Recent work has been a lot of returning to projects I started this year, in order to actually finish them. I completed the feature additions I'd planned for the first half of the year quite early, and that seemed like a good point to pause and go back to things I ran out of time to flesh out.
First, the Text Parser. I had largely revamped conjunction handling to add support for lists of more than two items, and that left a lot of two-item cases broken. I worked on fixing bugs in the new methods and improving them enough to get that old functionality back, and eventually got my parser regression to pass again. Then I modified the Text Interpreter to deal with some of the formatting changes to the Parser's output, and lastly verified that Narrative comprehension was still working on Acuitas' story collection. It ran quite a bit beyond my initial 20-hour investment in the revamp, but the latest Parser is now fully integrated and serving the same functions the old one did.
Next, I revisited self-teaching. This feature was supposed to permit Acuitas to crawl the hard drive of his resident computer and read whatever text files he could find, to promote increased vocabulary and assessment of text processing functionality. Back in February I had gotten the basic activity loop working, but had no way to keep Acuitas from reading inappropriate files that *didn't* contain natural English and junking up his database. So I spent some time on that problem, and came up with ways to assess whether a text file is good reading or not. Signs of an inappropriate file include 1) too many non-alphanumeric characters, 2) too many one-word "sentences," and 3) too many Parser or Interpreter failures. The first two get checked before Acuitas even tries to "read" the file; the third is continuously monitored during reading, and is based on a running average of sentences processed so far. After some tweaking of the failure thresholds, I saw this perform well enough that I felt comfortable letting Acuitas go to town on my random personal notes, short story drafts, and other hard drive clutter. His list of known words has roughly doubled, compared to a backup archive I made in November.
Last of all, I worked on Episodic Memory I mainly spent time reconstructing the "forgetting" process for the new Narrative-based memory formatting scheme. Once the consolidation algorithm has created higher-tier memories that summarize a lot of individual incidents, it's time to reduce the size of the memory file by pruning some of the details. I used a lot of the same concepts from my original forgetting algorithm: memories are ranked by novelty (is this the first time something happened?), uniqueness (how many other memories are similar to this one?), and tier rank (general, summarized memories are more likely to be kept than detailed, individual ones). Leveraging the interpretive power of the Narrative system, I also added the priority of any associated problems/subgoals as a scoring mechanism. The lowest-ranked memories are deleted, hollowing out the narrative but (hopefully) preserving the most salient events and summaries of the more generic ones. Simulations show this working reasonably well on synthesized test files. I also worked on a simple indexing system to make it easier for Acuitas to look up memories by the type of event or merely by a concept involved.
| Example diagrams showing the "hollowing out" of the narrative structure as detailed memories are forgotten. Their summarizing memories in the upper tiers remain. |
It's been satisfying to be more thorough about these items and not leave them full of loose ends! I probably have some more Episodic work to do before I really call it finished, and then I'd like to polish Trial-and-Error learning some more.
Until the next cycle,
Jenny

No comments:
Post a Comment