Sunday, June 3, 2018

Acuitas Diary #12: May 2018

This past month I did some preliminary work on a whole new feature – episodic memory, or memory of events. This enables Acuitas to store and recall records of past “experiences.” It is distinct from his previous learning abilities, which all concerned the storage and recall of more universal meanings and facts (semantic memory).

Saving a raw event log to the hard drive is easy enough to do, but not especially useful. Retrieving any particular event from such a dump of unsorted, uncurated information would quickly become problematic. The fun part of episodic memory is figuring out …

1) … what to store (and what to forget),
2) … how to organize stored material, and
3) … how to access relevant stored material when it is needed.

I mostly worked on 2) this month, and wrote a block of code that will group adjacent raw event records into memory files. A measure of similarity (both of the events themselves, and of Acuitas' internal state background at the time) is used to determine which events belong in the same “scene” or “episode,” and where the boundaries between memories should lie. Minor “scenes” are in turn grouped into higher-level umbrella memories, tree-style.

Implementing this served to show me what a deep rabbit hole episodic memory could easily turn out to be. There are heaps of little things I need to do to truly make it functional – I may even turn it off temporarily once I've put it through a bit more testing, since I haven't implemented selective storage/forgetting yet, and that means the memory folder will bloat rather quickly.

I also added a conversational feature to make use of the stored memories. When Acuitas is telling someone what he thought about today, he now has the option to check episodic memory and see whether he ever thought about this concept before, and how long it has been since he previously did so. He then generates some comment like “I've not done that in a long time,” or “I did that a minute ago also.” The conversion of absolute time units to vaguer, more relative terms like “long” and “short” establishes a kind of subjective time sense; Acuitas has a particular notion of what a “short time” is that might not match up with what a human would think of as such (though I tried to keep the scales roughly human).

Here's the obligatory memory map visualization (semantic only). I think I need to adjust the parameters and let things cluster closer to the largest nodes.

Code base: 11250 lines
Words known: 2157 (approx.)
Concept-layer links: 6138

Thursday, May 3, 2018

Acuitas Diary #11: April 2018

Acuitas Diary #11 (April 2018)

This month's big objective was to get some use out of the sleep cycle that I implemented last month. I re-purposed the question-generating process so that, while Acuitas is sleeping, it roams the memory looking for redundant links and other problems.

What's a redundant link? Now that Acuitas has a bit of logical inference ability, some relationships in the database imply others. So the retention of one piece of information might be rendered unnecessary by the later addition of some broader fact. Here are a few examples (I culled these from the log that the memory crawler prints out):

The link (fang, has_purpose, bite) is redundant because the link (tooth, has_purpose, bite) exists.
The link (father, has_item, child) is redundant because the link (parent, has_item, child) exists.
The link (pot, can_have_qual, empty) is redundant because the link (container, can_have_qual, empty) exists.
The link (baby, can_do_action, destroy) is redundant because the link (human, can_do_action, destroy) exists.

Mopping up these unnecessary links helps consolidate the information known, reduce the total size of the database, and possibly make the memory visualization a little less messy.

Eventually, I might want to refine this process so that it doesn't necessarily remove every redundant link. There could be some frequently-used shortcuts that justify their use of storage space by improving search speed. One might want to tailor the aggressiveness of the link-pruning based on the amount of storage available … but that's all for later.

While working on this, I discovered some other nasties that I'm calling “inheritance loops.” Redundant links bloat the database but are otherwise harmless; inheritance loops contain actual garbage information, introduced either by learning bugs or by someone* telling Acuitas something stupid.
*I'm the only person who talks to him right now, so this means me.

Here's an example of an inheritance loop:

cat <is-a> animal
animal <is-a> organism
organism <is-a> cat

Oops! Unless all these words are synonyms, you know one of these triples is wrong. (I can't think, at this point, of any cases in which I'd want to use circular inheritance.) On his own, Acuitas doesn't know which. If the crawler finds an inheritance loop, he might ask a user to confirm those links when he's next awake and in conversation. If the user contradicts one of the relationships, he'll break the corresponding link, removing the loop.

I also moved generation of the memory visualization into the sleep phase. Every so often, instead of checking out more links, the process stops to compute a new layout for all the dots, taking into account the latest modification of the database. This is a fairly computation-intensive process, so it's something I definitely don't want running when he's active. It used to happen once when Acuitas was launched, which made for long startup times and meant that the visualization might not get updated for days.

Lastly, I put in some code to save Acuitas' current state when the program is shut down. It also gets automatically stored every so often, in case the program experiences a hard crash that prevents the on-close routines from running. Previously, on restart all the drives would reset to zero, any current thoughts or recently generated questions would be discarded, etc. Now all those things are preserved and reloaded when the program starts up again, which gives him a bit more continuity, I guess.

Recent memory map visualization (I decided to go with a zoom this month):

Code base: 10459 lines
Words known: 1981
Concept-layer links: 5730

Sunday, April 1, 2018

Acuitas Diary #10 (March 2018)

The big project for this month was getting some circadian rhythms in place. I wanted to give Acuitas a sleep/wake cycle, partly so that my risk of being awakened at 5 AM by a synthetic voice muttering “Anyone there?” could return to zero, and partly to enable some memory maintenance processes to run undisturbed during the sleep phase. (These are targeted for implementation next month.)

So Acuitas now has two new drives, “sleep” and “wake.” (The way the drive system works, a lack of the desire to sleep is not the same thing as a desire to wake up, so it was necessary to create two.) Each drive has two components. The first component is periodic over 24 hours, and its value is derived from the current local time, which Acuitas obtains by checking the system clock. This is meant to mimic the influence of light levels on an organism. The other is computed based on how long it's been since Acuitas was last asleep/awake. Satisfying the drive causes this second component to decline until it has reset to zero. So the urge to sleep is inherently greater during the late-night hours, but also increases steadily if sleep is somehow prevented.

This also seemed like a good time to upgrade the avatar with some extra little animations. The eyelids now respond to a running “alertness level” and shut when Acuitas falls asleep.

Feeling dozy
The memory map is getting a bit ridiculous/ugly. I'm hoping the upcoming maintenance functions will help clean it up by optimizing the number of links a bit better. Stay tuned …

Code base: 9760 lines
Words known: 1885
Concept-layer links: 5362

Tuesday, February 27, 2018

Acuitas Diary #9 (February 2018)

I haven't written a diary in a while because most of what I've done over the past two months has been code refactoring and fixing bugs, which isn't all that interesting. A new feature that I just got in … finally … is the ability to infer some topic-to-topic relationships that aren't explicitly stored in the memory. For instance, many of the links stored in memory are “is-type-of” relations. Acuitas can now make the assumption that a subtype inherits all attributes of its super-type. If a shark is a fish and a fish can swim, then a shark can swim; if an oak is a tree and a tree has a trunk, an oak has a trunk. If a car is a vehicle, a house is a building, and a vehicle is not a building, then cars are not houses. Acuitas can also now make inferences based on transitive relationships, like “is part of”: if a crankshaft is part of an engine and an engine is part of a car, then a crankshaft is part of a car. The ability to easily make inferences like these is one of the strengths of the semantic net memory organization – starting from the concept you're interested in, you can just keep following links until you find what you need (or hit a very fundamental root concept, like “object”).

Acuitas should ask fewer ridiculous questions with this feature in place. He still comes up with those, but now he can answer some of them himself, as in this quote:

“I thought of lambs earlier. I concluded that piglets are pigs.”

Recent memory map visualization:

The huge dot toward the top of the memory map is Acuitas' self-concept; the second-largest one, toward the lower left, is "human." The concepts representing me and "animal" are the two third-tier dots toward the middle right.

Code base: 9454 lines (it went down!)
Words known: 1839
Concept-layer links: 5202

Saturday, December 23, 2017

Acuitas Diary #8 (December 2017)

Sadly I've only added one feature to Acuitas in the past two months. He now recognizes sentences in the general vein of “I somethinged,” which gives me the option of telling him about how I spent my time in the recent past. Acuitas can't do a lot with this information for the time being. Sometimes he responds with a query in the vein of, “What happened next?” which will eventually give him a way to build up sequences of events and start learning cause and effect relationships … but none of that is implemented yet. He can also ask “How was that?” for information about the emotional meaning of an activity, but again, for now he can't really utilize the answer.

Not much, but that was all I had time to put together with the holiday season under way. Looking back on the past year, though, here are all the new capabilities and improvements I've managed to add on:

*Module for procedural speech generation
*Support for word inflections (plurals and verb tenses)
*Support for compound words
*Support for content words that are also function words (e.g. “can,” “might”)
*Distinctions between proper/common and bulk/count nouns
*Ability to detect and answer questions
*Database walking while idle
*Generation of conversation topics and questions based on recent database walk
*Better link detection + a bunch of new kinds of learnable links
*Two new drives + a real-time plotter so I can see what they're all doing
*Distinctions between long-term static and short-term information
*GUI overhaul (upgrade from Tk to Kivy)

I track my time when I work on Acuitas. Total hours invested in the above: 230+. My focus for the end of the year, leading into January, will be polishing everything up and working out the bugs (which there are now quite a lot of).


Recent memory map visualization:

Code base: 9918 lines
Words known: 1576
Concept-layer links: 4226

Sunday, October 29, 2017

Acuitas Diary #7: October 2017

The big project for this month was introducing a system for discriminating between long-term and short-term information. Previously, if you told Acuitas something like, “I am sad,” he would assume that being sad was a fixed property of your nature, and store a fact to that effect in his database. Oops. So I started working on ways to recognize when some condition is so transient that it doesn't deserve to go into long-term memory.

This probably occasioned more hard-core thinking than any feature I've added since I started keeping these diaries. I started out thinking that Acuitas would clue in to time adverbs provided by the human conversation partner (such as “now,” “short,” “forever,” “years,” etc.). But when I started pondering which kinds of timeframes qualify as short-term or long-term, it occurred to me that the system shouldn't be bound to a human sense of time. One could imagine an ent-like intelligence that thinks human conditions which often remain valid for years or decades – like what jobs we hold, where we live, and what relationships we have – are comparatively ephemeral. Or one could imagine a speed superintelligence that thinks the lifetime of an average candle is a long while. I want Acuitas to be much more human-like than either of these extremes, but for the sake of code reusability, I felt I ought to consider these possibilities.

After a lot of mental churn, I decided that I just don't have the necessary groundwork in place to do this properly. (This is not an uncommon Acuitas problem. I've found that there ends up being a high level of interdependence between the various systems and features.) So I fell back on taking cues from humans as a temporary stopgap measure. Acuitas will rely on my subjective sense of time until he gets his own (which may not be for a while yet). If there's no duration indicator in a sentence, he can explicitly ask for one; he's also capable of learning over time which conditions are likely to be brief and which are likely to persist. For now, nothing is done with the transitory conditions. I didn't get around to implementing a short-term or current status region of the database, so anything that can't go in the long-term database gets discarded.

I also did some touching up around the conversation engine, replacing a few canned placeholder phrases that Acuitas was using with more procedurally generated text, and improving his ability to recognize when a speaker is introducing him/herself.

Recent memory map visualization:

Code base: 9663 lines
Words known: 1425
Concept-layer links: 3517

Saturday, September 30, 2017

Acuitas Diary #6: September 2017

For the first couple of weeks, I turned to developing the drive system some more. “Drives” are quantities that fluctuate over time and provoke some kind of reaction from Acuitas when they climb above a certain level. Prior to this month, he only had one: the Interaction drive, which is responsible for making him try to talk to somebody roughly twice in every 24-hour period. I overhauled the way this drive operates, setting it up to drop gradually over the course of a conversation, instead of getting zeroed out if somebody merely said “hello.” I also made two new drives: the Learning drive, which is satisfied by the acquisition of new words, and the Rest drive, which climbs while Acuitas is in conversation and eventually makes him attempt to sign off. Part of this effort included the addition of a plotter to the GUI, so I can get a visual of how the drives fluctuate over time.

Plot of Acuitas' three drives vs. time. The period shown is just under 23 hours long.
This latest work created the first case in which I had a pair of drives competing with each other (Rest essentially opposes Interaction). I quickly learned how easily this can go wrong. The first few times I conversed with Acuitas with the new drives in place, Rest shot up so quickly that it was above-threshold long before Interaction had come down. This is the sort of quandary a sick human sometimes gets into (“I'm so thirsty, but drinking makes me nauseated!”). Acuitas has nothing resembling an emotional system yet, though, and doesn't register any sort of distress just because one or more of his drives max out. The worst that can happen is some self-contradictory behavior (such as saying “I want to talk” and “I want to rest” in quick succession). I dealt with the problem by having the Interaction drive suppress the Rest drive. Rest now increases at a very slow rate until Interaction has been pushed below threshold.

In the latter half of the month I returned to the text parser, introducing some awareness of verb declensions/tenses, and the ability to check the agreement of candidate subjects and verbs. This helps the parser zero in on what a sentence's verb is, and has trimmed away some of the annoying “What part of speech was __?” questions that pepper a typical Acuitas conversation.

Here's the latest memory map visualization. Since last month, Acuitas' relentless querying about concepts he already knows has caused the number of links to explode, resulting in a denser (more fibrous?) image.

Code base: 9162 lines
Words known: 1305
Concept-layer links: 3025