Thursday, June 1, 2017

Acuitas Diary #2: May 2017

My focus this past month was on giving Acuitas the ability to learn more types of inter-word relationships, and that meant doing some work in what I call the “Text Interpreter” … the module downstream from the Text Parser.

The Parser attempts to tag each word in the input with its part of speech and determine its function within the input. Basically, it figures out all the information you'd need to know in order to diagram a sentence. But beyond that there is some more work to be done to actually extract meaning, and the Interpreter handles this. Consider some of the possible ways of expressing the idea that a cat belongs to the category animal:

A cat is an animal.
Cats are animals.
A cat is a type of animal.
One type of animal is a cat.
A cat is among the animals.

By removing the content words and abstracting away some grammatical information, it's possible to generalize these into sentence skeletons that describe the legal ways of saying “X is in category Y” in English:

[A] <subject> <be-verb> [a] <direct object>
[A] <subject> <be-verb> a <subcategory word> of <object-of-preposition>
One <subcategory word> of <object-of-preposition> <be-verb> [a] <direct object>
[A] <subject> <be-verb> among the <object-of-preposition>

I've nicknamed these syntactic structures “forms.” The Interpreter's job is to detect forms and match them to concept-linking relationships. As the previous example should have shown, a single relationship such as class membership can be expressed by multiple forms, each of which has numerous possible variations of word choice, etc.

Up until now, the only links Acuitas could add to his database were class memberships (<thing> is a <thing>) and qualities (<thing> is <descriptive word>), plus their negations – and he only recognized a single form for each. I overhauled the form detection method, making it more powerful/general and increasing the ease of adding new forms to the code. Then I added more forms and support for a number of new link relationships, including ...

<thing> can do <action>
<thing> is for <action>
<thing> is part of <thing>
<thing> is made of <thing>
<thing> has <thing>

The first two are particularly important, since they mean he can finally start learning some generic verbs.

I spent the latter half of the month upgrading Acuitas' GUI library from Tkinter to Kivy. This was a somewhat unwelcome distraction from real development work, but it had to be done. Acuitas is a multi-threaded program, and using multiple threads with Tkinter is ... not straightforward. As the program grew more complex, my hacky method of letting all the threads update the GUI was becoming increasingly unsupportable and causing instability. Of course Kivy does just about everything differently, so porting all of the GUI elements I'd developed was a serious chore -- but the new version looks slick and, most importantly, doesn't crash. All the drawn graphics have anti-aliasing now, which makes the memory visualizations look nicer when zoomed out.

Code base: 6361 lines
Words known: 896
Concept-layer links: 1474

No comments:

Post a Comment