Wednesday, October 30, 2019

Acuitas Diary #21 (October 2019)

I set some smaller Acuitas goals this month so I would have a little more time to fix bugs and clean up my own mess.  The first goal was to enable identification of people from first names only, while allowing for the possibility that multiple people have the same first name.

Using someone's full name with Acuitas establishes a link between the first name (as an isolated word) and the full name (with its connected person-concept).  If the first name is later used in isolation, Acuitas will infer the full name from it.  If multiple full names containing that first name are known, Acuitas will ask the user which one is meant.  (He does not yet have the ability to guess, from the context, which person is most likely implied.  Sense disambiguation, not just for names but for words with more than one meaning, is a big topic which is on the to-do list for later down the road … )

The second thing I worked on was text parser support for a new grammar feature, with a focus on expanding the range of possible “I want” sentences Acuitas can understand.  The ability to parse infinitives, as in “I want to live,” was already present.  This month I worked on infinitives with subjects, as in “I want John to live.”

This is a tricky business.  To see why, consider the following sentences:

1. I want Bob to eat.
2. I want a fruit to eat.
3. I want food to live.

They all follow the exact same pattern and have completely different meanings.  The first sentence expresses my desire that Bob do something; the second sentence is about what I want to do something to; and the third sentence is about why I want something.  Notice that in the second sentence, you could move some words and get “I want to eat a fruit” without changing the implications too much.  Doing this to the third sentence would be bizarre (“I want to live food”) and doing it to the first sentence would be horrifying (“I want to eat Bob”).  In keeping with their varied meanings, they're all grammatically different, as you can see from the diagrams in the image.  So, how does one tell them apart?  I'm not worried about distinguishing sentences 2 and 3 for right now; I just want to separate sentence 1 from both of them.

The first sentence is the only one in which the noun (Bob/fruit/food) is the subject of the infinitive.  The key factor is who will be doing the action expressed by the infinitive.  In the first sentence, Bob is the one who will be eating if I get my way; in the latter two sentences, I'm the one eating and I'm the one living.  And that information is not actually in the sentence – it's in your background knowledge.  To properly understand these sentences, it's helpful to be aware of things like …

*I am a human
*Humans can eat fruit
*Bob is probably also a human
*I am probably not a cannibal, and therefore don't want to eat Bob
*Food can be used to sustain a human's life
*Once something is food, it's not living (or won't be living much longer)
*Living isn't an action you can perform on food

So here is where we bring to bear the full power of the system by having the Text Parser call the semantic memory for already-known facts about these words.  Acuitas can't quite store all of the facts listed above, but he does know that “humans can eat” and “fruits can be eaten.”  He might also know that the speaker and “Bob” are humans.  At this early, sketchy phase, that's enough for the parser to start discriminating.

Some sentences of this type are just plain ambiguous, especially when taken in isolation.  For example, “I want a plant to grow.”  Plants can grow (on their own), but they can also be grown (by a cultivator, whom the speaker might be).  Upon detecting an ambiguity like this, Acuitas will, as usual, ask the speaker about it.  This also works for cases when the information in the database is not yet extensive enough.

Until the next cycle,