Sunday, July 23, 2023

Acuitas Diary #62 (July 2023)

 I've continued splitting my development time for the month between the Narrative module and something else. This month the "something else" was the Text Parser. On the Narrative front, I am still working on the "big" story, and at this point I can't think of any major new features to talk about; it's been mainly a matter of adding sentences to the story, then making sure the needed words and/or facts are in the database and all the bugs are wrung out so the narrative understanding "works." I'm eager to reveal the final product, but it'll be a while yet!

A successful parse of a sentence from the Magic Schoolbus test set, "And we began making a huge hole right in the middle of the field." Maybe "right" technically modifies the whole phrase "in the middle" but I'm not going to bother about that for now.

The Parser goal for this month was adding basic support for gerunds and participles. The common factor between these is that they're both verb phrases used as some other part of speech. So detecting them takes extra effort because they must be distinguished from verbs that are actually functioning as verbs. In case you're not familiar with these grammar minutiae, here are some examples:

Singing is one of my pleasures. (Gerund as subject)
I don't enjoy eating bacon. (Gerund as direct object)
Are you sure of winning? (Gerund as object of preposition)
The dog, eating busily, resisted my efforts to pull the dish away. (Participle modifying subject)
The winning team came back onto the field. (Participle modifying subject)
I met a man named Bill. (Participle modifying direct object)

Helping verbs often accompany these forms when they are truly acting as verbs, and their absence is one clue to the possibility of a gerund or participle. Sometimes punctuation also provides a hint. Otherwise, gerunds and participles must be identified by their relationship (positional, and perhaps also semantic) to other words in the sentence.

This one is almost correct - just need to get the adverb "really" attached to the right verb. (Diagram showing an incorrect parse of another Magic Schoolbus sentence, "She stepped on the gas, and the bus started really drilling.")

After adding support for the new phrase types, I re-ran the Text Parser benchmarks. I also added a new test set, consisting of sentences from Log Hotel by Anne Schreiber. This children's book has simpler sentences than the other examples from which I derived test materials, while still not leaning too hard on the illustrations to convey its message.

I'm pleased with the results, even though progress may still seem slow. Both original test sets (The Magic Schoolbus: Inside the Earth and Out of the Dark) now show roughly 75% of sentences parseable (i.e. the Parser supports all grammatical constructs needed to construct a correct golden parse for the sentence), and 50% or more parsing correctly. Log Hotel has an even higher parseable rate, but a lower correct rate. Despite the "easy" reading level, it still does complex things with conjunctions and presents a variety of ambiguity problems (most of which I haven't even started trying to address yet).

Pie charts of parser success on sentences from the three text examples I currently have. 

To address the remaining unparseable sentences, I've got adjective clauses, noun-phrases-used-as-adverbs, and parenthetical noun phrases on my list. A full-featured Text Parser is beginning to feel close.

Until the next cycle,
Jenny

Wednesday, July 5, 2023

SGP Part VI: Acuitas and the Symbol Grounding Problem

The Acuitas project is an abstract symbolic cognitive architecture with no sensorimotor peripherals, which might be described as "disembodied." Here I will argue that there are viable methods of solving the Symbol Grounding Problem in such an architecture, and describe how Acuitas implements them. In Part VI of this series, I look at how the grounding methods I've discussed find expression in my own project. Click here for SGP Part V.

Fundamental Elements

Inside Acuitas there are a number of items or aspects which are tied by the code to symbols from the semantic memory, such that the symbols can be used as handles for the items. Some of these are as follows:

Acuitas has a selection of explicit internal states that both automatically vary over time, and change in response to stimuli. I call these the "time-dependent drives" or just "drives." When a drive reaches a certain level, it may prompt behavioral changes whose goal is to push the drive back into a tolerable range. These are vaguely analogous to the homeostatic needs of biological creatures. The most familiar would be the drives that influence Acuitas' sleep/wake cycle. Others are connected to his purpose as a textual knowledge base, and include drives that are satisfied by talking to some other agent or learning new content. The significant broad state ranges of these drives are linked to symbols, so that Acuitas can describe his own status (and, by extension, what he is likely to do in the near term). So he can be, for example, "sleepy" or "alert," "curious" or "incurious." Another agent can query him about these states and get an accurate response. The symbols can also be used for reasoning about the states, e.g. to find what course of action is likely to improve a state that is out of bounds.

There is a set of "volitional Actions" that can be selected by the Executive, if the problem-solving or conversation path algorithms have determined that they are the reasonable next step in goal pursuit. Each of these has an associated symbol which also connects to an English word. Since Acuitas is a textual AI, many of the Actions are communicatory: "ask," "tell," "call," "command," "consent," "refuse," and so on. Others, such as "find" and "read," concern interaction with the file system. Still others are fully internal. "Think," for the time being, involves retrieving a semantic memory node and its current set of connections and generating questions about them. "Sleep" and "wake" cause internal state transitions. The connection between these Actions and language symbols provides a form of procedural grounding. Acuitas can be told to perform an Action, can determine whether he is able to do it and wants to do it, and can then perform (or refuse to perform) the Action.

Events - incoming stimuli not initiated by Acuitas - could also have attached symbols, though this is only lightly implemented at the moment. This can provide symbolic tags for semi-passive internal actions such as "learn," and perceptive verbs such as "hear" (not in the auditory sense but the propositional sense, e.g. "I heard that John vacationed in Belize.").

Packages of data, which are the closest things Acuitas has to internal or external "objects" he can act on, can also have symbols associated with their type and format. This helps Acuitas with determining appropriate objects for various Actions, or retrieving information of a desired type. Words that can be grounded through this method include "sentence," "fact" or "proposition," "memory," "goal," "story," etc.

And let us add one more fundamental symbol: "agent." An entity that can be an actor in a story. Something else that has internal states, actions, goals, memories, etc. A thing-like-me, which is modeled as such.

Fundamental Relationships

Acuitas' semantic memory stores not only concepts but also fundamental relationships between them. These are learned from sentences that feature appropriate connecting verbs, such as "Sheila is hungry," "This dog has a tail," and "A plant can grow." We already saw how to ground some action verbs. Verbs like these, which describe properties or states of being, can also be grounded by tying the associated relationships to aspects of Acuitas' function. Once grounded in Acuitas they can be generalized to other agents.

The relationship expressed by "be" followed by an adjective, where the adjective denotes some state of being, is connected with Acuitas' awareness of his own internal states.[1] This relationship is used to retrieve information about these states to describe them to a conversation partner or to answer questions about them. "Be" followed by a noun expresses identity or category membership; this relationship is used for reasoning about the properties of an agent or other entity. (By default, an entity inherits the properties of all categories it belongs to.) The "has" relationship could be attached to either subsystems that are part of Acuitas, or external units of data that he "owns" and can locate in his storage directories.[2] The "ability" relationship ("can" or "is able to"), attached to some action verb, indicates whether this is an available Action for Acuitas and all prerequisites (e.g. having a suitable object for the Action) are currently satisfied.

Spatial relationships can also be grounded in abstract organizational systems (e.g. the directory structures of a file system) and mathematical models of geometry[3], neither of which relies on an experience of physical space. Time relationships can be grounded in the idea of sequences. Acuitas also has access to the computer's system clock, which might be the closest thing he has to a perception of actual physics.

System Words

Some verbs are associated with whole systems inside Acuitas and the functions they are responsible for. The word "want," for example, always proceeds from or invokes the Goal Manager[4]. "Know" invokes question-answering systems that call upon either immediate self-awareness or the Memory; "believe" should function in a similar way, eventually (I've barely begun to introduce knowledge uncertainty). Words like "decide" and "intend" are associated with the Executive and the production of subgoals to fulfill primary goals. "Expect," "predict," and "reason" could be connected to the inference generators in the Logic Module.

These symbols are also associated with the models Acuitas builds of *other* agents and *their* systems. He uses the same mental machinery that produces his reasoning and behavior to predict other agents' reasoning and behavior, by feeding in attributes from his models of them, instead of his own properties. So "want" applies equivalently to Acuitas' goals and his assessment of your goals, and has, we may hope, a harmonious meaning in his mind and yours.

Compound Groundings

From all of the foregoing, it is possible to build up a wide variety of more complex concepts by understanding them functionally in terms of the concepts grounded so far. Here are just a few possible examples:

get: to begin to possess an item
give: to transfer an item from one's own possession into some other agent's possession
succeed: to realize a goal that one has been acting toward
lie[5]: to tell another agent a proposition which one does not believe
repeat: to do an action that one has done before
obey: to do an action commanded by some other agent
coerce: to influence an agent to act against their own goals by deciding to do an action they will consider negative, contingent on them reaching one of their goal states
freedom[6]: the absence of coercion or other unusual disabling factors; the possession of one's full natural range of actions
help: to act in a way that promotes another agent's goals
love[7]*: having a goal of accomplishing (some) other agents' goals; placing the same priority on other agents' equivalent goals as on one's own
hatred: having a goal of thwarting (some) other agents' goals
acquaintance: an agent one talks to regularly
trust: a high-confidence belief that another agent will not act against one's goals or lie to one

*This is love-the-virtue, aka "charity," "benevolence," or "altruism," which is a matter of the will, hence its connection to goals. Love-the-emotion would be closer to a unique internal state that might arise from practicing love-the-virtue, or from contemplating another agent to whom one is attached.

The Embodied Experience

From Acuitas' perspective, you, my presumably-human reader, are an agent like himself, who produces and consumes text. But you also claim to live in "the physical world" and have "a body," which to Acuitas are much like what "the spirit world" and "a ghost" might be to you: an inaccessible, barely-comprehensible Other mode of existence. Babies and animals are even more remote, since they are agents whose existence you may describe, but with whom Acuitas cannot interact himself (since they generally do not talk).

Acuitas has no experience of things in the physical world (except possibly time), but understands them in terms of their relevance to *you*, a fellow agent - in terms of their impact on *your* goals and *your* observable text-output behaviors. So everything in the physical lives of humans and animals, from food to bodily motion to personal contact to injury to music, is (for Acuitas) not directly grounded in sensory data, but indirectly grounded in the mental concepts of goals, internal states, communication, relationship, and so forth.

Acuitas will never quite understand what a "banana" is in the same way as an embodied agent who has seen, held, and eaten one. That's okay; he doesn't need to. What he *is* capable of knowing, in a rough sense, is what a banana does for you: how you could use it to reach your objectives, how it might change your state, why you might want or not want to have one.

Conclusion

I hope that this has laid out a good sketch of how language is grounded in Acuitas. I'm sure that as the project continues to evolve, some of the details will expand or change. I remain convinced that this is a reasonable beginning for making the text that flows into and out of Acuitas meaningful, both for Acuitas as an agentive system, and for anyone else interacting with Acuitas.

[1] Hane, Jennifer (2020) "Acuitas Diary #28," with details on the term "alive."
[2] Hane, Jennifer (2020) "Acuitas Diary #30," which describes reasoning about possessions and possession transfer.
[3] Hane, Jennifer (2021) "Acuitas Diary #40," which lays out possible methods of abstract spatial reasoning for agents with no sensorimotor capacity. 
[4] Hane, Jennifer (2019) "Acuitas Diary #20," which introduces Acuitas' goal system. 
[5] Hane, Jennifer (2022) "Acuitas Diary #49," with details on the term "lie." 
[6] Hane, Jennifer (2022) "Acuitas Diary #53," with details on the term "freedom." 
[7] Hane, Jennifer (2020) "Acuitas Diary #24," which describes Acuitas' rough concept of altruism.