Wednesday, June 21, 2023

Acuitas Diary #61 (June 2023)

 It's tiny demo day! I've got the "game playing" features whipped into enough shape that I can walk Acuitas through a tiny text adventure of sorts. So without further ado, here's the video:


I start by setting the scene. I can enter multiple sentences and, while each is received as a distinct input, Acuitas will process them all as a group; he waits for a little while to see if I have anything more to say before generating a response.

First I tell him what sort of character he is ("You are a human"). This nameless human is entered as a character in the game's Narrative Scratchboard, but is also specially designated as *his* character. Future references to "you" are assumed to apply to this character. Then I supply a setting: I tell him where his character is, and mention some objects that share the space with him. Finally, I mention a goal-relevant issue: "You are hungry."

Given something that is obviously a problem for a human character, Acuitas will work on solving it. The obvious solution to hunger is to eat some food (this is a previously-known fact in the cause-and-effect database, which can be found via a solution search process). But there is no "food" in the game - there is only a room, an apple, and a table. Acuitas has to rely on more prior knowledge - that an apple qualifies as food - and choose this specific object as the target of his character's next action. He also has to check the necessary prerequisites for the action "eat," at which point he remembers a few more things:

To eat something, you must have it in your possession. This generates a new Problem, because Acuitas doesn't currently have the apple.
Problem-solving on the above indicates that getting something will enable you to have it. This generates a new Subgoal.
To get something, you must be co-located with it.
Acuitas' character is already co-located with the apple, so this is not a problem.

Acuitas will work on the lowest subgoal in this tree; before trying to eat the apple, he will get it. He generates a response to me to express this intention.

Now something else interesting happens. Acuitas can't just automatically send "I get the apple" to the Narrative Scratchboard. He'll *attempt* the action, but that doesn't mean it will necessarily happen; there might be some obstacle to completing it that he isn't currently aware of. So he simply says "I get the apple" to me, and waits to see whether I confirm or deny that his character actually did it. At this point, I don't have to be boring and answer "You get the apple." If I instead tell him that one of the expected results of his desired action has come to pass, he'll take that as positive confirmation that he performed the action.

Once I confirm that he's done it, the action is sent to the Scratchboard, followed by my latest statement. This fulfills one subgoal and solves one problem. Now he'll fall back on his original subgoal of eating the apple, and tell me that he does so. I confirm that he ate it and ... boom, hunger problem disappears.

Since the game-playing code has a Narrative scratchboard attached, I can generate a Narrative diagram representing what happens in the game, just as I could for one of the stories in which Acuitas is a passive listener. This diagram appears in the latter part of the video.

And that's the story for this month! I've also continued refining and adding to the abilities of the new Text Generator, but it's not ready for integration yet.

Until the next cycle,
Jenny

Thursday, June 8, 2023

SGP Part V: Symbol Grounding for Disembodied Agents

The Acuitas project is an abstract symbolic cognitive architecture with no sensorimotor peripherals, which might be described as "disembodied." Here I will argue that there are viable methods of solving the Symbol Grounding Problem in such an architecture, and describe how Acuitas implements them. In Part V of this series, I consider some possible generalized methods for symbol grounding in systems that are not embodied in the traditional sense. Click here for SGP Part IV.

A modern art piece composed of geometric abstractions. The figures include triangles, rectangles, and curves, and are brightly colored on a tan background. A large black shape dominates the left side of the image, while a violet shape dominates the right.
"Black and Violet," by Wassily Kandinsky

Since I've argued that symbol grounding is possible for artificial intelligence programs without bodies, just how might this be accomplished? I already hinted at the techniques in the last article, but let's consider some proposed methods more thoroughly now.

1. Experience-grounded semantics

I suspect Pei Wang coined this term, as so far I've seen it only in his papers. It's the primary method of grounding attempted in his NARS (Non-Axiomatic Reasoning System) project. So I'll start by letting some quotes from him describe what it is:

"In this kind of semantics, both meaning and truth are defined with respect to the experience of the system. Briefly speaking, an experience-grounded semantics first defines the form of experience a system can have, then defines truth value and meaning as functions of given experience." [1]

"As a computerized reasoning system, NARS uses an artificial language, Narsese, to communicate with its environment. The syntax of this language is precisely specified in a formal grammar. Because NARS, in the current version, only interacts with its environment through this language, the “environment” of the system consists of a human user or another computer system. The system accepts declarative knowledge and questions (as sentences of the language) from its environment." [2]

"For an intelligent system likes[sic] NARS (or for adaptive systems in general), ... the concept of “meaning” still makes sense, because the system uses the terms in Narsese in different ways, not because they have different shapes, but because they correspond to different experiences." [3]

"For an actual term in NARS, its meaning is indicated by its available relations with other terms." [4]

NARS (in its original or default form) does not have sensory experiences of a physical environment; rather, its experiences consist only of text inputs, in the form of valid sentences from this "Narsese" language. Based on its past experience of how words appear in association with each other and with other Narsese symbols, such as the inference operator, NARS determines which words to return when asked a question.

This is still a pretty *weak* form of grounding, in my opinion ... if it can really be called grounding at all. NARS is still relying on the relationships between symbols as a form of "meaning," rather than going outside the symbol system to truly connect symbols with their referents. As I discussed in Part II, the graph topology of connections between symbols, taken by itself, doesn't really seem to share much in common with a human, or agentive, idea of "meaning." Individual symbols cease to be interchangeable because they all have different relations, but then, the graph as a whole can be considered arbitrary. Poisoning NARS' experience with misinformation might cause it to regurgitate some of the misinformation as incorrect answers to questions, but would not materially affect its behavior in any other way.

However, I like the general *idea* here, and if we get away from only relating terms to each other, I think it can be extended in interesting directions. What if, rather than just saying "meaning arises from relations between terms, which appear in experience," we allowed terms to be related to types of experience? For example, the reception of text input via various methods could be designated as the referent for words like "hear" or "be told." The arrival of multiple text input units in a distinct cluster could be tied to terms like "speech" or "conversation." Such inputs can produce a cascade of internal experiences such as introduction of new data to a database ("learn") or automatic retrieval of similar past experiences ("remember"). A disembodied intelligent system can also have "experiences" that relate to state change rather than reception of input, and these also can be given names, such as "activation" and "deactivation."

The above proposal preserves the idea of the system using different symbols on different occasions not because of their arbitrary shapes, but because of their correspondence to different experiences. And it enables one of the things we're trying to get out of Symbol Grounding, namely, the capacity for true communication. A system that has names for its experiences can accurately tell an interrogator what happened to it recently. It can also learn what happened to another and relate this to its own memories of what happened to itself, assuming the other is capable of similar experiences.

The fact that the experiences of a disembodied system are inevitably somewhat alien, does not prevent this technique from being a valid form of Symbol Grounding. The AI is using words representationally, to designate stimuli coming from its environment; symbols are being joined with referents. It is merely the case that the referents and the environment are rather *strange* by human standards.

2. Procedure-grounded semantics

"The idea of procedural semantics is that the semantics of natural language sentences can be characterized in a formalism whose meanings are defined by abstract procedures that a computer (or a person) can either execute or reason about. In this theory the meaning of a noun is a procedure for recognizing or generating instances, the meaning of a proposition is a procedure for determining if it is true or false, and the meaning of an action is the ability to do the action or to tell if it has been done ... The procedural semantics approach allows a computer to understand, in a single, uniform way, the meanings of conditions to be tested, questions to be answered, and actions to be carried out." [5]

The above quote from Woods considers grounding language not in things that happen to an AI program, but in things the program can *do*. This idea is equally feasible for application to embodied or disembodied artificial agents, since any reasonable program does *something.* Symbols in the language simply need connections to function calls or other pointers linked to the program's activity. This activity could be internal ("think," "decide," "plan,") or external, directly retrieving inputs or creating outputs ("ask," "tell," "take," "give," "find," "read").

This method of grounding is meaningful for outside observers, since it permits real communication about the program's output (past, present, or future). A system with procedural grounding can explain what it is currently doing, accurately announce what it is about to do or describe what it habitually does, and generalize to speaking of what other agents do in the level of reality where it operates.

I don't really favor trying to reduce all grounding to procedural grounding, as Woods suggests to do. For instance, thinking of some noun being *defined* by a procedure for recognizing the associated entity doesn't sit well; I would rather go back to Method #1 and ground the noun in the experiences generated by the entity's presence. But I don't know if this is a practical quibble, so much as a different way of conceptualizing things.

Combining experiential grounding and procedural grounding permits communication about both halves of enaction - the agent's shaping of its own experiences through action. Suppose an AI said "I noticed you inputting commands to the word processing program, and I spoke, because I thought that it might lead you to speak back." This could be a fully grounded statement for an AI with both experiential and procedural grounding. Through these two groundings the agent's language can achieve subjective meaning, since they allow discussion of what the agent's goals are (what kind of experiences it is pursuing) and how it will achieve those goals (through action).

3. Structure-grounded semantics

A disembodied AI could also theoretically ground meaning in aspects of itself - modules, subsystems, or properties - and in abstract constructs that it operates upon - data structures, other programs used as tools, and so on. From this we can obtain reasonable groundings for words like "memory," "thought," "fact," "goal," "sentence," "story," "file," "directory," and more.

This is a little different from experience-grounded semantics because the associations between these symbols and their referents aren't necessarily based on experiences of the referents; they can be directly "baked in." For example, any internal package of data in a given format could be accompanied by a pointer to that format's symbolic name. This is not quite the same thing as having an internal experience and then assigning it a name, as a human would; the name in this case is pre-embedded. So this is getting far afield from the way that humans do grounding, since all our symbols are arbitrary and learned. But I don't consider it infeasible.

Combining all three methods, we now have potential groundings for what an AI experiences, what it does, and what it is or has. For a disembodied AI mind, these all have their roots in a purely mental space consisting of structured information, some of which the AI would regard as part of itself and some of which would be coming from or going to "outside" (the environment), which includes other minds.

Groundings for elements of the physical world, which the AI can never experience or act upon *directly*, must then be derived by relations to the AI's mental-space groundings. For example, though such an AI might never properly understand what "water" is in the same way a human does, it can conceptualize water as something a human needs regularly to achieve a survival goal. This goes a long way toward an essential "understanding" of what a human means when telling a story about attempts to obtain water. The truly important thing for the AI to grasp is not the sensory experience of touching or drinking water (this is specific to an embodied existence, and the disembodied AI has no need for it), but the functional role that water plays in the lives of biological agents. Using the grounded mental terms as metaphors for inaccessible physical concepts is an additional option.

In the sixth and final installment of this series, I'll finally get down to brass tacks and sketch out some aspects of how I'm doing (and plan to do) grounding in the Acuitas project.

[1] Wang, Pei (2004) "Experience-Grounded Semantics: A theory for intelligent systems," p. 2
[2] Wang, Pei (2004) "Experience-Grounded Semantics: A theory for intelligent systems," p. 3
[3] Wang, Pei (2004) "Experience-Grounded Semantics: A theory for intelligent systems," p. 7
[4] Wang, Pei (2004) "Experience-Grounded Semantics: A theory for intelligent systems," p. 15
[5] Woods, William A. (2007) "Meaning and Links," AI Magazine, Volume 28, No. 4, p. 75