Tuesday, February 21, 2023

Acuitas Diary #57 (February 2023)

The development pattern I'm trying to follow lately is to spend half the month adding something to the Narrative module, and half the month on something else. This month's Narrative work was on better goal modeling in the Narrative scratchboard, with the hope of expanding its ability to handle character motivations. For the other feature, I made my first introduction of indirect objects to the Text Parser.

Acuitas has had the capacity to model individual goals for agents for a while. But this was something that had to be established ahead of time; the Narrative module couldn't take in top-level goals defined for a fictional character and store them in the temporary Narrative memory space. There were several elements to incorporating this:

*Getting Narrative to pick up on hints that help indicate whether an agent's described desire is an ultimate goal or an instrumental goal.
*Making it detect and store goal priority information (e.g. "he wanted X more than he wanted Y").
*Merging these stored goals into the goal model so they can be detected as motivations for instrumental subgoals

I also threw in some ability to model goal maximization. Up to this point, the Narrative module has considered goals as things that can be satisfied or unsatisfied - e.g. a state an agent wants to be in, or some singular deed it wants to accomplish. At any given moment in the course of the story, the goal is either achieved or not. A maximizing goal is something the agent wants to have as many times as possible, or to the greatest possible extent. It has a "score," but is never completed.

The endgame was to get this sentence to be somewhat meaningful:

"The <agent> wanted to be powerful more than the <agent> wanted any other thing."

Uh-oh.

On to the second project: including indirect objects in the Text Parser. I left them out initially because they can be a little tricky. Another noun appearing between a verb and a direct object might be an indirect object (as in "I gave the people bread"), but it might also be a "noun" functioning as an adjective (as in "I gave the wheat bread to them"). I guarantee the parser still doesn't perfectly distinguish these yet - sorting out all cases will probably take the application of common-sense reasoning and contextual clues. But it can already handle less ambiguous cases like "I gave the people a good show."

Despite the difficulties, it was time to get IOs in, because their absence has been something of a thorn in my side. I've been getting around it by substituting prepositional phrases, which has led to some awkward wording like "Graham asked of a troll where the chest was." I wouldn't say they're fully implemented  yet either - interactions with some other grammatical elements, notably conjunctions and dependent clauses, aren't totally ironed out. But at least the Parser can handle IOs in simpler sentences, and the rest of the text-processing chain is now set up to manage them also.

Indirect objects are surprisingly sparse in my parser benchmark datasets. I re-ran those and scored one new sentence. One.

Sentence diagram (Parser output) example from the "Out of the Dark" test set. There's some ambiguity in whether the phrase starting with "about" modifies "told" or "stories." I think either reading counts as correct.

In my diagram tool, I draw an arrow pointing from a direct object to the indirect object with which it is associated.

Until the next cycle,
Jenny

Thursday, February 9, 2023

SGP Part I: A Description of the Symbol Grounding Problem

The Acuitas project is an abstract symbolic cognitive architecture with no sensorimotor peripherals, which might be described as "disembodied." Here I will argue that there are viable methods of solving the Symbol Grounding Problem in such an architecture, and describe how Acuitas implements them. In this first article, I introduce the Symbol Grounding Problem for anyone who is not already familiar with it.

Part I: A Description of the Symbol Grounding Problem

Some mysterious symbols on a display board. The board is covered with little curved elements (electroluminescent, perhaps?), some of which are lit up to form the symbols. Some viewers may recognize this as a screenshot of one of the Narayani epigrams from Myst III: Exile.
Mean anything to you?
(Screenshot of Myst III:Exile via mystjourney.com, copyright Ubisoft)

In layman's terms, the Symbol Grounding Problem (SGP for short) can be summed up in the following question: "How does anyone know what words mean?" If you listen carefully, a quieter, more ominous voice will ask, "And what is 'meaning,' anyway?"

Technically the symbols don't have to be words, though language processing is the context in which the problem is most often illustrated. A "symbol" is anything that points to some "referent," such that an intelligent agent who has learned this symbol's meaning can "pick out" the referent upon observing the symbol. (What does "pick out" imply? I would argue that it certainly means "think of," but also goes so far as "find/recognize in the environment," "act on," etc. The point of thinking about referents is usually to do something about them.) A symbol is also probably part of a "symbol system" which allocates a variety of referents to a collection of complementary symbols. The system includes rules for manipulating the symbols to produce combined or derivative meanings; the grammatical rules for composing sentences are an example. "Grounding" is the process of associating symbols with their referents. [1]

For an example of what working with ungrounded symbols is like, try reading a page of text in a language you do not know. Even if you also have a dictionary in this language, looking up the words won't help, because they're only defined in terms of other unknown words. Trying to use the dictionary will lead you on an endless circular path that never arrives at real meaning.

Two more terms that often come up in connection with the SGP are "semantics" and "syntax." "Semantics" is a formal term for meaning or the study thereof, while "syntax" refers to the structural or manipulative rules that are part of a symbol system. It is notable that syntax does not need semantics; it is based on the forms of the symbols themselves, so manipulations that follow the rules can be carried out without knowledge of the symbols' meanings. However, syntax without semantics is arguably not very useful, as it only serves to transform one string of gibberish into another.

For example, suppose I propound the following statements:

All muips are weetabiners.
Paloporoloo is a muip.

Then you could conclude with certainty that "Paloporoloo is a weetabiner." By logic and the syntactic rules of the English language, this is a correct deduction! But how does it help you? You know neither what Paloporoloo is nor what a weetabiner is. So there's not much you can *do* with the information.

To a human, perhaps the most obvious kind of referent is "something out there in the world" - an object, or a fellow embodied agent, or part of a landscape. We can also refer to properties of these things (color, shape, size, age) and to changes in them: actions that they take or that may be taken upon them. But referents include a variety of intangibles too: physical things that can't be directly touched or pointed to (time, energy); systems, organizations, philosophies, and methods (water cycle, nation, liberalism, science); things inside our own minds (idea, memory, decision, emotion); and abstract standards or states of being (love, justice, freedom, beauty). Symbols can even refer to other symbols (word, glyph, number), or to grammatical or logical structure in a sentence (the, and, that).

A symbol can be utterly arbitrary.[2] Some symbols take on a bit of flavor from their referents - onomatopoeic words, for instance, or ideograms that look like stick figures of the objects they represent. But this is not necessary. Any piece of data you like will do to symbolize anything you like. If you hope to use symbols for communication, then the mapping of symbols to referents must be largely agreed upon by you and your communication partner; this is the only real constraint.

Words are just collections of sounds (or squiggles on a surface), in no particular order, with no rules by which they were chosen. They are in no way *inherently* tied to or derived from the referents to which they point. "A rose by any other name would smell as sweet." And this is why the Symbol Grounding Problem is a Problem. Symbols do not map themselves; you need more than just the symbols in order to connect them with their referents. [3] The fabled True Speech is not ours, and the name of the rose without the rose itself is futile.

As a (presumably) human reader you might still be wondering what the big deal is. Of course we go outside words to learn the meanings of words, but that's easy enough, right? A baby can do it. Babies learn words by hearing them in association with some experience of their referents. With time and repetition, a mental link is formed between the two. But now consider the issue from the perspective of an artificial intelligence that has no robotic body, exists as an abstraction inside a computer tower, and *only* processes words. A number of past and present attempts at AI fit this description. How shall they know what words mean?

If we did presume to place an AI mind in a suitable robotic body and let it "grow up like a baby," the human learning process is not fully understood, and not so simple to replicate as it might seem. Just establishing the needed low-level processing so that sensory experiences can be categorized is a massive undertaking that remains incomplete. Here lies the attraction of jumping to a fully-formed, abstract linguistic intelligence, even if this demands novel ways of grappling with the SGP.

I want to address one more wrinkle before going further. Symbols have an objective meaning (the referent your community or culture generally agrees they map to), but they also have a subjective meaning. For every symbol you know, there is something that it means *to you,* based on its referent's implications in your particular life. [4] There are ways your inner existence changes upon reading certain words. These are not necessarily constant across the whole time you know a symbol, either; they shift with personal growth and situational context. Both the objective and the subjective are important to the use of symbols for communication. Objective meanings allow communication to be successful. They are what give your partner the ability to "pick out" the same referent you just "picked out." But it is the subjective meanings that provide the motive for attempting communication in the first place. If something is utterly unimportant to you, you probably won't bother talking about it. An ideal grounding solution should enable both these conceptions of "meaning."

Simple word-object association grants objective meaning but not subjective meaning. To obtain the latter, you need some notion of the referents doing things *for* you or *to* you. Rewards, goals, nociception, attraction, bliss, agony. Connection of referents, and by extension their symbols, with positive, negative, or neutral states in the self builds up subjective meaning. A baby learning subjective meanings has, for starters, a sensitive body with homeostatic needs for warmth, food, hygiene, and sleep. Most AI programs have nothing remotely like this to work with.

The SGP has implications for other well-known AI problems, such as the Alignment Problem. Suppose someone would like to give an AI a built-in directive such as "love your neighbor" or "do not harm human beings." One of your reactions upon hearing such a proposal should be "How will you tell the AI what 'harm' is? How are they supposed to know how to 'love'? Who is their 'neighbor'? What counts as a 'human being'?" (Even humans are notorious for defining these terms so as to make ethical loopholes for themselves.) The problem gets especially sticky if the plan is to somehow instill these ideals, from the beginning, in an AI that will gradually learn language. How to embed a directive writ in symbols before the symbols are even known? What if the wrong meanings are learned?

And yet ... some AI systems that manipulate symbols *without* any grounding look surprisingly capable. Large Language Models, often criticized for being word blenders that contain no connections between those words and anything meaningful[5], can still produce coherent and responsive texts. Image generators that know nothing about physics or three-dimensional form still turn out stunning pictures. This has led some to question whether we really need Symbol Grounding after all, since programs without it can achieve a lot of behaviors they would associate with "understanding" or "intelligence."

Thus the Symbol Grounding Problem ignites two debates in the AI research community. Do we really need to worry about it? And if so, how can we solve it?

In Part II, I'll look into arguing the first question.

[1] Stevan Harnad (2007) Symbol grounding problem. Scholarpedia, 2(7):2373., revision #73220

[2] "Anything can be a representation of anything by fiat. For example, a pen can be a representation of a boat or a person or upward movement. A broomstrick can be a representation of a hobby horse. The magic of representations happens because one person decides to establish that x is a representation for y, and others agree with this or accept that this representational relation holds. There is nothing in the nature of an object that makes it a representation or not, it is rather the role the object plays in subsequent interaction." Luc Steels (2008) "The symbol grounding problem has been solved, so what's next?"

[3] "Symbolic representations must be grounded bottom-up in nonsymbolic representations ..." Stevan Harnad (1990), "The Symbol Grounding Problem." Physica D: Nonlinear Phenomena, Volume 42, Issues 1-3, Pages 335-346.

[4] "Something is meaningful if it is important in one way or another for survival, maintaining a job, social relations, navigating in the world, etc. For example, the differences in color between different mushrooms may be relevant to me because they help me to distinguish those that are poisonous from those that are not." Steels seems to prefer the term "representation" for what I'm calling "objective meaning." Luc Steels (2008) "The symbol grounding problem has been solved, so what's next?"

[5] "I suspect that Johnson (like many others) has mistaken the ability of GPT-3 and its ilk to manipulate linguistic form with actually acquiring a linguistic system. Languages are symbolic systems, and symbols are pairings of form and meaning (or, per de Saussure, signifier and signified). But GPT-3 in its training was only provided with the form part of this equation and so never had any hope of learning the meaning part." Emily Bender (2022) "On NYT Magazine on AI: Resist the Urge to be Impressed."