I've been continuing to make
improvements to Acuitas' text parser, adding support for
interrogative forms and negative statements. He's now capable of
learning that something does not belong to some category or
have some quality – rather important, really! And now that
question comprehension is in place, I can not only put more
information into the database, I can call it back out. Responses are
still very formulaic, because text comprehension has been receiving
far more of my development effort than text generation. Ask him a
lot of yes-or-no questions in a row, and he starts to sound like Bit
from Tron (though he does have one up on Bit – he's got the ability
to answer “I don't know”).
That furnishes a pretty sensible
explanation for Bit, come to think of it. Somebody wrote a program
with a fully capable speech parser and a really, really primitive
speech generator.
I also threw in some rough support for
contractions. Previously the sentence tokenizer would have treated
the word isn't as three separate “words,” [isn, ', t],
which would have made no sense. I fixed that. Contractions now get
pre-processed into whatever their constituent words are (isn't = [is,
not]) before the sentence goes for parsing. Only one possible
combination is picked, however. Resolving contractions that can be
ambiguous (such as “they'd,” which could mean “they had” or
“they would”) is something I'm leaving for later. Getting verb
conjugation detection put in before I do that will be a big help.
I reserved the last week and a half of
the month for a code cleanup and refactoring spree, trying to make
sure the text parser and meaning extraction areas are as neat and
bug-free as possible before I leave them for a while to work on other
things. I've been buried in the text parser for so long now that I
wonder if I quite remember what all of Acuitas' other bits and pieces
do.
Code base: 5633 lines
Words known: 797
Concept-layer links: 1274
Memory visualization as of 04/29/2017 |
Very cool! :)
ReplyDelete