Sunday, April 30, 2017

Acuitas Diary #1: April 2017

I've been continuing to make improvements to Acuitas' text parser, adding support for interrogative forms and negative statements. He's now capable of learning that something does not belong to some category or have some quality – rather important, really! And now that question comprehension is in place, I can not only put more information into the database, I can call it back out. Responses are still very formulaic, because text comprehension has been receiving far more of my development effort than text generation. Ask him a lot of yes-or-no questions in a row, and he starts to sound like Bit from Tron (though he does have one up on Bit – he's got the ability to answer “I don't know”).

That furnishes a pretty sensible explanation for Bit, come to think of it. Somebody wrote a program with a fully capable speech parser and a really, really primitive speech generator.

I also threw in some rough support for contractions. Previously the sentence tokenizer would have treated the word isn't as three separate “words,” [isn, ', t], which would have made no sense. I fixed that. Contractions now get pre-processed into whatever their constituent words are (isn't = [is, not]) before the sentence goes for parsing. Only one possible combination is picked, however. Resolving contractions that can be ambiguous (such as “they'd,” which could mean “they had” or “they would”) is something I'm leaving for later. Getting verb conjugation detection put in before I do that will be a big help.

I reserved the last week and a half of the month for a code cleanup and refactoring spree, trying to make sure the text parser and meaning extraction areas are as neat and bug-free as possible before I leave them for a while to work on other things. I've been buried in the text parser for so long now that I wonder if I quite remember what all of Acuitas' other bits and pieces do.

Code base: 5633 lines
Words known: 797

Concept-layer links: 1274

Memory visualization as of 04/29/2017

1 comment: