Sunday, June 30, 2024

Acuitas Diary #73 (June 2024)

This was a light month for Acuitas work - which means not that I necessarily spent less time, but that I was busy taking care of technical debt. My main objective was to get that shiny new Text Parser revision I wrote last year integrated into the rest of the code. I also converted another of my benchmark test sets to the new Parser format.

There were some small, but significant, alterations to the output format of the Parser, so the greatest part of the work was revising the Text Interpreter to properly handle the new form of input. Nothing else in Acuitas views the output of the Parser directly, so these changes were nicely isolated. I did not have to crawl through the whole system making revisions, as I did during the knowledge representation refactor. It was sufficient to re-harmonize the Parser and Interpreter, and get the Interpreter regression to pass.

I converted and ran tests on the "Out of the Dark" benchmark set. Accuracy is sitting where it was the last time I benchmarked this set, about 50% (and if I spend some more time on Parser bugs, I am almost certain I can bring this up). The important difference is that many new sentences have moved out of the "Unparseable" category. Only 6 out of 116 sentences (about 5%) remain Unparseable, due to inclusion of parenthetical noun phrases or oddities that I might not bother with for a long while. The previous Unparseable portion for this set, from last July, was 27%. Better handling of conjunctions, dependent clauses, and noun phrases used as adverbs enabled most of the improvements.

The integration process and the new benchmark set flushed out a number of Parser bugs that hadn't shown up previously. Some of these were growing pains for the new features. For example, multiple sentences failed because the Parser's new facility for collecting groups of capitalized words into proper names was being too aggressive. The Parser can now, at least in theory, recognize "The End of Line Club" as a single unit. However, in a sentence like "But Flynn went to work anyway," it was wanting to treat "But Flynn" as a full name. You mean you never heard about Kevin Flynn's *other* first name, But? I cleaned up a lot of that stuff as I was working.

I'm still not quite ready to bring the newest Parser and Interpreter into the live code, because I want to test them on the stories and ensure there are no major hiccups. That is (hopefully!) a quick thing that I can do in the background while I keep working on the Conversation Engine.

Until the next cycle,
Jenny

No comments:

Post a Comment