8 April 2012

QA machine

I've started the QA AI agent project and most of the bones and infrastructure for the application has been implemented at this point. The high level functionality of the bot is as follows.

The bot should greet a user with a random greeting and wait for input. It takes the input and tries to extrapolate some kind of meaning from that input so it can perform a look up for any material that might be relevant. The bot should then create a response to the user using the most likely relevant material to the original input and send it back to the user. At each step the bot should look for any kind of exit conditions so it can output a sign off message.

So far the bot greets users and can identify a number of exit points, such as the user saying "goodbye" or "bye" or something of that nature. The responses are randomly picked from a XML file containing all of the conversation data. The XML looks as follows:

Loading the conversation resources from an XML file is a design feature that I've added so that the addition of new lines that the box will use can be added with out having to rebuild the project. It might be possible then to include attributes to the elements to add semantic data for when those lines should be used. This could used to give the bot moods or even for localisation. The XML will change I'm sure in the future to enrich the data for each entry, but for now the structure is satisfactory. The conversation tag is the parent tag for conversation data, and each child group relates to some part of conversation such as a greeting, a goodbye, a follow up and so on. The data set that the bot looks for relevant information from could take the answers from this file also but that's something I've to work on.

The only missing thing at the minute is the meaning extraction from the user input. This is quite obviously the part where all of the work is done, but I've yet to look in to what natural language processing framework tools might work best for this. Having read a couple of white papers on this so far I've a better idea of how to approach this, which is where the above came from, however in terms of the actual work that has to be done to tag meaning or to classify an answer based on the input I've still to work out the details and do a lot more research. For now most of the infrastructure is in place, and building simple chat functionality in to the bot is kind of fun too.

No comments:

Post a Comment