15 April 2012

10,000 lotto checker downloads!

Today at some point my lotto checker app will have surpassed 10,000 downloads and to mark this event I'm working on an update. There's a number of things that I'm going to update, the first of which is that I'm now going to include ads in the app. I'm going to keep these small and they're going to only be on the results display page.

The second update is the ability to search for historic results. I'm changing this to appear with in a tabbed view in the app and it will work by posting data to the lottos website and parsing the HTML returned, or possibly displaying this with in some kind of browser view

For now I'm battling the API version problem with the latest google adSense jar. It seems it requires the SDK version 13 of android before it will display test ads.

8 April 2012

QA machine

I've started the QA AI agent project and most of the bones and infrastructure for the application has been implemented at this point. The high level functionality of the bot is as follows.

The bot should greet a user with a random greeting and wait for input. It takes the input and tries to extrapolate some kind of meaning from that input so it can perform a look up for any material that might be relevant. The bot should then create a response to the user using the most likely relevant material to the original input and send it back to the user. At each step the bot should look for any kind of exit conditions so it can output a sign off message.

So far the bot greets users and can identify a number of exit points, such as the user saying "goodbye" or "bye" or something of that nature. The responses are randomly picked from a XML file containing all of the conversation data. The XML looks as follows:

Loading the conversation resources from an XML file is a design feature that I've added so that the addition of new lines that the box will use can be added with out having to rebuild the project. It might be possible then to include attributes to the elements to add semantic data for when those lines should be used. This could used to give the bot moods or even for localisation. The XML will change I'm sure in the future to enrich the data for each entry, but for now the structure is satisfactory. The conversation tag is the parent tag for conversation data, and each child group relates to some part of conversation such as a greeting, a goodbye, a follow up and so on. The data set that the bot looks for relevant information from could take the answers from this file also but that's something I've to work on.

The only missing thing at the minute is the meaning extraction from the user input. This is quite obviously the part where all of the work is done, but I've yet to look in to what natural language processing framework tools might work best for this. Having read a couple of white papers on this so far I've a better idea of how to approach this, which is where the above came from, however in terms of the actual work that has to be done to tag meaning or to classify an answer based on the input I've still to work out the details and do a lot more research. For now most of the infrastructure is in place, and building simple chat functionality in to the bot is kind of fun too.

6 April 2012

Upcoming projects

I've decided to start a couple of new projects that I'm planning on working on in my spare time. I'm not going to post up a schedule or anything because they're very much in the planning stages and they're going to be quite large projects.

The first project is a natural language processing agent that responds to questions with logical correct answers from a given dataset. There are a number of these across the web that respond to quite a large variety of general questions that are there to purely try and emulate humans to pass a Turing test. The focus of this project will be a more directed search for an answer given a key. The key in this case is parsed or extrapolated from the question. This project is more concerned with the classification of a correct answer or response given these keys, rather than the ability to extract information from a question and dynamically generate an answer in an attempt to fool someone who submits a question. This project will involve the use of machine learning techniques and natural language processing. I may need to look at what might be out there already to speed up development when it comes to parsing or extrapolating meaning from questions.

The second project I'm working on is an e-commerce project. I've been working on it now for about a week or two at this point. The project is a platform for integrating with a payment gateway that can be used for storing payer data and scheduling a recurring payment over a period of time. I'm trying to design this as a platform so that additional functionality can be built on top of it and any website can utilise it's features through some sort of an interface or API.

I'll picking a project to start with over the weekend and maybe do a bit of research in to some of the Java natural language processing tools available. I've used the NLTK in python before which was excellent and easy to use, however I'd like to stick with Java this time as I'm not great with Python. Update coming soon...