Master Thesis Project

Title ALIRA: An (Almost) Lazy Information Retrieval Application
Student Ho E.
Supervisor Arno Knobbe
Abstract The ALIRA system is an experimental prototype that borrows techniques from information retrieval and artificial intelligence to accomplish retrieval of relevant documents and classification of user queries. The retrieval of relevant documents is based on the Okapi BM25 ranking mechanism. Given the retrieved result set that contains the k best-ranked documents for a specific query, the ALIRA system builds upon the paradigm of a lazy learning scheme named nearest neighbours to classify this query. The primary goal of this research project is to optimise the ranking and classification accuracy of the system with respect to its base performance for two specific datasets. This is achieved by combining several components, which optimise the original feature space and parameter settings of the Okapi BM25 algorithm and consequently improve its mapping of features. The secondary goal consists of designing a virtual helpdesk framework that uses the ALIRA system as a backbone and a FAQ model as its document reference collection. This research is part of the M. Sc. Agents & Computational Intelligence course at Utrecht University. Most of the practical work was done at Clockwork B.V., a Dutch company focused on interactive multi-channel ebusiness solutions. Performance was measured using two real-life datasets, donated by Q-Go, a company specialized in online marketing and customer interaction services.