A small summary of existing work in dialogue in QA

Preliminaries for a paper on QA dialogue.

What is QA?

QA is traditionally the answering of clear, factual questions with simple, short answers by extracting the information from a non structured database. QA answers questions like: how many people suffer from RSI each year? Or: what does the abbreviation "RSI" stand for?

A QA system typically answers such questions by classifying each question into one out of a limited number of classes, such as "Definition question", "Person question", "Number question", or "Yes/No question". Then, a specific answer finding strategy can be executed for each type of question.

What is QA in IMIX?

In IMIX, the questions are more difficult than in traditional QA. Many of the questions require long, explanatory answers. Such questions may even be considered inherently ambiguous, as the kind and amount of explanation required is dependent on the user's information need. Examples are: Wat voor oefeningen zijn er allemaal tegen RSI? Waarom krijgen mensen RSI? Wat is de beste methode om te genezen van RSI?

Answering such questions cannot be done using the question classification strategies described above. The assumption of the QA in IMIX is that a proper answer for the questions is given by some text fragment found in the database. You might say that the assumption is that somebody has already recognised that people might have such a question, and wrote a piece of text to answer it. The difference with traditional QA is, that the text fragment that answers the question is much larger. Rather than just a part of a sentence, it may be several sentences or a whole paragraph.

All that the IMIX QA has to do is find that fragment. Naturally, there will not be one perfect text fragment, but rather, a set of text fragments that are likely to partially satisfy the user's information need. The strategies currently employed in IMIX are surface-based. Deep analysis based strategies are likely AI complete. This means that the likely accuracy of answers diminishes even further. Therefore, it is natural to return a set of answers to each question. This is what the QA modules do. It also seems natural to present these multiple answers to the user. This does not happen in IMIX, but would be a natural extension. Furthermore, providing a link to the document that the fragments came from is also a natural step.

Literature on QA and dialogue

There is relatively little work done on combining QA with dialogue. The most significant work is listed below.