A small summary of existing work in dialogue in QA
Preliminaries for a paper on QA dialogue.
What is QA?
QA is traditionally the answering of clear, factual questions with simple,
short answers by extracting the information from a non structured
database. QA answers questions like: how many people suffer from RSI each
year? Or: what does the abbreviation "RSI" stand for?
A QA system typically answers such questions by classifying each question into
one out of a limited number of classes, such as "Definition question", "Person
question", "Number question", or "Yes/No question". Then, a specific answer
finding strategy can be executed for each type of question.
What is QA in IMIX?
In IMIX, the questions are more difficult than in traditional QA. Many of
the questions require long, explanatory answers. Such questions may even be
considered inherently ambiguous, as the kind and amount of explanation
required is dependent on the user's information need. Examples are: Wat voor
oefeningen zijn er allemaal tegen RSI? Waarom krijgen mensen RSI? Wat is de
beste methode om te genezen van RSI?
Answering such questions cannot be done using the question classification
strategies described above. The assumption of the QA in IMIX is that a proper
answer for the questions is given by some text fragment found in the database.
You might say that the assumption is that somebody has already
recognised that people might have such a question, and wrote a piece of text
to answer it. The difference with traditional QA is, that the text fragment
that answers the question is much larger. Rather than just a part of a
sentence, it may be several sentences or a whole paragraph.
All that the IMIX QA has to do is find that fragment. Naturally, there will
not be one perfect text fragment, but rather, a set of text fragments that are
likely to partially satisfy the user's information need.
The strategies currently employed in IMIX are surface-based.
Deep analysis based strategies are likely AI complete.
This means that the likely accuracy of answers diminishes even further.
Therefore, it is
natural to return a set of answers to each question. This is what the QA
modules do. It also seems natural to present these multiple answers to the
user. This does not happen in IMIX, but would be a natural extension.
Furthermore, providing a link to the document that the fragments came from is
also a natural step.
Literature on QA and dialogue
There is relatively little work done on combining QA with dialogue. The most
significant work is listed below.
-
BirdQuest - qa with structured database.
- Hitiqa;
homepage
- Narrowing dialogue: ask questions that would allow the system to reduce the
size of the answer set.
- Expanding dialogue: ask questions that would allow
the system to decide if the answer set needs to be expanded by information
just outside of it (near-misses).
- Fact seeking dialogue: allow the user to
ask questions seeking additional facts and specific examples, or similar
situations.
-
De Boni's work
-
Dialogue & Question Answering
Discussion Group
-
QACIAD (Question Answering Challenge for Information Access Dialogue), Kato et
al.
;
Publications
-
Overview of QA, including dialogue component
COLLATE; homepage
-
Inui et al, Dialogue management for language-based information seeking
Proposes an algorithm for finding an answer using dialogue. The system
supports follow up questions and system clarification questions. It follows
these steps:
-
Analyse the question, obtaining keywords, question type, and answer class.
Obtain question type and answer class from dialogue history if missing.
-
Add keywords from dialogue history (both system and user utterances).
-
Remove keywords with low weights, with the same semantic class as answer
class, and for each semantic class, keep only the keyword with the highest
weight.
-
Retrieve the answer.
-
If answer is found, provide the answer.
-
If more than one answer is found, prompt the user to add more constraints.
-
If no answer is found, relax the current request until an answer is found.
-
If no answer could be found even after relaxation, report the error.
-
article: using dialogues to access semantic knowledge in a web legal ir
>system
-
Deriving disambiguous queries in a spoken interactive odqa system
Proposes an algorithm for asking spoken clarification questions to
disambiguate the user query using surface analysis techniques.
-
Speech is filtered for sponaneous speech phenomena and recognition errors by
taking words from the utterance that maximise summarisation score (that is:
the appropriateness of the words for describing a summarisation).
- An answer is retrieved. If no appropriate answer can be found, the
system asks a clarification question:
-
First it selects the phrase from the question that is likely to be the most
ambiguous, by looking at generality and structural dependencies in the
sentence.
-
Then it asks a question about that phrase, using one out of a number of canned
questions (i.e.\ what year, what kind, etc.). The question is selected by
some statistical technique.
-
Adaptive question answering with a dialogue interface
-
Abductive Dialogue Planning for Concept-Based Multimedia Information Retrieval