Multimedia Information Retrieval Reducing information OveRload
Like the real-life equivalent, our miRRor shows one similar objects
Index:
research statement,
introduction,
research topics,
approach,
publications,
other information.
In this PhD project, we study multimedia query processing, and in particular
its implications on database design. We assume a modern extensible database
system such as Illustra or Monet. By extending the database, new
representations of the multimedia data can be used, and advanced search
techniques can be incoorporated in the database architecture.
From a user perspective, the main unsolved problem is how to make use of these
different representations and techniques to fulfill an information need. We
propose that a multimedia query processor must provide an iterative query
process using relevance feedback. Also, the query processor must identify
which of the available representations are most promising for answering the
query. In addition, it should combine evidence from different sources.
Recently, we have started to design and implement a prototype database system
that can provide this functionality to the user. In particular, we focus on
information retrieval using Bayesian reasoning over a concept space of
automatically generated clusters.
Information superhighways are hyped by the media. However, these worldwide
computer networks are really nothing more than data
highways. Television and radio channels blast newsreports, documentaries, and
talk shows through the air. Thousands of magazines and papers are printed
all over the world. The amount of data around us is so huge that it has become
impossible to deal with in an efficient manner. People call this the problem
of information overload.
This research program focuses on the application of database technology to
multimedia data. Ideally, hundreds of television and radio broadcasts would be
covered by a database application. This database system can notify its clients
whenever interesting data becomes available.
Most information systems that claim to be multimedia databases are
not more than huge collections of data like video, audio, text and images. The
only query facility provided uses manually added textual descriptions of these
multimedia data.
We believe that true multimedia database systems provide
content-based retrieval facilities. Multimedia objects in such a database are
first-class citizens. It should be possible to formulate queries referring to
several types of data simultaneously. For example, a user may be interested in
web pages about inference networks containing a photograph of Bayes.
Full-text retrieval systems have been developed since the early sixties. Many
ideas from this field can relatively easy be applied to multimedia data. Text
retrieval systems were the first information systems to deal with approximate
queries, using similarity between documents to drive the retrieval process.
Unfortunately, full-text retrieval systems are never integrated in database
systems. Therefore, one of the purposes of this project is to integrate
information retrieval techniques into the traditional database environment.
Querying a multimedia database requires new querying strategies. Multimedia
queries are hard to formulate explicitely. For example, try to explain what
music you like. Often, it is far more easy to show an example document and
have the system identify similar documents. This querying strategy is better
known as query-by-example. The concept of relevance feedback plays an
important role.
A somewhat related strategy to find interesting documents in a
multimedia database is the navigational querying paradigm. Similar documents
are grouped together and by wandering through the search space you can find
answers to the queries.
One of the research questions to be addressed in this project, is whether such
querying paradigms are really beneficial to the users.
This project is still in progress. Several steps have already been taken. This
section is an attempt to reflect the progress of the research
project. However, some of the focus has developed in a slightly different
direction over the last three years. The most recent information can be found
in the summary of what should be my thesis
after another year of writing, research, and development. The core ideas about
multimedia retrieval are still reflected in the other subsections of this
paragraph.
More information
VLDB99 Demo Webpages
We gave a demo of the Mirror DBMS (using an image retrieval application) at VLDB
'99. A tour of this demo has now been made available (to be included on the ACM
SIGMOD cdrom).
More information
Architecture
We chose for the integration of IR and database technology to address the
problems introduced above. The research topics with respect to the black box
in the middle of the Mirror architecture are the red line through my PhD work. One
of the hypotheses that I try to prove in my work is whether the techniques
from the next subsection can provide the functionality to fill the black box.
More (old) information
Bayesian inference networks
The information retrieval system INQUERY uses Baysian
inference networks to find documents fulfilling an information need. This
approach seems very suitable to describe retrieval processes in multimedia
databases. However, the INQUERY system is a dedicated system. We investigate
whether the inference process can be expressed as database queries.
More (old) information
Student projects
Several groups of students work on aspects of the MIRROR research topics. We
provide an overview of the student
activities on a separate web page. Projects include research into advanced
indexing techniques, audio retrieval and the design of television for the
future.
-
Arjen P. de Vries, Mark G.L.M. van Doorn, Henk M. Blanken, Peter M.G. Apers,
The Mirror MMDBMS architecture,
technical demo at VLDB 99 Edinborough, 1999.
-
Arjen P. de Vries, Mirror:
Multimedia Query Processing in Extensible Databases, in 14th
Twente Workshop on Language Technology. Language Technology in Multimedia
Information Retrieval, Enschede, The Netherlands, December 1998.
-
Arjen P. de Vries, Annita N. Wilschut, On the integration of IR and
databases, accepted as short paper, in 8th IFIP 2.6 Working
Conference on Database Semantics (DS-8).
-
Arjen P. de Vries, Henk M. Blanken, Database technology and the management
of multimedia data in Mirror, in Multimedia Storage and Archiving
Systems III, volume 3527 of Proceedings of SPIE, Boston, November
1998.
-
Arjen P. de Vries, Brian Eberman, David E. Kovalcin, The design and implementation of an
infrastructure for multimedia digital libraries,
in Proceedings of the 1998 International Database Engineering & Applications
Symposium, pages 103-110, Cardiff, UK, July 1998.
-
Arjen P. de Vries, Henk M. Blanken, The Relationship between IR and
Multimedia Databases, Accepted for publication at IRSG98.
-
Arjen P. de Vries, Gerrit C. van der Veer, Henk M. Blanken, Let's talk
about it: Dialogues with multimedia databases. Database support for human
activity, Displays, 18(4):215-220, 1998.
-
Arjen P. de Vries, Intelligent
Television: A testbed for multimedia information filtering, CTIT
Technical Report series, No. 97-35.
-
Wolfgang Klas, Arjen de Vries, Christian Breiteneder,
Multimedia databases in perspective, chapter Current and
emerging applications, pages 13-30, Springer Verlag, 1997.
-
Arjen P. de Vries, Television
Information Filtering through Speech Recognition, in
Interactive Distributed Multimedia Systems and Services (IDMS '96),
Berlin, Germany, 1996, pages 59-69.
-
Arjen P. de Vries, Multimedia Information
Access, Master's Thesis, University of Twente, August 1995.
This is a large document! Therefore, I also provide a short overview of this Master's project.
-
My presentation given at the Dagstuhl
Seminar on Multimedia Database Support for Digital Libraries.
-
Peter Apers and Martin Kersten, Content-based retrieval in multimedia
databases based on feature models, invited paper AMCP Conference, Japan,
November 1998.
-
Let's market a good text book on multimedia databases for graduate
students, that appeared with Springer Verlag:
Multimedia Databases in Perspective, edited by Peter Apers, Henk
Blanken and Maurice Houtsma, 1997.
This book originated as an Advanced Course in Boekelo near University of
Twente, June 8th and 9th, 1995, sponsored by the IDOMENEUS Network of
Excellence of the European Committee. Several outstanding researchers
contributed to this course and presented recent work.
-
Gerrit van der Veer, Roel Vertegaal en Arjen de Vries, De beste
interface bestaat niet, Automatiseringsgids, 5 juli 1996.
-
My BibTeX references are also available in HTML
format.
Grant award
We are proud to mention that we participate in the Informix Engines
for Innovation Research Grant Program. We investigated possible advantages
of extensible database technology for our projects and try to identify
requirements for further refinement.
Index:
research statement,
introduction,
research topics,
approach,
publications,
other information.
Last updated: $Id: mmdb.html,v 1.26 1999/08/27 20:58:05 arjen Exp $
Maintained by: arjen@cs.utwente.nl