Archive for the 'Uncategorized' Category

How Cyril Cleverdon set the stage for IR research

Thursday, December 11th, 2008, posted by Djoerd Hiemstra

Cyril CleverdonCyril Cleverdon (9 September 1914 – 4 December 1997) was a British librarian and computer scientist who is best known for his work on the evaluation of information retrieval systems.

Cyril Cleverdon was born in Bristol, England. He worked at the Bristol Libraries from 1932 to 1938, and from 1938 to 1946 he was the librarian of the Engine Division of the Bristol Aeroplane Co. Ltd. In 1946 he was appointed librarian of the College of Aeronautics at Cranfield (later the Cranfield Institute of Technology), where he served until his retirement in 1979, the last two years as professor of Information Transfer Studies.

With the help of NSF funding, Cleverdon started a series of projects in 1957 that lasted for about 10 years in which he and his collegues set the stage for information retrieval research. In the Cranfield project, retrieval experiments were conducted on test databases in a controlled, laboratory-like setting. The aim of the research was to find ways to improve the retrieval effectiveness of information retrieval systems by developing better indexing languages and methods. The components of the experiments were: 1) a collection of documents, 2) a set of user requests or queries, and 3) a set of relevance judgments, that is a set of documents judged to be relevant to each query. Together, these components form an information retrieval test collection. The test collection serves as a golden standard for testing retrieval approaches, and the success of each approach is measured in terms of two measures: precision and recall. Test collections and evaluation measures based on precision and recall are driving forces behind research of search systems, today. Cleverdon’s research approach forms a blue print for the successful Text Retrieval Conference series that started in 1992.

Cleverdon’s Cranfield studies did not only introduce experimental research in computer science, the outcomes of the project also established the basis of the automatic indexing as done in today’s search engines. Basically Cleverdon found that using single terms from the documents, as opposed manually assigned thesaurus terms, synonyms, etc. achieved the best retrieval performance. These results were very controversial at the time. In the Cranfield 2 Report, Cleverdon says:

This conclusion is so controversial and so unexpected that it is bound to throw considerable doubt on the methods which have been used (…) A complete recheck has failed to reveal any discrepancies (…) there is no other course except to attempt to explain the results which seem to offend against every canon on which we were trained as librarians.

Cyril Cleverdon also ran, for many years, the Cranfield conferences, which provided a major international forum for discussion of ideas and research in information retrieval. This function was taken over by the SIGIR conferences in the 1970’s.

Cleverdon was awarded several times during his life. He received the Professional Award of the Special Libraries Association (1962), the Award of Merit of the American Society for Information Science (1971), and the Gerard Salton Award of the Special Interest Group on Information Retrieval of the Association for Computing Machinery (1991).

Written for Wikipedia.

Dr. Kawashima says: Search the Web

Thursday, October 16th, 2008, posted by Djoerd Hiemstra

Dr.Kawashima Scientists have found that searching the Internet triggers key centers in the brain that control decision-making and complex reasoning. The findings demonstrate that Web search activity may help stimulate and possibly improve brain function. According to UCLA’s director of Memory and Aging Research Center Dr. Gary Small: “Our most striking finding was that Internet searching appears to engage a greater extent of neural circuitry that is not activated during reading — but only in those with prior Internet experience,”. Researchers found that during Web searching, volunteers with prior experience with internet searching registered a twofold increase in brain activation when compared with those with little internet experience.

More at UCLA

SIGIR Salton Award 2009

Tuesday, August 12th, 2008, posted by Djoerd Hiemstra

SIGIR accepts nominations for the Salton Award to honor members of our community who have made …significant, sustained and continuing contributions to research in information retrieval. The Salton award is awarded triennially at the SIGIR Conference: 2009 is a Salton year. The selection committee consists of the available past Salton Award winners. Nominations can be sent to the SIGIR Chair Elizabeth D. Liddy who coordinates the discussion and nomination.

More information at the SIGIR Award Page

Help Joost de Wit met zijn afstudeeropdracht

Friday, November 30th, 2007, posted by Djoerd Hiemstra

Doe mee en maak kans op een gepersonaliseerde DVD Box!

DVD Box Naarmate er meer en meer films, boeken, foto’s, nieuwsartikelen en andere content op het web verschijnen wordt het steeds lastiger om onderscheid te maken tussen interessante en niet interessante items. Aanbevelings systemen zijn programma’s die proberen te helpen bij het ontdekken van items die voor jou de moeite waard zijn. Een bekend voorbeeld hiervan is de “Customers Who Viewed This Item Also Viewed” functie van Amazon.

Joost de Wit voert een gebruikersonderzoek uit bij TNO ICT om te ontdekken welke aspecten bijdragen aan de kwaliteit van aanbevelingen van TV programma’s. Voor het onderzoek is het belangrijk dat er genoeg feedback verzameld wordt. Hoe meer feedback hoe beter. Om deelname te stimuleren heeft TNO ICT een gepersonaliseerde DVD box beschikbaar gesteld. Deze box zal bestaan uit de 5 DVD’s die het aanbevelingssysteem als meest interessant voor jou aanmerkt. Waardeer je programma’s waar veel actie in voorkomt hoog? Dan zou het maar zo kunnen dat er een actiefilm in je DVD box komt. Het is dus belangrijk dat je veel en goede feedback geeft.

Om mee te doen aan het onderzoek klik je hier.

What is SIGIR?

Tuesday, November 6th, 2007, posted by Djoerd Hiemstra

SIGIR is the Association for Computing Machinery’s Special Interest Group on Information Retrieval. The scope of the group’s specialty is the theory and application of computers to the acquisition, organization, storage, retrieval and distribution of information; emphasis is placed on working with non-numeric information, ranging from natural language to highly structured data bases.

The annual international SIGIR conference series, which began in 1978 (there was an initial SIGIR conference in 1971), is considered one of the most important conferences in the field of information retrieval. The 31st SIGIR conference (SIGIR 2008) took place in Singapore, and next SIGIR conference (SIGIR 2009) will be held in Boston MA, USA. SIGIR also sponsors the annual Joint Conference on Digital Libraries JCDL in association with SIGWEB, the Conference on Information and Knowledge Management CIKM, and in 2008 the International Conference on Web Search and Data Mining WSDM in association with SIGKDD, SIGMOD, and SIGWEB.

The group gives out several awards to contributions to the field of information retrieval. The most important award is the Gerard Salton Award (named after the computer scientist Gerard Salton), which is awarded every three years to an individual who has made “significant, sustained and continuing contributions to research in information retrieval”.

Written for Wikipedia.

New group member: Rongmei Li

Monday, January 22nd, 2007, posted by Djoerd Hiemstra
Rongmei Li joined our group on 15 January. She will be working on EfFoRT: Effective Focused Retrieval Techniques, a project funded by the Netherlands Organisation for Scientific Research (NWO). Welcome Rongmei!

TSR: Emphasis on exceptional academic efforts

Friday, October 20th, 2006, posted by Djoerd Hiemstra
We are doing too little as a university for our excellent students. Students with difficulties may expect a lot of attention from lecturers and BOZ (the educational bureau). Excellent student get a high grade, that’s it. The academic climate on campus in more and more looking like that of a High School.

The idea for an academic student journal at the University of Twente was born in the autumn of 2005, when a small group of students was discussing original ways to improve the academic climate on campus. They felt that more emphasis should be put on exceptional academic efforts. Student should have the will to excel and be proud of their work. When realizing that many interesting scholar activities by students only lead to an inches thick report, they decided a peer-reviewed journal would be of tremendous added value to the student community and could foster existing and potential academic talents to flourish.

Check out the TSR Web Site

Groot nationaal onderzoek onder docenten

Monday, April 3rd, 2006, posted by Djoerd Hiemstra
Onderwijs aan het woord biedt docenten en onderwijsondersteuners in alle onderwijssectoren de kans hun mening te geven over hun beroep. Het is het eerste grote, nationale onderzoek waarin de mening van de meer dan 350.000 onderwijsprofessionals wordt gevraagd over de dagelijkse praktijk van het onderwijs. Zij bepalen welke thema’s op de agenda komen.

Het laten horen van je mening kan gemakkelijk door het invullen van een webenquete voor 21 april (kost minder dan 20 minuten). In juni 2006 zal op basis van de enquete een agenda met concrete verbeterpunten worden aangeboden aan minister Van der Hoeven en staatssecretaris Rutte van Onderwijs.

Wouter Alink graduates on information retrieval for digital forensics

Friday, October 28th, 2005, posted by Djoerd Hiemstra

Wouter Alink did his graduation project at CWI Amsterdam and the Nederlands Forensisch Instituut (NFI). His Master’s thesis addresses problems in current digital forensic investigations. It proposes the XIRAF system as a novel approach towards the integration of existing forensic analysis tools using XML technology. The concept of integrating these tools can be compared to the concept of concurrent XML hierarchies. The representation of concurrent XML has been widely studied, but concurrent XML hierarchies cause a variety of unsolved problems when such data has to be queried. Querying concurrent XML hierarchies has however many practical applications, including digital forensics, question answering, and multimedia retrieval. This thesis introduces Burkowski axis steps in XPath as a viable solution for the digital forensics application area. The steps can be used in stand-off XML annotation in which the content is separated from the annotations. This approach has many advantages over inline annotation, especially in field of digital forensics. The introduced steps have been implemented in an existing open source XQuery system called MonetDB/XQuery.