Archive for the 'Expert Search' Category

Expert group formation using facility location analysis

Thursday, April 3rd, 2014, posted by Djoerd Hiemstra

by Mahmood Neshati, Hamid Beigy, and Djoerd Hiemstra

In this paper, we propose an optimization framework to retrieve an optimal group of experts to perform a multi-aspect task. While a diverse set of skills are needed to perform a multi-aspect task, the group of assigned experts should be able to collectively cover all these required skills. We consider three types of multi-aspect expert group formation problems and propose a unified framework to solve these problems accurately and efficiently. The first problem is concerned with finding the top k experts for a given task, while the required skills of the task are implicitly described. In the second problem, the required skills of the tasks are explicitly described using some keywords but each expert has a limited capacity to perform these tasks and therefore should be assigned to a limited number of them. Finally, the third problem is the combination of the first and the second problems. Our proposed optimization framework is based on the Facility Location Analysis which is a well known branch of the Operation Research. In our experiments, we compare the accuracy and efficiency of the proposed framework with the state-of-the-art approaches for the group formation problems. The experiment results show the effectiveness of our proposed methods in comparison with state-of-the-art approaches.

Published in Information Processing & Management 50(2), March 2014, Pages 361–383

[download pdf]

Assigning reviewers to papers

Monday, November 12th, 2012, posted by Djoerd Hiemstra

Multi-Aspect Group Formation using Facility Location Analysis

by Mahmood Neshati, Hamid Beigy, and Djoerd Hiemstra

In this paper, we propose an optimization framework to retrieve an optimal group of experts to perform a given multi-aspect task/project. Each task needs a diverse set of skills and the group of assigned experts should be able to collectively cover all required aspects of the task. We consider three types of multi-aspect team formation problems and propose a unified framework to solve these problems accurately and efficiently. Our proposed framework is based on Facility Location Analysis which is a well known branch of the Operation Research. Our experiments on a real dataset show significant improvement in comparison with the state-of-the art approaches for the team formation problem.

The paper will be presented at the 17th Australasian Document Computing Symposium ADCS 2012 at the University of Otago, Dunedin, New Zealand on the 5th and 6th December, 2012.

[download pdf]

Pavel Serdyukov defends PhD thesis on Expert Search

Thursday, June 25th, 2009, posted by Djoerd Hiemstra

by Pavel Serdyukov

The automatic search for knowledgeable people in the scope of an organization is a key function which makes modern enterprise search systems commercially successful and socially demanded. A number of effective approaches to expert finding were recently proposed in academic publications. Although, most of them use reasonably defined measures of personal expertise, they often limit themselves to rather unrealistic and sometimes oversimplified principles. In this thesis, we explore several ways to go beyond state-of-the-art assumptions used in research on expert finding and propose several novel solutions for this and related tasks. First, we describe measures of expertise that do not assume independent occurrence of terms and persons in a document what makes them perform better than the measures based on independence of all entities in a document. One of these measures makes persons central to the process of terms generation in a document. Another one assumes that the position of the person’s mention in a document with respect to the positions of query terms indicates the relation of the person to the document’s relevant content. Second, we find the ways to use not only direct expertise evidence for a person concentrated within the document space of the person’s current employer and only within those organizational documents that mention the person. We successfully utilize the predicting potential of additional indirect expertise evidence publicly available on the Web and in the organizational documents implicitly related to a person. Finally, besides the expert finding methods we proposed, we also demonstrate solutions for tasks from related domains. In one case, we use several algorithms of multi-step relevance propagation to search for typed entities in Wikipedia. In another case, we suggest generic methods for placing photos uploaded to Flickr on the world map using language models of locations built entirely on the annotations provided by users with a few task specific extensions.

[download pdf]

2nd SIKS/Twente Seminar on Searching and Ranking

Monday, June 8th, 2009, posted by Djoerd Hiemstra

On June 24, 2009 at the University of Twente

http://www.cs.utwente.nl/~hiemstra/ssr2009/

The goal of the one day seminar is to bring together researchers from companies and academia working on enterprise search problems. Speakers at the seminar are: David Hawking from Funnelback Internet and Enterprise Search & the Australian National University, who will talk about Practical Methods for Evaluating Enterprise Search. Iadh Ounis from the University of Glasgow will present Voting Techniques for Expert Search. Maarten de Rijke from the University of Amsterdam will talk about Expert Profiling Out In the Wild.

University of Twente at the TREC 2008 Enterprise Track

Friday, October 24th, 2008, posted by Djoerd Hiemstra

Using the Global Web as an expertise evidence source

by Pavel Serdyukov, Robin Aly, Djoerd Hiemstra

This is the fourth (and the last) year of the TREC Enterprise Track and the second year the University of Twente submitted runs for the expert finding task. In the methods that were used to produce these runs, we mostly rely on the predicting potential of those expertise evidence sources that are publicly available on the Global Web, but not hosted at the website of the organization under study (CSIRO). This paper describes the follow-up studies complimentary to our recent research that demonstrated how taking the web factor seriously significantly improves the performance of expert finding in the enterprise.

The paper will be presented at the 17th Text Retrieval Conference (TREC), November 19-21, at the United States National Institute of Standards and Technology in Gaithersburg, USA.

[download draft paper] [More info]

Multi-step Relevance Propagation for Expert Finding

Friday, August 15th, 2008, posted by Djoerd Hiemstra

by Pavel Serdyukov, Henning Rode, and Djoerd Hiemstra

A fragment of the real expertise graph with links between documents white nodes) and candidate experts (black nodes) for query 'sustainable ecosystems' An expert finding system allows a user to type a simple text query and retrieve names and contact information of individuals that possess the expertise expressed in the query. This paper proposes a novel approach to expert finding in large enterprises or intranets by modeling candidate experts (persons), web documents and various relations among them with so-called expertise graphs. As distinct from the state-of-the-art approaches estimating personal expertise through one-step propagation of relevance probability from documents to the related candidates, our methods are based on the principle of multi-step relevance propagation in topic-specific expertise graphs. We model the process of expert finding by probabilistic random walks of three kinds: finite, infinite and absorbing. Experiments on TREC Enterprise Track data originating from two large organizations show that our methods using multi-step relevance propagation improve over the baseline one-step propagation based method in almost all cases.

The paper will be presented at the ACM Conference on Information and Knowledge Management CIKM 2008 in Napa Valley, USA

[download pdf]

Being Omnipresent to be Almighty

Friday, June 20th, 2008, posted by Djoerd Hiemstra

The Importance of the Global Web Evidence for Organizational Expert Finding

by Pavel Serdyukov and Djoerd Hiemstra

Modern expert finding algorithms are developed under the assumption that all possible expertise evidence for a person is concentrated in a company that currently employs the person. The evidence that can be acquired outside of an enterprise is traditionally unnoticed. At the same time, the Web is full of personal information which is sufficiently detailed to judge about a person’s skills and knowledge. In this work, we review various sources of expertise evidence outside of an organization and experiment with rankings built on the data acquired from six different sources, accessible through APIs of two major web search engines. We show that these rankings and their combinations are often more realistic and of higher quality than rankings built on organizational data only.

The paper will be presented at the Future Challenges in Expertise Retrieval fCHER workshop in Singapore

[download pdf]

Pavel Serdyukov wins ECIR best student paper award

Tuesday, April 1st, 2008, posted by Djoerd Hiemstra

Pavel shows his check

Great news: Yesterday, Pavel Serdyukov won the best student paper award at the European Conference on Information Retrieval (ECIR) in Glasgow for his paper Modeling documents as mixtures of persons for expert finding. The award includes a check of $ 1200 sponsored by Yahoo.

[download pdf]

ECIR tutorial slides on-line

Monday, March 31st, 2008, posted by Djoerd Hiemstra

Djoerd performing at ECIR

I enjoyed giving the advanced language modeling tutorial at the European Conference on Information Retrieval (ECIR). The slides are now availble for download below.

[download pdf]

Modeling documents as mixtures of persons

Friday, March 28th, 2008, posted by Djoerd Hiemstra

by Pavel Serdyukov and Djoerd Hiemstra

In this paper we address the problem of searching for knowledgeable persons within the enterprise, known as the expert finding (or expert search) task. We present a probabilistic algorithm using the assumption that terms in documents are produced by people who are mentioned in them. We represent documents retrieved to a query as mixtures of candidate experts language models. Two methods of personal language models extraction are proposed, as well as the way of combining them with other evidences of expertise. Experiments conducted with the TREC Enterprise collection demonstrate the superiority of our approach in comparison with the best one among existing solutions.

download pdf