====== CriES Pilot Challenge ======
In the pilot challenge of the [[:cries|CriES Workshop]] we instantiate the problem setting by an expert finding task, i.e. our goal is to identify the expertise of online community members and to provide expert suggestions for solving new problems, questions, or help requests in multi-lingual social media. In many cases, expert users in online communities are multi-lingual, i.e. they participate in discussions in several languages. Frequently, the actual expertise of the user is language-independent, so he/she could provide meaningful assistance and support to questions and requests stated in any of the known languages. The combined analysis of multi-lingual user contributions (e.g. answers or postings from the past) together with mining of his social environment (e.g. interaction with other community members in the past, contact/favorite lists, etc.) may provide better indications that the user has the necessary expertise for addressing the request irrespective of the language.
===== Evaluation Guidelines =====
Please refer to the [[evaluation_guideline|Evaluation Guideline]] for information about:
* Submission of results
* Pooling of results
* Relevance assessment
===== Important Dates =====
* December 2009: Release of pilot dataset
* 16.05.2010: Submission deadline for results on pilot challenge
* 20.06.2010: Release of relevance assessments and evaluation results
* 04.07.2010: Submission deadline for papers
* 18.07.2010: Notification of acceptance
* 15.08.2010: Camera ready deadline
===== Challenge =====
The pilot challenge can be summarized as follows:
* **Given:** (by organizers)
* Document corpus consisting of questions and answers to these questions
* User ids of authors of questions and answers
* Set of 60 multi-lingual topics (15 English, 15 German, 15 French, 15 Spanish)
* **Task:** (by participants)
* For each topic, generate a ranked list of expert ids, which are likely to have expertise on the topic
* **Evaluation:** (by organizers)
* Relevance assessment of experts for topics (using result pooling)
* Evaluation of the retrieval results of the participants using standard evaluation measures (e.g. recall, precision, MAP, ...)
===== Dataset and Participation =====
As dataset we use a dataset provided by Yahoo! through the Webscope Program that contains questions and answers from [[http://answers.yahoo.com/|Yahoo! Answers]]. We identified a subset of this dataset that is suitable for expert search and selected 60 queries in 4 different languages, namely English, German, French and Spanish.
We will provide participants in the pilot challenge with the following:
* Description of [[webscope|how to acquire]] the dataset from Yahoo!
* [[preprocessing|Preprocessing tool]] that has the following outputs:
* Restriction of the dataset to the selected categories
* Queries in TREC Topic format
* Questioner/answerer graph
* Evaluation of retrieval results by human assessors
===== Contact =====
Please contact [[philipp.sorg@kit.edu|Philipp Sorg]] if you want to participate or to get more information about the pilot challenge.