This repo holds a collection of tools for the TREC Microblog tracks, which officially ended in 2015. The track mailing list can be found at [email protected].
The Microblog tracks in 2013 and 2014 used the "evaluation as a service" (EaaS) model, where teams interact with the official corpus via a common API. Although the evaluation has ended, the API is still available for researcher use.
To request access to the API, follow these steps:
The main Maven artifact for the TREC Microblog API is
twitter-tools-core. The latest releases of Maven artifacts are available at Maven Central.
You can clone the repo with the following command:
$ git clone git://github.com/lintool/twitter-tools.git
Once you've cloned the repository, change directory into
twitter-tools-coreand build the package with Maven:
$ cd twitter-tools-core $ mvn clean package appassembler:assemble
For more information, see the project wiki.
One advantage of the TREC Microblog API is that it is possible to deploy a community baseline whose results are replicable by anyone. The
rawresults are simply the output of the API unmodified. The
baselineresults are the
rawresults that have been post-processed to remove retweets and break score ties by reverse chronological order (earliest first).
To run the
rawresults for TREC 2011, issue the following command:
sh target/appassembler/bin/RunQueriesThrift \ -host [host] -port [port] -group [group] -token [token] \ -queries ../data/topics.microblog2011.txt > run.microblog2011.raw.txt
And to run the
baselineresults for TREC 2011, issue the following command:
sh target/appassembler/bin/RunQueriesBaselineThrift \ -host [host] -port [port] -group [group] -token [token] \ -queries ../data/topics.microblog2011.txt > run.microblog2011.baseline.txt
trec_evalis included in
twitter-tools/etc(just needs to be compiled), and the qrels are stored in
twitter-tools/data(just needs to be uncompressed), so you can evaluate as follows:
../etc/trec_eval.9.0/trec_eval ../data/qrels.microblog2011.txt run.microblog2011.raw.txt
Similar commands will allow you to replicate runs for TREC 2012 and TREC 2013. With
trec_eval, you should get exactly the following results:
Licensed under the Apache License, Version 2.0.
This work is supported in part by the National Science Foundation under award IIS-1218043. Any opinions, findings, and conclusions or recommendations expressed are those of the researchers and do not necessarily reflect the views of the National Science Foundation.