Corpora

There is a number of annotated corpora which can be used to carry out research on temporal expressions and/or event ordering. Those that can serve as gold standard corpora are:

Corpus Short Description TIMEX version
MUC-6 The corpus from the 6th Message Understanding Conference, available at LDC under the catalogue number LDC2003T13. MUC-6 TIMEX
MUC-7 The corpus from the 7th Message Understanding Conference, available at LDC under the catalogue number LDC2001T02. MUC-7 TIMEX
TIDES This corpus consists of two parts: (1) 95 Spanish dialogs (a part of the Enthusiast corpus) and their English translations; (2) 193 documents of the TDT-2 corpus. Only the first part is available at the Mitre's website on TIMEX2. TIMEX2 2001 v.1.0.2
(June 2001)
ACE-2004 Dev This was the development corpus used at the Automatic Content Extraction (ACE) evaluations in 2004, available at LDC under the catalogue number LDC2005T07. TIMEX2 2003 v.1.3
(April 2004)
ACE-2004 Eval This corpus was used for official evaluation at the ACE 2004 TERN task. TIMEX2 2003 v.1.3
(April 2004)
ACE-2005 Dev This was the development corpus used at the Automatic Content Extraction (ACE) evaluations in 2005, available at LDC under the catalogue number LDC2006T06. TIMEX2
(April 2005)
ACE-2005 Eval This was the evaluation corpus used at the Automatic Content Extraction (ACE) evaluations in 2005. This is not publicly available corpus yet. TIMEX2
(April 2005)
ACE-2007 Dev This was the development corpus, consisting of selected domains in Arabic and Spanish only, used at the Automatic Content Extraction (ACE) evaluations in 2007. This is not publicly available corpus yet. TIMEX2
(April 2005)
ACE-2007 Eval This was the evaluation corpus used at the Automatic Content Extraction (ACE) evaluations in 2007. This is not publicly available corpus yet. TIMEX2
(April 2005)
TimeBank 1.1 The TimeBank corpus in the 1.1 version, available to download from the MITRE website. See release notes. TIMEX3
(TimeML 1.1)
TimeBank 1.2 The TimeBank corpus in the 1.2 version, available at LDC under the catalogue number LDC2006T08. TIMEX3
(TimeML 1.2.1)
WikiWars A corpus of English Wikipedia articles about wars. TIMEX2
(Sep 2005)
WikiWarsDE A German version of WikiWars created from the corresponding German Wikipedia articles. TIMEX2
(Sep 2005)
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License