Corpora
There is a number of annotated corpora which can be used to carry out research on temporal expressions and/or event ordering. Those that can serve as gold standard corpora are:
| Corpus | Short Description | TIMEX version |
|---|---|---|
| MUC-6 | The corpus from the 6th Message Understanding Conference, available at LDC under the catalogue number LDC2003T13. | MUC-6 TIMEX |
| MUC-7 | The corpus from the 7th Message Understanding Conference, available at LDC under the catalogue number LDC2001T02. | MUC-7 TIMEX |
| TIDES | This corpus consists of two parts: (1) 95 Spanish dialogs (a part of the Enthusiast corpus) and their English translations; (2) 193 documents of the TDT-2 corpus. Only the first part is available at the Mitre's website on TIMEX2. | TIMEX2 2001 v.1.0.2 (June 2001) |
| ACE-2004 Dev | This was the development corpus used at the Automatic Content Extraction (ACE) evaluations in 2004, available at LDC under the catalogue number LDC2005T07. | TIMEX2 2003 v.1.3 (April 2004) |
| ACE-2004 Eval | This corpus was used for official evaluation at the ACE 2004 TERN task. | TIMEX2 2003 v.1.3 (April 2004) |
| ACE-2005 Dev | This was the development corpus used at the Automatic Content Extraction (ACE) evaluations in 2005, available at LDC under the catalogue number LDC2006T06. | TIMEX2 (April 2005) |
| ACE-2005 Eval | This was the evaluation corpus used at the Automatic Content Extraction (ACE) evaluations in 2005. This is not publicly available corpus yet. | TIMEX2 (April 2005) |
| ACE-2007 Dev | This was the development corpus, consisting of selected domains in Arabic and Spanish only, used at the Automatic Content Extraction (ACE) evaluations in 2007. This is not publicly available corpus yet. | TIMEX2 (April 2005) |
| ACE-2007 Eval | This was the evaluation corpus used at the Automatic Content Extraction (ACE) evaluations in 2007. This is not publicly available corpus yet. | TIMEX2 (April 2005) |
| TimeBank 1.1 | The TimeBank corpus in the 1.1 version, available to download from the MITRE website. See release notes. | TIMEX3 (TimeML 1.1) |
| TimeBank 1.2 | The TimeBank corpus in the 1.2 version, available at LDC under the catalogue number LDC2006T08. | TIMEX3 (TimeML 1.2.1) |
| WikiWars | A corpus of English Wikipedia articles about wars. | TIMEX2 (Sep 2005) |
| WikiWarsDE | A German version of WikiWars created from the corresponding German Wikipedia articles. | TIMEX2 (Sep 2005) |
page revision: 7, last edited: 29 Oct 2011 18:42





