MITIE trained NER models on the GMB 2.2.0 corpus

Tools Click here to view this collection in the new DAP user interface

show summary fields  |   show all    

About this Collection

MITIE trained NER models on the GMB 2.2.0 corpus

These models are the product of research undertaken under the 2017 APS Data Fellowship program. This collection included trained MITIE NER models that are ready for use with a standard MITIE library classifier. A summary Precision, Recall and F1 performance statistics table is included for future reference.

Econometric and Statistical Methods Knowledge Representation and Machine Learning Natural Language Processing

May 2017

CSIRO Enquiries
1300 363 400

Natural Entitie Recognition NER NLP MITIE GMB

Models in this collection were trained over the publically available GMB 2.2.0 corpus on the full 20 tags. The models also include the publically available MITIE total word feature extraction (TFE) CCA word embedding and morphological features dictionaries. The TFE dictionaries were trained on the LDC English language gigaword corpus.

GMB 2.2.0: Valerio Basile, Johan Bos, Kilian Evang, Noortje Venhuizen (2012): Developing a large semantically annotated corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), pages 3196-3200. European Language Resources Association (ELRA). MITIE TFE dictionary original source link:

Creative Commons Attribution 4.0 International Licence

Australian Securities And Investments Commission (Australia), CSIRO (Australia)

Scherer, Tariq (2017): MITIE trained NER models on the GMB 2.2.0 corpus. v2. CSIRO. Data Collection.

All Rights (including copyright) CSIRO 2017.

The metadata and files (if any) are available to the public.

show all

About this Project


NER model training

The included models are the product of a structural support vector machine procedure compatible with the MITIE NER library


ASICdata61MITIE distributed training library

Tariq Scherer

Others were also interested in

  • High-frequency digital camera images of tropical pasture over multiple years at two nearby locations.....
  • Lung Segmentation Data Kit....
  • StereoMSI....
  • Spark 0.9.4 batch....