Click here to view this collection in the new DAP user interface
MITIE trained NER models on the GMB 2.2.0 corpus
These models are the product of research undertaken under the 2017 APS Data Fellowship program. This collection included trained MITIE NER models that are ready for use with a standard MITIE library classifier. A summary Precision, Recall and F1 performance statistics table is included for future reference.
Econometric and Statistical Methods
Knowledge Representation and Machine Learning
Natural Language Processing
Natural Entitie Recognition
Models in this collection were trained over the publically available GMB 2.2.0 corpus on the full 20 tags. The models also include the publically available MITIE total word feature extraction (TFE) CCA word embedding and morphological features dictionaries. The TFE dictionaries were trained on the LDC English language gigaword corpus.
Valerio Basile, Johan Bos, Kilian Evang, Noortje Venhuizen (2012): Developing a large semantically annotated corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), pages 3196-3200. European Language Resources Association (ELRA).
MITIE TFE dictionary original source link:
Creative Commons Attribution 4.0 International Licence
Australian Securities And Investments Commission (Australia), CSIRO (Australia)
Scherer, Tariq (2017): MITIE trained NER models on the GMB 2.2.0 corpus. v2. CSIRO. Data Collection.
All Rights (including copyright) CSIRO 2017.
The metadata and files (if any) are available to the public.
NER model training
The included models are the product of a structural support vector machine procedure compatible with the MITIE NER library
ASICdata61MITIE distributed training library
Others were also interested in