Show ONLY these Collection Types:
Show ONLY these Categories:
Show ONLY these Projects:
Show ONLY these People:
Show ONLY these Activity Types:
Showing results for: [ Natural Language Processing ]
These models are the product of research undertaken under the 2017 APS Data Fellowship program. This collection included trained MITIE NER models that are ready for use with a standard MITIE library c... morelassifier. A summary Precision, Recall and F1 performance statistics table is included for future reference.less
Legacy data - NER model training - Published 02 Jun 2017
This is one of two collection records. Please see the link below for the other collection of associated audio files.
Both collections together comprise an open clinical dataset of three sets of 101 n... moreursing handover records, very similar to real documents in Australian English. Each record consists of a patient profile, spoken free-form text document, written free-form text document, and written structured document.
This collection contains 3 sets of text documents.
Data Set 1 for Training and Development
The data set, released in June 2014, includes the following documents:
Folder initialisation: Initialisation details for speech recognition using Dragon Medical 11.0 (i.e., i) DOCX for the written, free-form text document that originates from the Dragon software release and ii) WMA for the spoken, free-form text document by the RN)
Folder 100profiles: 100 patient profiles (DOCX)
Folder 101writtenfreetextreports: 101 written, free-form text documents (TXT)
Folder 100x6speechrecognised: 100 speech-recognized, written, free-form text documents for six Dragon vocabularies (TXT)
Folder 101informationextraction: 101 written, structured documents for information extraction that include i) the reference standard text, ii) features used by our best system, iii) form categories with respect to the reference standard and iv) form categories with respect to the our best information extraction system (TXT in CRF++ format).
An Independent Data Set 2
The aforementioned data set was supplemented in April 2015 with an independent set that was used as a test set in the CLEFeHealth 2015 Task 1a on clinical speech recognition and can be used as a validation set in the CLEFeHealth 2016 Task 1 on handover information extraction. Hence, when using this set, please avoid its repeated use in evaluation – we do not wish to overfit to these data sets.
The set released in April 2015 consists of 100 patient profiles (DOCX), 100 written, and 100 speech-recognized, written, free-form text documents for the Dragon vocabulary of Nursing (TXT). The set released in November 2015 consists of the respective 100 written free-form text documents (TXT) and 100 written, structured documents for information extraction.
An Independent Data Set 3
For evaluation purposes, the aforementioned data sets were supplemented in April 2016 with an independent set of another 100 synthetic cases.
Legacy data - Generation of synthetic nursing handover data set - Published 21 Mar 2017
This is one of two collection records. Please see the link below for the other collection of associated text files.
The two collections together comprise an open clinical dataset of three sets of 10 ... morenursing handover records, very similar to real documents in Australian English. Each record consists of a patient profile, spoken free-form text document, written free-form text document, and written structured document.
This collection contains 3 X 100 spoken free-form audio files in WAV less
CSIRO Adverse Drug Event Corpus (Cadec) is a rich annotated corpus of medical forum posts on patient reported Adverse Drug Events (ADEs). This corpus is useful for those studies in the area of informa... moretion extraction, or more generically text mining, from social media to detect possible adverse drug reactions from direct patient reports.less
CLSD 1057.1 Drug Side-Effect Discovery - Text Mining for Pharmacovigilence from Medical Forums - Published 15 Apr 2016