Med7: a transferable clinical natural language processing model for electronic health records

Abstract

Electronic health record systems are ubiquitous and the majority of patients’ data are now being collected electronically in the form of free text. Deep learning has significantly advanced the field of natural language processing and the self-supervised representation learning and the transfer learning have become the methods of choice in particular when the high quality annotated data are limited. Identification of medical concepts and information extraction is a challenging task, yet important ingredient for parsing unstructured data into structured and tabulated format for downstream analytical tasks. In this work we introduced a named-entity recognition (NER) model for clinical natural language processing. The model is trained to recognise seven categories: drug names, route of administration, frequency, dosage, strength, form, duration. The model was first pre-trained on the task of predicting the next word, using a collection of 2 million free-text patients’ records from MIMIC-III corpora followed by fine-tuning on the named-entity recognition task.

Andrey Kormilitzin
Andrey Kormilitzin
Senior Researcher

My research is centred around translating advances in mathematics, statistical machine learning and deep learning to address challenges involved in learning, inference and ethical decision making using complex biomedical and health data.

Alejo J Nevado-Holgado
Alejo J Nevado-Holgado
Associate Professor

I am an Associate Professor of the Department of Psychiatry and the Big Data Institute, and part of Dementia Research Oxford. I am very glad to supervise the AI team in the TNDR, formed by 10 excellent machine learners and bioinformaticians. Our focus is on the applications of machine learning and bioinformatics to mental health care. In addition, I also hold a position at the Big Data Institute, where we collaborate in the application of machine learning to genomics and target discovery. I am also consultant to a number of AI companies.

Related