Linking electronic health data to identify relationships between physical and mental health

The Question

The backs of 3 women, who are holding handsDoes the presence of post-stroke depression affect overall recovery?

Can we discover more about this relationship by using machine learning techniques to ‘read’ medical notes which describe brain scan data?

Principal Investigator: Dr Heather Whalley


Brain scan data

Brain scans are often done on stroke patients to assess the position and extent of the blockage or bleed. Doctors also look for changes in brain tissue and blood vessels. Each of these can lead to physical or psychiatric symptoms in the patient.

Post-stroke depression

Stethoscope and laptop computerRecent research has shown that one of the strongest predictors of recovery/mortality and eventual quality of life for stroke patients is the presence or absence of post-stroke depression.

Correspondingly, if improvements are seen in depressive symptoms, physical symptoms also tend to improve.

So perhaps doctors should focus more on treating post-stroke depression?
Or perhaps a stoke should be seen as a risk factor for developing depression or other psychiatric conditions?


The opportunity

Analysing written health records

When a hospital doctor looks at a brain scan, they write detailed notes in the patient’s electronic health record. These written notes contain a wealth of untapped data, which could be used for mental health research.

However, it is very time consuming to read these notes and convert them into a useful format for data analysis. So recent advances in computing techniques that allow computers to do this work for us are very exciting.

What is Machine Learning?

Dots and lines representing a neural networkMachine learning means that computers use the data they are given to teach themselves how to do tasks, how to recognise patterns and how to make decisions. Machine learning makes it possible for computing systems to become ‘smarter’ as they encounter additional data.

For our researchers, this means they give the computer an example of the data as a starting point (training data). Once the computer has found patterns in this training data, it will know what to look for in any similar dataset it is given. 
Our researchers then examine and interpret these data patterns, by comparing them to currently known facts about that health condition and patterns found by other research methods (e.g. data linkage).

Example: Natural Language Processing

Imagine a researcher is looking at a brain scan and reading the accompanying report. Their job is to decide what type of stroke was diagnosed by the doctors. They could do this relatively easily for one report or even a series of reports, but what if they wanted to identify all the patients in Scotland with that particular type of stroke? This would be very time-consuming.

However, computers can be taught to analyse language and process large data collections. They can learn that words like ‘haemorrhage’ and ‘infarct’ are words related to stroke. They can also learn that other words often appear near them (e.g. ‘established’ versus ‘acute’ referring to whether the stroke happened a long time ago or recently).

Once the computer has learnt such expressions from the training data, it can recognise them (and other similar phrases) in any new set of reports it is given, even if it has never seen before.

How will we use machine learning?

Our researchers will use Natural Language Processing technology to ‘read’ the medical notes of stroke patients and investigate the relationship between mental health and recovery after a stroke.


Research Planned

Analyse notes on brain scans

Four brain scan imagesUsing electronic health records, we will identify patients who have had a brain scan after their stroke. We will use natural language processing and supervised machine learning to convert their doctor's notes into structured (useable) data. So instead of having lots of words which describe the patient's signs and symptoms and their diagnosis, the computer will output a series of numbers in a table which ‘code’ for this data.

Combine with electronic health records

We will then combine this structured brain scan data with clinical information about the patient’s mental health.

Identify patterns and links

With our research, we hope to identify patterns and links between what has been seen on a brain scan and whether or not someone develops depression.

If we are successful, we hope that this technique could be used for many other conditions such as traumatic head injury, Alzheimer’s or Parkinson’s disease.



  1. Chemerinski E et al. Stroke; a journal of cerebral circulation 32, 113-117, (2001).


Useful Links

Related Videos


How Can Computers Understand Human Language?
Natural Language Processing Explained

How Can Computers Understand Human Language? | Natural Language Processing Explained


Resilience and Depression:
STRADL Project

Resilience and Depression


James Boardman
Growing up following premature birth

#3 James Boardman: Growing up following premature birth