Skip to main content
Precision Health
Aerial photograph of a crowd of people

Precision Health: Population Health Data Science

Addressing health inequalities using population-level data to predict, prevent and manage ill health in diverse groups of people

About the programme

‘Precision Health: Population health data science’ is a Barts Life Sciences research programme, co-led by Queen Mary’s Clinical Effectiveness Group (CEG) and Barts Cancer Institute – Centre for Cancer Biomarkers and Biotherapeutics. The research has initial funding of £5.7m from Barts Charity to develop new ways of using health data for the benefit of patients and the public.  

Our researchers are analysing data from the patient health records held by GP practices and hospitals, and data derived from tissue samples, imaging and genomics. We are studying disease risk factors and outcomes on a population scale to find new ways of preventing, diagnosing and managing chronic diseases in people who share the same characteristics. The long-term aim is to improve population health equity by enabling clinicians to tailor their approaches for specific patient groups. 

Population Health Data Science Seminar Series

Join us for examples of research excellence in health data science from across the Wolfson Institute of Population Health, Barts Cancer Institute, Queen Mary, Barts Health, and more widely.

Join or watch online

Current research

  • Stratified care for women with breast cancer
    Leads: Professor Claude Chelala, Professor Louise Jones  |  Team: Dr Dayem Ullah, Dr Vivek Singh
    Professor Chelala’s team are preparing research-ready data from primary and secondary care records, plus data from imaging and sequencing, for breast cancer patients at St Bartholomew’s Hospital who are dually consented to the Breast Cancer Now Tissue Bank and the 100,000 Genomes Project (Genomics England). The resulting research will guide stratified care for women with or at risk of breast cancer and will be a proof-of-concept exemplar to be validated on the entire 100,000 Genomes Project national breast cancer cohort. This will help us ensure we have the most relevant and targeted care for our population.
  • Addressing inequalities in global cardiometabolic conditions
    Lead: Professor Rohini Mathur  |  Team: Dr Moneeza Siddiqui
    Led by Professor Mathur, this work will harness local, national and global data to inform new approaches to managing cardiometabolic conditions in globally diverse populations. The team are using clinical, genetic and cohort data on risk factors, trajectories of progression and response to treatment to: i) develop advanced statistical methods for causal inference from observational data, ii) identify novel genetic and clinical determinants of disease risk, and iii) work closely with UK and international partners to address local healthcare needs. The resulting research will help reduce inequalities, improve healthcare resource allocation and inform health policy.
  • Statistical machine learning for life-course health
    Lead: Professor Jianhua Wu  |  Team: Dr Harriet Larvin, Dr Paris Baptiste
    Professor Wu and his team are generating insights into life-course health and inequalities in cardiovascular diseases and oral health. The team are using electronic health records and other longitudinal and multimodal data resources, to: i) explore the application of statistical machine learning to predict health outcomes over the life-course, based on demographic, clinical and lifestyle data; ii) develop and apply statistical machine learning to integrate multimodal longitudinal health data from multiple sources to identify risk factors and patterns of disease trajectories; iii) investigate the factors driving inequalities and progression for multiple long-term conditions. This work will generate actionable health analytics and population-level intervention strategies to inform health policy and improve patient health and care. 

Data sources

The programme has trusted access to national and local health data sets, including primary and secondary care data from North East London - one of the most ethnically diverse and economically disadvantaged NHS regions in the UK. We are using pseudonymised phenotypic data (which includes patient demographics and symptoms as recorded in the patient health record) and biological data (derived from tissue samples, scans and genomics).  

  • Barts Health NHS Trust Precision Medicine Platform - A new resource that will utilise one of the largest, rich and complex datasets from a diverse community within the NHS to improve research and clinical outcomes.
  • Clinical Practice Research Datalink (CPRD) - Anonymised patient data from a network of GP practices across the UK. It includes primary care data linked to a range of other health-related data to provide a longitudinal dataset representative of the UK population. About Queen Mary's access to CPRD.
  • Discovery Data Service – An NHS-hosted data service that brings together electronic health record data from GP practices, hospitals, community care, urgent care, social care and public health settings in London.
  • Breast Cancer Now Tissue Bank (BCNTB) - The UK’s largest unique collection of high-quality breast tissue, breast cells and blood samples from breast cancer patients.
  • Pancreatic Cancer Research Fund Tissue Bank (PCRFTB) – The UK’s largest unique collection of samples of tissue, blood, saliva and urine from people with pancreatic cancer and other diseases of the pancreas, alongside samples from first degree relatives of patient donors and other healthy volunteers.
  • UK Biobank - In-depth genetic and health information from half a million UK participants.
  • 100,000 Genomes Project – A Genomics England initiative that sequenced 100,000 genomes from 85,000 NHS patients affected by rare disease or cancer.
  • Cancer Imaging Archive - A US-based resource of de-identified and highly curated radiology and histopathology imaging, including related-data such as patient outcomes, treatment details, genomics and pathology.
  • Genes and Health - A long-term study of 100,000 people of Bangladeshi and Pakistani origin, led by colleagues at Queen Mary. The dataset links genes with health records to study diseases and treatments.


Programme leads: Professor Carol Dezateux, Dr John Robson (Clinical Effectiveness Group). Professor Claude Chelala, Professor Louise Jones (Barts Cancer Institute).

Programme Manager: Mary Thomas


Back to top