Skip to main contentSkip to footer
Four students walk through campus
Brighton & Sussex Medical School

National datasets

National datasets

National datasets A–Z

A B C D E G H I M N O U

A

Avon Longitudinal Study of Parents and Children – ALSPAC

ALSPAC (also widely known as Children of the 90s) is a world leading health study following the health and wellbeing of thousands of pregnant women and their families recruited between 1991 and 1992 in Bristol (UK). The study explores all aspects of health and wellbeing, from obesity and liver disease to air pollution and mental health. As of 2025, there are three generations of participants taking part. Along with the children of the 90s and their parents, there’s now a new set of babies who are the children of the children of the 90s. 

Detailed information has been collected on these women, their partners, and subsequent children and grandchildren, using self-completed questionnaires, data extraction and medical notes, linkage to routine information systems and from hands-on research clinics. Participants contribute their health data in many ways, building a detailed picture of population health.

Explore the ALSPAC data dictionary >

Find out more about ALSPAC and how to access the data in the ALSPAC resource pack >

B

UK BIOBANK

UK Biobank is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants. It is a major national and international health resource, with the aim of improving prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses.

A summary of all the information gathered and available for research via UK Biobank can be found in the Data Showcase tab here >

UK Biobank resource pack >

C

Clinical Practice Research Datalink – CPRD

CPRD collects anonymised patient data from a network of GP practices across the UK. Primary care data are linked to a range of other health related data to provide a longitudinal, representative UK population health dataset. The data encompass 60 million patients, including 18 million currently registered patients and contains information on diagnoses, symptoms, referrals, prescriptions, test results, and patient health behaviours.

CPRD resource pack >

Clinical Record Interactive Search – CRIS

CRIS allows researchers to carry out projects using information from South London and Maudsley (SLaM) NHS Foundation Trust clinical records. The Trust is a specialist mental healthcare Trust, providing mental health and Improving Access to Psychological Therapies (IAPT) services for the following four London boroughs: Lambeth; Southward; Lewisham; and Croydon, which have a collective population of around 1.3 million. The main source of data for CRIS is SLaM’s electronic health records system and the equivalent IAPT services system managed by SLaM. These records date back to 2007. No patient identifiable information (PII) can be accessed by researchers, CRIS data is provided in an anonymised format.

CRIS resource pack >

Community Services Dataset – CSDS 

As a secondary use dataset, CSDS reuses clinical and operational data for purposes other than direct care. CSDS sets out national definitions for the extraction of data about children and adults, including: personal and demographic, social and personal circumstances, breastfeeding and nutrition, care event and screening activity, diagnosis including long-term conditions and disabilities, scored assessments. CSDS information is captured from a wide variety of publicly funded community services in England.

CSDS resource pack >

D

Diagnostic Imaging Dataset – DID

The DID is a central collection of detailed information about diagnostic imaging tests carried out on NHS patients, extracted from local radiology information systems and submitted monthly. It captures information about referral source and patient type, details of the test, demographic information, plus items about waiting times for each diagnostic imaging event.

For further information and a full description of each of the fields available in the DID, please see the data dictionary tab on the following webpage here >

DID resource pack >

E

English Longitudinal Study of Ageing – ELSA

ELSA was established in 2002 as a multi-faceted survey to track the dynamics of ageing, with the aim of advancing research and informing policy. ELSA is a unique and rich resource of information on the dynamics of health, social, wellbeing and economic circumstances in the English population aged 50+. The ELSA project has over 18,000 participants and has been successfully running for over 20 years.

During each wave of ELSA, interviewers ask respondents to complete a core self-completion questionnaire covering topics such as wellbeing, relationships, and alcohol consumption. During some ELSA waves a nurse visit has been carried out, meaning various additional information has been obtained such as physical examination and performance data.

The ELSA sample was selected from three survey years of Health Survey for England data. Households were included in ELSA if they contained at least one adult aged 50 or over and had agreed to be re-contacted at some point in the future. Those who become ELSA participants are interviewed by the study team every 2 years, and their responses make up the ELSA dataset.

ELSA resource pack >

Emergency Care Dataset – ECDS

ECDS is the national dataset for urgent and emergency care, replacing the previous Accident and Emergency Commissioning Dataset. ECDS collects information about why people attend emergency departments and the treatment they receive. The variables collected in this dataset, can be found here >

ECDS resource pack >

G

Genomics England

The Department of Health and Social Care set up Genomics England in 2013 with the aim to transform healthcare, accelerate research and protect citizens. Genomics England data is held within a secure Research Environment and currently contains the following types of de identified data:

  • Genomics data: de-identified genomic sequences, variant and genes from over 100,000 genomes so you can compare mass amounts of data to support your research. You’ll also have access to the de-identified genomics data from over 14,700 Covid-19 study participants. 
  • Clinical data: genomics data on its own can’t tell you much without robust patient health data. The Research Environment collates this information so that you can look at the whole picture of what might be causing disease. Adding further data such as proteomic and transcriptomic data and digital histopathology.
  • Omics samples.

Genomic England resource pack >

H

Health Survey for England – HSE

HSE is a cross-sectional annual survey that monitors trend in the nation’s health and care, collecting information on adults (16+) and children (0-15years) living in private households in England. The survey aims to help inform and shape health policy and improve health services so that the UK population can stay healthier, for longer.

HSE started in 1991 and has been administered every year since (excluding 2020 due to Covid-19 restrictions). Participants are chosen at random, through their postcode, meaning every private household address in England has an equal chance of being included in the survey, ensuring a truly representative picture. Each survey includes core questions, measurements, and with consent the analysis of biological samples. Participants are interviewed at home, and if consent is given, are followed up by a nurse visit to collect measurements and samples. Around 8,000 adults and 2,000 children take part in the survey each year.

HSE resource pack >

Hospital Episode Statistics – HES

HES is a data warehouse containing details of all admissions, outpatient appointments and A&E attendances at NHS hospitals in England. HES data covers all NHS CCGs in England.

The HES data dictionary can be found here >

HES resource pack >

I

Improving Access to Psychological Therapies Dataset – IAPT

IAPT collects information about people in contact with adult psychological therapy services in England. The IAPT data set was developed with the IAPT programme as a patient level, output based, secondary uses dataset. Data has been collected since 2012 and is a mandatory submission for all NHS funded care, including care delivered by independent sector healthcare providers. The IAPT dataset provides a comprehensive national picture of the use of IAPT services in England and supports a variety of secondary use functions.

IAPT resource pack >

M

Maternity Services Dataset – MSDS 

MSDS is a patient-level dataset that captures information about activity carried out by Maternity Services relating to a mother and baby, from the point of the first booked appointment until mother and baby are discharged.

MSDS resource pack >

Mental Health Services Dataset – MHSDS 

MHSDS collects data from the health records of individual children, young people and adults who are in contact with mental health services. MHSDS brings together information captured on clinical systems as part of patient care. It covers not only services provided in hospitals but also outpatient clinics and in the community, where the majority of people in contact with these services are treated. It is mandatory for NHS funded care providers to submit MHSDS data.

Information on the data submitted to the MHSDS can be found here >

MHSDS resource pack >

N

National Cancer Registration Analysis Service – NCRAS 

The National Cancer Registration Analysis Service (NCRAS) is the population-based cancer registry for England. It is responsible for collecting, quality assuring and analysing data on all patients diagnosed with a primary tumour in England.

NCRAS resource pack >

National Survey of Sexual Attitudes and Lifestyles – Natsal

Natsal is a cross-sectional study currently in its fourth round of data collection, exploring the sexual behaviours and attitudes of people living in Great Britain. Natsal conducts surveys roughly every 10 years, the first was administered in 1990 in response to the emerging HIV/AIDs epidemic, and was followed by Natsal-2 in 2001, Natsal-3 in 2010 and Natsal-4 in 2024. The survey takes place in the participant’s home, administered by an interviewer face-to-face. The computer-assisted interview allows questions of a more sensitive nature to be self-completed by the participant. In addition, a biological sample was also included in Natsal-2 and Natsal-3.

The consistent methodology and repetition of the surveys has made it possible to look at differences in sexual behaviour, attitudes and lifestyles over time, capturing the dramatic changes in Britain. Natsal provides evidence of the context, influences and consequences of sexual lifestyles and the surveys continue to be vital for informing; national and international sexual health interventions, strategies, guidelines, and sex and relationship education. Over 45,000 people have taken part in Natsal surveys, spanning those born throughout much of the 20th Century. The Natsal project is among the largest and most detailed scientific studies of sexual behaviour in the world.

Natsal data has been deposited with the UK Data Service and is available for registered users. Further details and additional guidance on accessing Natsal data can be found in the resource pack below.

Natsal resource pack >

NHS DigiTrials Service

NHS DigiTrials is a service aimed to help clinical trials progress more easily at every stage, using data that NHS England already collects from health and care organisations across England. NHS DigiTrials offers a service that will make the data for trial follow-up available for linkage to clinical trial data, in one place, meaning better understanding of longer term outcomes of new treatments, improving healthcare across the country.

NHS DigiTrials is a relatively new service (started in 2019) powered by NHS Digital to support clinical trials, with the ultimate aim of improving NHS services and access to evidence based diagnostics, vaccines and treatments to improve the health of patients. It was developed and supported by a diverse panel, representing the voice of patients and the public. Whilst NHS DigiTrials is a new service, it has already supported over 72 research studies, sent around 23.8 million invitations, and helped recruit over 1.2 million study participants.

NHS DigiTrials service resource pack >

O

Office for National Statistics (ONS) Mortality Data

Mortality statistics in England and Wales are derived from the registration of deaths certified by a doctor or coroner. The data pass through a number of processes before becoming usable for analysis and available through ONS. The ONS has over 128 mortality datasets, with the majority open access and freely available.

Some of the England and Wales datasets available via ONS are:

  • Provisional counts of weekly and monthly death registrations
  • Monthly mortality analysis
  • Provisional rate and number of suicidal deaths
  • Provisional quarterly rates and number of alcohol specific deaths
  • Annual mortality statistics

Variables include: age, sex, area, religious groups, care home residents, place of death, Covid-19, vaccination status, cause of death, homelessness, drug and alcohol abuse, plus many more.

ONS Mortality Data resource pack >

U

UK Data Service – UKDS

UKDS is a service which hosts the largest trusted digital archive of economic, social and population data in the UK. It specialises in data curation, data literacy and actively managing long-term access to high quality data, transforming social science research, teaching and learning.

To find out more about UKDS, check out the resource pack below.

UKDS resource pack >