Skip to main contentSkip to footer
Four students walk through campus
Brighton & Sussex Medical School

Beginners guide to data manipulation for routinely collected data using Python

BSMS > Research > Primary care and public health > Health Data Science > ARC KSS Data Science Hub > Beginners guide to data manipulation for routinely collected data using Python

Beginners guide to data manipulation for routinely collected data using Python

This page provides some introductory Python coding guides covering how to perform some basic data manipulation tasks commonly needed when working with routinely collected data.

BACKGROUND IMAGE FOR PANEL

About this resource

These introductory coding resources are provided as Jupyter notebooks and a set of associated example data files. Jupyter is free software which is available here: Project Jupyter | Home, or through Anaconda Navigator: Navigator Anaconda Navigator | Anaconda.

A useful introduction to using Jupyter notebooks can be found here: How to Use Jupyter Notebook: A Beginner’s Tutorial – Dataquest

The notebooks include annotation, cells containing Python code, and example output. The notebooks can just be used as reference documents, or the code can be edited and run using the example data. Some of the example data provided (Encounters2.csv and medications2.csv) are edited versions of the Synthea synthetic patient data, which are available here: Home | Synthea. The notebooks are intended as a very basic guide on how to get started, but include links to further resources.

To run the code in the Jupyter notebooks you will need to download the example data files, and then edit the file paths given in the notebooks to wherever you have saved the data.

Example data files >

Jupyter notebooks >