Save the Titanic: Hand-on anonymisation and risk control

/ January 29, 2015

Interested in how to publish data without passing on people’s personal information? Try the Open Data Institute’s walkthrough guide, developed from a session at the Open Knowledge Festival last year. It allows you to download passenger data from the Titanic and test out a few practical ways in which the data might be used.



A key trade-off is protecting data subjects’ privacy while making sure that the data is still useful. Even if you remove direct identifiers (the passengers’ names), there are still other ways in which the data could be used to identify a person.

The most common variables that are not direct identifiers but carry a high risk:

dates (e.g. birth, admission, discharge, …)
geolocators (e.g. post codes, spatial data)
unusual education (e.g. PhD in statistical disclosure control procedures)
unusual occupation (e.g. organiser of the OK Festival).

The guide has a series of other practical examples to think about.

About the contributor

Tom started out writing and editing for newspapers, consultancies and think tanks on topics including politics and corruption in sub-Saharan Africa and Asia, then moved into designing and managing election-related projects in countries including Myanmar, Bangladesh, Rwanda and Bolivia. After getting interested in what data and technology could add in those areas and elsewhere, he made a beeline for The Engine Room. Tom is trying to read all of the Internet, but mostly spends his time picking out useful resources and trends for organisations using technology in their work.

See Tom's Articles

Leave a Reply

Related /

/ May 17, 2019

From Consensus, to Calls to Action: Insights and Challenges From #5daysofdata

/ May 17, 2018

Why accessibility matters for responsible data: resources & readings

/ January 24, 2018

RD 101: Responsible Data Principles