Now that storing data is easy and cheap, it can be tempting to keep everything that you collect. Bruce Schneier points out the pitfalls of this approach in his discussion of the recent Sony hack:
If Sony had had an aggressive data deletion policy, much of what was leaked couldn’t have been stolen and wouldn’t have been published…
Unless there are laws requiring an organization to save a particular type of data for a prescribed length of time, deletion should be the norm.
This has always been true, but many organizations have forgotten it in the age of big data. In the wake of the devastating leak of terabytes of sensitive Sony data, I hope we’ll all remember it now.
Want to know where to get started with a data deletion policy? Two resources that have come out of Responsible Data Forums can help.
- The book Ways to Practice Responsible Development Data, included in the Responsible Data Forum’s resources section, has a section on deciding what data to delete at the end of a project, and how to do so (starting on page 135).
- Geeks Without Bounds’ guide on the Responsible Humanitarian and Disaster Response Project Lifecycle has suggestions on how to close down your project safely (pages 2 and 6 are particularly useful).
Does your organisation have a policy for dealing with data retention? Can you recommend any resources for creating one? Any tips or tricks for what to consider? Let us know in the comments.