Responsible Open Source Investigations for Human Rights Research

When so much is possible, drawing a line between can and should becomes essential

When I started working with open source information I knew it under different names – ‘citizen evidence’, ‘publicly available data’ or ‘user-generated content’. But regardless of what it was called, I was drawn to the idea that some of the same techniques and data sources used by companies and governments to collect information on individuals could be flipped around to expose abuses of power. In theory, all that was needed was an internet connection, some tools and a creative mindset. 

Working on a project called Exposing the Invisible, we highlighted different approaches, technologies and projects that either made use of public data or highlighted ways, both digital and physical, to create data using low-cost and accessible means. I learned about projects that used cameras attached to kites to capture higher resolution imagery of towns where none had existed before, investigations that used flight spotting and tracking websites to track planes being used to clandestinely deport people, and techniques to analyse websites in order to link particular companies. 

At the same time as researchers and tool developers were increasingly testing the boundaries of what was possible, a community was forming around questions of ethics and responsibility in this type of work. The 2016 collaboratively-written DatNav report, for example, featured guidance on navigating some of the important ethical dilemmas of the time, and in 2017 a Responsible Data Forum gathered others interested in these questions.

Nowadays, open source data is used in a range of ways, and in different types of investigation. One of the most well-known of these involves researchers investigating crisis and conflict situations from afar, predominantly through visual data. By knowing how to analyse videos and photographs shared online alongside satellite imagery, researchers are able to determine exact locations, dates and impacts of particular violations. 

The benefits of being able to gain this kind of knowledge are clear, but these new and increasingly widespread capabilities also give rise to significant concerns around issues like what is being covered, what stories and data are being missed, how to gain meaningful consent to publish findings and re-publish content, what to do if consent is not possible (without it, how can you know you aren’t retraumatising those captured in, or those who captured, a video?), who is credited in the investigative process, and who has and does not have access to the investigative techniques, data and tools being used. 

In a chapter I wrote with Zara Rahman for the book Digital Witness, titled Ethics in Open Source Investigations, we explored questions around who is able to do these kinds of investigations and who is getting credit for them, and looked at different elements of harm. Like many others, we don’t believe it’s possible to always ‘do no harm’ (a framework often applied to human rights research) with this type of fast-moving work in an ever-changing landscape, especially when the work involves making decisions for others from afar – but we do need to put extra work into reducing the harm that can be caused.

This applies both to those who have captured evidence or who appear in evidence, and to those working on investigations. In work I’ve done in the past that focused on videos and photographs of human rights violations, I sought out and developed individual strategies to reduce and manage the risk of vicarious traumatisation when viewing distressing content, learning from others in this space such as the Eyewitness Media Hub, Sam Dubberley and Amnesty International’s Digital Verification Corps, who have built this out in their programmes. 

I’ve learnt that while individual strategies are useful, they can only go so far. For those working in organisations, there needs to be an organisational commitment to, and a system-wide approach toward, enabling people to work sustainably with this kind of material. This is the path we’re taking at Human Rights Watch as part of the work of our internal Stress and Resilience Task Force, through devising an organisational approach to safeguarding the wellbeing of our staff and those we work with when working with distressing material. 

While collaborating with the Syrian-led NGO Mnemonic on their Syrian Archive project, I learned the importance of having contextual knowledge of both the context and of those who have been doing the work of capturing videos and photographs. The Syrian Archive’s founder, Hadi Al Khatib, often spoke about how essential it was to do open source work in collaboration with those who were working in Syria, and, when possible, those who recorded the videos the organisation was working with.

Documenting human rights violations generally works best, in my experience, when you combine as many different types of data as possible with knowledge of the place and context in which you are working. When it comes to using and publishing open source data, I find that focusing on the source – whether that be the person represented in the dataset you have, the person behind or in front of the camera, the whistleblower, or the person represented in a data leak – can help to guide decisions. 

Looking forward

In a field that relies on technology and publicly available data, what it means to conduct responsible open source investigations is constantly changing. Undoubtedly, I will look back in six years’ time and the types of data available, the ways in which data could be used, and the norms around using that data will have evolved. With thought, care, and clear standards, this evolution can happen in a rights-respecting way.  

Right now, taking a human rights based approach helps to guide me through these difficult questions, using human rights principles such as proportionality and necessity, alongside other related approaches such as duty of care, harm minimisation, and radical empathy (for more on how this can be applied in open source investigations, see an article I wrote with Sophie Dyer: What would a feminist open source investigation look like?). In working on these questions with Zara Rahman, we used the maxim ‘the ends can not justify the means as a touchstone to help guide the use of these techniques in human rights research. 

Some people currently see open source investigations as somewhat magical. This can lead to putting the practice on a pedestal, or believing it lies out of reach – that a person couldn’t  do these kinds of investigations themselves.  But these techniques should not be seen as magic – they should be reviewed, questioned, and replicated. 

Due to restrictions in movement over the past year, much human rights research has of necessity moved online, leading researchers to further develop their skills in working with open sources of data. My hope is for these techniques to become more normalised and less sensationalised, and that they will become part of a researcher’s standard  toolbox, ready if and when needed. 

The reason I started in this area of work is the same reason I continue – in the hopes that these techniques will be accessible to as wide a group of people as possible, in order to continue to expose human rights abuses. 

Further reading


Latest Event / View Past Events

/ ON March 29, 2016 / San Francisco

Human Rights documentation

How are new types of data changing the way that we document human rights violations?

A Human Rights-Based Approach to Data

As the lives people live in the digital age are increasingly enjoyed online, the role of the internet becomes more pressing for human dignity. In the past, for people to eat, they had to physically go to the shops. In today’s world, you simply order a meal online via applications which take your personal information on registration and refresh your memory when you want to make an order. But how safe is data in the hands of the social media, tech or telecommunications companies? This article looks generally at data protection in Africa. 

From the initial point of registration for getting a sim card or joining an online platform, it is critical to figure out who holds the power of information.  Apart from all the data that is collected by online platforms, in places like Kenya and Zimbabwe mobile money is what sustains economic livelihoods, and telecommunications companies collect large amounts of data from clients who expect that their information is secure. 

Data breaches have been witnessed in countries across the globe. A report on the top 15 biggest data breaches of the 21st century lists platforms like Canva, Equifax, and LinkedIn, all of whom have had data breaches in the past seven years. Complaints have been made against Google and other tech companies. Data breaches happen without the consent of data subjects, and in violation of their privacy.   

While national constitutions prescribe that governments have a duty to protect human rights, business entities have increasingly been part of the threat to privacy. In the absence of domestic laws that secure personal data, violations of privacy remain unchecked.  In some instances, companies have been weaponized by states through disclosure of the troves of information they keep on data subjects. 

But over the past seven years, there has also been steady growth in the development of guidelines for data protection, necessitated by the need to place safeguards on how the private sector utilises the data it collects in a manner that respects human rights. 

South Africa’s Information Regulator, for example, recently called for Facebook to seek consent when making use of information collected from the Whatsapp messaging platform, in a bid to protect the privacy of many users.

The European General Data Protection Regulation (GDPR) came into effect on 25 May 2018 to harmonise data protection laws in Europe. It sets out that protection of the processing of personal data is a fundamental right, echoing article 8(1) of the Charter of Fundamental Rights of the European Union and article 16(1) of the treaty on the functioning of the European Union that everyone has the right to protection of personal data concerning himself or herself.

The GDPR has set the tone for data protection in Africa, given that most private companies collecting data in Africa are affiliated to Europe, hence falling within the ambit of the regulation. The GDPR promotes the need for informed consent when the data subject has their information collected, and private companies who breach privacy run the risk of being fined, which acts as a safeguard. 

While this is progressive, a more substantial remedy would be found in homegrown regional frameworks. With its slow pace in ratifying the African Union Convention on Cyber Security and Personal Data Protection (the Malabo Convention), Africa is lagging behind.

Though the treaty, which lays a foundation for data protection in Africa, was adopted in June 2014 – before the GDPR – it needs to be ratified by 15 countries before it can come into force; though it has, in the meantime, inspired data protection laws within some African countries.

The common thread between the GDPR and the Malabo Convention is a human-rights-based approach to handling data. Like the GDPR, the Malabo Convention also provides for the punishment of any violations. Basic principles laid out include the principle of consent and legitimacy as well as the principle of transparency of personal data processing. 

Both the GDPR and the Malabo Convention articulate that a data subject has the right to be ‘forgotten’ on a particular platform and to object to the processing of their personal data in certain ways. 

As digital citizenry grows, so too does the risk to privacy and, most pertinently, the need for the protection of personal data. In ensuring adequate data protection in Africa, the following measures must be taken:

  • There is a need for responsible data handling that ensures transparency from private companies on how they process data. As laid out in the Responsible Data community’s Responsible Data Principles, just because data can be used in a certain way does not mean it should be. Private companies must be guided by policies that ensure consent is sought for the processing of information collected.
  • States that have not ratified the Malabo Convention must do so, to bring it into operation.
  • States without data protection laws must enact them and provide deterrent penalties and guidelines to guard against data breaches. 

In Africa, it is critical that data protection is prioritised both by states and by private companies. The Malabo Convention is a comprehensive treaty which needs to be ratified, domesticated and relied upon to regulate the use of personal data by private companies in Africa: a homegrown solution is readily accessible.  

Latest Event / View Past Events

/ ON March 29, 2016 / San Francisco

Human Rights documentation

How are new types of data changing the way that we document human rights violations?

On Echo Chambers and Challenging Assumptions: Responsible Data in Fragile and Conflict Settings

There has been a growing realisation that responsible data practices are important, need improvement, and can be especially critical in fragile and conflict settings. Decision-makers at international organisations in Brussels or New York or Geneva seem to understand that there is a systemic problem with responsible data in humanitarian settings. This is excellent, and is in part due to the diligent work of many members of the Responsible Data community

But although we now know that technology and data are not neutral, we must also recognise that what is meant by ‘responsible’ data is not neutral either. 

Most of the discourse around responsible data in humanitarian contexts focuses on the relationship between aid providers and direct users or recipients of aid. Local NGOs, civil society and communities are not usually part of the conversation. And unfortunately, for all of the immense value of this community, they are not often part of our conversations, either. This reflects and exacerbates deeply-rooted power imbalances and top-down engagement strategies, reinforcing existing silos and echo chambers. 

Take, for example, the need for additional data in this field. We recognise that to ‘do no harm’ we need to learn what the harms actually are. And we readily admit that so far, we have tended to act based on what we think the risks are. 

As Ben Parker rightly argues, this is cause for data professionals to ask themselves, “Have we even thought of all the possible ways this could go wrong and, if we haven’t, who could help us think it through?” There is a plethora of helpful guidelines on data in humanitarian contexts. They include the Principles for Digital Development, the Harvard Signal Code, UN OCHA’s report on Humanitarian Data Ethics, and the ICRC’s Handbook on Data Protection in Humanitarian Action. But, as Parker argues, guidelines are not enough, and data professionals need to be personally dedicated to a responsible data approach.

And they must also take concrete steps to make sure that their responsible data approach reflects the values of affected communities. To be ‘responsible’, data practitioners must meaningfully engage with civil society or community leaders in humanitarian contexts – especially those who are most impacted by humanitarian data practices – to help them understand all of the possible ways they could go wrong.

So while it is true that there urgently needs to be more research and evidence in this area, these must be grounded in participatory and inclusive methods that center the perspectives of local communities. Otherwise, we cannot arrive at an understanding of what responsible data means in these contexts.

How can we do this in practice? Taking a justice-centered approach, as reflected by innovative work on design justice, applied data justice, or data feminism is a useful starting point for improving responsible data practice. These diverse approaches have a common base of good practices, including: 

  1. Designing collective solutions 
  2. Accounting for structural inequities and power relations 
  3. Focusing on marginalised people and communities whose knowledge and data often get ignored 
  4. Considering the political economy of knowledge production 
  5. Ensuring meaningful participation in decisions, and 
  6. Recognising community-based traditions, knowledge, and practices.

It is imperative to be continuously challenging our own assumptions about what ‘responsible data’ means in fragile and conflict settings by listening to and learning from affected communities. We need to start mainstreaming these concepts into our work, internal and external advocacy, and day-to-day decision-making. 

This likely requires a community re-think, and an opportunity to streamline our definition of what a justice-centered approach to responsible data looks like. It will also require a concerted effort to reach beyond our networks, set aside time and budget for inclusive consultations, and to become comfortable with shifting our programmatic plans – sometimes perhaps radically  – to incorporate diverse views into responsible data work. 

Latest Event / View Past Events

/ ON March 29, 2016 / San Francisco

Human Rights documentation

How are new types of data changing the way that we document human rights violations?

What Real Accountability in the Humanitarian Sector Can Look Like

Humanitarian organisations are bound by a duty to do no harm; however, sometimes in their bid to integrate data technologies into aid provision, they might end up doing more harm than good

Data breaches are one such instance. When these occur in the humanitarian sector, very sensitive data can end up in the hands of potentially harmful actors. While some humanitarian organisations enjoy immunity from being held legally accountable, most do not; but there is also no common understanding of ‘accountability’ for accountability to be meaningful and actionable. 

Due to the sensitive nature of the work and the commonly-held belief that humanitarians are ‘doing good’, humanitarian action has not been subject to regulation in the same ways as other sectors. In the absence of legal reporting requirements for data breaches, humanitarian accountability generally flows towards donors and member states, rather than towards beneficiaries of humanitarian aid.

Humanitarian agencies are bound by their charters, their responsibilities as outlined by country agreements, and – for organisations without privileges and immunities – by domestic laws. There is often no requirement to inform beneficiaries that data collected from them or about them has been accessed, without authorisation, by a third party. In the case of organisations with immunity, even if the beneficiaries learn of a data breach, they cannot sue the organisation for damages or reparations unless the organisation waives its immunity, which is exceedingly rare. In the case of organisations without immunity, the relative power of the organisation and a weak rule of law situation in many contexts can be a barrier to meaningful redress.

For the past seven years, the Responsible Data community has grappled with questions of ethics and accountability in the humanitarian sector, demanding that a breach not cause further persecution. Members of the community have mapped responsible data practices in the humanitarian sector, called on international bodies to help create a safe and inclusive digital future, created resource guides for safe and responsible data collection, and formulated shared strategies to uphold data ethics during the COVID-19 pandemic.

Community members have shown how to bring about greater transparency to data partnerships and promote new models of data protection centring economic development. The community has advanced justice and rights-based approaches to responsibly handing beneficiary data and promoted an equity-based framework. Community members have also uplifted children’s rights by highlighting the need for children’s participation to prevent harms across the data life cycle. 

Members of the Responsible Data community have called on humanitarian agencies to acknowledge the possible harms caused by humanitarian innovation and the importance of decolonising humanitarian data practices, emphasising the need to protect group data as humanitarian situations often demand.

Now the question is: What can accountability look like?

For humanitarian aid to move towards justice, agencies can address the power imbalance between givers and receivers. As notions of consent evolve in humanitarian contexts, so can notions of accountability. 

Humanitarian aid beneficiaries can be integrated into the process of distributing aid, with real leadership and power in how their data is collected and utilised. In a paper titled “Should international organisations include beneficiaries in decision-making? Arguments for mediated inclusion,” researcher Chris Tenove makes the case that according to the ‘affected interests’ principle, those impacted by governance decisions ought to be included in international organisations’ decision-making processes. In the case of humanitarian data governance, beneficiaries’ normative claim to inclusion would require a ‘mediated inclusion’, wherein representatives can make claims on beneficiaries’ behalf and have a meaningful ability to influence decisions on data collection, processing and storage.

However, even in the absence of a legal requirement, humanitarians have the opportunity to disclose data breaches, offer support to affected data subjects, and promote transparency as they fix systems to maintain trust among beneficiaries. They can share how previous breaches were remedied and demonstrate how they are preparing to mitigate against future harms and adopting safer data collection practices, both during and beyond the COVID-19 pandemic.

Humanitarian organisations can set up internal processes where complaints can be received anonymously, to remedy the fear of backlash, and where both individual and group-level concerns are met. The ICRC Data Protection Commission offers an example of what has been done in this area.

To practice accountability to communities, humanitarian agencies can share and publicise privacy impact assessments prior to data collection, alongside their plans for responding to a breach. Agencies can spell out what responsibility looks like and outline actionable consequences for data breaches.

There is also an opportunity to engage in stronger external accountability. Donors can play an important role in setting data policy priorities and increasing compliance with voluntary certification processes that uphold core humanitarian standards. Such processes could include recurring recertification options, with mid-term reviews and final evaluations by third parties.

Since the current accountability mechanisms still depend a great deal on individual action, there is an opportunity for internal and external bodies to take suo motu, or unprompted, action to review and report on a data breach. Community members have also recommended the creation of an independent investigatory body to examine the extent of legal harm caused by a data breach.

The humanitarian sector still relies on self-regulation and internal compliance. For this reason, grassroots-led accountability efforts can provide essential oversight and counterbalance (as Jennifer Easterday points out in her contribution to this collection). The Responsible Data community – composed of activists, scholars, and other experts – has provided useful guidance and support to humanitarian organisations’ processes over the years. Onwards and upwards to many more years of responsible and equitable data practices!

Latest Event / View Past Events

/ ON March 29, 2016 / San Francisco

Human Rights documentation

How are new types of data changing the way that we document human rights violations?