TABLE OF CONTENTS
- Case study: data loss due to overprotection
- Case study: undocumented analysis and personnel change
- Case study: data obtained from local authorities
- Key resources
You have just joined a medical NGO where data protection procedures are particularly strict. Indeed, the operational context is particularly degraded: armed groups are raging in the recently destabilised region, while well-established foreign companies in the country are known for predatory practices in terms of industrial espionage and aggressive data collection to better target their consumers. The newly arrived staff, fresh out of the various particularly dense briefings in data protection and security risks, is thereby responsible for processing the data from the last survey, without any specific instruction other than “prepare the data for analysis.” The staff fulfils the task by anonymizing all of the collected data to ensure that no beneficiary is identifiable.
Unfortunately, the database was necessary for audit purposes with the donor, and for subsequent monitoring of the persons concerned.
- Loss of data, and therefore credibility of the NGO
- Unsuitable subsequent response
- Reputational risks to the donor, and therefore potential economic losses (for non-compliance with commitments), future plans with this donor compromised
- Financial losses or misuse of resources if the data needs to be collected once again (and loss of trust of the beneficiaries toward the NGO)
If anonymization of the data has already been carried out, it is IMPOSSIBLE to go back, the data is permanently lost and this in an irremediable way. Therefore, particular caution should be exercised in the use of such procedures since these involve irreversible change. For example, if you need individualised patient follow-up, anonymization will not be a viable solution, for example.
In case the procedure has not yet been launched, or if the organisation is reflecting on what it wants to implement, it must start by making a first diagnosis: if it needs individualised data, or at least a certain level of granularity for its analysis, it may turn to pseudonymisation or aggregation procedures, which may allow for differentiated analysis while maintaining a certain level of data protection. If it really does not need personal data, then anonymization techniques can be considered an option, but staff capable of processing this type of aggregated data is then required to draw a relevant analysis (which can be complex in some cases). Leaving files without any processing is generally not recommended, except in the case where no personal/sensitive data is collected.
In all cases requiring a certain level of de-identification, it is important to be imaginative, because with current big data/data harvesting techniques, even data that seems harmless (the quantity of water distributed, for instance, priority needs, etc.) can allow a re-identification of people (as long as they refer to a person/user group of a certain type).
In any case, the example is a blatant case of improper sizing of the means to ends, a common occurrence when talking about responsible data management. Much the same case emerges in the use of complex technologies or processes (VPN, Sandboxes, etc…) in a systematic and undifferentiated way within a structure, which can lead to circumvention, omission, even avoidance strategy on the part of some staff much more problematic than the situation we are trying to avoid.
A data analysis was started by one person but could not be finished because that person left the organisation precipitously and a handover was not an option. Unfortunately, the procedure was not documented and as it stands the analysis document has multiple tabs and countless tables and formulas without any explanation. A new person arrives and has to take over the analysis midstream.
- Extension of the estimated time needed to complete the analysis and unnecessary additional costs
- Poor use of resources, waste of time, frustration of teams
- An erroneous or even irrelevant analysis
Contact the person who left hastily to see if they can provide a memo on what they have already done
In order to avoid cumulating wasted time and erroneous analysis, it may actually be smarter to start the analysis from scratch rather than trying to understand what has been done
If it exists, follow the analysis plan in order to limit - to the extent possible - unnecessary analyses. If it does not exist, develop one, even succinct, to allow the most efficient possible resumption of the analysis.
Systematically develop an analysis plan when building a survey protocol. It provides a framework for an analysis to ensure that it remains relevant to the objectives of a program. By limiting the analysis work to the chosen themes, it makes it possible to follow the evolution of said work. If possible, establish one-off communication during the analysis phase between the program manager and the person performing the data analysis in order to anticipate problems and avoid surprises during the final analysis.
Wanting to avoid collecting data that may already exist, you contact an employee within the social affairs department of the local authority of the region where your project (food security of nomadic populations) will take place. Following a first exchange, your contact person provides you, in an impromptu way, via an intermediary and on a USB key (not encrypted), with a database containing a whole range of data. After a quick assessment, you feel that some of this data can indeed be useful for your project. However, the database also contains personal data from collections that appear to have taken place at different times. In addition, there is no metadata, information on the methods of collection or the consent provided, or not, by the recipients requested for these surveys. What to do?
- For beneficiaries, whose data are shared, a loss of confidence in the action of local authorities
- For your NGO, a loss of credibility with regard to the populations concerned
- Personal and sensitive data accessible by third parties due to an insecure sharing method
- Disclosure of personal and sensitive data that does not comply with humanitarian and data protection principles
- Risk of bias (selection, social or process) present in the data and over which you would have no visibility
- Get in touch with your contact person to establish for what purposes and under what conditions this data was collected. For example, try to determine if consent was obtained for these data collections from the people called upon. If it turns out that no consent was obtained, or if you in any way doubt that people were fully informed about the possibility of sharing their data with NGOs working in the region, you should not use this data for your purposes.
As you were not responsible for the collection of this data, there was actually nothing you could have done upstream to avoid this situation. On the other hand, it is now your responsibility not to use this data for your project even though this may force you to organise a new collection. However, the situation can also be an opportunity to raise awareness of personal data protection issues among local authorities and support them so that they develop skills and improve their practices on these subjects. In many countries, the protection of personal data is still perceived as a non-essential topic - despite legislation sometimes being in place - and therefore relegated to the bottom of the list of priorities with very few resources allocated to implement best practices. It may still be possible to involve local authorities when you train your own teams in responsible data management to provide them with awareness training and resources relevant to their work.
Be careful not to assume that just because you weren’t responsible for collecting this data, it isn’t’ incumbent upon you to verify it and that no one will ever find out - which may be the tendency in the interest of efficiency - in ethical terms, this approach would be quite contrary to humanitarian principles, and the burden/risk that this may put on the project and the NGO in terms of reputation is too great not to be considered carefully.
- The Data Analysis and Data Visualisation toolboxes are useful resources on associated methodological aspects
- This guide from the Urban Institute to better understand how to present responsible visualisations
- Should you have no knowledge of this, we recommend the following reading, to be shared with your colleagues: here and here