2 Glossary
TABLE OF CONTENTS
- Data
- Data Literacy
- Data Protection
- Primary Data
- Secondary Data
- Information Management
- Data Management Cycle
- Database
- Data collection
- Mobile Data Collection
- Data Quality
- Data Analysis
- Data Visualization
- IT Infrastructure
- Personal Data
- Sensitive Data
- GIS & Mapping
- Geodata
- ICT4D
- Business Intelligence
- Indicator
- Data aggregation
This part of the toolbox often refers to :
- the Data Analysis toolbox which provides an understanding of the basic principles and tricks of data analysis,
- the Data Visualisation toolbox which introduces you to the creation of relevant and effective data visualisations,
- the Responsible Data Management toolbox which provides an accessible and comprehensive overview of the different components, or “pillars”, of Responsible data and Data protection applied within the aid sector,
- the Covid19 program data toolbox which provides a number of resources for adapting programme data to a physically remote context.
Data
Data is a standalone element that has not been interpreted or put into context. Bringing data together through processing can lead to information. Therefore, data must be interpreted to be informative, and if this information is useful, then it becomes knowledge.
There are many types of data that can be processed into information in the humanitarian and development sphere. Quantitative data expresses a specific ‘quantity’, usually numerical, and has units. For example, ‘the number of people in a household’ is a piece of quantitative data (where the unit is persons) that can be useful to determine the level of support needed in some types of humanitarian programming. Qualitative data is instead focused on the characteristics of something, and cannot be expressed in numbers. An example is information collected through interviews or focus group discussions that can then help in the design of humanitarian or development programming. Further information in relation to defining the essentials of data was published by the Global Disaster Preparedness Center, here.
To go further, see :
- Section 2.1 of the Data analysis toolbox
- Section 4.2 Reconnaître les types de données de base of the Data analysis toolbox
Data Literacy
Data literacy is the ability to read, work with and analyze, and form arguments with data. According to the ICRC, data literacy is “the basic skills, knowledge, attitudes, and social structures required for different populations to use data”. Data literacy is the basic requirement and first step for organizations to use data to make evidence-based decisions. As such, improving data literacy within an organization can have a multitude of benefits, from improving collaboration to improving performance and efficiency to promoting accountability and transparency.
For further details, CartONG is publishing a series of blog posts to help organizations better master the topic, including a range of resources, found here.
Data Protection
Data protection includes the processes, systems and practices used to safeguard information from being lost, corrupted, or accessed by unauthorized parties. It refers to the fundamental rights of individuals: the right to data protection is derived from the Right to Privacy.
As stated by ICRC, “Protecting individuals’ Personal Data is an integral part of protecting their life, integrity, and dignity” (Handbook on data protection in humanitarian action). New technologies have allowed for easier and faster processing of personal data, which in turn leads to concerns about intrusion into the private lives. In the humanitarian and development sphere, NGOs collect and process personal data to perform humanitarian activities. However, in such environments where the rule of law may not be fully applied, “the protection of Personal Data of beneficiaries and staff is often necessary to safeguard their security, lives and work” (ICRC, Handbook on data protection in humanitarian action, pp 28-29).
D’autres éléments de définition sont disponibles dans le glossaire de la Boîte à outils Gestion responsable de données.
Primary Data
Primary data is data that has been collected directly from the research team for the purposes of answering 1 or more research questions. Primary data collection is when field teams collect data directly from beneficiaries or from other related stakeholders. Examples of primary data in development or humanitarian work include all sources of data relating to M&E that are collected by the M&E field teams. Primary data can be either quantitative (for example, data collected through surveys) or qualitative (for example, data collected through interviews or focus group discussions).
Secondary Data
Secondary data is data that has been collected by sources external to the research team. A review of secondary data can be used for many purposes, such as determining the research approach/design, formulating hypotheses and research questions, or triangulating findings developed through the analysis of primary data. Examples of secondary data in the humanitarian and development sphere include reports from other actors/stakeholders, or open data that can be further analyzed in relation to the target, beneficiary population.
To go further on this topic, see section 6.7 Share and transfer of the Responsible Data Management Toolbox.
Information Management
Information Management (IM) in the humanitarian and development sector are the internal processes an organization undertakes to managing program data. Therefore, IM includes all stages of a ‘data management cycle’ (see definition below), which is everything needed to produce and use data for humanitarian and development programs. In sum, Information Management supports decision making processes through the use of data.
A detailed overview of Information Management and how it is related to similar topics can be found in Chapter 2 of the 2020 study produced by CartONG found here.
Data Management Cycle
The data management cycle describes all the specific steps required for effective information management. As such, the data management cycle outlines the different processes required to produce and use data, thereby facilitating effective evidence-based decision-making in humanitarian and development activities.
To go further, see :
- Answer to question 3.1.1 What exactly is Information Management? in the FAQ
- Section 2.3 Where it stands in the data management cycle of the Data Analysis toolbox
Database
A database is “a tool that stores data, and let’s you create, read, update, and delete the data in some manner” (ACAPS, 2013, pp. 3). Usually, the database is the Excel file that teams use to follow their program data, but a database can take many forms (from paper to an Excel sheet to a more complex purpose-built software). Databases are necessary for program data and Information Management in order to provide a storage location for project data that can then easily be retrieved, updated, and analyzed to make evidence-based decisions. A database design guide was produced by ACAPS in 2013, found here.
Data collection
Data collection is the process of gathering data through either qualitative or quantitative methods (for example, focus group discussions or surveys, respectively). Data collection occurs after the development of a research approach and the creation of tools to collect data (e.g. survey tools with questions that can accurately measure the research question). Data collection requires planning and managing the logistics of field-teams. It also requires careful attention to a ‘sampling plan’, which is an outline of what data will be collected from and by whom.
Mobile Data Collection
Mobile data collection is the use of mobile technology (mostly smartphones and tablets) to collect data. MDC can be used to improve the quality of data, information, analysis and therefore decision-making. It has been used in the humanitarian and development sector since 2008, alongside the widespread availability of Android devices. There is a breadth of tools that can be used and the tool(s) selected by an organization should be based on a range of factors specific to different needs. To compare different MDC tools, please see the Benchmarking of Mobile Data Collection Solutions produced by CartONG, found here.
The MDC Toolkit produced by CartONG and Tdh, providing information on all stages of an MDC project, can be found here.
Data Quality
A common way of defining the quality of data is in relation to its ‘fitness for purpose’, or, the ability with which it can be used to make data-driven decisions. Quality is a multi-faceted concept that includes relevance, accuracy, timeliness, accessibility, and comparability of data.
Data relevance refers to the degree to which it meets the needs of the users. Accuracy refers to how well the data describes what it is trying to measure. Timeliness relates to how ‘up-to-date’ the data is, thereby impacting relevance and accuracy. Accessibility refers to the ease with which the data can be obtained. Finally, comparability of data refers to the ability to compare and analyze data in relation to other sources.
More information and other dimensions of data quality can be found in UNOCHA’s Humanitarian Data Exchange Quality Assurance Framework.
Data Analysis
Data analysis is the process of applying techniques to data to discover useful information and support decision-making. After data collection, each separate piece of data can be brought together and through data analysis techniques can provide more useful information. The information that is discovered through data analysis can then be used to facilitate decision-making.
A guide to data analysis in the humanitarian sector produced by ACAPS can be found here.
To go further on this topic, you can open the Data Analysis toolbox.
Data Visualization
Data visualization is the visual representation of information produced through data analysis. After data is analyzed to discover information it can be transformed into tables, charts and graphs to be more easily understood by a wider audience.
To improve your methodological skills in producing graph-type visualizations please see the Data Visualization Toolkit produced by Tdh and CartONG.
To go further on this topic, see :
- the Data Visualisation toolbox,
- section 2.1.3 Data visualisation and analysis de of the Covid-19 Program data toolbox .
IT Infrastructure
IT infrastructure is the hardware and software used by an organization, such as servers, networks, security systems, media, software, etc. IT infrastructure is required for effective Information Management, but Information Management is not only in the realm of IT staff (IM is dependent on program staff). However, IM produces and uses data with the available IT infrastructure.
Personal Data
Personal data is any information that relates directly to a specific person (or ‘data subject’).
Any information relating to a natural person (or “data subject”) who can be identified directly or indirectly. More precisely, it includes :
- A name, a picture, a fingerprint or iris scan;
- An identification number, an employee number or an internal registration number;
- A phone or social security number;
- Location data such as a postal address;
- An email address, an online identifier, an IP address;
- A voice recording;
- One or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.
Personal Identifiable Information
Also called “direct identifiers”, Personally Identifiable Information (PII) is specific personal data that can directly identify the identity of a person. PII can include data such as a respondent’s name, address, or ID number. Not all personal data is PII; for example, ‘number of persons in the household’ and ‘household income’ are examples of personal data that is not PII as they are information that can apply to many households.
An overview of existing guidance on the protection of personal data in humanitarian action can be found in OCHA’s 2019 Data Responsibility Guidelines (found here).
To go further on this topic, see :
- Personal data definition in the Responsible Data Management Toolbox Glossary,
- Personal Identifiable Information definition in the Responsible Data Management Toolbox Glossary,
- Pseudonymisation & anonymisation definitions you can find in the Responsible Data Management Toolbox Glossary.
Sensitive Data
Sensitive data is data that can cause harm if disclosed or accessed without proper authorization. Examples may include data in relation to health, race or ethnicity, or affiliation to religious and political groups. Sensitive data could cause harm to a person or have a negative impact on an organization’s ability to carry out its activities. Because humanitarian work is done in diverse contexts, the sensitivity of data and the appropriate safeguards must be determined on a contextual basis.
An overview of existing guidance on the protection of personal data in humanitarian action can be found in ICRC’s 2020 Handbook on Data Protection and OCHA’s 2019 Data Responsibility Guidelines.
To go further on this topic, see the sensitive data definition you can find in the Responsible Data Management Toolbox Glossary.
GIS & Mapping
A geographic mapping system (GIS) is an Information Management system that includes data related to positions on the Earth’s surface. GIS is used to understand spatial patterns and relationships between different variables through mapping. As such, GIS connects a database to geographic analysis, providing added analysis benefits.
There are multiple applications of GIS to humanitarian aid and development aid, with some examples shown here. In addition, CartONG has produced a GIS toolkit that both shows applications of GIS in the humanitarian sector, as well as provides tutorials on key GIS applications (found here (only in French).
Geodata
Geodata provides information about locations that is stored in a format used by a geographic information system (GIS) (Esri, 2016). Latitude and longitude are examples of geodata that have a spatial reference. Geodata can be collected by a range of Mobile Data Collection (MDC) tools. Geodata has a wide range of analysis possibilities that can be used in humanitarian and development activities through GIS.
ICT4D
Information and communications technology for development (ICT4D) is the “practice of using technology to assist poor and marginalized people in developing communities” (Catholic Relief Services). Sometimes called ‘digital development’ (Plan International), ICT4D refers to digital solutions within programs to provide support to target beneficiaries. Compared, technology used for Information Management relates more to internal, organizational processes.
To provide examples, The Rockefeller Foundation and FHI360 published an Inventory of Digital Technologies for Resilience in Asia-Pacific, found here.
To go further on this topic, see section 2.1.6 General ressources on ICT4D and digital initiatives of the Covid-19 program data toolbox.
Business Intelligence
Business intelligence (BI) refers to the strategies and technologies used for data analysis, which in turn transforms data into actionable insights. BI tools access datasets and present data in reports, summaries, dashboards, charts and maps. Tools used for BI in the humanitarian sphere include Power BI, Tableau and R. However, Excel is by far the most commonly used BI software through its use as a database software to automate and create charts, graphs and tables.
The goal of BI solutions is to provide an accessible and holistic picture of data, which in turn allows for a comprehensive understanding as the basis for decisions. Examples in the humanitarian sphere also include the use of dashboards that incorporate GIS and mapping.
To go further on this topic, see :
Indicator
An indicator is a variable (can be either quantitative or qualitative) providing a measurement or a description reflecting a change, usually related to a project or program. Indicators need to be specific, reliable and relevant, accurate, and easily interpretable. Additionally, they need to be S.M.A.R.T.:
- Specific: the indicator is not common to several items but it’s specific only for the item to be observed;
- Measurable: the data on the indicator needs to be available;
- Achievable: the phenomenon to be observed needs to be achievable in terms of conditions for the implementation of the project;
- Realistic: data needed can be obtained remaining within the resources allocated for the data collection;
- Time-Bound: the period in which the observation is run is clearly defined.
Data aggregation
Any process in which information is gathered and expressed in summary form, for purposes such as statistical analysis. A common aggregation purpose is to get more information about particular groups based on specific variables such as age, profession, or income.