5.5.1 Key considerations
16-Sept-2022 5 mins
The purpose of this section is to construct your form to make sure the data captured is short and sweet and that the collected data is as “ready to use” as possible for analysis. Keep in mind that “there is no such thing as common sense”.
- Consider whether there might not be data that you have available in another database that you could either avoid collecting in this survey or reinject directly in your submissions through calculated values using “external CSV” features to reduce errors and limit data entry
- Consider any multimedia you might want to include that might be relevant for your operational understanding, quality control or accountability/reporting (adding a photo of the infrastructure to double check it’s categorisation, a GPS point, an audio recording of the interviewee, a drawing…)
- Apply exhaustive data validation logic: there should never be open numeric fields (maximum and minimum constraints can nearly always be derived from secondary data and/or piloting/experience), you can also use Regex constraints, limit the number of entered characters for text fields if it makes sense, ensure that multiple option answers selected are adapted (ie that you have selected a minimum or maximum number of options, that you have not selected “none” at the same time as another option etc), that a date selected is in the right range (past or future) etc.
- Make mandatory as many questions as you can- consider however only questions that can be filled in in 100% of cases (e. g. do not make a GPS point mandatory as it depends on the stability of your mobile device for this feature)
- Verify that the mandatory single-choice/multiple option questions all have the necessary safeguards (“Don’t know”, “Unknown”, “Other”, “N/A”, “None of the above”, “Refuse to answer”, etc.) to avoid forcing your enumerator to enter incorrect information to be able to submit his data.
- Make sure you use multiple answer questions only for questions where it is really makes sense and that you are sure you are in capacity to analyse (as they can be more complex to analyse than single option questions). Consider using “ranking” questions that will give you a “pondered” understanding of option responses such as “What is your primary source of water”/”What is your secondary source of water”…)
- Set up the necessary calculations for your indicators directly in the form to avoid having to set them up later in the analysis tools
- Reflect on ways to adequately triangulate data on some key indicators to double check its relevance (i.e. showing the result of calculations to the enumerator directly in the field and asking him for confirmation), adding alert messages in case of possibly inconsistent data
- Add any necessary tips (hints, customised constraint messages, audio explanations) concerning all the difficult definitions/ notions/ jargon/ acronyms/ measurement units, or provide an enumerator’s guide if necessary
- Group questions and put section titles as well as use colour in your labels to facilitate visual understanding by the enumerator
Identify all your free text questions and check whether it might be relevant to have instead a possible list of options with the “other” safeguard (administrative information, name of enumerators…)
- Plan preparatory work such as Focus Groups Discussions if in doubt concerning the possible list of answers in that given context
- Avoid open-ended questions in particular for data that you want to analyse quantitatively or for disaggregation purposes (and also so that your enumerator will gain time filling in his submissions and it will reduce interviewee fatigue)
- Keep in mind that using combined qualitative and quantitative methods can be very useful, but it might not be relevant to combine the methods in electronic data collection
- Give your form a title that speaks for itself and that you are sure will not create misunderstandings for enumerators (with the relevant information such as thematic component, location, year, etc.)
- Include the necessary project metadata (recording the start date and time for the submission, including intermediary time stamps, IMEI number of the phone used, etc.). Refer to the part 5.6.6 Quality control.
- Look into constituting a unique identifier for your dataset if it is relevant to your project and see how best to have it captured (calculated automatically based on other information, selected in a list, filled in manually with advanced constraints, scanned from a barcode or QR code, etc.)
Try to integrate as much as possible standards. For example:
- If you are using administrative data, integrate standard and interoperable P-codes (place or position codes) that will be understandable if you need to link your dataset with other past or future data collections. Refer to the part 5.7.3 Managing administrative data - P-codes. (link to HR post on the question).
- Try to use HXL standards to make your data more interoperable and easily understandable to someone else.
Don’t hesitate to refer to Module 5: Making data useful, useable and shareable of the Data Playbook, in particular:
- Exercise 5 - 5 Generating a Data Quality Checklist that could be very useful to think how to rightly code the form.
- Exercice 5 - 6 Data Quality: Opportunities and Barriers could also be relevant to understand what can be improved (in terms of ressources, skills, time, etc.) to have better data quality.
- The 5 - 7 Data Quality Workflows presentation on page 34 also greatly clarify why quality is important to make evidence-based decision and what are the dimensions of data quality.