Link Search Menu Expand Document
Mobile Data Collection toolbox

2.3.2 Consequences of a badly run MDC


TABLE OF CONTENTS


As we’ve seen, Mobile data collection (MDC) offers numerous advantages in comparison to a traditional paper collection. However, a badly run MDC can be counter-productive. This goes to show that you shouldn’t jump onto MDC if you don’t have the proper time or skills for it to be a success. Beyond the very serious data protection risks of collecting data you should not collect, not securing your data collection tools or having dubious ethical data collection practices (see also the Responsible data management toolbox on this), below are examples of situations in which you may find yourself that might be problematic in terms of data quality or efficiency:

Please note that for each issue, specific solutions exist and can be found later in the different sections of the MDC toolkit.

Data quality issues

Description of the issue Example
Poorly developed constraints cause input errors in the field.
  • Unconscious undetected errors, such as the addition of a « 0 » to an integer by mistake or by misunderstanding of the unit.
  • An integer that shouldn’t be negative and has been input as negative by mistake.
Non-adapted question skipping. A group of questions that should apply only to children aged from 0 to 5 years old is displayed in all cases and makes the investigators believe they have to fill in the question for everyone.
No data triangulation in order to detect potential errors. If an answer seems surprising, ask the investigator to double check (i.e. redundancy: asking the same question but differently), or to cross check two answers that seem incompatible.
Incorrect data input due to the lack of a more adapted answer. Forgetting to add options such as « other » (with a possibility to specify what other means afterwards), « do not know », « none » or « not applicable » in a mandatory question.
Lack of automatically saved metadata. Date and time of filling in the form is missing. This may be an issue in order to keep track of the filling in time, but also to verify that the date on the phone is the current date, especially when age calculations are made on this basis.
Lack of translation or neutrality in the questions leading to a misunderstanding of the field teams or survey respondents (resulting in misinterpretations) This issue may be important if the way to ask the question can influence the answer: it is always be better to ask the question in an identical, neutral, unambiguous manner, in a particular language (in order to avoid interpreters to freely translate questions).
Information is entered freely as text and not in a structured way (i.e. lists), making the **data analysis difficult. List of villages with spelling errors, similar names, accents, etc. due to a free text entry (even though such information could be easily structured with a list of answers) à implies time-consuming data cleaning and sometimes makes impossible to link results to other databases (other thematic databases, map databases…).

Loss of time issues

Description of the issue Example
Extended input time. Number of question skips not sufficient, or number of questions is too high.
Unclear form structure, with a non-logical path (without titles of parts, numbering, or hints/notes).
  • Non-adapted constraint messages.
  • User friendly definition of ergonomically presented terms.
  • No appeal to answers to previous questions to facilitate the rest of the form.
Free entry text of information that could be structured into lists, and in particular cascaded lists. The list of villages should automatically adapt to the list of districts chosen, which adapts itself to the list of region, etc…, without the need to « scroll » for 5 minutes to find the right village in the entire list of villages.
Lack of general ergonomics of the form: appearance parameters etc.
  • Grouping of certain questions on the same page – e.g. group diameters/widths/lengths and units of measurement within one screen on the phone. However, make sure not to group too much information on one screen.
  • Display of the name of parts of the form.
  • Show pictures as answer options for some difficult questions (e.g. type of water treatment used).
Lack of use of directly integrated calculations within the form. Numerous calculations can be integrated directly into the form, which avoids repeating them in the analysis tools, and makes it possible to directly visualize this information in real time on the form.
Not all possibilities of the tool, such as repeated questions, electronic signatures, external lists etc. are well known. For a family of 5 children, entering a complete form submission 5 times for information on each child, including identical general information about the family. Instead, a group of repeated questions in a single record containing only the questions related to the children.
The answers columns and the answers themselves are not explicit enough (i.e., because they are automatically generated by the tools and not reworked by the person who has designed the form) à waste of time to clean each export file.
  • Variable name: «_38_Please_you_cit_you_know_38 » for the question « Could you cite 2 or 3 foods in the food groups that you know? »
  • Answer itself: « improves_growth_and_n » for the wording « Improves growth and nutritional status »
  • If your questions/answers is long, the variable name itself will be long, truncated, and may not make much sense.
  • If several questions/answers have the same “label”, their names will only differ by an incremented number and will not be specific (e.g. if other, specify).
  • If the questions/answers contain special characters (e.g. accented letters), those will be replaced by underscores that won’t make sense.
Form not designed in relation to the analysis tool: can caused difficulty analysing in real time or before the end of the survey as things cannot be modified during the survey.
  • Non-KoBoToolbox-compatible design (or not adapted to specifically designed analysis tools).
  • Multiple-choice questions are very hard to analyse and they could lead to contradictory answers and inconsistencies in data: to be used only when truly necessary.
Having to manage different versions of projects due to design errors that need to be corrected or because the data itself is likely to change throughout the survey: no possibility to analyse the results from the whole survey in real time; loss of time to merge the whole.
  • 5 form versions with X submissions each.
  • If there is some information likely to change often, it might be better to put them into an external file rather than code them directly within the form.