4.3 Using basic descriptive statistics


You can calculate a number of summary statistics to get an overview of the basic characteristics of your dataset. This is useful particularly when you have a large dataset, as it can be challenging to get any insight or understand trends in long tables of numbers or attributes. For example, the below table from the case study shows the data form just are a portion of the responses from just one question in the survey: “Which of your household’s basic needs can you not afford?”
In this section, we’re going to walk you through three different categories of descriptive statistics. These are all univariate1 measures that can be applied to a single variable in your dataset.
Warning: We should be careful about interpreting any of these summary statistics in isolation from each other. While these statistics give us a simple way to understand the basic characteristics of our dataset, they can in many cases oversimplify and hide interesting patterns. With reference to the case study, if we used the average of the FCS scores, or even the median, we would assume that our population has ‘Acceptable’ thresholds of food security, ignoring vulnerable households that have ‘borderline’ or ‘poor’ thresholds.
As we will discuss in more detail in the following sections, you should also create a number of basic plots, such as histograms and “whisker boxes,” to explore and summarize your data effectively.
This section is divided into 3 sub-sections:
-
Univariate, simply means that we do observations on a single characteristic or attribute. ↩