4.5 Correlation analysis
The correlation coefficient is used to determine whether there is a possible linear relationship between two variables. To calculate the correlation coefficient in Excel, use the function: =CORREL(matrix1; matrix2)
- Matrix1 is the first range of values
- Matrix2 is the second range of values
- Both tables must be the same length
Suppose we have the following variables, where we want to measure the correlation between the number of active health centres and the annual number of patients:
Our correlation coefficient formula is as follows:
= CORREL (C2: C37 ; D2: D37)
The formula shows a strong positive correlation (about 0.99) between the number of active health centres and the number of patients (as the number of health centres increases, the number of patients increases), as you can see in the screenshot below:
The following exercise is only available on the French page at this stage- translation coming soon!
In order to practice analysing the potential correlation between 2 variables, here is a practical exercise using clean data from the case study - developed by CartONG as part of the Data Analysis Toolbox publication. A more complete presentation of this case study can be found in the “Data Analysis Toolbox” – Part “7 The case study ”.
In this example, we propose to generate analyses of the correlation between the average duration of food security assistance (for a household) and the size of the household.
Use the [“Excel Toolbox - Analysis 4.5 - Practice exercise - V1 - Statement”] file to practice correlation analysis.
In case of difficulties, or to compare your results with those of the case study, refer to the file [“Excel Toolbox - Analysis 4.5 - Practical exercise - V1 – Solutions”].