7.1 Arrays formulas
TABLE OF CONTENTS
Array formulas are powerful formulas that allow you to perform complex calculations that are often impossible with standard spreadsheet functions. They are also called Ctrl-Shift-Enter or CSE formulas because you must press Ctrl+Shift+Enter to enter them.
Array formulas can be used to:
- Count the number of characters in a range of cells.
- Sum numbers that meet certain conditions, such as the lowest values in a range or numbers between an upper limit and a lower limit or sum each nth value in a range of values.
Excel provides two types of array formulas: those that perform multiple calculations to generate a single result (see example below) and those that compute multiple results. Some spreadsheet functions return values tables or require a values table as an argument.
Example of an array formula that calculates a single result:
This type of array formula can simplify a spreadsheet template by replacing several different formulas with a single array formula.
- Click the cell in which you want to enter the array formula.
- Enter the formula to use. Array formulas use a standard formula syntax. They always start with an equal sign (=), and you can use one of the Excel functions built into your array formulas.
- For instance, this formula calculates the total value of an array that contains the amount saved and places the result in the cell next to the Total Value Saved.
The challenge will be to find, in a column, the value(s) corresponding to the corresponding maximum value, especially of a date. To illustrate, you work in different villages in which your NGO has built different structures (warehouses, latrines, water points, etc…) that you regularly visit. It is indeed important for you to check the functionality status of the infrastructure. The latter all have a unique identifier to facilitate data entry. It looks like this:
So, you have 3 columns, for your infrastructures (ID), the date of visit, and the variable of interest in this example to download here
Since your teams regularly visit these villages, you have several feature states for each infrastructure (id). In your reference table (where each infrastructure is listed), you wish to obtain the feature state at the most recent date.
You will use for this the combination of MAX and IF (this function exists in the latest versions of excel, from 2019 or office 365 under the name MAXIFS)
This function will make it possible to dynamically find the last date and its equivalent (i.e., the corresponding variable in column B (functionality). For this, the syntax will be as follows:
But beware! If you enter the formula only this way, you will not obtain the right result, you will need to enter the formula using Ctrl + Shift + Enter, so as to create an array formula and end up with the first date corresponding to each infrastructure, as follows:
(Do not forget to always format your data into a table, to facilitate referencing, and writing your formulas)
You can download the file here (final version) to check the formula syntax if necessary.
When you try to find a unique value in an array that matches multiple criteria, it can be difficult to find the right formula. Especially since when you try to use index match to find data in your database with multiple conditions, you will encounter difficulties. Indeed, if we take the example of an infrastructure monitoring database, with multiple visits per infrastructure, per time units, you would not be able to find the feature state of a specific infrastructure, for a given date, with a simple “index/match”. Suppose for instance that we wanted to find the corresponding feature state of the ID-3 infrastructure at the date of the last visit.
Download the file here.
The formula must therefore
- Find the corresponding ID (for example ID-1) in the list of infrastructures
- Then find the date of the last visit that matches
- And finally, return the corresponding state to these two coordinates
The problem with the classic index/match function is that your formula will “stop” at the first line where it finds the infrastructure code (so ID-1), then find the date of last visit and return an error since the last visit is never the entry corresponding to the first code (except in the singular case where you have only one entry).
So, you’ll get an error code return, regardless of the syntax you use. It is therefore necessary to use
For this formula to work, you will need to use an array formula (link to other sections). But it will be necessary to slightly adapt the syntax of the formula for this.
As such, we will write the formula in a Boolean way, that is to say by asking it to look for an exact match (i.e., 1 in Boolean language) with all the criteria that interest us, which is equivalent to:
= INDEX(column where we look for the value; MATCH(1); (criterion 1 = X)*(criterion 2 = Y)* etc.)
And the formula will therefore search for the cell matching all of the criteria so as to obtain:
= INDEX(column where we look for the value; MATCH(1;1-1*1*1 etc.)
In our example file, this means that we will use the following syntax:
= INDEX(visit[Condition];MATCH(1;(@[Date of last visit]=visit[Date])*([@Infrastructure]=visit[Code]);0))
This will allow you to find the unique result corresponding to the date of the last visit for a given infrastructure code, which you can verify in this file.