4.4 Data management
Due to the formats used and the amount of data we use in GIS, implementing good data management practices is essential to sustain our work. Whether you store your data locally on your computer or on a shared online space, having a good organization of GIS data allows you to find it more easily and to gain in efficiency. The two basic aspects of data management that we will cover in this document are :
- Structured organization of folders and files
- Naming Convention
TABLE OF CONTENTS
- Data storage
- Data organization
- Naming convention
- Map naming convention
- Data naming convention
- Some examples of naming habits
For data storage, there are several possibilities, depending on your needs and capacities:
- Local storage on your computer
- Storage of a file on an online platform such as a Cloud
- Storage and centralization on an enterprise server (able to manage geographic data).
If you choose online storage, or if you share your files with colleagues, be sure to use secure channels/infrastructure if you are hosting personal or sensitive data.
Personal data is any information relating to a natural person who can be identified (name, address, number…). Personal data is said to be sensitive if it includes information related to racial or ethnic origin, political opinions, etc.
structured organization of folders and files is a first step towards efficient data structuring. It is advisable to organize data into folders and subfolders. This organization applies to the first two types of storage identified previously.
Here is an example of how the files can be organized by country.
Example of data organization
Please note that there is no one best way to organize folders and files, so feel free to adapt them to your needs.
As can be seen in the figure, there is a main folder that indicates the name of the country, and then subfolders by theme (Data, Maps, Projects). Each folder is named by the type of file it contains thus, it is very easy for another user to find his way in our folders.
It is also possible to follow a classification by thematic folders and subfolders. For example, the main folder indicates the name of a project, and then subfolders that group different themes (data, maps, etc.).
If you don’t respect a structured organization of your files, you risk losing data over time and breaking the link between the data and the project files of the various GIS software (QGIS, Arcgis Pro, etc.).
File naming is as important as file organization. It is important to follow a rule for naming files. These rules make it easier to identify the documents you are looking for. This makes it easier to exchange and transfer files.
The example below shows the comparison between a misnamed file and a file with a conventional name.
File naming example
Here, we notice that the names of the files in the red box do not allow us to understand the content of the files. The naming system in the green box immediately makes it possible to know what the files refer to (location, type, format, date, etc.) without opening the files.
For a better organization of GIS data and products, we suggest below a naming convention that you can follow.
This naming example includes five levels of detail or information:
- iso3 : conventional code to indicate a country. If there are several countries displayed on the map, simply put the iso code of the main country on the map. You can find the list of these codes here https://www.iban.com/country-codes
- map type : a key value to designate the type of the map
- Location : the name of the place that is displayed on the map. It can be the administrative name of a border, a city, a neighborhood, a health zone, a national park, a region of the world, etc.
- Description : Additional information on the map if the type and location are not enough. It is mostly used for epidemiological maps to specify the type of disease or the epidemiological week (ebola_ew21, cholera_ew25, etc.) or for logistic maps (road guide, flights, etc.).
- Printing size : the print size of the map. Contains the size and orientation (L for landscape and P for portrait): A4P, A1L, A3L, etc.
- Date: Day the map was produced, referenced as YYMMDD: for example 150325 for March 23, 2015.
- BM : Basemap / Base map (topographic map)
- ELE : Elevation map
- LOG: Logistics (road access, obstacles on the road, etc.)
- EPI: Epidemiology (cases of infection by health zone, by village, etc.)
- HLT: Health (health infrastructure, health administrative divisions, etc.)
- POP: Population (census, population displacement, etc.)
- SEC: Security (incident mapping, evacuation routes, etc.)
- iso3 : is the code for the country. If more than one country is displayed on the map, just put the iso3 of the main country.
- Location: the name of the place that is displayed on the map. It can be the administrative name of a border, a city, a neighborhood, a health zone, a national park, a region of the world, etc.
- Code : feature code (see table below)
- Subcode : feature subcode (see table below)
type : file geometry :
- a for areas (polygons)
- l for lines (and polylines)
- p for points : point markers
- source: is the source of the data. If you use abbreviations, you must create a document with a mapping to the full names in the “doc” folder (see Data Sources).
This table lists the different codes that can be used to name data.
|Code (folder name)||Code||Subcode||Description|
|Environment||landmark||N/A||Parc / National Park|
|Structure||hlt||N/A||Hôtels et maisons d’hôtes|