When it comes to data organization, we have to consider some key attributes for it. Spreadsheets are extensively used by many users [1] for storing numeric and non-numeric data. Further, it can be extended to work with other software packages such as "R", "Python", and "SPSS." However, spreadsheets provide numerous functions that are good enough for primary data analysis. Even, more can be done using spreadsheets (e.g., MS Excel, Libre Calc, and Google Sheets). In this article, it has been shown how to organize data in a good way in Google Sheets. Image 1 illustrates the following.
1. Headers
Headers in the top rows should be defined well. For example, it is not a good practice to use such headers (aka variables) as " Data Organization", rather we would use "data_organization" or "dataOrganization" (Figure 1) There is no rocket science behind using this kind of naming, it's just a practice to keep variables unique.
2. Categorical naming
Categorical data should be coherent. For example, there are three categorical variables for gender distribution (Male/Female/Other). We need to avoid multiple entries (Male, male, and M) entries. We have to stick with only one name (Figure 2).
3. Data validation
Data validation is one of the best functions in spreadsheets. It allows users to restrict multiple entries. It keeps records unique and uniform. It can reduce multiple entries (see Figure1).
4. Use of delimiter
5. Use of special characters
6. Formatting
7. Avoiding blank cells
8. Use of lower case
9. Exporting and saving
10. Maintain uniformity
| Image 1 |
References
[1] Chan, Yolande E., and Veda C. Storey. "The use of spreadsheets in organizations: Determinants and consequences." Information & Management 31.3 (1996): 119-134.
[2] Wikipedia contributors. "ISO 8601." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 31 Dec. 2021.
Share it: