Some of our products support saving their data to a file that can then be read by other applications. This process is called exporting the data. Exporting differs from an app’s normal save files because the data must be organized in a way that other applications can read and understand it. This is where the Comma-separated Values (CSV) file format comes in to play. Simply put, CSV organizes data into lines of values (which may be called “rows”) having each value in the line separated by a comma (establishing “columns” in the data). Here is a contrived example of a CSV formatted file:
Name,Age,Favorite Color
Ted,23,Blue
Jane,54,Red
Alex,73,Green
You might have guested, especially if you’re familiar with office applications, that CSV format is a standard and well-supported format for exporting data to spreadsheet applications, such as Apple’s Numbers or Microsoft’s Excel programs. Each line of a CSV file becomes a spreadsheet row, while the comma-separated values in each row aligned to the columns.
In the above example, there are four rows of data – the first row being a “header row”, naming each column of data. This is a common practice but not a requirement of the CSV format itself. The first column of data is:
Name
Ted
Jane
Alex
From the CSV file alone there is nothing to enforce interpreting the first column as such; however, if the example data were imported into a spreadsheet program these listed values would be the first column. At least, that is true assuming the spreadsheet application can read and import CSV formatted files.
It is an important point to understand that not all applications necessarily interpret CSV formatting the same way. In fact, there are several nuanced situations that many spreadsheet applications may ask for the user to specify a preference or approach to take. For example, specifying whether the first row of data is a “header row” is one such detail a person might need to specify during import.
Another nuance that you might have to deal with is the character encoding of the file. You can read more about it at the link provided, but briefly, text stored in files (such as CSV files) is represented by numbers. Each character gets a number. Which character is assigned which number is dictated by the character encoding. The CSV files exported by our products use the UTF-8 character encoding. This encoding format allows for many special characters not supported by other character encoding formats, including such things as Emoji.
In our testing of spreadsheet programs (Spring 2022), we found that Microsoft Excel did not correctly read UTF-8 CSV files in two of the three ways it provides to import CSV files. The methods we identified for importing CSV using Excel were:
Using the “Get Data” wizard and selecting “65001: Unicode (UTF-8)” from the File Origin menu correctly interpreted the text. This behaved the same on both Window and Macintosh versions of Excel. However, the iOS version of Excel did not appear to have this wizard. When loading CSV files containing characters outside of iOS Excel’s default encoding those characters (their numbers) will be mapped incorrectly and appear “corrupted”, i.e., they will be some other characters. Often these corrupt characters are just small rectangles and are sometimes just called “control characters”.
If you find yourself in this situation the best option is to try to learn how to inform the spreadsheet application to interpret the CSV file in UTF-8 encoding. In the case of desktop Excel (for both Window and Macintosh) this was identified by searching the web for a solution. And in the case of Excel, it was simply knowing which import procedure to use. However, in the case of iOS Excel, there may not currently be a solution without first going to the desktop version. The Apple Numbers application and the build-in preview (on desktop and iPhone) all support UTF-8 by default.
Of course, if you only use the standard range of characters and numbers in your content, then you likely won't see any character encoding issues during import.
Copyright © 2022 Rylolan Software, LLC - All Rights Reserved.
Logo by Josh Mindemann