Video: Why Excel isn't a database

Video: Why Excel isn't a database

In the context of the recent controversy surrounding one of the reporting sources for the various Coronavirus Positive tests having been miss-counting data due to truncation in Excel.

What's not mentioned in the news stories , or poorly understood is what went wrong and why.

It's nothing to do with databases, more a lack of trainging and/or using the tools incorrectly.

Their reporting was predicated upon reading the last ( most recent ) row(s) of a 'csv' file ( A large text file, with data/variables in rows, separated by a comma. )

This went wrong when they tried using excel to open up the entire file ( as opposed to using an appropriate tool to read just the last row. ).

This could have been achieved reliably by opening in -

Notepad ( yes - the data is just a text file, and depending on the complexity of said data,  it could have simply been opened in the default windows text editor ).

SQL Server ( or similar )  -  Import the CSV into an actual database, use Select statment with a filter to return only the last/most recent row.

Python -  Python can read the CSV into a data-frame ( similar to a database table ) using Pandas - again the data could be filtered to present the last row.

Excel -  Instead of trying to open the entire file into excel , and getting truncation ( Excel can only show 1,048,756 rows - so if a file had 1,048,757 rows, you'd never see the last row)  , the file should have been opened as a data connection and transformed / filtered with PowerQuery.