top of page

Excel: The Square Peg Of Data Management

  • Writer: Doug Dimiceli
    Doug Dimiceli
  • Jan 31, 2020
  • 6 min read

How to know when it is time to move from Excel to a relational database.




Microsoft Excel is an extremely useful application (more specifically a spreadsheet), so much so that it often gets used in ways for which it was never intended. Because Excel is so ubiquitous on the corporate desktop, it is common for people to use Excel as a database. When keeping track of a simple set of data, to be used by just on person at a time, Excel does a pretty good job. In fact, I would say it does so well at this limited use that it lulls many users into thinking it is a database. As soon as your data grows in complexity or others want to have access, it will likely start to feel like trying to fit a square peg in a round hole. The spreadsheet was designed to help answer the "What if" questions whereas databases were designed to answer "What is" or dictate "What should be". In this post, I will examine when using Excel as a database can lead to frustrations down the road and offer what might be a better solution.


The Spreadsheet was designed to help answer "What if" questions whereas databases were designed to answer "What is" or dictate "what should be".

One of the biggest limitations of using Excel as a database is that only one person can be in the data at any one time. If the data being tracked is updated by a team, or even if more than one member of the team wants to review the data, Excel instantly becomes a bottle neck. Once you do gain access to the file, any sorts or filters left by the previous user are still applied. This can lead to a tug-of-war within your work group. Most modern databases are multi-user so everyone on the team can update or review the data at the same time.


A free-for-all when it comes to data entry.

Another big issue I see with Excel is the lack of real data validation. There are some features in Excel to help limit what a user can enter into a cell but I have rarely seen anyone utilize them. What is much more likely is a free-for-all when it comes to data entry. For example, team members might enter phone numbers using different formats or just plain misspell a value used in a column like a city name. Along these lines, Excel can be setup to provide the user with a dropdown of valid values. If you decide to stick it out with Excel, I highly recommend adding a new tab at the end of the workbook labelled something like "Lookups" and use it to store all the possible values for columns that should be locked down to a set of acceptable values. You can find these features in the Data > Data Validation option and examples of how to configure just by doing a Google search.


Once you give a user access to the Excel file to change one thing, nothing prevents them from changing everything...

Another area where I think Excel falls short when being used as a workplace database is the lack of role based functionality. Once you give a user access to the Excel file to change one thing, nothing prevents them from changing everything (either by accident or on purpose). There is no mechanism in Excel that allows the database owner to dictate that other users can only make changes to certain columns or rows. Most modern databases have this type of granular security built-in.

Relational databases allow for a more logical design that can better reflect how your data is naturally organized.

The most important data management weakness Excel has is the lack of relational data concepts. While not always obvious to Excel users I think this is the most important distinction. Relational databases allow for a more logical design that can better reflect how your data is naturally organized. Unlike the generic grid of cells in a spreadsheet databases use the concept of tables. Tables are analogous to a grid but the columns are more defined e.g. column name, data type, data size and each row often has a unique identifier that can be used to link to other tables. For example, if you have a table of contacts and you find that these contacts can have various phone numbers you might want to have a separate table to store the phone numbers and then make a link (or a relationship) between each contact and their different phone numbers by adding a column to the phone numbers table called "ContactId" and populate it with the unique identifier for the correct contact . This type of design (or schema) means that a single contact can have as many phone numbers as needed. In Excel, if one contact needed an additional phone number you would have to add a new column to all the rows just to accommodate the contact in question. In the database world this example would represent a one-to-many relationship where one contact can have many phone numbers.


I would say impossible, but I have seen people come up with crazy ways to force that square peg into the round hole.

Another example could be a database where you want to track company meetings. You could start off with a table named "Meetings" to store data about each meeting e.g. "Title", "Room", "Date" and "Time" and a table named "Participants" to store information about each participant e.g. "Name", "Job Title", and "Email". The relational magic starts to take shape when you link these two tables using a third table (lets call it "MeetingParticipants") that stores the unique identifier of a user and the unique identifier of a meeting in one row. This would be considered a many-to-many relationship where one could query meetings a participant should attend or query which participants are assigned to a particular meeting. This could be done in Excel by creating a matrix where you put the meetings in the cells along the top row and the participants in cells down column A and put a "X" in the cells where meetings intersect the correct participants. While not pretty it works, but what if we want to add another one-to-many relationship to the mix? What if wanted to have a table of AV equipment and be able to assign one or more devices to a meeting. What if we then want to add a table to track any deliverables or tasks that arise from the meeting and which participants are responsible for each. Trying to do all of this inside of Excel would be extremely difficult. I would say impossible, but I have seen people come up with crazy ways to force that square peg into the round hole.



OK we get it, Excel is not a great option.. but what should we do instead?

At this point you are probably saying "OK we get it, Excel is not a great option.. but what should we do instead?". If your company uses Microsoft products and you have developers either in-house or on contract, I would suggest moving your data to SQL Server. If your organization already has SQL Server up and running and it isn't being used to capacity adding another database wouldn't cost anything. If you don't have SQL Server already, Microsoft offers a free version called "SQL Server Express". This version of SQL Server has limitations but if you were using Excel I doubt any of the limitations would be an issue. SQL Server implements a client-server model which means a client (or front-end) makes query requests and then displays the results from the server (or back-end). I would suggest building an ASP.net web application (Microsoft's web technologies) as the front-end to your data. This web application can be hosted on one of your organization's internal servers making it so that access is restricted to browsers inside your company network. The web application is where you can create any logic to validate data entry and give certain users access to different aspects of the data. Of course, all of this comes at a cost to develop, but if your Excel database has become critical to your (or your team's) productivity then it probably deserves the investment.



Comments


WHAT'S UP?

CURRENT PROJECT

 

We are working on making it easier to programmatically generate SSRS reports in XLSX format.  Stay tuned! 

 

EVENTS
MAR  10

OUR 10TH ANNIVERSARY!

We would like to thank all of our clients for making this possible.

CONTACT

Los Angeles, CA

Tel: 424-234-6424
 

Email: info@devarium.com

MAR
26

LAPA TECH EXPO  2020

Los Angeles, CA

  • w-facebook
  • Twitter Clean
AUG
23-27

ILTACON 2020

Nashville, TN

© 2019 BY DEVARIUM INC.

bottom of page