Every single problem I've ever had with excel stems from excel trying to be too clever. Nearly all my problems with importing data into it could be solved if excel left everything as is until I asked it to format values a certain way.
>Every single problem I've ever had with excel stems from excel trying to be too clever. Nearly all my problems with importing data into it could be solved if excel left everything as is
We're biased to bash on Microsoft for being "too clever" but maybe we need a reality check by looking at the bigger picture.
Examples of other software not written by Microsoft that also drops the leading zeros and users asking questions on how to preserve them:
I'm certainly not singling out excel, this thread is just about excel so thought I'd share my frustration with that. I've had similar experiences with pandas certainly. I've not used the rest of your examples, but do have similar issues with libreoffice calc. Regards pandas it is certainly easier sorting out the assumptions independent programmers make by including 'dtype=str'.
Does it? I think it should treat everything as a string since it can't be certain about the correct type. It could also suggest the correct type rather than applying it without user confirmation.
You want Excel to auto-interpret a table of numbers, that come from a CSV, as entirely text?
Looking forward to the first time anyone tries to use your excel on a table of numbers and then immediately has to multiply everything by *1 (in a separate table) just to get it back into numbers...
YES!! I seem to remember a few versions of Excel ago there was a way to import data and set the data type for each column in a preview window to see what it would look like. And to your point, it's very easy to select an existing column in Excel and change the data type anyway. So why not just make everything a string by default?
You are right on the cause of the issue. But the software does not HAVE to guess the type. If I paste something into Excel, I want what I pasted to be what's in Excel. Changing the type and inadvertently affecting the data isn't something any data software should do by default. And the fact that OP said they can't turn it off means that this is beyond a design mistake, it is a FEATURE. If Excel took itself seriously as a data platform, it would not make a change to any text by default.
But Excel has never been a data platform. It's a spreadsheet.
It's user's faults for using it in ways that it was never designed for.
Excel has always been about sticking numbers in boxes and calculating with them.
If you want unmodified string input, input strings into a tool intended to handle them.
Project specifications can be hard. Using 1) .xls files after they were superseded, 2) ANY data transfer method without considering capacity or truncation issues, speaks of incompetence.
You can format the column as text before you paste or you can load data from text and set the types for every column. Excel has features to handle all of these issues. I think it's biggest problem is that it's just too easy to use. It will allow novices to do all sorts of things with their data without providing guardrails for common mistakes.
Agreed with this. Excel does have data import feature that will allows us to fine-tune the import process to prevent this kind of mistake.
People just double click the CSV and complained that it didn't do it correctly. It is the same situation with scientific research data that researchers don't bother to use escape marker or blindly open the file without going through the proper import process. Then they blamed Excel for the that without understanding how Excel works.
Yes, Excel does have their quirks. But there are ways around those quirks, they have thousands of thousands guides out there about Excel. There is no excuses for people to complain about Excel didn't do the way that users want it to do without looking up for information.
Yeah it does, libreoffice has methods of recognizing numbers that are to be interpreted as strings as well. This is user error and the programmer trying to blame that on the program rather than it being a process issue. The spreadsheet can't read your mind
As an intern I was once given a spreadsheet of manually entered daily data, stored by a human being over the previous couple of years. Obviously he'd entered the dates in whatever form he felt like at the time and Excel had parsed them in a variety of godawful ways not least of which was defaulting to American month/day assumptions before UK day/month.
I think it would have been far quicker to just manually write a new column interpreting the dates based on previous/next etc. Instead I spent God knows how long trying to be clever, failing, and being embarrassed that I could not solve this obviously trivial problem.