I still don't understand who at Microsoft decided that it was a good idea to have "localized programming". If you use a non-English version of Excel you cannot use "if" but rather have to use whatever the word for if is in that language (for example "wenn" in German). If you think you really need that localization why can't both versions work or why can't you force English (the CAD software Rhino does this with an underscore for example). Yeah sure you can change language but it's such a kludge. Causes so much confusion in international settings, makes searching for stuff non-trivial etc. etc.
I'm really grateful I never had to work with Excel in any job. My excel skills section of my CV reads "basic, would rather not use it". The closest I came was importing Excel files. I do however respect people who can use it well and I've seen people be really efficient in excel.
This sentiment generalizes pretty well as a bilingual programmer, or even just a bilingual user of a lot of tech.
- Silent localization of decimal separator in Libre Office calc.
- Localization of error messages (much harder googling). I have known hundreds of french software engineers, absolutely no one cares for GCC errors in french.
- Google search trying to be way to smart. Good luck trying to figure out how to get English results for a word that has the same spelling in french.
- The entire keyboard layout software stack being a giant us-centric leaky abstraction at every layer.
- And the absolute worst of all: automatic translation of YouTube video titles with neither possibility to opt-out or consent, for both viewer and author.
> Good luck trying to figure out how to get English results for a word that has the same spelling in french.
Quite easy just not discoverable but if you remember query parameters from 20 years ago it's easy, just remove _all_ query parameters except `q` (the query itself) from the URL and append parameter `hl=en`, then click `show only results in English`.
Google was always very bad with often mixing translation and localisation.
This problem gets worse with localized formats. In my locale, $13.377 is 13377 dollars and zero cents. And $13,377 is 13 dollars and 37(and something) cents. Obviously, CSV exports randomly chose my locale, their locale or something else. Same for date-formats. But at least date-format errors are visible and often obvious. Having a giant table of prices and a very few being off by a thousand, isn't.
Add, indeed, the localized formulas and you end up with sheets that are completely unportable, unsharable.
I'm certain that this is because Excel was designed in a pre-internet era. Where collaboration, if it existed at all, evolved around intranets, shared drives and company-managed computers.
This is an issue with CSV numerical values stored as strings, for numbers it's a bit better. Libraries sometimes allow for a raw import that preserves the internal data type of the format, then you get proper numbers, not adjusted to locale formats.
If you do it for excel it even handles dates pretty well because they're in a numerical format and you can infer that a column is filled with dates because of the range.
The pre-internet conclusion is right. They try to keep backwards compatibility. Also, Excel doesn't handle big data well (1-2M rows) and neither do import libraries (at least in JS land).
> If you use a non-English version of Excel you cannot use "if" but rather have to use whatever the word for if is in that language (for example "wenn" in German)
Not just that, it also requires "native" decimal separator.
For example, if you installed English Excel on system with Russian calendar/numbers/currency, the formula can look like: =COUNTIF(B2:B6;">1,5") (syntax error otherwise).
English Excel in English environment: =COUNTIF(B2:B6;">1.5")
Russian Excel in Russian environment: =СЧЁТЕСЛИ(B2:B6;">1,5")
What will happen when you enter 1.5 into English Excel? Everyone knows it is 01.май. What happens if you enter 100.5? It is a string, Excel on this machine expects different separator.
And with Dutch (in Google Sheets), the argument separators differ too. =IF(B2,42,1337), in Dutch, =ALS(B2;42;1337) or, when using english formulas, still is =IF(B2;42;1337). Terrible when you copy formulas around.
Yes. And the keyboard bindings are also localized, so in Excel in Swedish on would press CTRL-F to make the text bold (F is for "fetstil") and to find one would press CTRL-B, logically for "sök".
(for unaware users, those are exchanged between Excel in English and in Swedish. In english CTRL-F for find, CTRL-B for bold. )
Knowing Microsoft, this is likely a backwards compatibility thing. There is no way this is easier to implement than just using the same keywords in any locale.
Since Joel Spolsky is now famous for a lot of different things (his "Joel on Software" blog, Fogbugz, Trello, and Stack Overflow), also important to point out that he was also a product manager on Excel in the 90s.
That series just killed me. The garbage bag one had me in hysterics. And the ending twist was fun given how people were so convinced the voice was him.
I think Joel was inspired by this series. One of the videos on that playlist you shared appears in this Trello board[1] that Joel was referring to during the talk.
Wow that's a blast from the past! I'd totally forgetten about it, thanks for the link I'm goung to have to rewatch it for some laughs whilst I feel old.
This is excel 101. And with some bits of that are questionable.
Range names are nice until you need to duplicate a tab, then you end up with a combination of local and global range names, where it isn't clear which is which. And god forbid you want to move a tab to another workbook, particularly if that workbook already has range names.
He does an INDEX/MATCH without specifying the match type to exact, which forces him to sort the lookup table. Specifying the match type should be muscle memory. And after 20 years of ignoring their users, the excel team finally introduced an xlookup fonction that does that. They also introduced other functions that are useful (and long overdue): SORT, UNIQUE, TEXTJOIN (somehow TEXTSPLIT only came several years later, I guess it was a hard computer science problem)
There are still functions that would be nice to have in Excel. An interpolate function that could take an X value, an X range and Y range, and an interpolation method (linear, polynomial, cubic splines, etc) and interpolate the value.
Excellent post. I'd also argue that the Excel team would do the data analysis world a huge favor by eliminating the "General" datatype interpolation guesswork Excel does when pasting data or opening a CSV file. I'm convinced more data files have been ruined by uninformed users opening a CSV in Excel and then saving it, than any other method.
I adore this video. I wish there was a version of a talk like this but for Google Sheets, or at least comparison of Excel and Google Sheets features as of 2023. I saw an article[1] that tries to replicate the cool automatic data-table features of Excel in Google Sheets but it seems easier to just use Excel.
Excel Online has Typescript-based [1] automation with Office Scripts [2]. There's even an online VS Code-based editor in development [3]. You can run these scripts non-interactively in Power Automate and integrate with pretty much anything, it's all pretty powerful.
Every time this comes up on HN I end up watching a decent amount of the video, if someone were to ask you "what does 'evergreen' content mean?" pointing them at this video would be a great example.
I've gotten a long way in life never using Excel except to type things in cells for my idiot former company which used it as a word processor, which made me hate it with all my heart.
And yet I assume, based on the slavish devotion to it by others, that it is actually great if you know how to use it effectively. Aside from this video, where can I learn the foundations as quickly as possible so that I can do cool things with it?
I used to work in the securities division of Goldman. Traders are generally amazing at excel and one way to completely lose their respect is to suck at excel. I've more than once seen a trader look on in horror as a programmer fumbles around in some excel model the trader has asked for help with. At one point I had to turn some genuinely insane excel models (think 10mins to load the sheet, 10-15mins to recalc) into code (my version of that model ticked >1000x per second and had more functionality than the original), so I learned to get good at excel and learned the kinds of things people who are amazing with it are able to do and how powerful it is.
So with that here's one super easy tip and one foundation pathway for long-term learning.
The super-easy tip is to learn some basic shortcuts[1] so you can move quickly and get shit done without constantly reaching for the mouse. In particular, learn the "Ctrl - arrow" shortcut (move to the end of the contiguous data in the direction of the arrow), the "Ctrl + Home" shortcut (Move to the start/top left corner of the spreadsheet) and realise that when you're holding down shift this means you can select regions of data for cut and paste or other operations really quickly. Also learn:
1) If you're using a mac you'll need to turn off some of the exposé features or rebind them if you want decent excel keyboard shortcuts. Small price to pay imo but your opinion may differ of course.
2) If you're using anything other than an archeological version of excel you're going to have to come to terms with that stupid ribbon thing. Luckily from a keyboard shortcut pov the ribbon means you have one simple set of shortcuts to learn to access any icon on the ribbon from your keyboard so learn it. On PC just press "Alt" and your ribbon will light up with all the keyboard shortcuts for everything from the ribbon on Mac I can't remember how you do this thing and a quick scan doesn't reveal. My chops are a little rusty because I only ever really used excel seriously on windows.
OK now you won't be painfully hobbling about the app one row or column at a time and reaching for the mouse the whole time, the long-term learning. Understanding the power of excel comes down to realising you are editing a model of a graph computation, then learning and understanding a few key features which are really powerful, and will hint at other directions to explore to learn more. I'll give you some examples, but these really are the tip of an incredibly huge iceberg.
1) Autofiltering
Put yourself in a sheet where you have column headings at the top and one contiguous block of data in rows and columns beneath. Go ctrl-home to move to the top left of your data, then go shift-ctrl-end (or shift-ctrl-right and shift-ctrl-down) to go to the end of your data. All your rows and columns should now be selected. Now click on the "AutoFilter"[2] icon or if you've been paying attention to 1, use "Alt" to choose the icon of a hopper thing on your ribbon. It says "filter" next to it and the link below has a picture. This allows you to sort and filter your data in very flexible ways with a UI that's very intuitive for non-technical users. I often point UX folks at this feature when they (inevitably) come up with a pale shadow of this capability.
2) Pivot tables
With your data still selected, go to "Insert" on your ribbon and go "Pivot table", select "new worksheet". OK here you have a thing that basically does a select ... group by on your data with various aggregations, sorting, filtering and a bunch of related functionality all in a pretty simple wrapper. Play around and get familiar with this. You can do amazing things really quickly with pivot tables. Yes I know you can do all this and more in pandas but your mba colleague can do this with excel in seconds and they can't write a line of code. You may be getting a sense of why people consider excel powerful.
3) Vlookup, hlookup, sumif, countif and friends
OK that was the entry-level drug, now go find out about vlookup, and sumif. These are simple functions that look data up in a table. Typically vlookup takes a sorted table on some reference sheet, looks up some key in the leftmost column and gives you back the value of some cell in that row. Realise this adds higher-order dependencies to the graph of your computation. People use this to do amazing shit with vlookup.
Sumif is a simpler lookup. It takes a table and a predicate and sums up values matching that predicate. It is often used to look up single values where the table isn't sorted but you know you only have one of each key.
4) Index, Indirect and Address
We've gone too far to stop now. If you're writing a sheet that uses these you already know you are a bad person and don't care. these are the `eval()` of excel, allowing you to construct arbitrary references to cells or arbitrary functions as strings, dereference and evaluate them. You can then compose these into other functions. More details of this depravity can be found here[3]. It always makes my day if I am making a sheet that requires any of these functions.
XLOOKUP is a better VLOOKUP. It was added in the last few years. Other really good new ones:
- LET for defining temporary variables inside a formula.
- LAMBDA for defining new functions
- IFS is like if-elif-else with a flat structure, solves the deeply nested IF problem
- SWITCH does what you think
- TEXTJOIN join a list of strings on a delimiter
Then there's the whole spill-arrays feature that completely changes the game. Much better than the old dynamic array formulae. You can finally treat ranges kind of as if they're dynamic-length arrays in a conventional programming language. There's MAP, FILTER, REDUCE, UNIQUE, SORTBY, HSTACK/VSTACK, etc.
Other than INDEX, please don't, for the very reasons you say. They're like eval. When someone hands me a spreadsheet that heavily uses INDIRECT I have to spend a long time figuring out what's happening. They're also volatile, meaning they're recalculated any time you do anything, rather than when they're needed, because Excel can't statically determine their cell dependencies.
Other important features: tables (i.e. the structured-reference tables, not pivot tables), Powerquery and its associated M language, VBA if you have to deal with a lot of legacy documents.
I sadly had to go deep in excel for a year or so early in my career. This is a good overview.
Only change I’d make would be to move Index to your third category alongside Vlookup, and add the Match function there as well, making sure to advise using exact match (match type 0, which isn’t the default).
I ditched Vlookup for Index-Matching and was able to scale notebooks quite a bit further. I don’t know why this was, however. Pushing all the heavy data processing to the SQL database, by pre-computing every possible metric to be shown to users, made my spreadsheets super lightweight and responsive. Essentially only doing index-match lookups against a single (but big) data tab. The finance team loved it.
INDEX and ADDRESS aren't bad functions, INDEX will not fetch a value outside of the range you specify so it doesn't mess up with the calculation model (you are explicit about your dependency). INDIRECT, yes, that's the nasty one. For advanced excel users, this website is worth a read: https://www.decisionmodels.com/calcsecrets.htm
It's also worth going through the function list in the documentation once, just so you know what is possible.
5) learn how to use array formula. They can be used for two use cases, either a function returns an array, though this is now deprecated with the SPILL feature in more recent version of excel. But more useful: you can do array based calculations (more powerful than SUMIF). For instance give me the number of time the value changed in a timeseries: {=SUM(1*(A1:A100<>A2:A101))}.
6) of the rare useful new features of excel, there is PowerQuery, which allows you to load data from a csv file or database. Very useful when you need to refresh that data. You can parametrize power query so that for instance the filepath of the csv file is defined in an excel range. It avoids this repeated pattern of manually opening a csv file and copy pasting the data when you need to refresh the report.
> learn the "Ctrl - arrow" shortcut (move to the end of the contiguous data in the direction of the arrow), the "Ctrl + Home" shortcut (Move to the start/top left corner of the spreadsheet) and realise that when you're holding down shift this means you can select regions of data for cut and paste or other operations really quickly
These are actually standard shortcuts that work in all Office programs and many others, including browsers. It's amazing how many people still use the mouse to select, and spend an inordinate amount of time doing so.
To me, conditional sums are a replacement for filters and pivot tables, because they let you have all the information in one go, without constantly clicking to select criteria.
Just select unique values for criteria and do conditional sums for each of them, and you have a visual that shows you all possible combinations and results at the same time.
The way I keep my Excel skills up to scratch is by staying updated with the Pro Excel scene. One method I employ is watching Krazam as he goes through the release notes. You can find his videos at: https://youtu.be/xubbVvKbUfY
The same Martin Shkreli as Martin "chased failed investment debt by buying patents for cheap (but life-dependent) drugs and hiking the prices by several orders of magnitude, but still failed to recover and was eventually prosecuted, and convicted, by the FBI for running this Ponzi Scheme" Shkreli?
I live in Excel. This video was basic. I was hoping for more sophisticated instruction. But from reading the comments, it’s clear I need to learn a data analysis language.
I think Python and Julia are popular for this purpose, Python much more so.
You will probably want to use one of the “notebook” environments that make your Python easier to use interactively and display graphs and the like. Jupiter is the one I know but there are others too.
In terms of popularity I'd say Python>R>Julia
R might be closest in syntax to how excel works, especially with the ifelse functions. R also has dplyr, which is unmatched in how it makes data manipulation easy and streamlined
I'm a data engineer and have previously worked on projects in a large and long running company where some important operational metric is calculated using lots and lots of spreadsheets tied together by a combination of formulas and macros. Sooner or,ore likely, later, they would stop being possible to maintain and run and my team would get called in to rebuild the business logic in a (non excel and code instead) data pipeline.
The resulting pipelines were always very simple, but the process of working through complex spreadsheets was always a special layer of Dante's hell that he didn't even dare writing about.
These weren't created because people sucked at excel (those people made very simple spreadsheets with a few calculated columns in) - they where made by people who where very skilled but didn't spot that they should stop.
All those bad spreadsheets cost hundreds of person hours to maintain, and ended up being finacially and emotionally expensive to untangle.
Analysts of the world: please ignore this video and continue sucking at excel.
The problem is, that you always start with a table just like in video, with 12 rows and simple functions.... coding everything for 12 rows is overkill.
Then you just add two more rows and one column. And 6 more rows and two more columns. And just two more formulas. And then you need just one minor thing, that can be done with a macro in five minutes.
Yes, companies such as Apple need specialized software to handle stuff, and there are solutions to handle this (eg. SAP, no matter how pain in the ass SAP is). But expecting Apple to use SAP when it was just the two Steves in a garage is stupid... and upgrading to SAP (or whatever other solution) is a thing you do, once the excell spreadsheet gets to an unmaintainable mess and not a second earlier.
That’s a shame. As scummy as he is those videos were excellent. Just how fast he was able to get data into excel and do things with it seemed like magic.
I enjoy using SQL but the idea of using it instead of Excel to visually explore and manipulate data makes me wince. I can do without Word or PowerPoint but Excel, despite all its data handling warts, remains an exceptionally powerful tool for quick one-off data explorations.
I prefer combining SQL with Excel by generating a query that gets most of what I want inside the database and then importing it into excel's powerquery to further explore the individual vectors.
I have multiple excel spreadsheets that query a database for data using a SQL query I wrote, that then displays excel based reports/dashboards. You can also grab data from a bunch of other sources - sharepoint, etc.
Then you can just update from excel and everything is in the right format for non programming people to use.
The thing about excel is that for large orgs its probably on every computer and almost everyone sort of knows how to use the basics of it. that gives it value.
The real value would be if there was some way I could get excel on a random machine without any privileges to refresh the data itself on a schedule.
One thing about Excel is that people don't see it as "programming," so they might consider it to be a job skill even if you're not marketing yourself as a programmer per se.
Just spent a few hours searching for a way to display 3d plots in excel, got frustrated and bought Matlab. Probably missed something hidden in thick UI forest.
There is a reason for Excels dominance. CFOs and executives down to junior analysts can quickly do tasks same day that would take weeks to build using overly complex programming languages. This is just more empire building by “I’m
Smarter than all you dumb people, why can’t you think like me” programmers.
I don’t need c++ or Python to build a net present value analysis that I could show the CFO. I can build it in excel in an afternoon, and I can hand it to the least technical salesperson, and he can make changes in my model and see the results instantly then send it back to me. And then my boss can. And then someone in marketing can.
And the bonus is I don’t get a bunch of useless errors that no one understands like “Stack Overflow Error” or “Cannot process because of error 1994505 super duper data transformation canooter value fluctuation.” The worst error I get is that it crashes and I lose 10 mins of work.
Face it - Excel is a phenomenal achievement and it burns arrogant programmers like you because we can do all this and completely shut you out. That’s why it’s a “horrible tool” lol.
Learning how to script a problem out of Excel brings many advantages. Its beyond the pure practicality. It helps you to think in terms of data structures. People who spend their lives on Excel have no idea how to best format data for processing and over and over repeat the same inefficient patterns.
A large majority of the people I work with have been working 20+ years. A lot of them make 200k-300k and have never written a line of code. They are doing great without it and my company stock value keeps increasing without coding in Finance.
There’s no need to hide behind the salary figure. If you and your colleagues r fine with a tool that’s “good enough” and would rather not undertake the mental burden of learning a new thing that’s fine. But don’t toss out all these irrelevant reasons that try to prop excel up as some truth
I didn't say I was making that salary but our executives are. The "mental burden" for us is thinking deeply about our business and how our products affect our profitability, not wasting understanding the difference between a list and dictionary. While not perfect, Excel is a fantastic tool and the only relevance I need to prove my point is the fact that it dominates the industry because it gets things done.
I'd be willing to bet you also probably look down on people who use Windows instead of Linux desktop. Or why they don't "burden" themselves to learn the oh so easy command line. If this is you, I get exactly the person you are and it defends my point even more.
My initial comment came across more antagonistic than I intended lol and I wasn’t trying to hand wave away the “mental burden.” I chose the word because it acknowledges the effort required, so apologies.
And I didn’t mean that Excel is bad or anything, it’s extremely good at what it’s for. And like u mention, it’s much easier for someone to pick up excel and “get stuff done” than say python. My point was simply that there’s merits to one suggesting an alternative to excel related tasks. It’s not unequivocally a better choice for everyone, but it certainly can be.
And lol windows is more so just the privacy thing and msft fucking their users around. I’ve heard great things about powershell. But ironically, the command line rlly isn’t difficult to grok, and certainly not hard relative to working with ridiculously nested XLOOKUP’s
It is better said that the "spreadsheet" concept is a phenomenal achievement (and it is true that currently Excel is the best spreadsheet program).
Already Lotus 1-2-3 for MS-DOS, more than 30 years ago, matched or exceeded almost all features provided today by MS Excel (and it had an optimized keyboard-based user interface that enabled experienced users to perform most tasks faster than in modern spreadsheet programs like Excel or LibreOffice Calc).
Excel has just provided the same features, with nothing original of any importance, but only with trivial improvements enabled later by better computers.
The fact that Excel has succeeded to replace Lotus 1-2-3 has not been caused by any technical superiority but by the integration in MS Office and by the fact that Microsoft has cheated, by ensuring that nobody else could keep up with the changes in the public Windows API and with the undocumented parts of the API.
Actually I have been both a heavy user of Lotus 1-2-3 in Antiquity and a heavy user of Excel in recent times.
It is obvious that a program which may have a size of many megabytes is able to include a lot of improvements over a program whose size was limited to a few hundred kilobytes.
Nevertheless, all improvements provided by Excel are quantitative not qualitative, they do not enable any essentially new application.
For instance, there is no doubt that it is much easier to write a maintainable Excel script in Basic, than in the awkward macro language of Lotus 1-2-3.
Even so, using just the macro language of Lotus 1-2-3, it was possible to write amazingly complex applications, e.g. for the automation of the tracking in real time and of the generation of reports about the flow of partially processed products through the steps of complex technological processes in some factories (with thousands of different products and with more than one hundred manufacturing process steps through which they might have to pass, depending on the part number), also of the printing of all the documents that accompanied the manufacturing batches, and where the data was stored in a database embedded in the spreadsheets.
Today, writing such a program by using a modern database, a modern programming language and a modern computer would be very easy, but doing the same within 640 kB and with a 33 MHz CPU was not a little accomplishment. At that time, Lotus 1-2-3 was one of the most useful applications for most businesses, especially when computers and programs were much less affordable and many would not have been able to buy any other program, except perhaps some word processor.
Excel has taken all the features of Lotus 1-2-3, except the user interface, which was no longer suitable in Windows, but everything added later were just enhancements that were obvious when faster CPUs and more memory became available.
Excel and spreadsheets in general are horrible in many ways, but being proficient in them is a basic life skill at this point and should be taught at school if it isn't already.
It's not just analysis; data representation, manipulation and collaboration are essential parts as well (especially the last one). Being able to quickly import numbers, format and surface the important part and show it to someone to discuss and tweak in real time makes a world of difference.
in python and R you can do a very complex graph in one line of code while it would require 30 clicks at least to do the same thing in Excel when its even possible. the power of abstraction.
In excel I click on a block of data and click on the graph I want and it appears. I use Python (Pandas) and Matplotlib pretty proficiently and I haven’t found a need to sub those for an excel graph. No picky data types. No installing and teaching someone Jupiter or Pandas or an IDE. The anti tech sales guy can just change the data in the cell, or pull the little box down. Or double click in the graph where he wants to insert a title.
You’re fighting a losing battle here. I have need for Python in what I do, but none of my Finance partners do. Coding has to get much much simpler before it will ever replace excel.
I’d bet money you are also the person who wonders why the world uses Windows over Linux for many tasks. When you can answer why regular people choose Excel over code and why regular people choose Windows over Linux, you’ll get it, and you can make a positive change in code to bring Linux and Code more mainstream.
I came to excel far after learning lots of programming languages, including those aimed at data analysis.
The only thing that comes even close for my current Excel spreadsheets use cases would be Rshiny. Interactive responsive analysis of various scenarios, with traceable computation for a very wide audience, is far better via Excel than even Rshiny, which hides all the code.
Starting with Excel is probably a great way to get to other tools. Especially when computations are straightforward and don't require the advanced stats libraries available in something like R.
IF your job is largely dedicated to data analysis and IF you don't need to communicate or hand off the raw work to a non-technical audience, and IF you've got the time and support to learn those skills, then you're probably right. But most of us don't live in that world.
Nothing prevents you from exporting the result of a analysis to excel at the end. This way you give people what they like while you dont have to use it for all the intermediate work.
I'm really grateful I never had to work with Excel in any job. My excel skills section of my CV reads "basic, would rather not use it". The closest I came was importing Excel files. I do however respect people who can use it well and I've seen people be really efficient in excel.