Hacker News new | past | comments | ask | show | jobs | submit login
New Text and Array Functions for Excel (microsoft.com)
122 points by eDameXxX on Sept 6, 2022 | hide | past | favorite | 54 comments



Array functions feel like they're just too far from the excel design thinking. A single function that affects nearby cells is hard for me to swallow.

I kind of wish they went for the matlab cell array style where a function can return an array, but it just becomes a data structured stored within a single cell.

So TEXTSPLIT (which is great, finally), would return an object like ARRAY("I", "SAW","A","CAT") and if you wanted to unpack it you could drag a formula that was something like =$A$1?0, =$A$1?1, =$A$1?2, etc.

Or maybe just one single black magic affects-nearby-cells function called "UNPACK"


I somewhat agree, but I've bent Excel to do things that aren't "excel thinking" for literally decades.

Reshaping is something could have used many times in the past. I used to pull data into PERL first to reshape it but that took a bit of code, then I learned about numpy doing it in one line. But I still have to import back into Excel. The fact that it is in there now is useful to me.

I 1000% agree with your matlab retval suggestion however. I hate how Excel fudges array return values by just blasting a range of cells one time!

My excel knowledge has been somewhat stagnant in the past decade. Have they added an ARRAYFUNC() like in Google Sheets, or do I still need to hit "ctrl-shift-enter" to designate one?


> I kind of wish they went for the matlab cell array style where a function can return an array, but it just becomes a data structured stored within a single cell.

I had a go at writing a DLL plugin for Excel that did this years ago. I ended up with a kind of SQL, where each cell has a result set of records. The purpose was to make a functional language for consultants starting with a familiar environment to them. I even integrated a system where you clicked the cell and a pop up would show the data records. It was an ugly proof-of-concept, using strings that just identified each result set, and using custom functions. Excel is beautifully functional, with some nice parallels with SQL, and your data flow/dependencies are naturally visible. Excel is far less scary to most consultants than imperative programming is. I wanted to be able to model the data flows, use sheets for consultants to define custom pure functions for our system, and the final outcome was a reactive data system where data updates could flow (push) into outputs. I failed to get it delivered because I failed to get the COM interfaces working working: I failed to tie together Excel automation as a library engine (Excel COM API), Excel custom functions (plug in DLL), Delphi 7, and my own code.


>I kind of wish they went for the matlab cell array style where a function can return an array, but it just becomes a data structured stored within a single cell. you mean kinda like json and jsonb in postgres?



No, these functions use an aggregation function (like SUM). If you don't use an aggregation function the value of the cell is the top left element of the array. The parent suggests a cell which value is an array object, which can then be queried by another formula.


Ah thanks for clearing that up


It's been a while, but I think these functions must be map-reduce expressions. You cannot return an array and pull out it's elements elsewhere. You cannot do operations with the resulting array. Like a merge or whatever.


All they need to do is implement Google Sheets ARRAYFORMULA().


> Scenarios to try: Use “ “ (space) as a delimiter with TEXTBEFORE to extract the first name and TEXTAFTER to extract the last name

I guess this warrants linking to https://shinesolutions.com/2018/01/08/falsehoods-programmers....


20 years overdue...

Another one that is massively overdue: take multiple arrays as arguments and return the distinct values, sorted (kind of like the remove duplicate button, but that doesn't require to click a button). [edit] actually it was introduced in 2021 ("UNIQUE" function)

Also take multiple arrays and returns the values that are in common (like an inner join). Use case: you want to align two time series by creating a 3rd time series made of the dates common to both original time series.

Then you can have all sort of finance related function. Validate the checksum for an ISIN, CUSIP, SEDOL, etc.

Excel should also come with the most common holiday list (all the major cities at least).


> Also take multiple arrays and returns the values that are in common (like an inner join). Use case: you want to align two time series by creating a 3rd time series made of the dates common to both original time series.

You just have to add a column and you are an XLOOKUP and a quick filtering away from the result you want. It’s a fairly common operation.

> Excel should also come with the most common holiday list (all the major cities at least).

It’s the same. You just have to add a table and do a lookup.


> You just have to add a column and you are an XLOOKUP and a quick filtering away from the result you want. It’s a fairly common operation.

Yeah you can do that by introducing multiple columns or creating a VBA UDF. My point isn’t that it cannot be done (like TEXTSPLIT or XLOOKUP, there were more convoluted ways to do that already). It’s more that it is something common enough that there should be a simpke function for that.

On the holiday lists I mean the list of bank holidays by major city. I believe right now you need to provide that yourself, but it is something microsoft could build and maintain centrally instead of everyone reinventing the wheel. They do that for timezones/time change in the OS already.


Yes, I get what you mean. I was very happy to replace 90% of my use of convoluted EQUIV MATCH with XLOOKUP. Sometimes new functions are great.

The point I wanted to make is that sometimes you can just get results quickly by some manual manipulation like sorting in complement to formula.


All these new text functions and still no support for regex? Google Sheets has had regex support for years.


Agreed. You can do regex via VBA but I really want a built-in cell function for regex search, regex extract, and regex replace.


XLOOKUP has some simple simple regex patterns.


Apple’s Numbers spreadsheet also has regex support FYI.


Excel makes me feel like such a novice. I often just write a python script instead out of frustration. I don't think little of people that write complex formulas and macros with it. I am still waiting for "python in excel".


That would be awesome. With Sheets, I often jump into AppScript to write a quick javascript function to do what I want instead of having to deal with weird syntax. I wish I could do the same in Python.


Doesn't the libre office equivalent do python?


Ah, very interesting. Though it seems they only have the document control API. What I like about the Sheets integration is that you can write normal formulas you can use inside cells. like

    function REVERSE_STRING(input) {
      return input.split("").reverse().join("");
    }
and then call that inside a cell formula.


FYI, I wasn't wishing, this is planned by MS.


In the latest (that I can find) Excel team's Reddit AMA [0], Python was not addressed at all.

[0] https://www.reddit.com/r/IAmA/comments/m22uc6/we_are_the_mic...


If you're really a novice, watch Joel Spolsky's talk. I go back to it whenever I end up in Excel for something.

https://youtu.be/0nbkaYsR94c


I created an addin that gives you Python in Excel. Take a look at https://xlslim.com it is easy to use, yet very powerful.


If you want to lean into the novice feeling you could watch this: https://www.youtube.com/watch?v=x1RVNGDSdw4


A while ago I watched a couple of videos by Krazam [0][1] on competitive Excel. Thought it was just a sketch, didn't realize it was actually a thing.

[0] https://www.youtube.com/watch?v=ICp2-EUKQAI [1] https://www.youtube.com/watch?v=xubbVvKbUfY


I'm surprised it's 2022 and they haven't embraced a multiline equation editor.


Not sure when it became a feature, but since at least 5 years ago you can click and drag the equation edit bar down to reveal multiple lines for editing.

Pair this with Alt+Enter (line break) and adding 4 spaces at the start and you can nest functions in a familiar, albeit manual, way.


The resizable formula bar was introduced in Excel 2007: http://projectwoman.com/tag/formula-bar-resizing


There is a microsoft garage project that helps you do it that way:

https://www.microsoft.com/en-us/garage/profiles/advanced-for...


Excel is the second best tool for every task!

If you want to do lots of reshaping data and performing operations on tables of data, you are better using ETL tools (Extract Transform Load) that were designed for this task. For example: Easy Data Transform, Tableau Prep or Alteryx.


The functionality I want is a count-by-format function. I had to write my own in VBA and of course that means every time I open that sheet I have to approve its use of macros (and it also doesn’t always catch format changes that impact the calculation)


Encoding data in cell formatting is questionable practice.


> Encoding data in cell formatting is questionable practice.

"Let's force the user to employ data-management best practices" is, for better or for worse, very much not the design philosophy of Excel. (More to the point, if you must consume the data that someone else produces, then you'd like very much to be able to deal with their less-than-best practices.)


Agreed, but I can’t control half the sheets I interact with.

Users really like to highlight rows, or use coloring to track their progress, or do some insane multi-color-mixed-with-other-formatting system to indicate complex statuses.

I’d like to be able to work with that terrible data.


Why? Because the tools don’t provide nice ways to interact with it? I use colors and other formatting as ways of providing metadata to cells that I want to be able to move around in the spreadsheet.


Can’t you achieve that with the new excel lambda function?


Next challenge: make the find/replace dialog better than the confusing tabbed mess it is today. And make it non-modal for simple search/replace.


CTRL+F… text not found… what do you mean, I’m looking right at an example of this string… Edit->Find… oh by default it ignores the value of a cell and searches inside the formula that generates that value… change to search by value.

Next day… CTRL+F… text not found … what do you mean, I’m looking right at an example of this string…


I think it would be nice if they added a feature where you could visually tell what cell you have highlighted. Bigger screens nowadays, I always have to look in the upper left to see what cell I am in, and then find the row and the column on the left and then trace across and down and voila, there is the highlighted cell.

Making it a substantially different color outline or something would be a nice feature. Maybe a 2032 feature.


I am very surprised they have not yet embraced the dplyr/tidyr-style melt/cast gather/spread pivot/unpivot functions for arrays


I am not familiar with the R ecosystem, but have you tried PoweredQuery?

I had become my go-to for all sorts for all sorts of data munging in excel.


For me personally,these new text functions will be great for working with IP address, MAC address, FQDN, and URL

There are already ugly, kludgey ways of doing it (or doing it outside of Excel altogether) but this will be faster and more elegant


I wish they completed the IF variants of statistical functions.

Why do we have AVERAGEIF, but not AVERAGEAIF or GEOMEANIF, MAX and MAXIFS, but not MAXIF, etc?

I can’t find logic in that.

(https://support.microsoft.com/en-us/office/excel-functions-b...)

Ideally, the “IF” would be a separate function that filters out cells, so that it could be used with VBA functions taking multiple cells, too.

While at it, these IF functions should use lambdas for specifying the criteria instead of strings such as “>5”



That doesn’t do what the FooIF functions do: take a range, evaluate a user specified function on each of its cells, and then call Foo on the cells for which the function returns true.

Yes, you can create new cells that compute the Boolean, then use IF to populate a new range with the values that pass the test, and then call Foo on the new range, but avoiding such ‘pollution’ of spreadsheets is the whole reason these functions exist.


love to see this stuff, reacting to users in the world's #1 accidental entry point to programming. I wish they did this 20 years ago when i was neck deep in Excel daily!


It’s 2022 and how come they don’t have GPT4 and some stable diffusion built in?


seriously though, I think a copilot-like ML to improve flash fill is a genuinely amazing idea which could save insane hours, especially with people less familiar with complex formulas/coding


Someone just created that!

https://excelformulabot.com/

Apparently Microsoft is creating a plugin.

(Not affiliated)


Not quite what I meant. Basically you fill in some cells, and then expand the region and Excel will autofill. For example if you have

    Full Name        | Nickname
    Smith, John      | Johnny
    Ayers, Danielle  | Dani
    McDougle, Jack   | Jack
    Jason, Charles   |
    James, Trent     |
    Myers, Jessica   |
    David, Katherine |
And if you select the first 4 entries on the right and drag down, it will fill in the last 4.

This is known as flash fill and already implemented: https://support.microsoft.com/en-us/office/using-flash-fill-.... Except AFAIK it still uses hard-coded pattern recognition. This is exactly the type of thing ML is good for!


Google Sheets has this and it works really well.

I had an idea Excel had something as well but maybe not?




Consider applying for YC's first-ever Fall batch! Applications are open till Aug 27.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: