Understanding map, filter, and fold

gingerlime · on Dec 24, 2012

I liked this post, and find the use of map/filter/reduce one of the most fundamental changes to the way you can manipulate data. Truly eye opening.

Even though I have relatively little front-end experience, and use primarily python/ruby- I only started to use these with Javascript (via the underscore library).

The only thing I didn't quite agree with is " ... the syntax is obvious enough for most people to understand even if they don't quite grok Haskell".

Whilst the early examples were manageable, I was struggling with the syntax later on. The syntax looks clean, so it seems like Haskell is a neat language, but it wasn't enough to make it easy for me to fully 'get it'. Perhaps I'm too locked in a ruby/python mindset.

biesnecker · on Dec 24, 2012

Point taken. I might redo this with Ruby at some point in the future -- the functions are all available and it might be interesting to see the differences between the two.

ricardobeat · on Dec 24, 2012

MDN also has pretty good documentation for map/filter/reduce:

https://developer.mozilla.org/en-US/docs/JavaScript/Referenc...

What's with the space shuttle / F22 hovering beneath the text right in the middle of the screen? :/

biesnecker · on Dec 24, 2012

Haha, I've been playing with different background images... maybe that one isn't optimum for reading :-|

MaxGabriel · on Dec 24, 2012

I liked the article. Just to let you know, when you zoom on an iPhone the background covers the text for a ~ second: https://www.dropbox.com/s/95f8ivqxodh2qnc/Photo%20Dec%2023%2...

biesnecker · on Dec 24, 2012

Thanks for the head's up. I'm beginning to think the background image is more trouble than it's worth :(

deltasquared · on Dec 24, 2012

If you are working with C# these functions are available in linq: map is Select, filter is Where, and reduce/inject/fold is Aggregate in Microsoft speak.

I use those methods all the time when I have to do C# code.

I do wonder why these operations lack a standard name. Still, I think map-reduce sounds a lot better than map-inject or Select-Aggregate or map-fold.

ConstantineXVI · on Dec 24, 2012

If you're already familiar with SQL (which your average C# programmer likely is), select/where/aggregate look familiar to you, so less of a mental leap to understand them. And being able to say 'it's like in-line SQL!' makes Linq more acceptable to pointy-haired-bosses at the same time.

gnaritas · on Dec 24, 2012

Easily fixed with extension methods for aliases you like.

ollysb · on Dec 24, 2012

Going from java to ruby these functions were the most obvious improvement for my day to day coding.

mrgriscom · on Dec 24, 2012

I agree. These functions have completely changed how I approach algorithms and data manipulation.

I was super-excited to learn Go, but it was a showstopper to discover that Go's lack of support for generics effectively made this pattern of programming impossible. Major disappointment.

0x0 · on Dec 24, 2012

Genuinely curious, what do generics have to do with this?

gridaphobe · on Dec 24, 2012

The type of map is `(a -> b) -> [a] -> [b]`, which means that map takes a function from any type `a` to any type `b` and a list of `a`s, and returns a list of `b`s. Generics, aka parametric polymorphism, are required for this because the definition of map makes no assumptions on what types `a` and `b` will be instantiated to.

DougWebb · on Dec 24, 2012

What does "from any type 'a' to any type 'b'" mean? That sounds like a range of types, but I don't see how that makes sense. Did you just mean that the map converts from 'a' to 'b'?

FreeFull · on Dec 24, 2012

a and b are placeholders (type variables). They can be any type, for example Integer, or a Float. It doesn't matter.

[a] is a list of a, [b] is a list of b. Map takes a function that takes an a and returns a b, and applies it to every element of [a] in order to produce the [b]. An example usage is

  map lessThanThree [1,2,3,4]

In this case, map takes a function where a is an integer, and b is a boolean, then takes a list of integers and produces a list of booleans.

The result is [True,True,False,False]

jjaredsimpson · on Dec 26, 2012

It's also illuminating to note that haskell curries function application. This allows us to think of map as a function of one argument.

map :: (a -> b) -> ([a] -> [b])

so instead of thinking that map takes a function and a list and applies that function to the list, we can think that map takes a function and "lifts" that functions argument and result types to the list type.

This thinking leads us directly to Functors and fmap

GlennS · on Dec 24, 2012

I recently started a Java job after having avoided it in favour of better languages for some time. I've found the Google Guava library makes Java programming a lot more bearable: https://code.google.com/p/guava-libraries/

Thinks that will make your life easier

* Collections2.filter() -> this is just filter()

* Collections2.transform() -> this is map()

* ImmutableMap.of() -> for when you need to create a quick lookup

You still have to make an anonymous inner class every time you want to pass a function as an argument though.

ollysb · on Dec 26, 2012

Yeah, I actually wrote some similar functions for java when I first started experimenting with ruby on the side. I agree it's a big improvement on for loops but there's still a lot of line noise; the anonymous class(as you mention) and having to passing the collection into a static method which is never great. Importing Collections2 statically does make it a bit more pleasant though.

tel · on Dec 24, 2012

This is such a key insight core to many of the benefits of FP. Other resources on the matter that I like include

Meijer, E., Fokkinga, M., & Paterson, R. (1991). Functional programming with bananas, lenses, envelopes and barbed wire, 124–144.

Hutton, G. (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming, 9(4), 355–372.

Gibbons, J., Hutton, G., & Altenkirch, T. (2001). When is a function a fold or an unfold? Electronic Notes in Theoretical Computer Science, 44(1), 146–160.

(The third one is for theoretical interest mostly)

gordon_syme · on Dec 24, 2012

Good article but he makes the same mistake I did at first with foldl and foldr by saying they fold from the left and right respectively. Which leads to the mistaken thinking that foldr can't operate on infinite lists (it absolutely can, foldl can't).

It's better to think of them as left- and right-associative.

dfeltey · on Dec 24, 2012

This was my one complaint as well, and it happens with the example given of any'

Defining any' as

any' f xs = foldl (\acc x -> f x || acc) False xs

Then trying to evaluate any' even [1..] runs forever, but if we modify it slightly to use foldr instead as

any' f xs = foldr (\x acc -> f x || acc) False xs

Then evaluating any' even [1..] terminates basically immediately with True.

biesnecker · on Dec 24, 2012

Thanks much for the feedback. I clearly didn't understand them as well as I thought I did, though after reading http://stackoverflow.com/a/3085516/337184 I think I understand a bit more.

I'll update the post with a correction, and eventually write a "how I learned to stop worrying and love the difference between foldl and foldr" post. :-)

xiaoma · on Dec 24, 2012

Nice to see you on HN, John!

I had previously learned these as map and reduce/inject (thank you, ruby). It's only this past few months when I took an algorithms course on Coursera that I realized how much of a pain it was to do some of this stuff in other langauges (like Java). To me it was just natural to write functions that operated on an array of "stuff", rather than arrays of ints or strings, etc... The course literally had us building identical functions for handling different data types and then finally solving the problem via generics and the use of an ugly typecast.

carlob · on Dec 24, 2012

Doesn't building map with a fold hide its inherent parallelizability?

The nice thing about map, as opposed to fold, is that you don't need to know the result of the ith step to do the i+1th.

Martijn · on Dec 24, 2012

Not really. There's a really nice paper by Graham Hutton, "A tutorial on the universality and expressiveness of fold" which goes into a bit more depth than OP does. Your question is answered directly in section 3.2, "The fusion property of fold". Basically, a sequence of multiple folds can be fused to a single fold.

See http://www.cs.nott.ac.uk/~gmh/fold.pdf

carlob · on Dec 26, 2012

The article you linked proves the universality of fold, that I never doubted for one second.

What I was trying to say is that map is not universal, in that all function evaluations can be performed in parallel, because you can't use the result of the ith to compute the i+1th. In this respect map is in fact a restriction of fold.

drudru · on Dec 24, 2012

Awesome - I had the same question. I was basically wondering how to inform the system of the opportunity to parallelize.

BTW, Hutton's book on Haskell is just so excellent.

DougWebb · on Dec 24, 2012

That depends on your map function, for most languages. If your map function includes external state (like a variable from an outer scope) then the execution sequence could change the map function's output. Eg, in C#:

    var i = 5;
    var result = input.select(item => { 
        var val = item.v + i; 
        i = val; 
        return val;
    });

This would be an awful programming practice, but it serves as an example that map isn't necessarily parrallelizable.

ddfreyne · on Dec 24, 2012

A small mistake: “In Ruby, these functions are map, filter” — is incorrect; filter in Ruby is called select.

biesnecker · on Dec 24, 2012

Cheers, fixed (as soon as the cache updates)

haileys · on Dec 24, 2012

Also Ruby has 'reduce' as a synonym for 'inject'

gingerlime · on Dec 24, 2012

...and thank god for that. I find inject a really confusing name.

gnaritas · on Dec 24, 2012

The Ruby names are copied from Smalltalk, as is much of Ruby.

talaketu · on Dec 24, 2012

Also: (++) :: [a] -> [a] -> [a]

Article implies: (++) :: [a] -> a -> [a]

BasDirks · on Dec 24, 2012

Before people go off writing this `foldx foo initial bars` pattern:

http://www.haskell.org/ghc/docs/latest/html/libraries/base/D...

biesnecker · on Dec 24, 2012

I ignored the existence of foldl1 and foldr1 because though they're available in Haskell and make a lot of things easier, they're not available in every language and not fundamental to understanding the concepts themselves.

groenie · on Dec 24, 2012

> In Ruby, these functions are map, filter, and (confusingly!) inject

The ruby I'm using right now (1.8.7) does not understand "filter", but it does have the equivalent called "select".

drudru · on Dec 24, 2012

Also, if you use map and you want to easily filter out values, and 'nil' is an ok sentinel for this, then just use compact after the map.