Hacker News new | past | comments | ask | show | jobs | submit login
Haskell's effect on my C++ : exploit the type system (vallettaventures.com)
57 points by steeleduncan on Feb 5, 2012 | hide | past | favorite | 54 comments



  > Although the newcomer C# was only introduced ten years
  > ago, there is little in the language to reflect the
  > thirty years of research into programming languages since
  > C++ was introduced.
Not a huge fan of C#, but he couldn't have picked a worse example for language stagnation.


Agreed, especially considering the amount of cross-pollination there is between C# and Haskell, possibly through F#.

The people that develop F# and those who develop a large part of Haskell drink coffee from the same coffee machine.


I can say form experience that C# has several interesting features built on research from Haskell, just gotta probe the edges a bit. And once you probe the edges too hard your code becomes worthless for the standard blub programmer who'd prefer not have to learn the syntax.


If your only example to show C++ verbosity fails to compile and is also patently wrong, you are not making a good point. (The constructor is private, the variable naming prevents short accessors such as Rectangle::width()). Better:

struct Rectangle { double x,y, width, length; };

int main() { Rectangle r = {0., 0., 10., 10. }; return 0; }

This even compiles in C++03. If you really insist on private members, you will need initializer lists and some changed member names.

The small example error has been worked around with for years by simply placing the constant on the left side of the comparison. Not pretty, but effective.

I agree with the general point, though. A functional style can make C++ much more efficient. It is also the only way to do template meta-programming and everybody who tries to go into that direction can benefit from learning Haskell.



Also found a def of type classes that i like: automatic instantiation + constraint propagation

http://blog.garillot.net/post/15147165154/every-single-fresh...


I should have mentioned those articles. They are a great example of Haskell influencing a C++ programmer. Although this article might show what the average C++ programmer thinks `using the type system`, `verbosity` and `influencing their style` mean.


If you replace Rectangle r with struct Rectangle r, your example even compiles in C89. :)


While modern C++ (or Java, for that matter) may not reflect 30 years of programming language research, it certainly has benefited 30 years of programmer feedback and development.

In C++11, the language is unaltered from the perspective of a PL researcher, its type systems unchanged, but wildly improved from the perspective of a programmer. Initializer lists, auto, lambdas, and for-each loops were added, all of which make the language more compact in ways which most programmers will appreciate. Futures will do for multi-threading in C++ what RAII did for resource management.

Programmers are not computer scientists. From the perspective a programmer, the big three languages -- C++, Java, Python -- are quite alive and changing in ways which allow us to write better, more interesting programs. Even if they are built on thirty year old type systems.


One of the more clever not so well known C++ type system hacks I have seen is the way Ogre3D (a graphics engine for games) handle Degrees versus radians. Basically they have a class for each and have overloaded the operators so that you can mix and match operations freely (e.g you get a radian result from some computation which you can then add twenty degrees to simply by saying + Degree(20) and have the C++ type system automatically convert back and forth. And since non virtual classes (think Java classes where all members are final) take no memory and can be trivially inline there is no runtime performance hit at all.

It is a joy to use to.


Do you mean "members are final" or "classes are final"? A final member is constant, which has nothing to do with inheritance. A final class has no subclasses.


All members are final, and it inherits from nothing -- that way there is no need to store any table of methods with each object as so the size requirement is limited to a single table.


Do people really still make the `=` vs `==` mistake? GCC -Wall has had "warning: suggest parentheses around assignment used as truth value" since as long as I can remember.


That's an easy and one might say picturesque example to give, but there are many variants, such as an if statement that in one branch turns out to change the value in that variable, then later on you forget about it and use it incorrectly. Or it gets passed into something taking it as a reference which unexpectedly modifies it, or so on and so forth. All of these are a lot harder to give examples of in a blog post because they don't really hit you in bite-sized chunks of example code. By definition, you're supposed to understand everything going on in an example chunk of code, so it's hard to show the problems that can emerge with hard-to-understand code.


It's the one mistake that everyone seems to cite when complaining about C, and I've literally never seen it happen in all my years of programming. Right up there with fall-through cases.


I have made both mistakes.


I use Yoda Conditions[1] in C-oid languages, so the error would be visible even without warnings (because the program would refuse to compile).

[1]: http://stackoverflow.com/questions/2349378/new-programming-j...


That's fine when you make a comparison against a constant, but what if you're comparing two variables?


You make a good point, but the mistake is still made. Semantic correctness has other advantages, such as more optimisation when possible.


I can sympathize; I too use C++ by day, and have become increasingly bitter/jealous/frustrated as I've spent more time with Haskell by night.

To the point that, I've started writing several libraries that enable in C++ several of the features I miss most in Haskell. In particular, I've missed the natural syntax of higher order functions, e.g., currying and functional composition, as well as the pithy expressiveness of Prelude vs the STL. With C++11, we can have these things, and while Boost gives us a wealth of functionality, it's hardly the most natural (and lightweight) library.

fc (functional composition) (https://github.com/jdduke/fc) and fpcpp (functional programming with C++)(https://github.com/jdduke/fpcpp) are my recent (very much work-in-progress) efforts towards this end.

With fc, one can naturally compose functions, lambdas and function objects, e.g., given composable functions f, g, h, write auto fgh = f + g + h; or, auto fgh = compose(f, compose (h,g));, and then call it as fgh(a,b,c);

fpcpp enables syntax like

  let pi = [](unsigned samples) -> double {
      typedef std::pair<double,double> point;
      let dxs = map([](const point& p) { return p.first*p.first + p.second*p.second; },
                    zip(take(samples, rand_range(-1.0,1.0)),
                        take(samples, rand_range(-1.0,1.0))));
      return 4.0 * filter([](double d) { return d <= 1.0; }, dxs).size() / dxs.size();
  }
  EXPECT_NEAR(pi(10000), 3.14);
In writing fc and fpcpp, I've actually become less and less concerned for the future of C++; having used a good number of the C++11 features available, I've regained a good deal of my passion for C++ programming. Really, C++ needs a more expansive and natural standard library, iterators are powerful but are NOT the answer.


map and fold are available in C++, though they're called std::transform (in <algorithm>) and std::accumulate (in <numeric>). Totally agree with the author's point, the C++ I write nowadays is heavily influenced by my Scala/Haskell/Lisp exposure and is much better for it.


Yep. <algorithm>, <numeric>, <functional>, and <iterator> are all incredibly useful, yet so underused.


Does 'accumulate' work on non-'numeric' types?

Part of the beauty of Haskell is the aggressive effort to define each function on the most general type it applies to (and then specialize to concrete types during compilation, a la C++, when performance is requested via a pragma), leading to a rich hierarchy of tiny type classes (type classes are similar to C++ abstract classes)


Yeah, it's a real fold, just misleadingly named:

  int main() {
    string s[] = { "a", "b", "c" };
    // will print "abc"
    cout << accumulate(s, s+3, string(""), [](string l, string r) { return l+r; });
    return 0;
  }
I think a closer C++ equivalent for typeclasses are specialized template functions--for example, if I define:

  template<InputIterator> string accumulate<string>(InputIterator first,
    InputIterator last, string init);
then my call to accumulate(s, s+3, string("")) will use that specialization.


> int a = f();

> if (a < 0) a = 0;

> has been replaced with

> const int a = MAX(0,f());

Going by MAX capitalization, it must be a macro, in which case the code is wrong. I am guessing this is a synthetic example, but still - try and get the basics right.


With GCC specifics, you can define MAX as a macro without calling the function twice.

  #define max(a,b) \
        ({ typeof (a) _a = (a); \
        typeof (b) _b = (b); \
        _a > _b ? _a : _b; })


Technically, the code is wrong only if f() is not pure.


Yes, I know, but there is no way to enforce it. This a classic "noob" mistake with macros - using a macro argument more than once.


That's why in C++ people seem to generally recommend inline functions instead. Not only do the arguments only get evaluated once, but the whole thing is type safe too.

Of course, people still misuse macros.


Macros, by definition, are always misused, complete the languages point of view. They exist to let the programmer get the job done when the language falls short.


Well, I would define misuse as using them when the language doesn't fall short, like for the max example.


I feel pretty safe assuming a pure function in a blog post about the advantages of immutability, really.


That first example is terrible. C doesn't guarantee that the arguments to a macro are evaluated only once. At best, you might be doing the computation twice. At worst (if the function has internal state), you could get different answers for the comparison and the assignment. The non Haskell-style is better!


I also find the non-Haskell style much more readable, but I've spent a lot more time reading C code than Haskell code, so that could just be idiomatic familiarity. To me it's clearer that it's doing: "use this value, except that if it's below 0, clamp it to 0". The MAX() reads to me semantically strangely, since I tend to view "max" as idiomatic when selecting between values, like max(left_child,right_child), but not as an idiomatic way of clamping values. It works, of course, just takes me longer to unpack, the way using short-circuit booleans for control flow takes me a second to unpack.

If it's a case where you can retrieve the value essentially free w/o side effects (as opposed to the result of a computation), one idiomatic alternative is:

    const int a = foo >= 0 ? foo : 0;


Better to use the std::max template function.


Do note that calling functions with the unsafe side effects is not Haskell-style, as that expression is not even possible in Haskell without literally typing word "unsafe" into your program at least twice. It is supposedly "Haskell-style" only in the OP's incorrect description.


  int a = f();
  if (a < 0) a = 0;
   has been replaced with
  const int a = MAX(0,f());
I believe the latter will generally result in 2 calls to f() (unless it returns less than zero), which could cause undesired behavior or just be inefficient. Of course, it depends on what f() does.

Assuming MAX(0,f()) expands to something like this:

  if( f() < 0 )
     a = 0
  else
     a = f()


It was a poor choice on the part of the author to use the macro-looking MAX() in his example. There is in fact a standard max() function in the <algorithm> header that does not suffer from silly macro-expansion problems:

    template<class T>
    const T& max(const T& a, const T& b) {
        return a < b ? b : a;
    }

    template<class T, class Predicate>
    const T& max(const T& a, const T& b, Predicate p) {
        return p(a, b) ? b : a;
    }
And your definition of MAX() is wrong, not just because of multiple evaluation of the arguments. The “conventionally wrong” MAX() would be this:

    #define MAX(a, b) ((a) < (b) ? (b) : (a))


The problem is a loss of verbosity. The next person looking at the code isn't necessarily going to know how MAX is defined, and saving 1 line of code isn't worth making it less clear.

That said, for me, your code would be clear, as not being in all caps, I would assume it's a function and therefore doesn't have the same weakness.

As someone else pointed out, there's no way you can look at MAX(0,f()) and know that it's enforced that f() is only used once within the macro.


If MAX evaluates f() twice, then it is broken. It's trivial to write a version that only evaluates its arguments once in C++ using templates, and in C using common extensions that allow for macros with multiple statements that act as an expression.


C++11 adopted strongly-typed enums, but I wish it also added strongly-typed typedefs. I've written (and found others') template classes to implement something similar, but the implementation is surprisingly (?) verbose and may introduce code bloat or runtime overhead. The compiler could implement this with no overhead!

C++11's strongly-typed enums look like:

  enum class Enumeration { Val1, Val2};
  enum class Enum2 : unsigned int {Val1, Val2};
A strongly-typed typedef could use similar "declare a class deriving from a native type" syntax:

  class Celcius : double;      // something like this?
  typedef Fahrenheit : double; // or maybe this?
  Celcius celcius = 0f;
  Fahrenheit fahrenheit = celcius; // ERROR


My somewhat biased stance: functional programming languages make you progress as a programmer quickly. Becoming great requires deliberate practice and intrinsically hard problems. So you learn programming and CS about 5-10 times as fast in a functional language. This is part of why enterprise Java/C++ devs plateau at a skill level of 1.2 (even into their gray-haired years) while ML and Haskell programmers can reach 1.5 in a few years and generally cross 2.0 if they make a career out of software. (For reference, here's where I explain that scale: http://michaelochurch.wordpress.com/2012/01/26/the-trajector... ; it's a 0.0-to-3.0 scale and 1.5 is what most of us would consider a seriously good programmer.)

So if you want to grow as an engineer and computer scientist, learn these languages. You'll learn a lot fast, and not just about one programming paradigm (FP) but about complexity and how to manage it. You'll just learn a lot more about software because the way to learn software is to write it and one person can actually accomplish significant things in these languages. Java and C++ demote the individual contributor: "commodity" developers crank out classes and a more senior "architect" figures out how to bolt them together.

That said, these languages won't improve your performance in typical Java and C++ jobs. Six months of practice for exposure might give you some new ideas and make your code and performance better. Beyond that, they might diminish it. Why? Because you start hating these languages (Java and C++) and the accidental complexity they throw at you, which starts to look foreign and unnecessary. You get to a point where your own code (if you use the idioms of these languages, and you probably should, because it's not just about you) starts to look like someone else's. When you start hating your own code, disengagement sets in and it gets ugly fast. I can "hack" Java and C++ but I certainly wouldn't do a major project in them at this point.

My Java and C++ are uninspired now that I know better languages. Often, the only way I can write decent Java is to write it first in Scala (which I like, a lot) and transliterate the solution. Whether this produces "good" Java I don't even know.

Also, I really believe that C++ is the 21st-century COBOL. Java was gunning for that distinction, but Scala and Clojure gave it an assembly-like status, as the lowest-level language on the JVM (which will remain important for at least 15 years because of Scala and Clojure).


I agree with one side of your message.

There are very few people who have just learnt functional programming. I have met a few in academia, and they tend to be as "bad" as the pure Java/C++ programmers, but in a different way. They have trouble understanding how their code will boil down to CPU instructions, and so have trouble with performance, particularly (unsuprisingly) mutating algorithms, which are vital for performance in some situations.

After I have defined a high-performance algorithm, they often want to modify it to make it "more beautiful" (which means functional) and in the process destroy the performance.

Also, I had quite a lot of exposure to Haskell, and I haven't found C++ hate. I got some Haskell hate, because doing some things (mutating algorithms and bit fiddling) were really, really hard, while I found most of the things I loved from Haskell I could do in C++, with care.

I work on low-level algorithm design in A.I., where performance is everything and there are many interlocking, low-level and complex algorithms and data structures. I know that other problems might well have very different characteristics. For example, I have no real ideal how Haskell vs C++ would compare for the average web app for example (or of course, something like Ruby/Python which seem to be much more popular).


To make it clear, I don't mean to rag on C, which is a great mid-level language. It's the ++ that I hate. Assembly is infantry: can do a lot of fine-grained work, but limited to about 5 mph with equipment. C is a tank-- a pretty powerful and impressive machine. It can shoot on the run and do 50 mph off-road. High-level languages are airplanes, perfect at 500 mph but coarse-grained from an operational standpoint. C++ exists because someone saw a tank and a plane and said, "I'm going to put wings on a fucking tank and see what happens." Thus, the tank-plane was born. The wings destroy its maneuverability on the ground and the weight of the tank makes it a shitty plane, and it looks fucking ridiculous, but there you go. Java exists because someone else concluded that the problem with tank-planes was that people were trying to fly them too fast and built an equally ridiculous invention-- the tank-icopter.

After I have defined a high-performance algorithm, they often want to modify it to make it "more beautiful" (which means functional) and in the process destroy the performance.

That's annoying. They should understand the concept of interface as separate from implementation. Implementations are allowed to be (and sometimes should be) ugly if performance concerns merit it.

Besides, mutable state, intelligently used, isn't "ugly". A lot of programs are simpler and more attractive when mutable state is used. Mutable state just needs to be used with taste.

What really pisses me off about that behavior pattern is when the tinkering is unnecessary. If it's working code, then who cares? I'm a hard-core FP advocate but I write stateful programs all the time. I try to wrap them behind interfaces that are as functional as possible. I will use state in controlled ways when appropriate; I just don't want to impose it on other people.

I work on low-level algorithm design in A.I., where performance is everything and there are many interlocking, low-level and complex algorithms and data structures.

I think what makes C++ appropriate for your purposes is that once code is written to spec, it doesn't need to be maintained (except for performance optimizations). I'm guessing that (except possibly for some STL usage) you're actually C.

Where C++ and Java really drop the ball (legacy disasters) is when code has to be read more times than it is written. If your goal is to write fast software one time and, once it's working code, never need to look at it again, C++ is probably appropriate.


Thanks for your reply, it was very interesting.

One small comment - what makes C++ appropriate for my usage is mainly templates. Lots and lots of templates.

It is very useful to be able to easily compile an algorithm for different types. If compiling a whole 10,000+ line algorithm 4 times for char/short/int/long inputs with templates and doing run-time branching between the algorithms (but not in the algorithms) saves 15% CPU time, then that is well worth the work. I don't know (well) any other language that offers that. Obviously if you introduce some kind of abstract base class / interface like Java, you will immediately lose the benefit of going int -> short -> char, when a class is alway at least... 12 bytes? 16?)

In C doing that requires a lot of macros and rapidly gets unusable, in Java it requires introducing lots of interfaces and hoping things get inlined.

What I like about C++ (compared again to Java/C, not Haskell), is that I can without fear of cost (other than compile time) introduce another layer of abstraction. When benchmarking code this is amazing, I can easily swap in a vector, or a deque, or an optimised fixed-size stack array.

In Java it is common for people to 'un-Java' code, introducing arrays of primitives instead of arrays of classes for example.

I don't know how well Haskel does at that kind of thing.


I think C++'s template feature is pretty unusual in languages. Lisps have macros, which are much more fully-featured and more hygienic than C macros, but I don't think any Lisps are fast enough to be interesting to you. I know that the Scala community is working on putting macros in.

Have you looked at Ocaml and Scala? Those might be close enough to suit your needs, and I think it's only a matter of time before Scala (or something like it) starts outperforming C++ on parallel problems.

What I'll say for C++ is that, in the style of development you're doing, it's definitely faster. The problem I have with C++ is all the use of it that occurs where it's not appropriate and actually makes performance worse because per unit time, programmers can accomplish less in it.

It's clearly possible for expert programmers to optimize the hell out of C++ in a way that just doesn't exist in GC languages, but I don't think average-case code is better, especially if it's framed as an economic problem and programmers of equivalent skill are involved and given the exact same amount of time to do it. Under those constraints, for a large problem, I would bet on Ocaml and Scala winning just because the programmers would have more time to spend on optimization. For a small problem, though, I'd bet on C++.


C++ template's aren't just unusual, they're unique among mainstream languages. They're C++'s secret weapon. These days, if you're not making heavy use of templates in your C++ code then you're using the wrong language. Like you said, many people use C++ to solve problems they could be solving more quickly and more safely with another language, but it's not C++'s fault, it's their fault.

In particular, the key feature of C++ templates (as opposed to other similar systems of polymorphism) is that they abstract over not just code but data representation. This is what CJefferson was getting at and it's orthogonal to macros. Ocaml and Scala don't offer it. It's about data, not code, so you'll never be able to gloss over it with extra parallelism.

In C++ a std::pair<double,double> really is just two doubles, 16 bytes. It can be passed into and returned from functions in registers. If I have a singly-linked list of them, each list cell is 24 bytes (16 for the pair, 8 for the next pointer.) In any other (mainstream) language, the pair is really a pair of pointers to boxed doubles which reside in the heap, and the list cell holds a pointer to that. The overhead of this is unacceptable in many situations, and these situations are where C++ still thrives. (cf. the Eigen linear algebra library for an example of "doing it right.")

Most languages have given up on this feature because of the drawbacks (code bloat) and the unified runtime representation of data (basically everything is a void*) makes many things easier (garbage collection, serialization, etc.) (I'm not super familiar with C# but I understand the struct types address this somewhat, but not without their own drawbacks.) But thanks to LLVM (written in C++ btw) people are pushing this area of language development forward (Rust, Deca, and others) in an attempt to fix the (many) problems of C++ while preserving its strengths.

Oh and BTW: A tank with wings is called an A-10 warthog :)


> I think it's only a matter of time before Scala (or something like it) starts outperforming C++ on parallel problems.

It is certainly plausible. But a few things need to change

(i) JVM applications tend to be memory heavy, even if you can approach 75% of the run time speed of a C or a C++ application you would typically need a lot more memory. In my expeience it has been 8 times or more, YMMV.

(ii) The other problem is that all the exciting native SIMD instructions are mostly out of reach. This can be a drag especially when your code uses log and exponentiation a lot. Without SIMD they are terribly expensive.

(iii) The kind of things that I use C++ for is heavy in array operations, in particular manipulating sparse arrays. In these applications the loop bounds are not constant and it is impossible to ensure at compile time that the array indexing will not go out of bounds. There Java's runtime bounds checking does slow it down. A way to override the safety would help.

(iv) The other good thing about C++ is the ability to delay evaluation via the use of templates, I am talking about expression templates. Its not pretty by any one's standards but with suitable libraries you can get the job done.

> I don't think any Lisps are fast enough to be interesting to you

Under favorable circumstances Lisps can easily give C++ a run for its money, even on numeric code. For that take a look at Stalin. It's unbelieveable how much one can optimize, even without type annotations.

There are few new curly brace languages out there, that are trying to simplify generic programming. D is the more mature among these and the new exciting ones are Clay, Deca and Rust. All have been discussed on HN. Exciting times ahead.

A multithreaded runtime for OCaMl will be realy cool. I do not need that threads be surfaced as an API but the possibility that parallelism exposed by the functional style can be exploited by the runtime (perhaps based on options used to start the runtime).

Yes there is F# but mono I hear is not that great on Linux, though I have not tried it myself.

OCaMl / Stalin: An old and civil discussion https://groups.google.com/group/comp.lang.scheme/browse_thre...

Clay: http://claylabs.com/clay/ and recently discussed here http://news.ycombinator.com/item?id=3474722

Deca: discussed here http://news.ycombinator.com/item?id=3413936 and here http://news.ycombinator.com/item?id=3413936

Rust and D I think do not need championing :)


A multithreaded runtime for OCaMl will be realy cool. I do not need that threads be surfaced as an API but the possibility that parallelism exposed by the functional style can be exploited by the runtime (perhaps based on options used to start the runtime).

This. I would love to see a multithreaded Ocaml.


> C++ exists because someone saw a tank and a plane and said, "I'm going to put wings on a fucking tank and see what happens."

What a great metaphor. Hilarious, great imagery, and it's even mostly true!

Thanks for that, I'll remember it.


To that, C++ proponents will reply: when done well, that gives you an awesome flying tank, as in http://en.wikipedia.org/wiki/Fairchild_Republic_A-10_Thunder.... It may not be sexy, but it sure gets the job done.

I think that is apt for C++, too. It is not sexy as in 'objects all the way down', 'functions all the way down', 'consign all the way down', or whatever else rocks one's boat, but it is sturdy and does get the job done.


We must have had very different experiences. I found Java to be a lot more limiting than C++, and I find them more different than similar. In C++ templates let you implement many interesting ideas in the language. The syntax would be verbose, but there is definitely a lot of room for experimentation. Smart pointers, for example, are a great concept implemented with templates and making C++ programming a lot easier than it used to be. In Java, switching to a different JVM language is pretty much the only choice if you want more power. And even these languages are constrained by JVM if what you actually want is performance and predictability. The main strength of C++ is how easy it is to "embed" C into the more performance critical parts of the code. All the high level concepts in C++ provide easily understandable and transparent tradeoffs and can be turned off if you need to go "low level". Not every system needs this, and there are many other ways to use C with higher level languages. But there are definitely situations where C++ will be the nicest choice.


you don't pick the language. The language picks you!


I've been reading predictions of the immanent breakthrough of FP into the mainstream for over 15 years now. I'm still waiting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: