How I want to write Node: Stream all the things

ixmatus · on Feb 14, 2014

Forgive me for being "That Guy" but I really think Javascript is ill-suited for this paradigm!

Streams, honestly, are hard to keep straight when the program gets big without a stronger type system. IMHO.

Some really sharp people have been working on stream computing software in Haskell for a while - Gabriel's Pipes package is a good example of generalized stream computing with strong equational reasoning as its foundation.

Maybe if you really want to try and do this in Node you can gain some inspiration from his journey: http://hackage.haskell.org/package/pipes

aegiso · on Feb 14, 2014

Here's the main enlightenment of becoming a node.js guy:

Node follows the unix way. Everything is a stream. It's just Buffers and JS objects flying around. It's really stupid, and sometimes it's nasty. This isn't helped by Javascript's warts.

But there's an enormous upside to this: following the stupid Unix way means that no matter what you need to do with your data, there's an npm module for it. Just .pipe() your stream in and your code is done. This is amazing. And it's possible only because of how bare-bones and loose the Buffer stream API is.

Strong typing has its place, but it would ruin node's biggest selling point. It's hard to realize this without trying it.

ixmatus · on Feb 14, 2014

I don't know node very well and only Javascript minimally but I understand what streaming I/O is and have used generators / iteratees for a long time in many languages.

I understand the benefits and that's why Pipes in HS is such an exciting thing because it gives us a formally reasoned and general set of stream computing tools - you can compute anything with type-level guarantees. It's just as flexible and general as, say, Unix pipes but better because there are guarantees of the library's tooling and there are guarantees of the programs you produce! You can't say that in Node / JS, Python, Ruby, etc...

JS's lack of strong typing limits your ability to reason about streams (a lot more than just streams, too) and further limits your ability to write performant stream computing software. Pipes, in Haskell, give you the big three: Effects (I/O), Composition (function composition with fusion), and Streams (generators and iteratees); because of the type system Haskell (and some nudges here and there by the library author) can fuse and optimize that code to a ridiculous degree in addition to all of the other nice guarantees you get from the type system (separate of I/O from pure code, etc...)

I personally don't think dynamic typing is a selling point, ever - I write software faster and with fewer bugs in Haskell than I ever have before in Python, Ruby, Erlang, or Scheme. But that's a totally different topic and I don't want to derail this one.

Don't misunderstand me as being aggressive, please. I fully respect what people decide to like and work on, I'm just trying to expand the awareness that there are tools in existence that do it better.

aegiso · on Feb 14, 2014

> JS's lack of strong typing limits your ability to reason about streams (a lot more than just streams, too) and further limits your ability to write performant stream computing software

I think you missed my point so I'll restate: in practice you don't reason about streams in Node, because the community (a product of the simplicity of the streams API) has a packaged solution to your problem. It plugs right in. And this ecosystem exists because of the simplicity and dynamicity of the constructs used.

I actually agree with you that Haskell does it "better". It's purer and cleaner. You'll probably have less bugs if you write everything in Haskell.

Except it doesn't matter to me, because Haskell doesn't have anything close to the plug-and-playability of npm modules -- and this is a pure social product of the stupid interface that Node exposes compared to Haskell. Node is shittier, and that's why it's more capable at solving the problem I have -- constructing powerful apps in close to no time, and zero lines of my own code.

I guess what I'm saying is that sometimes worse is better.

codygman · on Feb 15, 2014

Can you qualify "Haskell doesn't have anything close to the plug-and-playability of npm modules" because I don't quite get what you mean? Maybe it is because I can't think of anything plug-and-play in node that isn't plug-and-play in Haskell.

aegiso · on Feb 15, 2014

Sure. I want a git server with push notifications. Two lines with the pushover module in node.

I would be pleasantly surprised if this existed at all in the Haskell community. Even more so with two lines of my own code.

tel · on Feb 15, 2014

I think that has much more to do with the size of Node than the methods of streaming being used. Such things are easily accomplish able atop libraries in Haskell, but pushover hasn't been written as a library because the ecosystem isn't at the point where it specializes so far yet.

But the API provided is trivially replicated, and, yes, easily built atop pipes.

hamburglar · on Feb 14, 2014

Yes, and "the unix way" isn't just "everything's a stream", but rather "everything's a text stream." In that regard, node's version of "everything's a stream" is actually a half-step up in abstraction.

ixmatus · on Feb 14, 2014

Isn't it more accurately a byte stream (I don't know, which is why I'm asking)?

jkrems · on Feb 14, 2014

The default build-in streams found in the stdlib are byte streams (byte chunk streams, to be more exact). But user-level there are object streams as well, those are also mentioned in the docs: http://nodejs.org/api/stream.html#stream_object_mode - so they are kind-of official.

zmanian · on Feb 14, 2014

I'm a Haskeller but I'll defend Node a bit.

The absolute most compelling thing about Node is how it has hacked organizational dynamics in large companies. Walmart and PayPal basically used it to completely liberate their frontend groups from their backend systems using facade system with huge improvements on customer systems.

What is sad is that all the other high concurrency systems are going to end up implementing much of the GHC runtime without the reliability of Haskell...

jkrems · on Feb 14, 2014

Streams are being used in node.js already with a lot of success. Both basic byte chunk streams and object streams. As long as components are well behaved and only emit uniform data (everything they emit is of the same type), it's not better or worse than any function call.

ixmatus · on Feb 14, 2014

Right, but the burden is on the programmer to ensure those types. If the stream composition is of simple things, like:

    cat some.txt | sort | uniq > yay.txt

Then it isn't a problem - it's obvious and simple, but I think there will be difficulties as the programs get larger and type-level awareness occupies more space in the programmer's brain vs. it being handled by the compiler...

jkrems · on Feb 14, 2014

I didn't say loose/dynamic typing would not put the burden on the programmer to ensure the types. And it definitely has it's problems. I just don't think that this is specific to streams, it's true for every kind of composition.

gcb0 · on Feb 15, 2014

interesting example. the fact you know you have to call sort before uniq already shows that even with plain gnu unix util the user must know what he is doing.

glenjamin · on Feb 14, 2014

If you're doing Node.js, Caolan's async library is pretty much part of the standard toolkit.

I know Caolan's been thinking about this and reworking it for a while, so I'll be interested to see whether it manages to see significant takeup.

nailer · on Feb 14, 2014

From Great British Node Conf, maybe two months ago: a speaker talking about promises asked what people use for flow control:

- promises: about 20% of the room.

- async: about 80% of the room

For me async.waterfall([list of functions]) is little nicer than 10 chained .thens().

And people advocating promises still keep saying it keeps things flat. No it doesn't, we're already flat because we're all using async. Stop pretending async doesn't exist and isn't massively popular.

And way, way better documented. Q.spawn what? And this is the best promises library?

Stack Overflow question: Simplest fs.readFile example with generators and Q?

Current only answer:

  Q.spawn(function* () {
      …
      var data = yield Q.ninvoke(fs, "readFile", somefile);
      …
  });

Answer from Highland docs:

    var data = _.wrapCallback(fs.readFile)('myfile');

- What's Q (yes it's a module, but what does it mean? Is it supposed to a misspelt queue or something else?

- What does 'ninvoking' something do?

- Shouldn't I just be able to to put the variable declaration outside of the scope?

- Why do competing Open Source implementations of the same standard exist? Can't there just one reference implementation?

That's not the future.

I might be really ignorant here. I probably am - I could read a shit tonne of docs to work out what this strange beast does and technically someone can probably do a better job answering that Stack Overflow question. But nobody has, because very few people know how to operate the current state of the art generators/promises setup.

From the Q docs: "If you have a number of promise-producing functions that need to be run sequentially"

No, I don't have a number of promise producing functions. Nobody in nodeland has that. I just have functions. I could read about turning them into promise producing functions, and calculate whether this abstraction layer is adding value, but then again, I could do productive work with async.

And from the looks of it, Highland too.

gruseom · on Feb 14, 2014

I have a question about the async library: what does it do if an asynchronous function or a callback raises an exception? The word "catch" doesn't appear in https://github.com/caolan/async/blob/master/lib/async.js.

I ask because I wrote some Lisp macros (I work in a Lisp that compiles to JS) to implement a few async patterns I need, and making sure that exceptions are trapped and threaded into the callback chain correctly was the most complicated part.

thedufer · on Feb 15, 2014

It doesn't do anything about thrown exceptions. The correct way to deal with an error in asynchronous code is to pass an object describing the error as the first argument to the callback. Any code that takes a callback is expected to know this and not throw exceptions.

From a practical perspective, it doesn't make sense to try to catch exceptions in asynchronous code, anyway. Once you do something asynchronous, you lose the stack and thus the try block. The way to catch thrown exceptions in asynchronous code is with domains, which something as low level as async would not be expected to handle.

gruseom · on Feb 15, 2014

The correct way to deal with an error in asynchronous code is to pass an object describing the error as the first argument to the callback.

Sure, but what if the error is thrown at you as an exception in the first place—which happens a fair amount, because that's how the JS runtime tells you when something is wrong? How do you get from there to the callback way?

What the Lisp macro I mentioned does is generate a separate try-catch around each block of code that runs at a different time and thus might throw an exception that would not otherwise get caught. In that way it catches every exception that's thrown, converts it to an error object, and passes the error back through the callback chain. The async library could do the same, albeit with a lot more code. I'm curious why it doesn't.

From a practical perspective, it doesn't make sense to try to catch exceptions in asynchronous code, anyway.

I don't think that's right. Asynchronous code is just synchronous code that runs at different times. Each block of synchronous code can generate exceptions. I agree that if you don't catch them then, they become useless; but you can catch them then. The reason this is not a "practical perspective" in JS is not that it doesn't make sense, it's that the language doesn't support it. Even the minimum code necessary to catch every exception involves so many try-catch blocks as to obscure the rest of the program. So no one writes such code by hand in JS.

Yet it is, I think, code that one wants, because without it you don't have a consistent error model. You end up having one model for first-class errors—the ones you detect and pass to callbacks before an exception has a chance to arise—and a second one for the dregs—the ones that come from any code that didn't know about or follow the callback convention (which, critically, includes the language runtime). The latter kind of error either crashes the server or gets caught by a top-level handler so it "only" crashes the request it was processing. That's a half-baked system.

thedufer · on Feb 16, 2014

> Sure, but what if the error is thrown at you as an exception in the first place—which happens a fair amount, because that's how the JS runtime tells you when something is wrong? How do you get from there to the callback way?

Its on you to catch that, not your libraries. This shouldn't be terribly common, though. The only thing I can remember having to wrap in a try/catch in the codebase I work on is JSON.parse.

> The async library could do the same, albeit with a lot more code. I'm curious why it doesn't.

It couldn't, without domains. try-catch wouldn't do it. Domains are something that is not very well understood, in my experience, and expected to happen at a higher level than libraries like async.

gruseom · on Feb 16, 2014

I don't understand most of this. For example, I don't know why you say that the async library couldn't try-catch every place that an exception might occur (mostly its calls to the functions that get passed in to it). It would be interesting if it couldn't, since then we'd have an example of something macros can do that functions cannot. But it seems obvious to me that it could; you'd just need a lot of try-catches. What am I missing?

As for domains, I don't know what you mean by them, but if they're catching errors at a higher level than the async library, my guess is that they must be some more sophisticated sort of top-level handler; perhaps something that keeps track of which async calls are in progress and attempts to bind exceptions back to their context? Whatever it is, it sounds complicated.

But what I understand least of all is how you guys all seem to write Javascript code that generates almost no exceptions. To me that sounds almost like bug-free code. No null references, for example? I get stuff like that all the time.

thedufer · on Feb 16, 2014

> No null references, for example?

We write in CoffeeScript, where a null reference check is so astonishingly easy to write that you use them everywhere you might get a null. I'm not sure what other exceptions you're seeing. We do basically no math, so /0 errors aren't a problem.

> For example, I don't know why you say that the async library couldn't try-catch every place that an exception might occur

Let's build a typical function you might pass to async:

function(next) { request.get(url, function(err, data) { JSON.parse(data); } }

Let's assume the server doesn't serve JSON like we expect - so JSON.parse throws an exception. The only thing async could have wrapped in a try/catch is the main function, but we've fired off a request and then the call stack wrapped up, including the try/catch. Next, an event occurs that calls our callbacks, not going through async at all. That's where the exception occurs. The stack trace generated by that exception doesn't contain any code in the async lib, so it can't possibly have a try/catch active.

Domains are a way of fixing this. You create a Domain and bind callbacks to it - if that callback throws an exception, the Domain instead emits an error event.

gruseom · on Feb 16, 2014

Ok, thanks, I get it now. In my case a macro transforms the body of each callback to catch exceptions and pass them back as error args through the callback chain. So in your example, there would be a generated try-catch around the JSON.parse(data). I forgot this detail (sign of a successful abstraction?) and it does seem an example of something macros can do that functions cannot.

Re null reference checks, to get behavior analogous to a null exception you have not only to check for null, but also pass back an explicit error if you find it. That's a lot more work than adding in an extra question mark. Null checks that do nothing but not crash are a mixed blessing; 90+% of the time they do what you want, but when they don't, you get a silent failure and a debugging goose chase. I'd be surprised if you told me that that never happens.

I took a look at Node.js domains and they do seem really complicated. If I were working in Javascript instead of having control over the language, I doubt I would use them; I would probably just crash-and-restart as one of the other commenters described. That's not a good solution, but probably the best tradeoff given the alternatives.

thedufer · on Feb 16, 2014

Our use-case for domains is to allow the process to finish serving its other in-progress reqs before crashing. When an error occurs, we stop accepting new connections in that process, give them 10-15 seconds to complete, and then do the crash-and-restart cycle.

That said, we get very thorough testing from our large user base, and we quickly fix crashers. Our server proc crash rate is almost 0, brought up by occasional spikes on releases.

eldude · on Feb 15, 2014

http://github.com/CrabDude/trycatch

delluminatus · on Feb 14, 2014

Last I checked, exceptions were not widely used in javascript because the try... catch block was a huge performance loss. I forget why exactly -- I want to say that the browser would spin up a whole new interpreter for catch blocks, just like with eval -- and it might be fixed in more modern JS engines, but I've still never seen exceptions used in JS. So I wouldn't be surprised if async doesn't handle them at all.

gruseom · on Feb 14, 2014

But runtime errors are exceptions, so even if your code doesn't throw them, you have this problem. No?

woah · on Feb 14, 2014

You generally pass expected errors up the chain, exceptions are for things like syntax and type errors etc, that only happen if you have coded it wrong. I run the server on forever.js, then go back in and fix my mistakes if that happens.

spion · on Feb 14, 2014

  var Promise = require('bluebird');
  var fs = Promise.promisifyAll(require('fs'));

  Promise.spawn(function* () { 
    var data = yield fs.readFileAsync(somefile);
  });

Documentation: https://github.com/petkaantonov/bluebird/blob/master/API.md

My favorite example is doing a diff using an async diff service which provides a function `svc.diff(string1, string2)`. But also imagine that you need to preprocesses the files using the sync function `removeEmptyLines(buffer)`. This is how the function looks like when implemented using Bluebird:

  function diffTwoFiles(f1, f2) {
    var file1 = fs.readFileAsync(f1).then(removeEmptyLines),
        file2 = fs.readFileAsync(f2).then(removeEmptyLines);
    return Promise.join(file1, file2).spread(svc.diff);
  }

I'd love to see someone come up with a better example using callbacks and async.

nailer · on Feb 14, 2014

- Why BlueBird? What's wrong with Q?

- Isn't data a scope down?

- https://github.com/petkaantonov/bluebird/blob/master/API.md is API based, not task based. async is API based too, but the API has names like 'waterfall' and 'parallel'. I can click them because I know what they mean.

Promises just makes me feel like I'm reading about and endless series of abstractions.

spion · on Feb 14, 2014

There is nothing wrong with Q, but Bluebird is a bit more node-oriented and also has really, really low CPU/memory overhead (lower than caolan's async). Also it provides the best debugging experience, period - because of its long stack traces spanning multiple previous async events.

I didn't quite understand the comment about data being a scope down. What do you mean?

Yes, promises do have quite a steep learning curve :/ However they're a lot more flexible than a utility grab-bag of functions that never quite fit the problem you're having. By that I mean I often had to massage my functions (by creating new closures or using bind etc) to make them fit the signature that async requires.

I wrote a bunch of examples for common tasks here - http://promise-nuggets.github.io/

woah · on Feb 14, 2014

Thanks for the awesome blog post. I read it extensively when I was trying to use promises for everything. I ended up deciding that it was not worth the trouble of learning a lib with like 30 methods for a tiny bit of syntax sugar. Callbacks have never even really bothered me. Named functions FTW.

spion · on Feb 17, 2014

Those methods are there for convenience. Most of the time while I'm working with promises, I don't use anything else except `Promise.all` and `Promise.prototype.then`. Similarly how to when working with caolan's async, most of the time you don't use anything else except waterfall, series, parallel, mapSeries and map. (Note however that async's utility grab bag approach results with a larger commonly used subset :P)

Promises are not about syntax sugar. They're about utilizing the whole power of the language and providing a parallel for most features found in synchronous code:

1. Functions have return values

When using node style callbacks, we're ignoring the fact that the language was designed with functions that have return values. Instead we use half-functions. Its no wonder those compose quite badly - the language wasn't designed for that kind of composition. The language was designed to work with functions that take input values and return an output value. Callback-based functions do only the first half. Thats why to get them to compose we resort to a bunch of hairy helpers and boilerplate code.

Callback-based functions that don't return anything are seriously crippled in power, and promises fix that, restoring much of the power.

2. Errors can bubble like exceptions

When using node style callbacks, we must explicitly handle all errors. On one hand, this is a good thing: we should deal with all errors. On the other hand, its quite tedious: most of the time we can't deal with the error at the exact place it appears but must pass it up one level in the call chain.

Promises do the error bubbling automatically. We can attach the appropriate error handler at the appropriate place to deal with the error.

This simple feature results with tons of useful patterns, one of which is the ability to manage resources with constructs like C#'s `using` keyword. - https://github.com/spion/promise-using

3. Values in variables can be accessed multiple times

When using node style callback and event emitters, we must make sure to "capture" the value exactly when it comes. If we don't do that, poof, its gone forever - we missed it.

In contrast, promises will keep the value for us. If we need to access that value later, we can simply attach another callback handler. An example where this may be useful is a database connection:

We initialize the connection and get a promise for that connection:

  var pConn = db.connect(host, port);

How do we implement a query method that is immediately available and will queue up queries until the connection is established? Easily:

  function query(q, params) {
    return pConn.then(function(conn) {
      return conn.queryAsync(q, params);
    });
  }

It doesn't matter whether the connection was established a long time ago or hasn't been established yet - the query will either execute immediately or its execution will be delayed until the connection becomes available.

Now try doing this with callbacks :)

nailer · on Feb 14, 2014

+1 examples.

For me, nearly everything .waterfall(), .each(), or .parallel(), and has been for two years now.

lightblade · on Feb 14, 2014

Native Promises already landed Chrome 32 and Q still does not support native promises. Bluebird delegate to native if supported.

Promises can be used as flow control, but more importantly, it's an object that encapsulates asynchronous mechanics.

I like to see how async can launch an asynchronous operation, then allow listeners to be attached later to capture the result. Now, you may say that if you want to attach listeners to capture result, you'll want to use event emitters. True, but event emitters has its own problem because event emitting and attach listeners are synchronous. What if the event was emitted before any body has a chance to attach listener to it?

Promises does not have these issues.

jhrobert · on Feb 15, 2014

Using async, if you want something to allow listeners to be attached later:

  var outcome = Boxon();
  fs.readFile( "test.txt", outcome );
  // Attach listener later:
  outcome( function( err, content ){ ... } );

See https://github.com/JeanHuguesRobert/l8/wiki/AboutBoxons

Boxon objects are light compared to promises, but they interop well. Best of both worlds!

domenicd · on Feb 15, 2014

This is false. Bluebird does not and never will delegate to native promises.

lightblade · on Feb 18, 2014

ok... then can you explain what's happening here: https://github.com/petkaantonov/bluebird/blob/master/js/brow...

Where the line checks for `window.Promise`

benesch · on Feb 14, 2014

For more on why bluebird crushes Q, check out the original HN thread: https://news.ycombinator.com/item?id=6494622

In short, its API is nearly as extensive as Q's with essentially no overhead—bluebird is hardly more expensive than callbacks, while Q is something like 10x slower than using callbacks.

jonny_eh · on Feb 14, 2014

Bluebird is the best promises library: https://github.com/petkaantonov/bluebird

phleet · on Feb 14, 2014

Promises aren't just about keeping things flat.

The biggest value to me is being able to avoid the passing around of callbacks and them relying on varying conventions (some async are function(args, ..., callback(err, data)), some are function(args, ..., callback(data), errback(err)), some are function({success: callback, error: errback})).

Promises solve this problem by not passing around callbacks _at all_. Instead you return the promise and let the consumer attach the callback itself. And once we have promises widely available and part of the standard library, the calling conventions will be standardized too.

EDIT: I agree that as it stands, the lack of standardization of promises (jQuery's are mutable, for instance) is a pain, and that documentation could certainly be better.

nailer · on Feb 14, 2014

I can install pretty much anything from npm at this point and put money on it being a single callback, err first.

jhrobert · on Feb 15, 2014

  Boxon.cast( fs.readfile, 'myfile' )
  .then( function( content ){ console.log( content ) })
  .catch( function( err ){ console.log( "Error", err ) });

Boxon is promise implementation agnostic, works with Q, bluebird, etc... See https://github.com/JeanHuguesRobert/l8/wiki/AboutBoxons

jkrems · on Feb 14, 2014

async:

    async.waterfall([
      fn1, fn2, fn3
    ], function(err, result) {
    });

Q:

    resultP = [ fn1, fn2, fn3 ].reduce(Q.when, void 0);
    resultP
    .then(function(result) {});
    .catch(function(err) {});

If you chain thens in Q, you are not doing it right (imho).

nailer · on Feb 14, 2014

Q docs show both. Either way: .waterfall() is still simpler.

bluepnume · on Feb 15, 2014

It would be trivial to create a q.series or q.waterfall abstraction. The real benefits of promises are that we get back `return`, `throw`, `try` and `catch`, not that it makes our code 'look' more sequential, which is totally possible using callbacks and async.js.

tel · on Feb 15, 2014

Waterfall is an easier top-level API that hides a clean low-level API. When it's all you need and it works well then you won't want to switch. When it breaks or you want it to behave better then a nice low-level API is beneficial.

onestone · on Feb 14, 2014

Indeed caolan/async has been a de-facto standard until recently. But for me personally it has become obsolete, with the standardization of generators and promises.

rubiquity · on Feb 15, 2014

This. The best part of Harmony is to wipe libraries like this off the face of the planet. I dread the days of trying to pick between 10 different flow control libraries. A few years ago 5 new flow control libraries were coming out every month.

alasdair_ · on Feb 14, 2014

Agreed, almost everyone at Groupon is using Async over the alternatives.

cjf4 · on Feb 14, 2014

Really can't wait for "all the things" to stop being used as an acceptable replacement for "everything."

badman_ting · on Feb 14, 2014

syntern · on Feb 14, 2014

Most of the things he describes can be done with Dart's async and collections libraries. Compiles to js, works on server side, used in production. Even if someone wants to reinvent the wheel, they should take a look at those libraries and see how it is done there.

ilaksh · on Feb 14, 2014

I use async.map when I need to do a bunch of file operations or something asynchronously and wait for them to finish. I have largely avoided using stream specific syntax because the event model was more familiar to me. I have used ToffeeScript instead of async.waterfall or async. series because ToffeeScript is cleaner.

With these improvements to streams in Highland making things more convenient and broadly applicable I expect to be using Highland streams for certain things.

badman_ting · on Feb 14, 2014

I get that arrays are meant to stand in for more asynchronous sources of data but those things seem so different to me that I don't understand why you'd want a library that treats them the same. If you need to map an array that's taken care of. I know I'm being dense, I just don't get it.

eplawless · on Feb 14, 2014

I'm curious which features highland provides which RxJS doesn't. From what I understand, composable streams from any data source with backpressure support is pretty much the definition of Rx.

Sometimes simplicity is a feature, too, though.

caolanmcmahon · on Feb 14, 2014

Rx doesn't handle back-pressure or laziness, so it's for only really for handling events.

caolanmcmahon · on Feb 14, 2014

RxJS advocates are unhappy with this comment so I'm going to qualify it a little. Apologies for any misunderstanding...

Rx doesn't handle automatic back-pressure (like Node Streams) but does have mechanisms to avoid overwhelming slow consumers. Rx also has delayed subscription which you can call lazy, but not by turning the stream into a pull-stream (allowing you to sequence actions in the way Highland does).

If any of the above needs further qualification or comment please weigh in on the issue by commenting here... but for now I'll leave it at that. I actually list RxJS in the blogpost because it's a good example!

mattpodwysocki · on Feb 14, 2014

Coming in 2.3, we will have full capabilities for backpressure. We already have window/buffer/throttle, etc. But, I think it's naive to have only one style of backpressure because many are valid. Just an example of RxJS, and what can be done, which includes a style in which you can do several forms of backpressure, and yes, push to pull based models: https://gist.github.com/mattpodwysocki/9010149

Still fleshing it out, but pretty close to calling it complete: https://github.com/Reactive-Extensions/RxJS/tree/master/src/...

We're more than open to pull requests though if anyone thinks we're missing something here.

platz · on Feb 14, 2014

since `fastSource.map(slowThing)` automatically pauses the source while the slow thing is processing, how is this different from iteration? It also states that in the case of a non-pausable source it will buffer the data. How does it know when to pause vs when to buffer.

Is there some way to know what kind of source you've got, or are the the sources constructed in a way that chooses which behavior you get?

nailer · on Feb 14, 2014

@caolanmcmahon: consider using methods?

- We have object.defineProperty() in ES5 to avoid enumeration.

- You can use user-specified prefixing to avoid future conflicts.

Eg:

    {foo: 1, bar: 2}.hlPairs();

rather than:

    _.pairs({foo: 1, bar: 2});

hamburglar · on Feb 14, 2014

This is highly intriguing, but I must ask why the JS community has this fascination with obscure identifiers like "_". It decreases readability when what should be a logically-chosen descriptive identifier for your class is replaced with a single character that visually recedes into the language syntax. Edit: I've grudgingly given jquery a pass on this because of its ubiquity, but come on, a stream library? Not to mention the fact that a reasonably popular lib already seems to have squatted on underscore.

wcummings · on Feb 14, 2014

In my codebases, literally 1/10 lines or more contain at least one call to underscore.js, frequently more than one (not even including async). Its for brevity. If you wrote out underscore.* or Highland.* your code would become hard to read. These are utility functions that are used heavily.

I'm excited about this, as someone who uses underscore.js and async together, heavily.

hamburglar · on Feb 14, 2014

underscore.js is a pretty good sized collection of utilities, which is why you call it a lot. I am having a hard time believing you're going to be constructing so many Highland streams that you need a one-character identifier for it to increase readability.

wcummings · on Feb 14, 2014

If you look at the full docs, there is a lot of overlap, highland provides a lot of of the functions from underscore.js, and since from what I can tell its trying to unify the "javascript utility belt", I can only imagine more will be added.

al2o3cr · on Feb 14, 2014

"highland provides a lot of of the functions from underscore.js"

"A lot of the functions" (versus "all of the functions") sounds like a one-way ticket to readability hell, since it means that an inattentive reader may assume the functions ARE from Underscore.

wcummings · on Feb 14, 2014

To me the benefit of a convenient shorthand outweighs this, someone would figure it out really quick, seems like a non-issue. A few different libs use $ and the world doesn't end.

ryanatkn · on Feb 14, 2014

JS devs rely heavily on libraries to do things that most programmers would expect the language to handle. $ and _ have been adopted as toolkit identifiers and are littered throughout most JS codebases. Highland does much of what one would expect from _ via Underscore/Lodash. I'd bet most people think they improve readability.

Cthulhu_ · on Feb 14, 2014

I agree; what I personally wonder is why Underscore and similar libraries don't make use of Javascript's prototype business and add methods to the array and object prototypes? Probably things I'm overlooking here, but, [1, 2, 3].map(stuff) is much nicer than _.map([1, 2, 3], stuff) and the like.

jashkenas · on Feb 14, 2014

Ha ha.

You're describing the state of affairs before Underscore existed, back when functional-ish programming in JavaScript was ruled by Prototype.js:

http://prototypejs.org/

... which added a lot of useful methods to native prototypes.

While handy in controlled and limited environments, mucking about with native prototypes quickly becomes extremely dangerous and difficult — once you have two third-party modules on the page that expect different versions of your patched prototype method ... once you have a new version of a browser that implements one of your previously-extended functions, but does it differently — you're pretty well screwed. Both of those things tended to happen in large sites.

hamburglar · on Feb 14, 2014

Whereas with _.map, you always know exactly which implementation you're getting, right? :)

jashkenas · on Feb 14, 2014

Actually, you do. You've loaded it, and you can lock it down privately to your library or app with _.noConflict().

You can have ten different versions of Underscore loaded on the page, living in peace and harmony, in ten different third-party modules. Not that you should. But that you could.

hamburglar · on Feb 14, 2014

Well then I guess this _ has a bug because it doesn't have a 'noConflict' method. :)

But to be perfectly fair, you are 100% correct: this is not a technical complaint and perhaps this entire sub-thread is, as has been claimed, "bike-shedding." Since this lib is designed to be loaded via an AMD-style mechanism, users can call it whatever they want, so _ is just as valid an identifier as any other. Except the obvious issue that readers of the code, examples, and any code that follows suit, will end up with this completely pointless ambiguity because _ is ultimately a meaningless name if it evolves to mean simply "some library I loaded." You may as well write sample code like:

var $ = require('http'); $.createServer(...);

I would think people would call that out as ridiculous and confusing.

jdd · on Feb 17, 2014

Extending native prototypes creates other less obvious hurdles for libraries too. Craft.js, es5-shim, Modernizr, MooTools, Prototype.js, and Sugar.js, to name a few, have all, at one time or another, added incorrect shims to native prototypes.

While Underscore is in a better position than those that extend native prototypes, regarding api/environment conflicts, it can still be tripped up by poor shims because it defers to many ES5 methods if they exist. For example, if Prototype 1.6.0 and Underscore.js are included on a page Underscore's `_.reduce` method won't work properly. This is one of the reasons why libs/frameworks like Dojo, Ember, Lo-Dash, RequireJS, Sizzle, and YUI do native checks too.

b3n · on Feb 14, 2014

It's considered bad practice to add add methods to the prototypes of built in objects because it has the chance of breaking other people's code.

owenversteeg · on Feb 14, 2014

Most people are very hesitant to add methods to array, string, etc. because if more than one person does that you'll end up with all sorts of problems.

rubiquity · on Feb 15, 2014

Underscore uses the "_" variable because, well the library is actually named "underscore." Highland uses the "_" because... I don't have anything nice to say.

caolanmcmahon · on Feb 14, 2014

var Highland = require('highland');

hamburglar · on Feb 14, 2014

Yes, clearly you can do that, but it's obviously not your intent. All the docs use _ as the class name, and that's what you use internally. The class is named "_" and that's what people using it are going to expect to see if it becomes popular. You could make a better choice.

bigonlogn · on Feb 14, 2014

> You could make a better choice.

But you can call it whatever you want, so... who cares? The choice is yours... Being a pedant is hardly constructive.

caolanmcmahon · on Feb 14, 2014

I think I'm getting the hang of HN ;)

hamburglar · on Feb 14, 2014

But seriously, it seems like a pretty neat library. I just think your nomenclature is the suxxors and eventually we're going to look back on this trend and think, "boy, making everyone reading your code wonder which _ you're talking about is a pretty silly way to save a couple characters."

e12e · on Feb 15, 2014

var ‾ = require('highland'); // Meet overline, the new underscore

badman_ting · on Feb 14, 2014

Just seems like a silly thing to get hung up on.

hamburglar · on Feb 14, 2014

Naming is important for code readability. I think a lot of people recognize that. Which is why I'm so baffled by this trend of "screw it, I won't even bother with a name! Call everything underscore!"

gruseom · on Feb 14, 2014

Please don't bikeshed in technical threads. One comment on something like this is more than enough; zero is probably better.

When a new thing comes out, the discussion should focus on what's significant about it.

gfosco · on Feb 14, 2014

This sounds intriguing. I wish there were some more complex examples. Part of the greatness of Promises is taking a big chunk of pyramid code and turning it into a set of simple steps... I'd like to see how this would handle that.

caolanmcmahon · on Feb 14, 2014

Good idea, I'll definitely post a follow-up with some real code done using async/callbacks and highland/streams. The comparisons usually start to look more favourable with longer examples, due to the Highland API being so composable.

qubyte · on Feb 14, 2014

Nice. I need to compare code written with highland to code with async.series/.each etc. (but not waterfall, I don't like that dude ;)).

ctcliff · on Feb 14, 2014

Loads of examples at http://highlandjs.org/.

lukasm · on Feb 15, 2014

I really like Node, but I want a language with sane semantics. What's the best option apart from JS and CoffeScript. TypeScript is awesome, but I'm afraid that there will no good community, because of MS stigma.

cgag · on Feb 15, 2014

Clojurescript sane and pretty mature now. http://himera.herokuapp.com/synonym.html

outside1234 · on Feb 15, 2014

Definitely a good community, have a look at all of the type libraries alone:

https://github.com/borisyankov/DefinitelyTyped

michaelwww · on Feb 15, 2014

Google Dart and TypeScript are pretty good alternatives to JS in some cases and Dart has the notion of streams.

https://www.dartlang.org/docs/tutorials/streams/

codygman · on Feb 15, 2014

You can use Haskell for the same semantics if I understand you correctly. Actually coming from Node, Go might be a better fit.

lukasm · on Feb 15, 2014

In other words I would like to use language that runs on Node, but it's not JS or CS

rubiquity · on Feb 15, 2014

When did sane and static typing become synonymous?

lukasm · on Feb 15, 2014

sane has nothing to do with static typing. Python is a language which is dynamic and has a sane semantics. e.g. in JS {} + {} is NaN and many other examples.

jkrems · on Feb 15, 2014

There's a language where `{} + {}` does something useful? Or is the complaint that it should `TypeError`?

anonymouz · on Feb 15, 2014

It's obviously {}, duh! ;)

(In one of the ways of constructing the natural numbers within ZFC set theory in mathematics, one identifies 0 with the empty set {}, 1 with {{}}={0}, 2 with { {}, {{}}} = {0,1}, and so on.)

pekk · on Feb 15, 2014

If it does anything at all, it certainly shouldn't generate NaN

tel · on Feb 15, 2014

Why not? If (+) is numeric addition and coercion is free then {} is NaN and so is {} and so is NaN + (anything) and so is (anything) + NaN.

Which is not to say that it's a good choice, but instead to say that the badness stems from overloading (+) too much and having free coercion.

tel · on Feb 15, 2014

Right, sane semantics: http://cs.brown.edu/research/plt/dl/lambda-py/lambda-py.pdf

Saner perhaps, but still ungodly complex with weird edge cases.

rubiquity · on Feb 15, 2014

How would using TypeScript fix that?

_greim_ · on Feb 15, 2014

Perhaps Dart?

mcgwiz · on Feb 15, 2014

Along these lines, John Resig recently created http://nodestreams.com/ as a way of composing and conceptualizing stream processing. Pretty nifty.

albertoleal · on Feb 14, 2014

This is amazing! And it's something that I wanted for some time.

LazyJS just received stream support, but I'm pretty much sold on to highlandjs.

kimjotki2 · on Feb 14, 2014

It looks like this whole notion of 'stream' is just a syntactic sugar for javascript guys who lost themselves in a bunch of nested stupid callbacks, which is worse than lisp parens. Instead of nesting whole callbacks, create a stream in the middle; execute first half of callback chain and dump the result into the stream so that next half of callback chain can be executed later.

Why is this so an enlightenment for node guys? UNIX does it right since epoch - simple programs perform simple tasks and connected via pipes. Python has gevent, so you don't even need 'a stream' or other bullshit, you just write the code as-is and greenlets provide the concurrency needed.

The real enlightenment comes from 'programming properly'; you start with C and torture your brain with function pointers and realize why it is a good idea to treat functions as a first-class objects. then you learn some 'proper' functional programming languages like lisp or something to learn how to think in functional way. which is the only guaranteed and proven path to prevent yourself from shooting your own foot by writing 20+ nested callbacks. If you start with binding an anonymous function to a <button>'s click event and think you can do this to do real programming, you'll never get it right.

woah · on Feb 14, 2014

Wise words of wisdom from kimjotki2, the only real programmer on the internet. Bow down, bitches.

coldtea · on Feb 15, 2014

This is not reddit.

mattgreenrocks · on Feb 15, 2014

What is it about JS that causes people to re-invent everything and call it progress?

tjholowaychuk · on Feb 15, 2014

yup! node does streams horribly wrong

jhrobert · on Feb 15, 2014

I am considering the idea of spending some time studying node streams. Your remark worries me. Is there some sources I can consult about node stream potential weaknesses? Thanks.