Stop Filing Bugs, File a Container

feld · on Feb 1, 2017

I hate to be negative, but this feels like an invitation for very poor bug quality reports where now you have to debug an application AND an OS (the container).

Me1000 · on Feb 1, 2017

Hi there, I'm on the RunKit team.

I think we've all been in a situation where we have a bug reproducing in front of us but when we ask for help by describe or even give code to someone else they say "hmm, it works on my machine". Since software rarely runs in a vacuum, having better system level access can sometimes be the only way to debug it.

Our hope here is that RunKit notebooks make it easier to see what's wrong, even when the problem is less obvious.

monochromatic · on Feb 1, 2017

Probably makes more sense to use the container as a backup for cases like that. If the bug does reproduce on other machines, then a code snippet is going to be easier to diagnose than messing around with a whole container.

tolmasky · on Feb 1, 2017

I think there may be a bit of a misunderstanding here. RunKit does not send you a container for you to interact with manually, in fact there is nothing in the RunKit UI to "mess around" with the container itself. RunKit is a coding notebook that is backed by containers (and automatic shrink-wrapping of dependencies) to guarantee determinism over multiple runs. In other words, it is a code snippet, but with a run button attached.

Nothing prevents you from 1) copy pasting the code to your own computer to run it exactly the same way as you would have previously done, or even better 2) hitting the "download" link on the left which downloads the code + shrink-wrap file so you are using the same dependencies as the user. On top of that, we have also made stack traces and other elements of the UI a lot friendlier.

From a reproducibility perspective, the notebook represents an undeniable instance that the bug did happen, along with all the background information you are usually asked (what version of node? what version of package? any other dependencies?). The goal is to make that all apparent in the code itself, vs extraneous other files like package.json or such. Here is an example: https://runkit.com/tolmasky/my-bug/1.0.0

If you are interested in the underlying technology, we've documented it here: http://blog.runkit.com/2015/09/10/time-traveling-in-node.js-...

philipov · on Feb 1, 2017

The most common situations for having a bug that works on my machine but not other people's have to do with memory access violations. But it seems like when you containerize the environment, you will change the memory layout and risk the same situation where the bug disappears because you've moved it to a different machine.

coldtea · on Feb 2, 2017

>The most common situations for having a bug that works on my machine but not other people's have to do with memory access violations.

Not even close. Different classpaths, different versions of packages, some ENV setting like LOCALE causing havoc, a different configuration between the two installations of the programs that are debugged, a single core vs multi core cpu that masks some race conditions, there are literally MILLIONS of things that can, and in my experience, have, cause something to work in one machine and not another.

In fact memory access violations do not even register as a blip on the top-100 reasons...

robzyb · on Feb 2, 2017

> The most common situations for having a bug that works on my machine but not other people's have to do with memory access violations.

That has not been my experience. Especially with software written under time constraints including the constraint of "this is FOSS software that I'm writing in my free time."

spc476 · on Feb 2, 2017

The hardest bug I had to track down exhibited different behavior on different systems. On one system, it only took an hour or two for the program (a daemon) to crash; on another system, several days could pass before a crash.

I was able to track it down once I realized that signal handlers were, technically speaking, another thread of execution in a special context ...

khedoros1 · on Feb 2, 2017

If part of the idea is to find differences between the working and non-working environments, does RunKit include a way to diff two containers, or something?

thinkmassive · on Feb 2, 2017

It appears to be a notebook backed by containers, so you could presumably use:

  docker container diff

khedoros1 · on Feb 2, 2017

But doesn't that just give you the diff between the image that the container was built from and the current state? If part of the premise is that environmental differences might be the cause of the bug, wouldn't you want to compare two potentially-dissimilar containers?

feld · on Feb 1, 2017

Thanks for the info!

gravyboat · on Feb 2, 2017

I agree with this. For an open source project it's just going to be ridiculous. "Hey you know all that free work you already do? Well now you can troubleshoot some user's OS issues as well! Doesn't that sound like fun, we all know you love unpaid work!"

tolmasky · on Feb 2, 2017

I've responded to this lower: https://news.ycombinator.com/item?id=13545862 , but this is not how this feature works. The confusion is completely understandable if you've never used RunKit, but the idea is not to give people containers, but rather a "common ground" where users can provide reproducible test cases.

gravyboat · on Feb 2, 2017

Gotcha,thanks for the additional explanation.

voiper1 · on Feb 2, 2017

Wow. Not sure why all the hate. This is amazing.

It's like the live javascript/html/css editors, but for systems!

https://runkit.com/npm/lodash and poof, you can run queries against lodash and share and they can edit it. (which you could do in a plunkr too, if you loaded it). I didn't even need to sign up...

Looks like http://code.runnable.com/ does similar for more stacks, but you have to sign up for that.

chriswarbo · on Feb 1, 2017

I find Nix invaluable for this kind of thing; especially how it rebuilds anything who's dependencies have changed, and caches everything else.

I tried to play with docker, but haven't seen much use for it in for my current workflows.

akavel · on Feb 1, 2017

AFAIK, Nix even has a testing framework, where you can start a VM and simulate various actions inside it. All described by a single Nix script. I haven't explored it myself yet, but I understand it's used for unit-testing various Nix packages.

yagni3 · on Feb 1, 2017

+1 for nix. _nix-shell_ is a great way to get stuff up and running without having to wrestle with the container boundry.

swift · on Feb 1, 2017

I don't work on projects where Docker is useful in production, but I've found it tremendously useful in continuous integration. It guarantees a reproducible environment, which a VM could do, but just as importantly it documents how to produce that environment from scratch. I've found that hugely valuable.

I've always liked the idea of Nix but I've never used it. How easy is it to extract and understand the details of the environment you've constructed to build your project?

chriswarbo · on Feb 1, 2017

> Docker... documents how to produce that environment from scratch. I've found that hugely valuable.

> How easy is it to extract and understand the details of the [Nix] environment you've constructed to build your project?

The "extract" part is easy: Nix uses a declarative language to describe the environment, so in that sense it's like docker, but more fine-grained: each package is built in isolation, and installed to a read-only filesystem.

As far as "understanding the details", it depends on what you want to know. You can use "nix-shell" to enter a build environment in a shell, e.g.

    nix-shell -E 'some Nix expression'

will enter the build environment of 'some Nix expression',

    nix-shell -p 'some Nix expression'

will create an environment with 'some Nix expression' installed, etc.

For example, I'm currently tracking down a space leak in a Haskell program by running the following command after each edit of the source code:

    nix-shell -p 'with import <nixpkgs> {}; profiledHaskellPackages.callPackage (runCabal2nix { url = ../myTester; }) { myDependency = profiledHaskellPackages.callPackage (runCabal2nix { url = ./.; }) {}; }' --run 'tester +RTS -M100M -xc'

The command 'tester +RTS -M100M -xc' is a Haskell program which dies if it uses more than 100MB of heap, exposing the space leak and dumping a stack trace.

This will be run in an environment containing Haskell packages built from the source code in "./." (the current directory, for "myDependency") and "../myTester" (for the "tester" command). These packages and all of their dependencies will be built with profiling enabled.

Running this command over and over will just re-use the previously-built environment. If I edit the source in ./. or in ../myTester, Nix will notice that the hashes have changed, rebuild the packages and their dependents and make a new environment using those. This makes it easy for me to test potential fixes.

baq · on Feb 1, 2017

the problem is docker builds aren't really reproducible; e.g. if you're using ubuntu, you have to do apt-get update or your apt-get installs will fail with 404s.

i also haven't used nix, but i hear it truly solves the problem.

Animats · on Feb 1, 2017

So now someone can reproduce the bug, but can't test a fix to it. They can take apart the container and attempt to reproduce in their own environment, but that's a lot of work.

tolmasky · on Feb 1, 2017

We provide a download link on the left which will download the file with all the dependencies shrink-wrapped, so that you can certainly start hacking on it locally too. This is a great start and a lot easier than manually trying to npm install the right combination of sub-dependencies, etc. If you have any other ideas we're happy to implement!

contingencies · on Feb 1, 2017

The point is that complexity is most effectively managed by discrete (modular) consideration, not another layer of abstraction, which tends to hide it. Build process is a discrete development process problem space with its own group of established solutions. It should not be conflated with bug reporting.

All problems in computer science can be solved by another level of indirection, except of course for the problem of too many indirections. - David Wheeler

sametmax · on Feb 2, 2017

If you are in the type of bug runkit targets, getting the exact same env right will take you days. With runkit, you do have a little setup to do, but the days can be used to debug.

This is a great concept : remove the work to reproduce an environment, and use the freed resource to understand said env.

Major advantages are:

- the bug reporter may not know his/her case is special and won't provide you with the information you need to see it. Providing a running container with the bug just obliviate that.

- the running container can come with a docker file, which gives you everything: libs, versions, settings. It's basically a summary of the stuff you need for it to go wrong.

- you can try locally the stuff, even if your setup is completly different, without messing with your setup. Because complex bugs rarely are reproducible with a bunch or pip install, apt get and other yum incantations.

It's not good for all bugs. I would say it's actually bad for most bugs. But for the bugs that needs it, it seems fantastic.

contingencies · on Feb 2, 2017

getting the exact same env right will take you days

If your codebase has significant undocumented environmental dependencies and these are hard to script in place, there's something far more fundamentally wrong with your development process than bug reporting.

sametmax · on Feb 2, 2017

Lol you react like most of us have software deployed in controlled environments.

I create Python libs that are used in so many various configurations by so many people I don't know. They come back and say "hey, I got this stack trace" (in the best case).

Now I have to play the guessing game. Is it me, is it you ? Wrong path ? Permission problem ? Bad conf ? Network is having trouble ? Server is this particular linux version and and there is something important about the SEL setup here ? Oh but upstart/systemd don't behave the same way. Output is redirected here look. Stuff is not the expected encoding. Na it was not a bug but you're file is corrupted. What the heck is this data format ?

Etc, etc.

You thing you though about every single thing ? Your error handling is perfect for all IO ? You deal with all encoding, all user inputs perfectly ? You know all the little OS peculiarities that will make your subprocess run in the exact way you think ?

Of course you don't, nobody is perfect, we don't have infinite resources. But there are infinite ways to fail.

And for many things, you can figure it out with with just your code base and the error because it's a simple cause. But from time to time arrive this terrible bug that is a mix between a strange LOCAL, this particular version of the VM you use but only in one time zone with this env variable set. And for that, yes, a good container with a reproducible bug in the proper env is an interesting idea.

Apparently, you don't ship software that is wildly used enough or you would not be that arrogant.

Deploying on your own 100 servers is hard. Try seing your code deployed on 1000 servers that you don't own, nor configure.

contingencies · on Feb 2, 2017

I'm trying to help point you in the right direction based on my own experience, not be arrogant. It sounds like you would get a lot out of spending some time working on CI/CD concepts.

In short throwing your hands up in frustration at the complexity of software is not a solution and gets you nowhere. The way we deal with "infinite ways to fail" is to control the environment. These days, quality projects are expected to version control their environments and conduct test deployments within a representative set of environments using a representative set of configurations.

Docker provides an easy way to do this ("always deploy on <distro>-<os>-<version>"), but it's only one approach. Another free and relatively straightforward place to start getting up to speed would be automating build and test processes with Travis CI for an open source project.

Deploying to any number of tested environments is trivial.

sametmax · on Feb 2, 2017

You are completly missing the point.

If you create a Python lib, 10000 people will pip install it. You have no control on the env.

If you create a deb package, you will see ppa and it will be installed on many various env. You have no control on the env.

Again, you definitely have no experience in shipping software outside of your bubble.

A lib is not "a web project". A cmd line tools is not either. You still need to debug them. People will run them on windows, linux, mac, bsd, and who knows where. And they will come for you.

Now you can choose to simplify the problem and only support a limited number of env. But I guess I'm quite happy the guys who created apache, ffmpeg didn't force me to only used them on Linux with LOCALE set to accept only ascii and CEST.

contingencies · on Feb 2, 2017

No. As I said, in these cases you use a CI tool to test on a broad range of environments.

For example, here is a library I maintain with 12,000+ installs per month that is tested on 7 different environments every commit using Travis: https://github.com/globalcitizen/php-iban

sametmax · on Feb 2, 2017

7 env. LOL.

Don't get me wrong. It's a good thing. Most people I know don't even have unit tests. Having a CI is fantastic. 7 env is more than a lot of people do.

But that doesn't even scratch the surface of the combinations of factors you can get. It can't.

You can't install all the locals, all the lib versions, simulate all network conditions and all user inputs.

You will get strange bugs that your CI didn't take in consideration. And some of them will be very hard to reproduce. That's just a fact.

Now knowing you do have all this setup I just can't understand we still disagree. It's impossible you didn't run in those.

rhizome · on Feb 1, 2017

Filing a container-disassembler container RSN.

darksaints · on Feb 1, 2017

I wish we had better reproducibility tools. It should be possible to write a failing test case, and everything that is not of value to that test case gets stripped out, containing just the parts of the program/environment that can affect the test case (including sources of non-determinism like clocks and RNGs). Kinda like C-Reduce, but on steroids.

And we should be able to look at the problem probabilistically too: given the form of the test case, the value pipeline through the debugger, and the structure of the program, we should be able to narrow down places to look as much as possible using some form of bayesian search theory.

mrkgnao · on Feb 3, 2017

Minimal test cases sounds like QuickCheck.

mattwhilden · on Feb 2, 2017

We have something "similar" for the .NET Native compiler. You pass a special flag to the compiler and it will generate a file with more or less everything we need in house to diagnose your issue.

Compared to cut down repros, the packages are huge (200+MB) and debugging requires more gumption but we save a huge amount of time going back and forth with customers. It's not ideal but we'd rather have a "bad" repro than never hear about it.

We've been taking customer issues this way for almost 2 years and it's been a good thing to have in the tool belt.

tolmasky · on Feb 1, 2017

Hi I'm part of the RunKit team and am happy to answer any questions!

lilyball · on Feb 1, 2017

This looks really cool. I wish I could do this with other languages.

Myztiq · on Feb 1, 2017

You could do something similar with http://code.runnable.com/

[Disclaimer - I work at Runnable]

TheAceOfHearts · on Feb 1, 2017

This looks really awesome. I really like that you allow users to try it out without having to sign up.

I think probably the biggest hurdle with something like this gaining traction is having the users remember that it's available.

Is it possible to fork from an existing container? If that's possible, you can probably get maintainers to include a link to a working container with the lib already setup in the project's new issue template.

tolmasky · on Feb 1, 2017

You can link to an existing notebook and add "/clone" to the end of the URL to basically get this behavior. That's how the 'Try It Now' button at the bottom of the page works, we should definitely make this more obvious. Additionally, maintainers can embed RunKit on their pages with our embed API: https://runkit.com/embed. For example, Lodash uses embed on their documentation: https://lodash.com/docs/.

amelius · on Feb 1, 2017

Or send a coredump :)

shade23 · on Feb 2, 2017

I think this would make a lot of sense on platforms that undergo massive fragmentation(all fingers pointing at android here). Browsers are also fragmented , but websites like Can I Use It[1] should be able to reduce the problem to a manageable extent. I love the idea of reproducible bugs. Because often the hardest part of solving any bug is first being able to reproduce it followed by finding the source. Since I presume these containers would also help in reliably reproducing those one-off bugs which happen only on first install/ first open / on some other event.

[1]:http://caniuse.com/

corndoge · on Feb 2, 2017

Down in microcontroller land we generally just send an email describing the problem. But not knowing much about web development, I suppose problems in that area are much more complicated.

dullroar · on Feb 2, 2017

Am I the only one put off by the bad "s-t" ligatures as being too twee? It may or may not be a good approach, but I stopped reading literally ("Archer" literally) got too annoyed by the distraction of that font face. Edit: I want to downvote myself-my wife's W10 box show no annoying ligatures. My Linux Mint 18/Firefox setup is otherwise. Sorry for the distraction!