Machine Learning

Houshalter · on May 17, 2017

Reminds me of this horrifying stack exchange post: https://stats.stackexchange.com/questions/185507/what-happen...

sevensor · on May 17, 2017

One of my colleagues (an engineer) suggested something like this when I worked at a factory. My jaw dropped and I just stared at him. I had nothing to say. I guess he's a manager now.

cleetus · on May 17, 2017

I can't believe that's real!

SBCRec · on May 17, 2017

wow, that is actually horrifying.

jmkni · on May 17, 2017

ELI5?

TeMPOraL · on May 17, 2017

The manager of the person who asked the question thinks that if you take data in form of pairs (X, Y), split them up, sort independently and combine again, you'll get better results. In fact, such operation obviously destroys the relationship X had to Y, so the result is meaningless.

jmkni · on May 17, 2017

Ha, gotcha! Maybe if you do it enough times...in the cloud?

ardivekar · on May 17, 2017

No no, results stored on The Blockchain.

pavel_lishin · on May 17, 2017

A chatbot can read them back to us.

sidlls · on May 17, 2017

Only if it's implemented in Rust. Any other language wouldn't be safe.

yellowapple · on May 17, 2017

But Rust is not Web Scale™. Go.js is Web Scale™.

TeMPOraL · on May 17, 2017

You'd be better off just going with whatever "result" you want regardless of the data, and save on the AWS costs.

Buttons840 · on May 17, 2017

Doesn't it also assume you'll have access to the correct answer in production as well? Is X the observation and Y the correct answer, as often indicated by that notation?

nilkn · on May 17, 2017

Suppose your dataset consists of the following points.

(1,5)

(2,3)

(3,10)

(4,4)

Clearly there's really not a linear relationship between the first coordinate and the second coordinate. In other words, if you plot these four points and try to approximate them with a single line in the plane, you just can't do a very good job.

The person's manager suggested sorting the second coordinates in this list while keeping the first coordinates fixed, resulting in the following:

(1,3)

(2,4)

(3,5)

(4,10)

These points are still not collinear, but they certainly can be better approximated by a line than before. The problem is that this is simply a completely different set of points, so a linear approximation here implies nothing about the original dataset.

chestervonwinch · on May 17, 2017

Imagine you were trying to measure whether or not increasing X proportionally increases Y. Now imagine you first sort your data points, Xi and Yi, independently -- it will now misleadingly appear like Y increases when X increases since the variables are both sorted. You've scrambled your data (Xi, Yi) to be (Xk, Yj). So you're essentially working with a totally different dataset now.

jmkni · on May 17, 2017

It's actually quite a simple question, just asked using a lot of overly complex language which probably confused the questioner and his boss.

It's no different to having a list of key/value pairs and sorting the keys and the values independently, and hoping to get a meaningful result which is absurd.

stevenbedrick · on May 17, 2017

That is an excellent analogy.

abandonliberty · on May 18, 2017

OMG. This is a data scientist trap. They can't help themselves! It's actually a great question to test fundamental understanding and ability to explain.

abandonliberty · on May 18, 2017

Meanwhile I don't know how to get enough votes on my new stack exchange account to enable commenting.

v4nn4 · on May 17, 2017

Wtf wow

nirmalpatel · on May 17, 2017

Is this a joke?

Sophira · on May 17, 2017

Tom Scott yesterday made a video for laypeople on the topic of 'black box' machine learning and how it can be difficult to get it to behave as you want, too.[0]

It's an interesting watch - I'd recommend it if you're interested in learning about it.

(Heck, I'd recommend the channel. Tom does some great videos on a number of different topics.)

[0] https://www.youtube.com/watch?v=BSpAWkQLlgM

in9 · on May 17, 2017

His second channel is very nice as well. More casual and conversation based, and a bit less interesting on the topics it self. But nice nonetheless

flinty · on May 17, 2017

Isn't he the guy from the hello internet podcast?

cec · on May 17, 2017

No, HI is Brady Haran and CGP Grey, though I suspect Brady may have collaborated with Tom.

jey · on May 17, 2017

"Just stir the pile until [the answers] start looking right" is actually a pretty decent description of gradient descent.

james_a_craig · on May 17, 2017

It's clearly simulated annealing. You stir less as you get tired. :)

randomsearch · on May 17, 2017

Simulated annealing also exploits gradient information, so it is not equivalent to random search, although it is a stochastic algorithm.

dmitriy_ko · on May 17, 2017

With gradient descent you are walking in the direction of negative gradient. Stirring the pile implies that you are walking randomly from point to point.

TeMPOraL · on May 17, 2017

No, by stirring you're just making the pile to move, and things descend down the gradient of gravity. :).

steanne · on May 17, 2017

The little things moreso. Brazil nut effect!

yakult · on May 17, 2017

When you stir, you're stirring in a pattern, probably a circular one. Plus you're probably not paying too much attention to the shape of the pile, so it's gradient-free. It's pattern search. Except it's only approximately a circular pattern. There are random variations. Stochastic pattern search.

andai · on May 17, 2017

So more like a genetic algorithm then?

randomsearch · on May 17, 2017

GAs also exploit structure in the search space.

To see this most clearly, read up on estimation of distribution algorithms (EDAs), which generalise the ways in which GAs work.

netheril96 · on May 17, 2017

It is more like tuning the hyper-parameters, which is guided by anything but logic.

jrdt · on May 17, 2017

It actually seems almost like the definition of RANSAC.

https://en.wikipedia.org/wiki/Random_sample_consensus

randomsearch · on May 17, 2017

Gradient descent is not random search, so this is an inaccurate description. It works by exploiting the gradient in a structured landscape of solutions.

tchalla · on May 17, 2017

I'm not so sure. Gradient Descent refers to reaching the best solution available. Xkcd seems to be referring to more like the Bonferroni principle where you are looking for patterns without a hypothesis and post facto justifying it.

YeGoblynQueenne · on May 17, 2017

I think the point is about how data manipulation is about 80% of machine learning work. If an algorithm is giving you crap results for some data set, once you've fiddled with its hyperparameters through cv and so on there's not much you can do besides data manipulations like PCA, ICA and the like, to try and get a better result. Most algorithms work pretty badly with raw data.

I'm guessing that, in the sciences, that is a big no-no. Imagine if doctors, seeing that a new drug being trialled is failing to cure a disease, simply started chucking out the sick subjects until all the ones that were left where healthy, declaring the sick ones to be "noise" and the trial a success. Somehow, I don't think that would fly...

randomsearch · on May 17, 2017

This is confusing "cherry picking" (selecting some data and discarding other data) with "data transformation". Transformations are applied for perfectly good reasons, such as dimensionality reduction, or to enable the subsequent application of certain statistical tests that rely on assumptions not valid for the data in its original form.

tchalla · on May 17, 2017

> I think the point is about how data manipulation is about 80% of machine learning work. If an algorithm is giving you crap results for some data set, once you've fiddled with its hyperparameters through cv and so on there's not much you can do besides data manipulations like PCA, ICA and the like, to try and get a better result.

Data manipulation is different from data transformation. Manipulation changes the nature of data, transformation does not change the nature of data. PCA is data transformation, not data manipulation.

YeGoblynQueenne · on May 17, 2017

Tell me what you mean by "the nature of data" and I'll tell you if I agree with your definition. I don't see why the dimensionality of a dataset is not its "nature" for example.

ardivekar · on May 17, 2017

AKA half the work done in Machine Learning nowadays.

randomsearch · on May 17, 2017

To be precise: the best solution reachable from a given starting point, which may be a local minima rather than a global minima.

nabla9 · on May 17, 2017

(iterative) stochastic gradient descent to be specific. Data is stirred for each pass to prevent cycles.

rcarmo · on May 17, 2017

This is _so_ on point considering a lot of my customer interactions. Until I go in and go over establishing a process framework that goes through the models in use and sets criteria for their evaluation (even something as simple as ROC), a lot of ML work in companies is mostly tinkering with things (sometimes with wrong theoretical underpinnings) until it performs adequately.

The hype is only real if you systematically work it into measurable process, not virtuoso jamming.

Kholo · on May 17, 2017

Stone age tools as Chomsky calls them. The problem with simple to use tools, is the simplicity masks two key facts from the user - whether aptitude for the tool exists & amount of effort and time to mastery.

canjobear · on May 17, 2017

That's some nice snarkiness about how modern machine learning works. But let's not forget that this apparently "dumb" approach has beaten out much more intelligent seeming systems on many tasks. To me, this means we shouldn't be so confident that we know what an intelligent system looks like. Maybe effectiveness in AI doesn't have much to do with human interpretability, and even less to do with whether humans find the approach intellectually satisfying.

kbutler · on May 17, 2017

Is it survivorship bias? ML attempts that fail are cancelled and never heard from again, those that succeed are publicized.

The same approach works with pig entrails: a bunch of people make predictions, the ones that fail go away, the ones that happen to succeed a few times "must work".

Or in stock market terms, "past performance is no guarantee of future results."

canjobear · on May 17, 2017

In a way you're exactly correct and that's exactly why machine learning works. The models specify a huge range of possible input-output mappings and then do a dumb search to find the ones that work; the mappings that don't succeed are discarded. It turns out, surprisingly, this is the best approach we have come up with.

> Or in stock market terms, "past performance is no guarantee of future results."

Past performance isn't a guarantee, but under mild conditions it is very strong evidence that there will be future results.

YeGoblynQueenne · on May 17, 2017

And lets' not forget that one lucky hit with pig entrails is enough to build a career that lasts a couple of decades.

I'm trying to remember where I read about this but, allegedly, there used to be a gentleman in New Orleans (if memory serves) who went around handing, at random, sealed envelopes with "boy" or "girl" written on a piece of paper inside, to pregnant women.

The idea was that he could expect to hit the right sex of the unborn child a few times and that those lucky hits would make people think he had some sort of gift for seeing the future. As to the misses, the women would be too preoccupied with having just given birth to raise a stink. Note the envelopes were handed out for free. It was his advertisement, see.

pixl97 · on May 17, 2017

>Is it survivorship bias? ML attempts that fail are cancelled and never heard from again

Which is exactly how we got from single cell organisms to human level intelligence. Of course it took 4 billionish years for that to happen. Life, hence intelligence is survivorship, that's the selection mechanism.

cookiecaper · on May 17, 2017

Like other tech fads, there is a reasonable basis of useful stuff underneath a much larger lather of big, frothy bubbles. Almost every idiot is running around talking about using "machine learning" for their new system without understanding what this means, just as they're doing with "containerization", "orchestration", and so on.

The rule of thumb is that if Google or Facebook releases some toolkit for something, the next 3-5 years are going to be a hellscape of idiots clogging up the channels for the relatively small number of people who may have an actual, legitimate use for these tools. But since the idiots are bored at work and can tell their bosses, "Well, Google does it this way, so it must be the best! We're important just like Google, right?", we have to languish through their tiresome drivel, and watch as they drag their companies through a quagmire, only to propose the next fad as the savior a couple of years later.

reader5000 · on May 17, 2017

Human intelligence is rapidly becoming obsolete.

abainbridge · on May 17, 2017

True that. But you could replace "Machine Learning System" with "code base" and "linear algebra" with "C++" and it would be an accurate description of most corporate software development I've seen.

s_kilk · on May 17, 2017

Eh, we (collectively) have a pretty good grip on how code-bases, linear algebra and C++ work, but we still have no idea how minds and intelligence work. And so the best we can do with ML right now is stumble blindly in the dark, poking and prodding things until it 'seems' to give the right answers, some of the time.

bjoernbu · on May 17, 2017

Imho, it's more typical of machine learning. Sure, both can be very similar and are often signs of a lack of deeper understanding.

I'm no ML expert by any means, but I've seen several bachelor/master thesis and even ML competitions where ensembles performed best. Sure, this isn't necessarily aimless stirring and could combine models that really capture different aspects of the data. But often enough it's just several algorithms that do the same general thing, combined to achieve a slightly higher score.

Imho this is most relevant when competitions provide data that is not readable by humans (e.g. simplified: "classify these documents where all words are given as word IDs and never as actual strings").

To me this has a touch of pouring in data, stirring (build many classifiers and plug them together in an ensemble), and getting answers on the right side.

Optimizing hyper parameters goes in a similar direction, imho. I can really see an analogy to stirring

marcosdumay · on May 17, 2017

You don't throw random C++ code at the compiler to see if you get the correct result.

At least, most of the time you don't.

H4CK3RM4N · on May 17, 2017

I think this is one of Munroe's general digs at corporate culture, and not specifically aimed at machine learning, that's just the current buzzword.

TeMPOraL · on May 17, 2017

It feels like a dig at how people do machine learning these days though. My impression is that people just take TensorFlow and a random neural net, and keep throwing gigabytes of data at it until they feel the results look like they should...

TheOtherHobbes · on May 17, 2017

Random Neural Nets should totally be a real thing.

I mean more than they already are.

tincholio · on May 17, 2017

They are, actually: https://en.wikipedia.org/wiki/Random_neural_network

mpweiher · on May 17, 2017

At University, we did neural nets via genetic algorithms. Does that count?

majewsky · on May 17, 2017

Bingo. :)

visarga · on May 17, 2017

> Random Neural Nets should totally be a real thing.

Reservoir computing. Some are critical of this method.

tgb · on May 17, 2017

There's a long history of that: http://catb.org/jargon/html/koans.html#id3141241

sverhagen · on May 17, 2017

Courtesy of "Machine Learning in $(some single digit number) Minutes".

lngnmn · on May 17, 2017

I like a different analogy. In old times the oracles and shamans danced in masks wielding amulets of power and magical weapons. Modern day shamans dance with datasets and clusters, but the main principle is the same. If my performance produced a desired effect (correlation does not imply causation), that is, obviously, due to my undoubted magic powers. If it fail, which is usually the case, that is because the data was not big enough and there was not enough money given to perform big-enough sacrifice to please the gods.

We actually might learn a lot from Tibetan tantric practitioners. It seems that Wall street guys and economists did.

sgt101 · on May 17, 2017

My advice : if you see a pile of sticks growing in the car park, make a swift exit.

thriftwy · on May 17, 2017

As famous russo-ukrainian sentence goes, "is it a defeat or a victory?"

kozak · on May 17, 2017

I don't know what is the Russian equivalent of this saying, but the Ukrainian version that got very popular recent years has an interesting culturological twist. The word that is used as an antonym of "victory" does not translate to English as "defeat", but instead it literally means "betrayal". This is because in Ukrainian history we almost never lost by being defeated, and almost every time our defeats were results of betrayal. And same thing applies to the ongoing war: we don't really fear a military defeat as much as a political betrayal (either internal or external).

dagenleg · on May 17, 2017

> in Ukrainian history we almost never lost by being defeated

Nice to see Ukrainians reinventing German Stab-in-the-back mythology from interwar period.

YeGoblynQueenne · on May 17, 2017

As in all things, Greeks invented this first. Remember Efialtes, the guy who betrayed the Spartans in Thermopylae?

(In modern days also, all my friends who like football, their teams never lose a match. It's always the referee who is on the side of the other team).

scarmig · on May 17, 2017

It's a very powerful hammer in our psychological toolkit, and it's been used countless times throughout history.

No one wants to blame the Everyman and his masculine valor for failing the country. So the parts of the national leadership who started the war need to deflect blame, and they do it by attaching themselves to martial myths and posing as defenders of the Everyman. Not like those other dastardly effete leaders who oppose war, who are in a rhetorically weaker position because they're correctly acknowledging that individual valor doesn't play a major role in the outcome of the war. They're easily portrayed as trivializing valor and not glamorizing it sufficiently.

If you talk to old white American soldiers, you can hear the same thing about Vietnam (damn liberals!) and more recently Iraq (damn liberals!). I'm sure you could hear similar things from old Brits nostalgic for Empire.

kozak · on May 17, 2017

It's much more complex than that.

dagenleg · on May 17, 2017

After making such a statement it would be proper to elaborate on what you meant there. Don't get me wrong, I am not yearning for a debate on Ukrainian historiography here, I just believe that borderline chauvinistic statements like that should be checked.

kozak · on May 17, 2017

I get your point. It would be chauvinistic if I said that we are so strong that we can't be taken by force (which is impossible), while actually I'm just saying that we are too corrupt (so that every time a pivotal point comes, someone gets bribed and betrays). And also, I'm speaking of the last 300 years, and not really the _entire_ history.

dgudkov · on May 17, 2017

"Is it a treason or a victory?" would be a more precise translation. And yes, it's Ukrainian only. It's quite surprising to see this funny national meme pops up on HN.

firtoz · on May 17, 2017

"Yes"

Markoff · on May 17, 2017

that would be actually very Chinese answer

ATsch · on May 17, 2017

or a very boolean algebra answer

MarkMMullin · on May 17, 2017

I'm waiting for the followon - I stirred it up, I got good results, and now I can't seem to reproduce them again :-(

lottin · on May 17, 2017

There is this famous paper "I Just Ran Two Million Regressions". Any half decent statistician will tell you what's wrong with that type of approach.

joe_the_user · on May 17, 2017

Could one describe machine learning as "combine enough bad statistical tests that the cries of 'it seems to work' from managers drown out all the 'it's wrong' cries from the statisticians"

marcosdumay · on May 17, 2017

In machine learning circles, this is called "overfitting".

itissid · on May 17, 2017

So I am curious, like to a software engineer who has never learnt much of stats how difficult has it been for them using these things in the field to solve problems with, if they are any doing so? Like do they have a problem because they don't have a grounding in stats or do these things pretty much "work" assuming the models are coped to fit the problems.

ursuscamp · on May 17, 2017

Not hot dog.

YeGoblynQueenne · on May 17, 2017

Seems to be releated to this recent one:

https://xkcd.com/1831/

Something tells me Rundall Munroe had a run-in with some over-eager data scientists recently.

jakozaur · on May 17, 2017

So far Deep Learning has made this Xkcd obsolete: https://xkcd.com/1425/

Check whether a photo is of a bird.

simias · on May 17, 2017

Can a computer algorithm reach a human-level of skill in identifying birds in a photo in all situations? That sounds like a very hard problem indeed.

For instance could an algorithm conclusively identify the birds in all of these pictures without having too many false positives?

https://ak3.picdn.net/shutterstock/videos/6087611/thumb/1.jp...

http://www.hippoquotes.com/img/impact-of-nature-quotes-in-fr...

http://www.birdsasart.com/baacom/wp-content/gallery/cache/17...

https://vztravels.files.wordpress.com/2014/05/img_1390.jpg

http://www.zastavki.com/pictures/1920x1200/2011/Animals_Bird...

This is not a rhetorical question by the way, I genuinely don't know the state of the art in this field. If it's indeed possible to do that today I'll be extremely impressed.

julius · on May 17, 2017

Detection accuracy is fine. We are actually at a point where NNs can make photos :)

Convert string:"this small bird has a pink breast and crown, and black primaries and secondaries." into a photo.

https://www.youtube.com/watch?v=rAbhypxs1qQ

simias · on May 17, 2017

That's impressive, but I'll point out that the bird photos in this video are all clean, well focused close ups, that's probably easier to process than random pictures.

If you wanted a general algorithm working on non-curated data (like tagging facebook photos for instance) I'm sure it would be significantly harder.

nl · on May 17, 2017

Check out the (deliberately blurry) examples in https://arxiv.org/pdf/1703.05393.pdf where it can distinguish between blurred, low resolution pictures of different types of crows.

It's only ~50% accuracy, but the photos are terrible. Much worse than Facebook pics.

OTOH, this is classification into hundreds of classes, not millions like in the case of FB face recognition. (Although of course that can use the connectivity graph as a filter on that too).

Smerity · on May 17, 2017

This is all doable today. As an example, check out some bird photos[1] from the Visual Genome[2] project that are similar to your examples. I selected the photos and hosted them on Imgur in hopes we don't kill Visual Genome with traffic ;) The systems to do this today are not highly efficient or without flaws but it can certainly be done.

The research group I am part of, Salesforce Research (formerly MetaMind), have a model that does this "accidentally" - and there's even an example image of a bird[3]! The model is only meant to provide a caption for an image, not to segment the image into the various objects, but learns to "focus" on the bird as part of describing the image. For those particularly interested, check out the paper "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"[5].

Systems made specifically to segment an image into objects would obviously do far better. For an example of that, check out "CRF as RNN - Semantic Image Segmentation Live Demo"[4]. There are many more systems of this style floating about.

[1]: http://imgur.com/a/1UPnn

[2]: http://visualgenome.org/

[3]: http://imgur.com/vbX8NNZ

[4]: http://www.robots.ox.ac.uk/~szheng/crfasrnndemo

[5]: https://arxiv.org/abs/1612.01887

TheOtherHobbes · on May 17, 2017

I think you underestimate the problem, which is not to get an output that says "Bird", but one that says "Specific breed of bird."

Human experts can get enough clues from the bird shape and the context to do that in the sample photos. I doubt your captioning system can.

This is a good example of a standard problem in ML - underestimating the complexity of the problem domain.

You could argue that your system only needs to do the simpler task to be useful, and that's likely true. But if the goal is to approach human expert levels of classification, it needs to improve by at least a few levels.

I suspect getting it there would run into some interesting performance constraints, and possibly some theoretical issues too.

nl · on May 17, 2017

No, ML is very, very good at doing breeds. See, for example https://arxiv.org/pdf/1603.06765.pdf which gets 88.9% accuracy on the Stanford Dogs dataset, and 84.3 on the Caltech Birds dataset.

These are way better than anything a non-expert human can do. For example, it can distinguish between the Rhinoceros Auklet and the Parakeet Auklet.

I'm not sure what expert performance is, but around 94% is where humans top out on most tasks.

Also, the parent poster knows what they are talking about: https://www.semanticscholar.org/author/Stephen-Merity/337544...

chrisan · on May 17, 2017

How far away is ML from identifying everything in a picture?

For example if we could take a photo of Noah's Ark loading up every animal?

Do you just loop through each NN you have on each species?

nl · on May 17, 2017

A single NN can predict more than one class of object. The ImageNet competition has 20,000 classes.

There's also image segmentation as another poster has pointed to.

In the case of FB face tagging, they'd have learn an embedding space for faces, and when a new image comes in they'd place it in the embedding space along with all the person's connections and find the nearest neighbors.

See https://arxiv.org/abs/1503.03832 or the implementation https://cmusatyalab.github.io/openface/

rjtavares · on May 17, 2017

This seems to be what you are looking for: https://code.facebook.com/posts/561187904071636/segmenting-a...

mnx · on May 17, 2017

The problem posed in the xkcd is "Check if the photo is of a bird", not identify the bird in question. As far as identifying the bird species, that would probably be harder, because I'm guessing there's very few human experts who could reliably do that across a wide spectrum of species, and without knowing the context of the photo.

ardivekar · on May 17, 2017

RESnet in 2016 achieved 97% accuracy on the ImageNet challenge with hundreds of classes. I think that's near human accuracy.

tchalla · on May 17, 2017

It DID take a research team and ~ 5 years to get there? :)

randomsearch · on May 17, 2017

Others have pointed out that this problem has not been solved in the general case.

More importantly, the progress that has been made in recent years actually builds very heavily on work since the early 1990s, so not only is it not complete, what has been achieved took a great deal longer than 5 years.

pixl97 · on May 17, 2017

Well, because we had to invent the massively parallel GPU in between those times. Essentially work that would take an entire supercomputer cluster in the 90s can be done on my desktop with 4 high end GPUs stuck in it.

Now that we are in the range of having the correct hardware the whole "it's taking decades issue" will go away.

randomsearch · on May 17, 2017

GPGPU was definitely part of the success of recent years, but there was also a lot of experimentation and hard work carried out e.g. on CNN designs. Lots of trial and error. That took a lot of time. There have been fundamental changes in the structure and training of NNs that have helped bring the step-change in success.

nyamhap · on May 17, 2017

There was a previous discussion about this nearly 2 years ago on hn:

https://news.ycombinator.com/item?id=10239401

seren · on May 17, 2017

Interesting, it shows how things can move quicker than expected, even when judged by enlightened people.

Is there a way to know the date (1425) was released ?

emiliobumachar · on May 17, 2017

There's an army of Machine Learning researches out there. Arguably a single team would have taken much longer, even on a narrow task.

spatulan · on May 17, 2017

24th September 2014

bryanrasmussen · on May 17, 2017

Programmers often have Scotty syndrome and pad their time estimates to look like geniuses when they solve ahead of schedule.

undergrowth54 · on May 17, 2017

> to look like geniuses when they solve ahead of schedule

No, it is because estimating software tasks is difficult, the penalty for underestimating is that people think you are dishonest/flakey, and there isn't anywhere to get an education in how to do it well. The default advice given to junior engineers is therefore: "take your intuition and triple it." I hate that this is the state of the industry. My interactions around estimation over the past 5 years since uni have literally made me feel nauseated and near fainting on multiple occasions. I would love for Joel or Klamezius or Uncle Bob or someone else to fix it and produce a good course on how to create estimates.

UK-AL · on May 17, 2017

There isn't a way. It's an issue that blights everyone in the industry.

Probably the best your going get is the book "Software Estimation: Demystifying the Black Art "

Even applying those techniques you get it wrong.

Most experienced software companies have adopted agile, and accept reductions in scope to meet deadlines as something that happens.

bragh · on May 17, 2017

Agreed, agile seems the only way, but does indeed require experienced managers. A lecturer once pointed out that business/normal people would always expect some kind of point estimate, they are never satisfied with some kind of distribution or interval. Personally, I would say that this is even more sad than that: the point estimates are always taken at the extreme values, which ever suits the person wanting the estimate more, never the average value.

Of course, all this leads to bad blood between techies and business side: how long will it take? -> probably about 3 weeks, but this requires using a library we haven't used before, so in the worst case even 2 months -> what? so long? get it done in 4 days, this is required the next week -> no, that's not really possible -> make it happen -> it happens and it either sucks when it's delivered at all, so the deadline gets extended anyway to iron out all the bugs or it causes lots of problems in the future.

seren · on May 17, 2017

For very long projects, I have seen much delay because of feature creep.

"OK you have implemented it as requested, but finally the customer does not like it, it needs to be slightly different. Can you do it quickly?"

Sometimes it is easy to adapt, sometimes next to impossible.

josefx · on May 17, 2017

Or when you allow for hilarious false positives/negatives. Sometimes birds are birds, sometimes they are cats and cats are birds, sometimes they are dogs. Everything is possible with the right training set and machine learning.

nyamhap · on May 17, 2017

https://news.ycombinator.com/item?id=10239401

efaref · on May 17, 2017

By the looks of http://explainxkcd.com/1425 it was September 2014.

Beltiras · on May 17, 2017

First copy in Waybackmachine is Oct. 2. 2014. https://web.archive.org/web/20141101000000*/https://xkcd.com...

Ended · on May 17, 2017

24 Sep 2014 (hover over the title in https://xkcd.com/archive/). So, less than 5 years ago!

pasta · on May 17, 2017

It's funny that this also seems to work for humans.

When the outcome is bad/not what humanity wants, we get/give a negative response and hope the outcome next time will be better.

ganwar · on May 19, 2017

The second part quite accurately describes most of the generalization techniques. Especially when it comes to deep learning.

stared · on May 17, 2017

For webcomics and ML, see this footnote: http://p.migdal.pl/2017/04/30/teaching-deep-learning.html#fn...

"It made a few episodes of webcomics obsolete: xkcd: Tasks (totally, by Park or Bird?), xkcd: Game AI) (partially, by AlphaGo), PHD Comics: If TV Science was more like REAL Science (not exactly, but still it’s cool, by LapSRN)"

jacquesm · on May 17, 2017

This thread is hilarious.

ML is powering most or even all self driving car efforts underway, powers online translation services, numerous vision projects and speech recognition besides winning competitions meeting or exceeding human performance on the same data.

I'm just as allergic to those that hype some technology as I am to those that will snarkily discard something with arguments that have already been laid to rest, in some cases multiple years ago.

Xkcd is fun, but isn't necessarily prescient nor does it have to be accurate, when this cartoon was published the writing was already on the wall and that's 3 years ago. It's fine to be skeptical about new technology but before you start criticizing it make sure that you have at least a rough idea of where things stand lest you end up looking foolish.

Sure, ML is abused and if we're not careful we will see another AI winter because of silly hype and ascribing near magical properties to ML. But at the same time snark, condescension and a-priori dismissal of what is most likely the biggest landslide in computing since the smartphone is - especially on a site that deals with both hacking and novelty - something that I would not expect.

Compared to the HN love for the next JS framework or language fad this attitude is surprising to say the least.

topologie · on May 24, 2017

Do you think it's connected in any way to https://www.wired.com/2013/02/big-data-means-big-errors-peop... ???

If not, what is your interpretation of the XKCD cartoon?

vsbosvubo · on May 17, 2017

Couldn't a DNN be trained to inspect other DNNs and generate human-readable explanations of how they work ? In the fewest words, with maximum poetry

YeGoblynQueenne · on May 17, 2017

Brilliant idea! Why not try it out yourself and see how it goes?

vsbosvubo · on May 17, 2017

don't know anything about DNNs!

21 · on May 17, 2017

Can we load all XKCDs into a neural net (LSTM, or something), and train it to "dream" new ones?

Like the neural nets which generated non-sense, but realistic looking C++ code.