Hacker News new | past | comments | ask | show | jobs | submit login
Richard Feynman and the Connection Machine (1989) (longnow.org)
165 points by pmoriarty on March 1, 2017 | hide | past | favorite | 61 comments



Both of a friend's parents worked on the Manhattan Project. Feynman was a character.

I was fortunate to be able to write code for version 1 of the Connection Machine (that was the SIMD device) and I would prototype code on my Mac in Star Lisp. Good times that I am grateful for.


Write more about this. A lot of people will find it very interesting.


+1


Please, go on. I've been fascinated with Connection Machines ever since I saw one at a university auction. Which also had plenty of SGI gear and Next slabs as well.


The CM 1 had fast I/O throughput and all processors executed the same instructions (SIMD, single instruction multiple data stream).


Waiting for the long version of your experiences. Thanks for sharing.


Me too, at the Naval Research Laboratory. Did you use C*? (This was a version of C designed for the CM, with parallel operators.)


A nice essay. This paragraph stood out for me:

"By the end of that summer of 1983, Richard had completed his analysis of the behavior of the router, and much to our surprise and amusement, he presented his answer in the form of a set of partial differential equations. To a physicist this may seem natural, but to a computer designer, treating a set of boolean circuits as a continuous, differentiable system is a bit strange. Feynman’s router equations were in terms of variables representing continuous quantities such as “the average number of 1 bits in a message address.” I was much more accustomed to seeing analysis in terms of inductive proof and case analysis than taking the derivative of “the number of 1’s” with respect to time. Our discrete analysis said we needed seven buffers per chip; Feynman’s equations suggested that we only needed five. We decided to play it safe and ignore Feynman."


Feynman is great, very amusing and obviously brilliant. But I've also worked with folks like him before who end up as fish out of water in fields they don't know the conventions for and can't be bothered to learn.

Lots of time is wasted trying to figure out how to get the rest of the company to communicate with the lone genius, and it's obvious that they're smart enough to go away for a week and learn enough of the field they're working in to try to use the language and vocabulary of that field to be minimally effective.

I've been on the receiving end of these kinds of analysis and the result is that they seem to exist purely to showcase how smart the individual is and to provide no other meaningful input to the effort. In this case the engineers ignored Feynman, did their own analysis anyways and followed their own conclusion...the subtext here is that multiple people were not getting along with Feynman's way of doing things.

He turned out right in the end of course, because Feynman, but there's lots of people who think they're Feynman and aren't and it's hard to tell the difference sometimes.


> He turned out right in the end of course, because Feynman, but there's lots of people who think they're Feynman and aren't and it's hard to tell the difference sometimes.

Well he had a Nobel prize in physics on his hands for starters, that's one way to tell the difference. Granted, noone's infallible, but it's a good predictor. Not that it would have spared him from fizzbuzz on the interview these days..


>Not that it would have spared him from fizzbuzz on the interview these days..

I'd hope an interviewer would view an applicant's fizzbuzz as a pleasant surprise if it came with a bit of casual mathematical proof.


>Well he had a Nobel prize in physics on his hands for starters, that's one way to tell the difference.

With the preface that I know that a higher calibre of comment is expected from me in this community, it's still the truth: I lol'd.


This is a great point. However, I would generalize a bit and say the problem is sometimes beyond ego; when someone's technical aptitude is greater than their ability to communicate. This is a more common problem than just the smartest guy in the room, and at some point you either trust each other, or agree to disagree.

That said there is an individual far worse than the brilliant teammate that's hard to understand. These are people whose ability to listen is worse than their technical aptitude. Just one of them in a meeting will destroy all progress, and disrupt any ideas that don't mesh with their preexisting thoughts.


On the flip side, it's also frustrating to know you're right about something, but because your co-workers don't have the same education as you they completely ignore you.


They are both real problems. Anecdotally, I've seen a lot more cases of specialists failing to communicate effectively than of non-specialists ignoring well-expressed advice. It's kind of endemic among techies, unfortunately.


Well actually in the next few paragraphs they do end up trusting Feynman's analysis and go with 5. And it worked.


I think you and the parent are agreeing. :) They didn't know what to make of Feynman's analysis, because it used techniques they weren't familiar with. It was clearly an approximation; were the error terms really ignorable?

They only used 5 when it turned out they couldn't manufacture 7 and had no other choice. So they only trusted his analysis out of desperation / wishful thinking, not out of objective reasoning.


But parent comment is using this as an example to complain about experts from different fields wasting people's time or whatever. But this is a terrible example to use. Feynman brought new insights that never would have occurred to computer scientists. And of course they were cautious at first. But they ended up trusting it enough to go through with manufacturing it. And he was right in the end.

Additionally parent comment complains about people like Feynman not being able to communicate. But there is nothing about that in the story. It goes on and on about how Feynman was a great communicator and explained his ideas clearly.


Lots of time is wasted trying to figure out how to get the rest of the company to communicate with the lone genius

Feynmann himself disdained this sort of lone genius puffery. He was critical of this in Murray Gell-Mann. I don't think Feynmann was a loner at all. He had people he liked and could relate to, and other people he didn't like and relate to, just like anyone.

I've been on the receiving end of these kinds of analysis and the result is that they seem to exist purely to showcase how smart the individual is and to provide no other meaningful input to the effort.

Bingo! Always look at the cost/benefit to the group. Everyone who seems like this on the surface tells themselves that they are striking a blow for the truth. But what is the overall effect? Did they primarily bring the group's perception of the truth closer to reality, or did they mostly establish themselves as the Alpha nerd? It's usually both, but what is the mix? Is the undertone one of kindness, or one of cruelty?

there's lots of people who think they're Feynman and aren't and it's hard to tell the difference sometimes

The same goes for Steve Jobs in the Bay Area.


You're making a big assumption that he could have achieved the same result using the techniques that failed for the rest of the team.


Feynman really liked continuous solutions. It's too bad he wasn't around for the deep learning era. Or for bufferbloat.

The "average number of 1 bits in a message address" makes me think of routing in the NCube. Like the Connection Machine, the NCube was a big array of small CPUs. Bigger CPUs than the CM, though, and running independent programs. The NCube had 2^N CPUs, up to 1024, and each was connected to N neighboring CPUs in N dimensions. Each CPU had an N-bit ID, representing its position in the binary N-dimensional array. The connections between CPUs were thus between ones that differed in address by exactly 1 bit.

So routing worked by taking the current CPU ID and XORing it with the destination ID, then inverting. The 1 bits then represented possible paths to neighbor nodes on which the packet could be sent. Any path would work, and all possible paths are the same length. At the next node, there would be 1 fewer 1 bits in the difference. When there were no 1 bits, the packet had arrived.

Nodes need packet buffers. How much buffering is required? That's probably what Feynman was working on.

The number of 1 bits is a measure of distance to destination. The average number of 1 bits is a measure of network traffic. As a discrete problem, this is a mess. Feynman converted it to a continuous flow problem, for which there's known theory, some of which Feynman had developed. He'd done a lot of hydrodynamics work at Los Alamos.

Van Jacobson, who started as a physicist, also saw network congestion that way.


Thanks for that explanation of Connection Machine routing. Damn clever! Puts me in mind of Gray code.


I've always wondered what Feynman's proof looked like. Is it (or something comparable) available somewhere online?


This is an application of mean field approximation. See the O'Reilly book Data Analysis with Open Source Tools by Philipp Janert, page 175-178.


For me, the best sentence in the essay was "Nothing made him angrier than making something simple sound complicated".


Feynman has many interesting stories about playing about with all sorts of stuff from picking locks to the Manhattan project. His book, "Surely you're joking, Mr. Feynman!" is just a fantastic read: https://www.amazon.com/Surely-Feynman-Adventures-Curious-Cha...


The sequel, "What Do You Care What Other People Think?": Further Adventures of a Curious Character [1] is also excellent, and includes his account of serving on the commission that investigated the Challenger disaster.

[1] https://www.amazon.com/What-Care-Other-People-Think/dp/03933... or part of the buy it together bundle in wimagguc's link


Funny, but I get the impression from that book's narrative of the Challenger disaster that Feynmann was in the role of someone else's "useful genius."


This was a typical Richard Feynman explanation. On the one hand, it infuriated the experts who had worked on the problem because it neglected to even mention all of the clever problems that they had solved. On the other hand, it delighted the listeners since they could walk away from it with a real understanding of the phenomenon and how it was connected to physical reality.

I've never seen this article before, and I love how the author managed to leave me delighted in the same way, feeling much smarter than I actually am, especially about theoretical physics.


A reprint of it is in the book Feynman and Computation edited by Anthony J.G. Hey. You can find it in Amazon or your local library.


I came across this before and found the section "An Algorithm For Logarithms" interesting.

As pointed out (in second para. of that section) Feynman's observation about representing a number as product of terms of the form $1 + 2^{-k}$ reduces the problem of estimating log to computing those values k which appear in the product. (Btw the article incorrectly claims the representation is unique.)

There's an obvious linear time algorithm for finding a sequence of values k but I don't think that would really perform so well compared to, say, Newton-Raphson.

So I wonder why this was a good approach. Perhaps there's a smart way to estimate the k or perhaps it just happened to fit the constraints under which they were working.

Anyone have some insights?


Newton-Raphson doesn't buy you much for logarithms, because the iteration itself involves either exp(x) or log(x)[1].

If you don't have a real multiplier[2], then the algorithm described is somewhat attractive because multiplying by 1/(1+2^-k) has a nice series expansion. You can do it using only shifts and adds (and the number of terms you need is O(1/k), so it falls off reasonably quickly), and you get the part you need to compute the next `k` early, so there's no serial dependency. Actually finding k is just a count-leading-zeros operation, which is cheap enough.

Computers today pretty much all have real multipliers, so no one does this. Instead we reduce to a range like [1/sqrt(2),sqrt(2)], and then either use a minimax approximation on that interval (if the accuracy requirement is reasonably low) or further reduce by looking up a value of r close to x for which we have a pre-computed 1/r and log(r) and take advantage of:

    log(x) = log(r * 1/r * x) = log(r) + log(1/r * x)
this allows one to achieve high-accuracy results if r is chosen so that one of 1/r and log(r) is exact and the other is unusually close to the exact value[3], or stored in extended precision.

[1] If you have a fast approximate exp and log you can make use of it in an approximate N-R iteration, but it still doesn't buy you much. In practice no one does this; we compute exp and log directly, and they're among the fastest functions in the math library.

[2] Meaning either you don't have one at all or that it's much, much slower than addition, as was common in the 1980s and earlier.

[3] This trick is called "Gal's Accurate Tables" (https://en.wikipedia.org/wiki/Gal's_accurate_tables), but as with most things named for someone, it was independently invented multiple times, and likely dates back decades before the credited inventor.


Thanks for these remarks Stephen.

IIUC you're speculating that the relevant constraint they faced was that multiplication was very burdensome. I agree; it seems plausible this was the reason they found this an attractive approach.

Also agreed about Newton-Raphson; that was a thinko on my part. It would be an unusual situation if NR were the right tool for estimating log.


It might be relevant that the CM-1's processors were tiny: they had a 1-bit word size.

Maybe this was meant to be a fast rough approximation? 1 + 1/2^-m + 1/2^-n ~= (1+1/2^-m)(1+1/2^-n). So check each bit in a fixed-point int and add a factor from a table for each 1 bit found? Then run a few rounds of Newton iteration at the end if you want it a little less rough? This goes well with the SIMD nature of the machine -- they broadcast single-bit ops one by one to all processors in parallel -- but maybe it's terrible numerically, I haven't checked.


Thanks, this is a good point: we have no idea how accurate they wished to be.


I came across this a long time ago, but read it again, and laughed out loud several times. It's a good read.

It's also amusing to think of how "small" science was only just a few decades ago. I guess that's what exponential growth is.. (plus, I guess, survivor bias) Here we have one of the greatest physicists of the 20th century working at a startup for a bunch of wet-eared kids from MIT. But oh, looking at the wikipedia page of Danny Hillis (co-founder of Thinking Machines, author of the article), his thesis advisors were (drumroll) Marvin Minsky, Gerry Sussman, Claude Shannon. Not exactly nobodys either..


> I came across this a long time ago, but read it again, and laughed out loud several times. It's a good read.

I've read it before as well, but read it again recently due to this HN post (before it moved up in the stack, sometime last week).

While I always find it an amazing and amusing recollection of the period, at the end I always find a bit of "smoke" in my eyes, knowing that we have lost one of the greats of our age, and that I will never be able to say to him personally how much I have enjoyed reading and listening to his words.

Fortunately, though, we do still have those - and they seem to continue to inspire people - which is really something.


> I came across this a long time ago

Maybe because it was discussed 24 times on HN already, as observable when clicking "past."

I like Feynman stories nevertheless, even to re-read.


I love this part:

"Feynman had a proposed solution to the anisotropy problem which he attempted (without success) to work out in detail. His notion was that the underlying automata, rather than being connected in a regular lattice like a grid or a pattern of hexagons, might be randomly connected. Waves propagating through this medium would, on the average, propagate at the same rate in every direction."


A counter point to this narrative can be found here: http://www.inc.com/magazine/19950915/2622.html or here: http://thedailywtf.com/articles/Thinking-Machines

In the light of these two stories, one can speculate that Feynman was mostly hired by TM to help with PR, not to produce actual work.


I wonder what Feynman would be working on if he was a programmer today.


He wouldn't, he was careful to avoid that since 1940-es:

http://www.goodreads.com/quotes/325051-well-mr-frankel-who-s...

I guess you haven't read the book, take a look.


It’s too bad that the conversion from LaTeX markup to HTML was… nonexistent, leaving us with such things as “100\%” and math formulas in raw LaTeX forms.


Cellular automata-- a dead end.


Cellular automata are just a subclass of graph evolution algorithms constrained to a regular grid comprised of homogeneous nodes. Neural networks are also graph evolution algorithms, and logical programs can be modeled this way as well.

Cellular automata are just a toy branch of a tree that is thriving like crazy.


Cellular automata is also the most natural fit for a huge bank of single bit processors connected in a hypercube topology. There's very little software you need to write to turn that particular hardware into a cellular automata simulator. It is a good first program for that machine.


Exactly. A toy. I had Fredkin for introductory computer science. He was probably my favorite professor, however I was a physics major and didn't go further in coding. He would rant about the game of life. I was sad about this, even as an 18-year-old.


I just visited the long now foundation's SF office yesterday. Very cool stuff.


Someone please explain how to do this. It's been a lifelong mystery that has always fascinated me:

> By the end of that summer of 1983, Richard had completed his analysis of the behavior of the router, and much to our surprise and amusement, he presented his answer in the form of a set of partial differential equations. To a physicist this may seem natural, but to a computer designer, treating a set of boolean circuits as a continuous, differentiable system is a bit strange. Feynman’s router equations were in terms of variables representing continuous quantities such as “the average number of 1 bits in a message address.” I was much more accustomed to seeing analysis in terms of inductive proof and case analysis than taking the derivative of “the number of 1’s” with respect to time. Our discrete analysis said we needed seven buffers per chip; Feynman’s equations suggested that we only needed five. We decided to play it safe and ignore Feynman.

How is it possible to use PDEs to model boolean circuits?


Just today I was reminded of this paragraph when we thought about completely different discrete optimalization problem with the expectation that there must be some way to solve it analytically instead of numerically.

But in case of CM router, “the average number of 1 bits in a message address.” is surprisingly useful quantity, because for the most straightforward implementation of hypercube routing the number of one bits in address is equal to number of links the message must traverse. The algorithm is essentially this: find address bit set to one, clear it and send the message down the link with same number as was the order of that bit, if address is all zeros, message has reached the destination (the same thing can be implemented without changing the address seen on wire by each router XORing its node ID with address in message and getting the same bitvector, but I believe most implementations actually modify the message address in each router).


I think the "in respect to time" and "amount of bit" part is the continuum. You can then think of your communication as a stream and rate. Their computer had a million cores in it's original design, but each could only talk with ~20 nearest neighbors. Weirdly information might begin to mimic simple fluid dynamics in this environment.

I suspect that's why Feynman could build those equations as well as discover that they only needed the 5 buffers. Though I can fully sympathize with the program manager in not wanting to reason in terms of differentials.


There aren't enough details about but there are a lot of touch points between discrete mathematics and calculus. If you read Knuth's AOCP you'll see a lot of other examples where things like integrals are used in discrete math problems...


Have a really large number of circuits. It's the same as how you can use PDEs to model atoms. If you have enough atoms in the collection, you don't have to think about individual atoms.


The same way that you can use continuous Fourier transforms to model the discrete one. Or, for that matter, the way we use digital simulations to model all sorts of things.


> How is it possible to use PDEs to model boolean circuits?

Be Richard Feynman?


My guess is that his calculations were akin to using a hash table to sort routing information and analyze the eviction patterns and probabilities over time?


I was curious about that too. This bit:

representing continuous quantities such as “the average number of 1 bits in a message address.”

suggested to me that maybe he recast it as a system dynamics problem? I'm a little out of my depth here, though.


Iirc, he used STATISTICAL DEs to model data flow.


Good but old: 1989. Should be stated in the title.


Feynman died almost 30 years ago ... I think it's pretty safe to assume it's not recent news, even without the (1989)


I don't think anyone would assume it's a news story. The title presents it as an essay, and new essays get written about historical details.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: