Hacker News new | past | comments | ask | show | jobs | submit login

Ugh. This again.

http://www.scottaaronson.com/papers/npcomplete.pdf

"this totally scales in polynomial time... but we lose statistical significance as n increases"

"this totally scales in polynomial time... but scales exponentially in the number of proteins needed"




The linked article is a little over-hyped, but the paper authors themselves acknowledge that their system is simply a really cool example of parallel computing and doesn't solve or avoid the P?=NP problem. From the paper:

"...it is inherent to combinatorial and NP-complete problems (assuming P! = NP) that the exploration of the entire solution space requires the use of exponentially increasing amounts of some resource, such as time, space, or material. In the present case this fundamental requirement manifests itself in the number of agents needed, which grows exponentially with 2N. Effectively we are trading the need of time for the need of molecular mass."

They could have implemented many different paralellizable algorithms, so the choice to solve a case of subset sum was probably for PR/the awesome factor (not to be underestimated as an academic motivation).


That's a cool paper. It touches on a question I've had before about how much of our algorithmic reasoning is predicated on the implementation of CPUs and our current tech stack.

For example, I understand that it's been proved that sorting can at best be done in O(n*log(n)), I guess related to the number 1-to-1 comparisons that need to be done.

However, how does sorting, say, the physical length of sticks come into this? If you stand them all in a bundle you can easily select out the tallest, and then the next tallest, etc, by e.g., lowering a sticky horizontal plate.

It's beyond my cleverness to think about how that could be implemented in a useful way for a computer, but could we prove that it's not possible? Could we have custom-designed units for specialized operations that take advantages of differences between the digital/analog world?


I'm going to provide an avenue for thinking about why the stick sorting example is problematic.

Before starting, you have to specify the the size of the plate / the sorting machine. Let's say it supports 10,000 sticks.

I'm going to give you 20,000 sticks.

Does your machine still solve the problem in O(N)? Nope, it's back to O(n log(n)).

The issue with complexity theory is that it's for _arbitrarily large N_, and solutions that work intuitively for small N often run into walls, especially physical solutions.

EDIT:

BTW, I had to stop and think for a bit on that one. This question made me smile; it reminded me of why I enjoyed school. Great question, where the answer isn't obvious and elucidates something about the nature of complexity and how it disagrees with intuition.


I don't think this is the reason the sorting machine can beat O(nlogn). Just as any real sorting machine is limited in size, every real computer is limited as well. Just as the machine can't hold arbitrarily many (or arbitrarily large) sticks, your computer doesn't have enough memory to hold arbitrarily many (or arbitrarily large) numbers. Complexity theory is predicated on idealized computers (Turing machines with infinitely long tapes, or register machines able to hold infinitely large numbers).


But isn't this just radix sort, effectively? That's linear-time, no problem. There's no fundamental rule saying that sorting has to be O(n lg n), it's only sorting-by-comparison that needs O(N lg n) comparisons.

Though I guess one might point out that, without direct (and infinitely precise!) comparisons, eventually you'll have enough sticks that you won't really be able to tell which is the largest.


You can build an arbitarily large sticky plate in linear time, before using the plate. It doesn't add a log(n) factor.


Well, I don't think that's true.

Eventually, the plate will be higher density than the material available at the point of its construction, so it will need to take material from outside the plate's current area to fill it.

As the plate extends further into lower-density regions, it will have to extend its materials gathering exercises further and further away from the actual construction site, so they'll take longer and longer to get there.

Even if the plate is very low density indeed, eventually you'll run out of planet and have to start dismantling the moon - which is quite a long way away - unless the plate has a lower density than the volume of space the moon encloses.


> However, how does sorting, say, the physical length of sticks come into this? If you stand them all in a bundle you can easily select out the tallest, and then the next tallest, etc, by e.g., lowering a sticky horizontal plate.

But here you are only linear in the number of plate-lowering- and longest-stick-removal-operations. The actual number of comparisons you do in this example is still very much quadratic, as, in every step, you have to check every stick on whether or not it touches the plate; you just found a very fast way to check this (but still linear in the number of sticks remaining).


Note 1: Comparison sorting is O(n log (n)).

In domains where the range of values is smaller than the number of elements (so there are duplicate values), sorting can be faster.

https://en.wikipedia.org/wiki/Radix_sort

Note 2: for aribtrarily large sticks, is "lowering a sticky horizontal plate" a constant-cost operation? Or is it a factor of K (difference between largest and shortest object)?


O(nlogn) is the bound for a serial version of the algorithm. A head that can read or write one value (stick) at a time. Parallel sorts achieve better bounds. Your sticky horizontal plate is essentially N parallel reads, each scanning through values, continually asking if it's reached its stick.


This is called spaghetti sort. The analysis on Wikipedia should help.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: