Hacker News new | past | comments | ask | show | jobs | submit login
Counterintuitive Properties of High Dimensional Space (marckhoury.github.io)
401 points by marckhoury on March 5, 2018 | hide | past | favorite | 64 comments



Just today on numberphile channel they showed how to pass circle through smaller square hole by bending it in higher (3rd) dimension: https://m.youtube.com/watch?v=AvFNCNOyZeE


There's a (sort of) intuitive way of understanding why things like these are possible without even seeing it visually.

How can you make two solid 2D squares occupy the space? They can't travel through each other, so...? Easy. Lift one of the squares into the 3rd Dimension, move it to the same X,Y coords, then drop it on the other (think of putting stacking 2 pieces of paper).

Similarly, you can have two 3D objects occupy the same space by "lifting" one into the 4th dimension, moving it to the same X,Y,Z coords as the other, and then "dropping it". You can "cheat" by thinking of the 4th dimension as time - one object travels through time, places itself at the same position of where the other object is, then returns. But as I said, that's "cheating".


I do similar mental exercise with higher dimensions, but for me with easier situation - escaping from certain area without crossing boundaries.

1 dimensional - an endless line with 2 spots marking boundaries to some subpart X. If you're inside, you need to enter 2nd dimension to escape that subpart X without crossing boundaries (and ie come back to original dimension).

2 dimensional - circle, you're inside, you need to enter 3rd dimension to escape without touching the circle.

3 dimensional - sphere, entering 4th dimension to escape without touching the sphere.

I can't go with my simpleton mind much further but it should scale indefinitely :)


The did a numberphile video on this exact topic a few months ago: https://www.youtube.com/watch?v=mceaM2_zQd8


The video you linked to has nothing at all to do with the parent one. You probably meant to reply to the topic, rather than the comment you actually replied to.


I suspect you're wrong, and that the connection between video and parent was that they were both Numberphile videos. An expanded comment might have read something like:

"Not only is today's Numberphile video potentially relevant, Numberphile actually directly addressed the article's exact topic several months ago."


That short little video gave me a lot of pleasure!


Anyone interested in practical consequences of these counterintuitive properties should look up https://en.wikipedia.org/wiki/Curse_of_dimensionality for how it impacts things like machine learning where we naturally are working with many dimensions.


The behavior the article is, to me, way more bizarre than the curse of dimensionality.

It's tempting to think of data sets as "point clouds". This article is a reality check for me: you can't safely apply intuition about 2- and 3-d point clouds to higher dimensional data. I suspect that this explains why methods like tSNE seem to produce unstable results depending on the parameters [0]. The notion of a "neighbor" in high dimensions is just not what I think it is.

I suppose the same is true for high-dimensional cost surfaces. Gradient descent is often described as "like walking down a hill". But without a deep understanding of high-dimensional geometry, I'm not at all confident that I know what a 4-, 10-, or 1000-dimensional hill looks like.

The lesson: Be skeptical of my own geometric intuition unless it is firmly backed by math.

[0]: https://distill.pub/2016/misread-tsne/


Indeed. I find https://www.thestar.com/news/insight/2016/01/16/when-us-air-... to be a good cautionary tale on how our intuition about people being close to average is misleading - nobody is. And nobody is particularly like anyone else, either.

On a thousand dimensional hill, my intuition is that it locally looks like a low dimensional hill, along axes that you can find through techniques like Principal Components Analysis. This has yet to mislead me. On the other hand, my pure math background was a long time ago, and I have not explored machine learning in any real depth...


But what this article makes me think is that even "low" dimensions can't be trusted, as long as it's greater than 3.


indeed ! the 'neural networks for pattern recognition' book (bishop) has exercises in chapter-01 on the same subject.


from the article: The volume of the unit d-sphere goes to 0 as d grows! A high dimensional unit sphere encloses almost no volume!

Maybe I'm nitpicking, but this interpretation is not accurate IMO. Volume of N-dimensional sphere is measured is different units than that of (N-1)-dimensional sphere. E.g. one is m^5, and another is m^4. Comparing values measured in different units is not the best idea. It would be better worded as : the ratio of volume of N-dimensional sphere to the volume of same-dimensional cube approaches zero when N goes to infinity. But this is not a counter-intuitive statement.


A more precise, unit-respecting result is to plot the ratio of the unit sphere's volume to the unit cube's volume. This is a dimensionless quantity, so it makes sense to compare them, but happily the numeric values are unchanged because the unit cube always has volume 1 m^N in dimension N.


The ‘pithy’ version of this idea is that most of a high-dimensional orange is the peel.

It’s from this excellent article by Pedro Domingos: https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf


This is actually slightly different issue.

Regardless of the dimension, the volume of n-ball is V_n(r) = c_n * r^n, where r is radius, and c_n is a constant depending on n. The grandparent observation is that surprisingly enough, c_n goes (rather quickly) to 0 as n goes to infinity, so for unit n-ball, that is, a ball of radius 1, the volume is exactly c_n, which is very small. This is surprising to us, because in familiar case of n = 3, c_3 = 4/3 pi, which is moderately large.

As for the mass being concentrated around the peel, suppose we have an orange of radius R+e, and its peel has thickness of e. Then, the ratio of volume of the peel to the volume of the whole orange is exactly:

(V_n(R+e) - V_n(R))/V_n(R+e) = 1 - V_n(R)/V_n(R+e) = 1 - c_n R^n / c_n (R+e)^n = 1 - R^n/(R+e)^n = 1 - (R/(R+e))^n

Now, since R/(R+e) < 1, for large n, (R/(R+e))^n will be very small, and so the ratio of the volume of the peel to the volume of the whole orange will be 1 minus something very small, so close to 1.

Note that the peel argument works just as well with "square" oranges -- they also have most of their mass concentrated around the peel, but contrary to round oranges, their whole mass does not go to 0 as the dimension increases. To see that, note that the volume of n-dimensional square with side of R is exactly R^n, and if you do the above computation, it's exactly the same (note that the constant c_n cancelled out anyway).

In this sense, your comment and the grandparent ones are about two different phenomenons -- grandparent is talking specificly about the geometry of the sphere in L_2 norm, while you are talking generally about n-dimensional volumes.


That's what I thought reading that too. In general I don't think it's a good idea for maths people to just forget about units ("because it is for physics")


Speaking about math: I'm not even sure what the meaning of polynomial is. E.g. x^2+3x+1. Each component is measured in different units. Is it intuitive?


But such polynomials do turn up in physics, e.g. accelerated motion: x(t) = 1/2 a t^2 + v t + x0. The trick is that the coefficients aren't unitless either.


Such a polynomial only makes sense when x is a dimensionless quantity, or alternatively when the coefficients have appropriate dimensions to compensate.


The definition of a polynomial has nothing to do with units of measurement. The coefficients come from a ring and a ring has nothing g to do with units of measurement.


That’s fine as a pure mathematics perspective, but in physics dimensions come into play often, and one often has polynomials whose domain and coefficients come “tagged” with particular dimensions.


OP did say, “speaking about math...”. The operations on polynomials and dealing with polynomials doesn’t have anything to do with units of measurement. It may be the case that when used in some areas and in some contexts that the units matter but I think it clouds issues to bring them up.

The intuition for operating with polynomials is best obtained by not worrying about units.


I think we’re not really disagreeing about anything. In a pure math context, x would be, as you say, a quantity in which dimensions play no role.


As he said, x is part of a ring. In physics the multiplication of two elements from the same ring end up on another ring (meters^2 for example) in which, for example, you cannot do addition with elements from the original ring (meters)


The mathematical definition of a ring precludes this from happening. By definition, in a ring when you multiply 2 elements from the ring you get another element of the ring. Perhaps you are referring tensor products (geometric algebra for physicists).


I admit that I have no intuition of a product of 2 dimensionless quantities :)


That would just be like a double-rescaling. Like "a meter is now twice as long as double a normal meter."


In abstract algebra, polynomials are just tuples (a_n, ... a_0) with addition (simple elementwise) and multiplication (more involved) defined. The +s and xs in the "x²+3x+1" notation are just syntax without semantic relevance; in particular x is not a variable and the substitution a ↦ P(a), a ∈ 𝗦 for any a and 𝗦 is not defined. This is because most interesting things about polynomials can be reasoned about without assuming anything about what x is.


Yes. But I've learned from bitter experience that mathematicians don't believe in units. At least not in their gut. To them, it's all just numbers.

(Even more infuriating is that physicists do believe in units, except when writing code).


Would it be counter intuitive though, that the ratio increases until dimension 5 (I think) then starts decreasing?


It's a matter of one's individual intuition, but said ratio is a non-linear function, and non-linear functions most often are non-monotonous.


Except exponential, log, log-linear, sqrt, the time or space complexity of pretty much every problem in CS...


And all cumulative distribution functions, x^k where k is odd, all functions F(x) = integral(f(x') dx', -inf, x) where f(x) is continuous..


Exactly. The article is really bad and serves only to confuse those with no experience in high-dimensional mathematics.


This was the first application of the Monte Carlo numerical method I've done in college. Take an N-dimensional cube with volume = 1. Inscribe an N-sphere in it. Calculate the volume of the N-sphere as a function of N.

I was too lazy to do the strict proof, so I sprayed it with lots of Monte Carlo bullets. It's like a page of code in any language. It turns out, as the article says, the volume of the N-sphere keeps getting smaller and smaller as N increases. In higher dimensions, there's a lot more volume in the corners of the N-cube. It did seem like the N-sphere was shrinking down to nothing in spaces with lots of dimensions.

Seems obvious after you read the article and look at the diagrams, but back then I had to think about it for a while. I thought my implementation was wrong somehow, but eventually I realized what was going on. Pretty amazing stuff.


John D. Cook has a couple of blog posts about non-intuitiveness of high-dimension geometry [1][2] and this article [3] expands on the very observation that you described.

Quote: "In his article “An Adventure in the Nth Dimension,” Brian Hayes explores how in high dimensions, balls have surprisingly little volume. As the dimension n increases, the volume of a ball of radius 1 increases until n = 5. Then for larger n the volume steadily decreases."

[1] https://www.johndcook.com/blog/2017/07/13/concentration_of_m...

[2] https://www.johndcook.com/blog/2017/07/19/corners-stick-out-...

[3] https://www.johndcook.com/blog/2012/10/23/dimension-5-isnt-s...



If you're interested in related problems, try improving on the performance of the wfg algorithm for computing the volume enclosed by a set of n-dimensional points. I spent a good 3 months convinced that I could, but I couldn't. The reason I couldn't is that I thought I could reason by analogy with 3D space, and it turns out that this is foolhardy.


"Sparse Distributed Memory" is a lovely book by Pentti Kanerva in which he hypothesizes that high dimensional binary vectors (hypervectors) is key to understanding human memory and cognition. To a large part Kanerva is concerned with constructing the theory for a feasible implementation of hypervector storage and arithmetics in modern hardware. (While the book is some decades old the main strokes and ideas still hold relevant.) As a fun fact, and quite telling of what kind of book it is, Douglas Hofstadter has written the foreword.

The book is quite hard to get hold of, but here is a recent talk: https://www.youtube.com/watch?v=zUCoxhExe0o


An interesting property of word vectors (which are usually 300-600 dimensional vectors) is that most are quasi-orthogonal. That means when you sum them up, they compose well representing all the meanings of the component parts, and strangely, multi-sense words such as "bank" contain all the senses overlapped, yet distinct.

Another interesting property is that high dimensional space has many shortcuts, or that at any point there are many paths. It's like a kaleidoscope with infinite reflections or like a mirror house. Or it's like any point has many close neighbours which can, paradoxically, be far apart between them.


> An interesting property of word vectors (which are usually 300-600 dimensional vectors) is that most are quasi-orthogonal.

I think it would be more interesting if this wasn't the case. The set of all possible English words is <200,000, with probably 10% of those being in common use. Given the small set, large number of dimensions, and the nature of language, it seems likely that non-random word vectors would tend towards orthogonality.

I'm assuming you mean that, "I will run with Bob" and "I will jog with Stacy" are not orthogonal, because they convey a very similar message, but are orthogonal to, "Man, that was a good beer."


Could this be one of the reasons why there are so many conspiracy theories these days? That is, with so many dimensions of data available, one can find close connections between any two things?


A conspiracy theorist will see a lack of evidence to the contrary as evidence of the affirmative. I.e., aliens exist because government officials let us in to the military base so we can see for ourselves there are no aliens there.

Whereas a reasonable person needs evidence of affirmative to believe something. I.e., news that NASA probes have discovered bacteria in subterranean wells on Mars.

The former way of thinking can be used to "prove" just about anything. So yeah, in some sense, most data points helps fuel more conspiracies because certain people are conditioned to believe anything.


Great article! I've been thinking about this on and off recently, as I wonder if we might be having intuitive issues when it comes to gradient descent optimization.

3 Blue,1 Brown had a video recently that kicked off my head scratching and is a great complement to your article: https://www.youtube.com/watch?v=zwAD6dRSVyI


Thank you! It's funny you should mention that because I've been thinking about continuous optimization a lot lately.

That's a great video, I really like his slider method for understanding the coordinates. Thanks for linking it!


The sphere drawing in figure 7... People seem to miss this subtelty, the spikeness is an artifact of the coordinate system. Rotate the basis vectors and the spikes move with them.


"You may be used to using the word “circle” in two dimensions and “sphere” in three dimensions. However, in higher dimensions we generally just use the word sphere, or d-sphere when the dimension of the sphere is not clear from context. With this terminology, a circle is also called a 1-sphere, for a 1-dimensional sphere. A standard sphere in three dimensions is called a 2-sphere, and so on. "

So, what is a 2-dimensional Sphere called if the 3-dimensional one is a 2-sphere?


The convention is to name things for their inherent dimensionalitly, not the space they live in.

A sphere in three dimensional space is a two dimensional object, in the sense that it's a surface on which the points can be described by two coordinates (say longitude and latitude).

Similarly a point on a circle can be described by a single coordinate (distance around the circle).

So the 1-sphere is a circle, and normally lives in 2 dimensional space, while the ordinary sphere is called a 2-sphere, and lives in three dimensional space.


If you talk about the "volume of a sphere" (like the article does), then a sphere embedded in 3 dimensions should be 3 dimensional. This would usually be called a ball and not a sphere, though.

Note that analogously, a cube in 3 dimensional space is always considered to be 3 dimensional.


Right, the 2-sphere is the surface of the 3-ball. I think "volume of a sphere" is okay though. It just means the volume enclosed.

>a cube in 3 dimensional space is always considered to be 3 dimensional.

So "cubes" are taken to include their interior. That makes sense I suppose. I wonder what the surface of a cube is called. A "box" maybe?


Things that are named for their inherent dimensionality regardless of the dimensionality of the space they are contained in are others, such as points (0), lines (1), planes (2), general manifolds, vector spaces, etc. Spheres (more properly balls) and cubes (or hypercubes) are understood to have the same number of dimensions as the space they are contained in.


I think it should be “in three-dimensional space”. The sphere is two dimensional.


Richard Hamming did a neat lecture on N-Dimensional Space, in the context of engineering: https://www.youtube.com/watch?v=uU_Q2a0S0zI


One of the first chapters of The Art of Science and Engineering (Hamming) covers this beautifully.


A simple observation that makes these properties a bit less counterintuitive is that the diameter of an n-dimensional unit cube (Euclidian distance between two opposite corners) is √n while the diameter of a unit sphere is always 1. So as the number of dimensions grows, the diameter of the unit cube can become arbitrarily large and the corners of the unit cube move further and further away from the unit sphere.


I love Matt Parker's take on the fourth dimension: see for instance this one-hour long but very enjoyable talk for general audiences at the Royal Institution: https://www.youtube.com/watch?v=1wAaI_6b9JE (Things to See and Hear in the Fourth Dimension)

If you know German, then you might like this talk as well: https://www.youtube.com/watch?v=d19zmiVBLS8 (The curious world of four-dimensional geometry)


Great 3blue1brown video to accompany this – Thinking Visually about Higher Dimensions:

https://www.youtube.com/watch?v=zwAD6dRSVyI


Why do we extrapolate 4d based on 3D volume, and not surface area? For 2D->3D we use area. Why would we then use volume in 3D->4D when surface area is an option?


The space that is enclosed by a unit sphere in dimension 1 is the interval (-1, 1). In physics this is measured in meters. This is referred to as length. The space enclosed by the unit sphere in dimension 2 is measured in meters squared. This is referred to as area. In dimension 3 we call it volume and measure it in meters cubed.

In dimensions 4 through infinity we quickly come to a problem. Do we come up with unique names to refer to amount of space enclosed by a unit sphere? Mathematicians have decided to just use the word volume. We rely on context to make it clear what the dimension is. Similarly we use the word n-sphere to refer to the collection of points in n-dimensional space that are exactly 1 unit from the origin.

Surface area of an n-sphere is used and studied.


In my applied mathematics education the highest-dimensional unit of measurement of a given dimensionality was referred to generally as ‘content’: length is the content of 1D objects, area is the content of 2D objects, volume is the content of 3D objects, hypervolume is the content of 4D objects, and henceforth you just start numbering them...


We're using volume the whole time. Area is just the two-dimensional equivalent of volume. The two-dimensional equivalent of surface area would be perimeter, which is relatively rarely used.


The more general term for ’area’ and ’volume’ in higher dimensions is ’content’: area is the content of a 2D item, volume is the content of a 3D item, hypervolume is the content of a 4D item, et cetera.


Same reason we don't extrapolate 2D->3D based on circumference. You need all the information of the (n)D thing, plus a new length, to get to the (n+1)D thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: