Yes, the typical point is far from the origin (but unless there is spherical symmetry the only thing those points have in common is being far from the origin). What do you gain by excluding the origin from the typical set?
Edit: I'm not sure if your comment contradicts or complements what I wrote. As I said, taking a hypersphere of radius 1 the "inner" hypersphere of radius 0.5 accounts for just 1/2^n of the hypervolume. Would you say that the "outer" shell is a neighbourhood? How is that defined? What is the advantage of doing so instead of using the whole (only marginally bigger) hypersphere?
There's some information theory that goes over my head, but I think the point is that you can approximate E[f(x)] with an integral over the typical set and expect good results, for non-pathological functions f.
Defined in this way, you could of course leave the origin in the typical set. But then there would be another, smaller, typical set that would do the job just as well.
Thanks. I'm sure the formalism makes sense if properly developed and can lead to useful approximations. It's just that the "conceptual introduction" in that paper makes no sense to me... I understand it's just a simplification, but simplifying too much removes the essence of things.
> A grid of length N distributed uniformly in a D-dimensional parameter space requires N^D points and hence N^D evaluations of the integrand. Unless N is incredibly large, however, it is unlikely that any of these points will intersect the narrow typical set, and the exponentially-growing cost of averaging over the grid yields worse and worse approximations to expectations.
In fact any point within the hypersphere is as good (or better) as any point in that narrow typical set not including the mode. The problem is not missing the "narrow typical set", is missing the whole hypersphere (because the volume of the hypersphere is small). I fail to see where in that introduction the fact that the typical set excludes the mode of the distribution makes any difference.
Edit: I'm not sure if your comment contradicts or complements what I wrote. As I said, taking a hypersphere of radius 1 the "inner" hypersphere of radius 0.5 accounts for just 1/2^n of the hypervolume. Would you say that the "outer" shell is a neighbourhood? How is that defined? What is the advantage of doing so instead of using the whole (only marginally bigger) hypersphere?