Hacker News new | past | comments | ask | show | jobs | submit login

I don't think "arbitrary linear algebra operations" is a valid critique. If you understand PCA as "take the SVD of the data", then the operations seem arbitrary. But if you understand it as, "construct a low-rank approximation in the L2 sense to the data, or its covariance", then it's not.

Also, I don't think that the (very legitimate) "dimensional" critique of PCA applies here. The units on the coordinates of the representation are the same: the presence or absence of that prime factor.

To the original question: my suspicion is that PCA might pull out the even numbers (first PC) and the divisible by 3 numbers (second PC), because these two factors may explain the most variability in the underlying vector representation. If it did, that would be pretty intuitive, although not as interesting.

---

Edited to add: Suspicion turned out to be true. For the first 2000 integers, the top 6 PCs turned out to correspond to the first 6 primes (2, 3, 5, 7, 11, 13).

Plot at: https://imgur.com/a/qi2Sx5u?

  function [nums,pcs]=pca_prime(nMax,nPC)

  nums = zeros(nMax, nMax);
  for k = 2:nMax,
    nums(k,factor(k)) = 1; % vector representation of "k"
  end;
  % 2:end because don't care about 1 as a "prime"
  pcs = pca(nums(2:end,:), 'NumComponents', nPC);
--

  [nums,pcs]=pca_prime(2000,10); % "svd" would work too
  plot(pcs(:,1:6)); % first 6 PCs



If you think of the covariance matrix, entry i,j for i ≠ j will be

   floor(n / (p[i]*p[j])) / n - floor(n / p[i]) * floor(n/p[j]) / n^2
and the ith diagonal entry will be

   floor(n / p[i]) / n - ( floor(n / p[i]) / n )^2
for n large, you approximately get a diagonal matrix with diagonal entries / eigenvalues 1/p[i] - 1/p[i]^2.


Smart observation. Another way to say it is that, for distinct primes p1 and p2, the events “p1 divides n”, and “p2 divides n”, are approximately statistically independent. So you get a near-diagonal covariance with entries as you wrote.


> If you understand PCA as "take the SVD of the data", then the operations seem arbitrary. But if you understand it as, "construct a low-rank approximation in the L2 sense to the data, or its covariance", then it's not.

Those are the same thing. Is your first case just a reader who has no idea what the SVD is?


Yes, they are both descriptions of the same thing. I'm trying to say that PCA does have a justification. It's not just an "arbitrary linear algebra operation", although the application of the SVD algorithm to perform PCA can be presented that way.


Are you saying the kth principal component is equal to the kth prime number?


Yes. See the plot: for example, PC #1 is essentially a 0/1 vector with all its weight (1 in this case) placed on the "2", which is the representation of the prime number 2 in the scheme used in the OP.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: