While technically, it is possible to use statistics as "proof" with N=30, but it is stretching it a bit, IMHO.
For example, stating that the amount of reports per year corresponds to the amount code added, by stating that both are "somewhat linear" is not very solid. I could just as well state that the amount of reports per year is "somewhat exponential" and conclude that it does not correspond to the amount of lines of code added.
This does not make the point overall any less true, it is just that the foundation: the numbers, are too few to make any grand conclusions from.
Might I humbly suggest that anybody serious about this issue read (sadly, the late) Manny Lehman's "FEAST" publications? He attempts to quantitatively model software evolution, which includes complexity, errors of omission (limitations of domain model), errors of commission ("bugs"), etc. It is fascinating reading. I remember many "Aha!" moments when seeing the graphs. It also contains many quantitatively-derived principles one can operate by, some of which underlie pg's "beating the averages" argument.
His wikipedia page is here https://en.wikipedia.org/wiki/Manny_Lehman_%28computer_scien..., and the FEAST pubs are here http://www.eis.mdx.ac.uk/staffpages/mml/feast2/papers.html
For example, stating that the amount of reports per year corresponds to the amount code added, by stating that both are "somewhat linear" is not very solid. I could just as well state that the amount of reports per year is "somewhat exponential" and conclude that it does not correspond to the amount of lines of code added.
This does not make the point overall any less true, it is just that the foundation: the numbers, are too few to make any grand conclusions from.