Hacker News new | past | comments | ask | show | jobs | submit login
Scant Evidence of Power Laws Found in Real-World Networks (quantamagazine.org)
207 points by yarapavan on Feb 16, 2018 | hide | past | favorite | 47 comments



In grad school scale-free networks were the soup du jour and my advisor was hammering on me to show that the human metabolic networks I studied had that property. "Guaranteed paper in Nature, get on it!", he exhorted. I sensed folks were not using the right statistical tests to show that distributions were scale-free (hint, doing a log-log-plot and doing linear regression to get a slope does not give you the base of the power law if there is no power law in the first place) and found a few notes on homepages that this was the case, but no publications (this was 2006 or so).

There seems to be an attitude in the physical sciences that math is there as sort of a glaze or condiment you can throw on top of bad data to make it palatable. I didn't make many friends in the biology department by telling them their pet model they'd based their careers off of wasn't only bolstered by $trendy_math_analysis but actually weakened by it. People seemed less interested in truth and more interested in appearance and publication prestige.


Speaking as a statistically-impaired person, how do you determine if you have a power law?


It's difficult to tell the difference between log-normal and a power-law. Best is to use theory to argue the distinction, explaining the generative process, rather than relying on some lame empirical evidence. Your priors will overwhelm almost everything.


MLE, then vuong's test against alternatives. The linked paper is just vuong's test against alternatives.


Might want to try that again...


trends and hype happen, it's human nature. biology is and will continue to be an exciting, under-explored frontier for years to come. math will only help. humanity needs people like you who recognize the underlying issues and can move things forward.


Always a good time to re-read Cosma Shalizi's classic "So You Think You Have a Power Law — Well Isn't That Special?" (http://bactra.org/weblog/491.html)


See also Mar's Law as taken from Akin's Laws of Spacecraft Design.[1]

    Everything is linear if plotted log-log with a fat magic marker.
[1] http://spacecraft.ssl.umd.edu/old_site/academics/akins_laws....


> that I have long had a thing about just how unsound many of the claims for the presence of power law distributions in real data are, especially those made by theoretical physicists, who, with some honorable exceptions, learn nothing about data analysis. (I certainly didn't.)

Statistics really needs to be a fundamental skill taught in all sciences (or any field involved in research/studies) as 101 courses as math and writing are.

It's amazing how much time/energy is wasted in research, across a wide swath of fields, because the ultimate analysis lack a strong fundamental grasp of statistics and data analysis.

I want to pull my hair out every time I read a Wikipedia article documenting the historical back/forth of a particular social science... where the status quo of knowledge is repeatedly discredited based on a biased manipulation or simply a poor grasp of analyzing complex multi-faceted data.


I completely agree. You can be a genius in your field, but mess things up when it comes to basic statistics. it can be surprisingly easy to make major errors. analyzing your own data is a fundamental skill that not only affects how you present your own research, but how you learn from and perform your research.


I love when facts contradict socially propagated truth, and we actually change our minds as a consequence. That update mechanism is one of the biggest sources of faith in humanity I can think of.


I find it rather more disturbing - that people rationalize explanations for various discovered "facts", and as our understanding changes and the "facts" change, just as easily come up with new rationalizations.

It exposes rationalization as story-telling.


Sure, but these two points of data make a pretty nice line: [My assumption is that] people (at large) did not update on socially propagated facts. Now they do. Maybe tomorrow it'll be further improvement.

(Or, maybe it's not that people are improving, but that the social transmission now exists and people haven't changed...?)


However, this can lead to 'malware' updates being installed through seemingly trustworthy channels.


> That update mechanism

is like gradient descent, for human culture.


More like memetic evolution. It has no attachment to local maxima.


Any time I hear a very socially useful truth proposed, I get very wary. History is littered with ideas that were proclaimed to be true mostly only because it would be very useful if it were true. It's certainly no guarantee of illegitimacy, but if an idea comes along that basically says "hey, you know how your society likes to do X? Turns out that's the best!" I immediately enter high-skepticism mode. Everything from leaded gasoline to eugenics to the Industrial Revolutions views on the dangers of masturbation can be laid at the feet of people who so desperately wanted something to be true that they skimped on rigor and ended up leaving unimaginable suffering in their wake. When the doctor (I can never recall his name!) suggested other doctors wash their hands between performing autopsies and delivering babies, he was insulted and rejected, as the idea that doctors who wanted desperately to help their patients were responsible for the astronomically high rate of death of both mothers and infants was offensive. This is the opposite side of that coin.


That's effectively Appeal to Consequences fallacy, though in both its usual (false rejection of an inconvenient truth) and inverse (false support of an appealing falsehood) forms.

That's a useful counter-bias to have, though not an infallible one.


Ignaz Semmelweis


This would be tremendously comforting. After reading the book 'Linked', I have been worried. It presents a possibility that I hope no one is immoral enough to pursue. If it's claims of how social networks cluster, and how both important and fragile the most critical edges in those networks are in terms of enabling the society-wide spread of information/viewpoints/etc are all true.... it opens a very dangerous door.

In the book, they claim that something like 'Six Degrees of Kevin Bacon' works not because of those heavily-connected nodes, but primarily because of nodes which bridge mostly-disconnected clusters. So like a member of a biker gang who plays bridge with his elderly aunts knitting circle on Sundays would be an example of that. Ideas can flow from the group of bikers to the group of elderly knitters pretty much only through him. And almost by necessity, those links are weak. Once broken, for information to travel from one of those groups to the other immediately becomes extremely difficult.

So conjecture that there were an organization that had the ability to observe the social network of peoples communications. Conjecture that they also felt that they had a mandate to protect the status quo, at least on the largest scale, and to do what was within their power to prevent things like widespread social unrest, formation of disruptive political movements, etc. If they had the ability to interfere with those communication networks even in a very mild way, they could affect the most successful and quiet oppression in history. By bouncing a few emails, dropping a few packets here and there, communication between these weak links would break down pretty easily. And once mostly-connected clusters only talk amongst themselves primarily, it becomes fundamentally impossible for widespread social change to occur.

Now this is a very 'blind' approach, of course. You don't get to pick which ideas get isolated and which are permitted to spread. But, you do gain a guarantee that even if an idea is very powerful, it can never spread far enough fast enough to gain widespread acceptance. Sure there might be "large groups" that get very loud about it... but they wouldn't have an 'inside man' to introduce the idea in ways acceptable to the mostly-disconnected group, so it simply wouldn't spread. I've been wondering for several years now if this kind of manipulation would leave telltale fingerprints.


This is not new, at least for biological networks. For example, see this paper by Lima-Mendez and van Helden: The powerful law of the power law and other myths in network biology: http://pubs.rsc.org/en/Content/ArticleLanding/2009/MB/b90868... and several blog posts by Lior Pachter: https://liorpachter.wordpress.com/2014/02/10/the-network-non...


For someone unfamiliar with power laws and scale in networks, that introduction has me totally bamboozled with multiple repeating negatives.

A scale-free network is one that follows power laws, which manifests as having specific hubs that are much more interconnected than others.

A random network does not follow power laws. Hubs and edges are distributed relatively evenly.

Is that right?


The article is a bit vague on these so I'll take a stab:

Random network are generated randomly, i.e. drawn from some distribution. An example would be the Erdos-Renyi (ER) network, where edges are drawn from a Bernoulli distribution between each node in the network. The simplest way to measure random networks is to determine the degree distribution. ER networks hence result to degree distributions that follow the Binomial distribution (Poisson as N->infty, Np->const). Since the edge probabilities are rather simple and drawn independently, one can take the ER network as sort of a "null model" of random networks.

Scale-free networks, on the other hand, do not follow this degree distribution. Rather, they follow a power-law distribution p(k) ~ x^-a. A process to explain this is that the network is generated by preferential attachment: nodes are more likely to be connected to nodes which are highly connected in the first place. Most of the discussion and the following arguments on this are in the article.


Scale-free: distribution of connections follows power law across hubs. Other type: it doesnt


But "it doesn't" can include things that are both distrubuted with fatter or thinner tails?

And if it is normally distrubuted below some range, and then power law above that it also wouldn't be scale free under this paper?


For the first question, yes. "Not power law" means anything else.

But I am quite confused on what you mean with power laws "above" or "below" some range.


It seems that the reason power laws are important is the frequency of arbitrarily large values, and the implications those have on data processing.

Under a definition that strict it doesn't sound useful: imagine the most common degree / # of connections is 10 and 5 is less popular: isn't that already not a power law? Regardless of how the distribution behaves for degree 100, 1000, etc.

It seems like under many models you have some unimportant distribution of low-degree connections and then close to a power law distribution if you consider above some threshold: in terms of algorithm selection and design under those conditions the dominating factor seems like it can typically still be just the frequency of large degrees which can still be close to a power law and rejected as one by this paper, right?


Yeah , I sort of agree that practically power laws are pretty useful, even if they are approximators. I think the paper is trying to say though, the same people (including myself) who use power law to generate these models often say that these processes are actually power law processes. When in fact they may not be. Which is surprising to me (Zipfs law and all of that)


I think Metcalfe's law is a bit overstated. Logic seems to suggest it starts out close to exponential but eventually becomes more like log n. Because each participant only has X amount of friends, that they derive benefit from.

For network of one-to-many broadcasts it may be more like n * k where k is the number of active broadcasters.


See Odlyzko-Tilly, "A Refutation of Metcalfe's Law".

https://www.dtc.umn.edu/~odlyzko/doc/metcalfe.pdf


A classic from Lior Pachter (not mentioned in the article) from 2014: https://liorpachter.wordpress.com/2014/02/10/the-network-non...


What are the implications here for internet service providers and the big social networking companies?


For social networking companies, I believe that it means targeting influencers to take advantage of their networks' coverage is much less powerful than first thought. An advertiser won't expect to be able to steer dollars towards just a few of the truly deeply connected influencers and gain outsized rewards; instead, they'll have to target lots of influencers, reducing the ROI of the project.


Even more than that, the structure of these networks imply that targeting influencers may be ineffective. There's an interview [0] with Duncan Watts discussing this.

[0] Is the Tipping Point Toast? https://www.fastcompany.com/641124/tipping-point-toast


It’s an absolute game changer!!!


Now, why do so many systems seem to follow log-normal??


A central claim in modern network science is that real-world networks are typically "scale free," meaning that the fraction of nodes with degree k follows a power law, decaying like k−α, often with 2<α<3. However, empirical evidence for this belief derives from a relatively small number of real-world networks. We test the universality of scale-free structure by applying state-of-the-art statistical tools to a large corpus of nearly 1000 network data sets drawn from social, biological, technological, and informational sources. We fit the power-law model to each degree distribution, test its statistical plausibility, and compare it via a likelihood ratio test to alternative, non-scale-free models, e.g., the log-normal. Across domains, we find that scale-free networks are rare, with only 4% exhibiting the strongest-possible evidence of scale-free structure and 52% exhibiting the weakest-possible evidence. Furthermore, evidence of scale-free structure is not uniformly distributed across sources: social networks are at best weakly scale free, while a handful of technological and biological networks can be called strongly scale free. These results undermine the universality of scale-free networks and reveal that real-world networks exhibit a rich structural diversity that will likely require new ideas and mechanisms to explain.

Paper referred: https://arxiv.org/abs/1801.03400


It reminds me a bit of how Romans used to build their ships by exactly scaling up everything.

However, it is now known that, for larger ships, the proportions should be different.


I've always been a fan of Haldane's treatment of the matter: http://irl.cs.ucla.edu/papers/right-size.html


Same goes for RC airplanes, but in reverse of course. It has to do with turbulence and reynolds numbers.



Many thanks. I wish there were a Chrome extension or something to find the actual underlying paper from popsci articles.


It's literally the first link in the first words of the first sentence of the article. And in this case you'd miss the commentary by experts in the field who were not on the paper, including Mason Porter, Steve Strogatz, and others. Klarreich is one the few respectable science writers and adds concrete value in her writing.


Sure; I'm not calling this article out in particular, just bemoaning the general state of science in journalism and that they generally bury the original paper so far down the page.


Thanks, I've wanted this too, and have a new fun project.


Quick thought from the off-field: Could those nets be modeled as scale-free meta nets of scale-free sub nets? Seeing the forrest for the trees kindalike.


“Power laws” are the golden ratio of network science.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: