Hacker News new | past | comments | ask | show | jobs | submit login
Graph Theory: Part III (Facebook) (20bits.com)
56 points by motters on Aug 22, 2011 | hide | past | favorite | 10 comments



Gremlin makes these type of graph algorithms easy (https://github.com/tinkerpop/gremlin/wiki):

  g = new Neo4jGraph('/tmp/neo4j')

  // calculate basic collaborative filtering for vertex 1
  m = [:]
  g.v(1).out('likes').in('likes').out('likes').groupCount(m)
  m.sort{a,b -> a.value <=> b.value}

  // calculate the primary eigenvector (eigenvector  centrality) of a graph
  m = [:]; c = 0;
  g.V.out.groupCount(m).loop(2){c++ < 1000}
  m.sort{a,b -> a.value <=> b.value}
And Bulbs allows you to run Gremlin scripts from Python (http://bulbflow.com/).


Measuring influence via this eigenvalue method seems to make sense. However, there is ambiguity. First, you have to decide which eigenvalue to pick. Do you choose based on its magnitude? Even when you choose one somehow, what if the dimension of the eigenspace isn’t 1? Unfortunately, there are no links to clarify this.


Under mild conditions of the undirected graph (basically that it is not bipartite and that it is connected) one can use the Perron-Frobenius theorem. So, to remove the ambiguity you are concerned with consider that you want an x s.t. all the entries in x are positive. Then PF guarantees that the largest eigenvalue of A has multiplicity one and its associated eigenspace is one-dimensional. PF also guarantees that the eigenvector associated with the largest eigenvalue has positive entries and is the only eigenvector with positive entries. So, you want the largest eigenvalue (which is positive) and the unique positive valued eigenvector associated with it. Finally, to obtain this eigenvector/value pair one can use the power method which this author recommends.


Thanks for the explanation! My only concern now is the validity of the meaning of the principal eigenvalue: "λ determines how much influence people share with each other through their connections. If λ is small then the CEO has a lot of influence, if it is large then he has little." It seems to depend on λ>1 or λ<1. Also, has this method been applied in practice, if you know?


I don't understand the interpretation of the principal eigenvalue either. Perhaps there is a more suitable interpretation in the directed case, but I'm not sure of that either.

I think this method is generally known as eigenvector centrality, that is to say, the entries in the vector x are generally known as eigenvector centralities. I think this method is quite popular, but I do not know who uses it or how often.


Interesting. The article is a little unclear about why not any lambda will work, which is what motivates the eigenvalue problem.

In case others are/were likewise confused: assume you've been given the influence of all but one of the vertices in the graph. You want to assign the last vertex an influence. What lambda do you pick? You might think you could pick any lambda and arbitrarily set the influence decay across edges.

On the other hand, we want a consistent measure of influence. After you assign that last vertex some influence, you can now recalculate the influence of any other node using your equation. If the recalculated value is different, you picked the wrong lambda.

The eigenvalue problem formalizes the problem of finding a consistent lambda before assigning an influence to any of the vertices.


Topological properties of an individual (such as network degree or betweeness centrality) are not necessarily predictive of influence. The author acknowledge this and suggests observation of peer response to messages sent by FB on behalf of the user. What is really necessary here to determine influence is a randomized trial. See for example the methods described here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1564856 (full disclosure - I am a co-author of this work and of some upcoming work that describes techniques of identifying influential and susceptible users in social networks).


However, I don't think it is possible to built a proper graph from the API as you cannot obtain the number of friends that your friends have. Or... ?


Does anyone have a cache of part 1 and 2? Seem like good reading material.


His direct links from part 3 to parts 1 and 2 are broken. However if you click on "More Articles" (http://20bits.com/category/articles/) on his homepage the links there work fine.

http://20bits.com/articles/graph-theory-part-i-introduction/

http://20bits.com/articles/graph-theory-part-ii-linear-algeb...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: