Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Little Ball of Fur – A NetworkX extension library for graph sampling (github.com/benedekrozemberczki)
16 points by benitorosenberg on May 12, 2020 | hide | past | favorite | 4 comments



Can you give some examples of when this would be useful? What's distinctive about sampling methods beyond, say, picking a random node and all of its neighbors? What problem does that solve?


The reason for not doing that is the bias that such sampling introduces.

We are writing a paper out of this, but the main point is that you can achieve these two things with minimal classification performance degradation:

1. Speeding up node embedding and classification. 2. Speeding up whole graph embedding and classification.


Can you speak a little more about how those work? I understand word embeddings conceptually. And I can imagine using a similar process to embed the arbitrary data stored in a graph. Embedding an entire graph makes less sense to me, unless 'entire graph' means a subgraph of the general population.

I do social network stuff occasionally. If I hypothetically could create an embedding representation of everyone, I could imagine it might be useful to, say, TSNE it all as opposed to a force layout for viz. Or maybe run it as a pretty black box prediction input? Wondering if I'm missing something more obvious here


Entire graph embedding means that you have a lot of smaller graphs (e.g. molecules, transactions, threads) and you want to classify them. We created this package which covers these methods:

https://github.com/benedekrozemberczki/karateclub




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: