Hacker News new | past | comments | ask | show | jobs | submit login
Map of Reddit (anvaka.github.io)
729 points by penneyd on May 12, 2022 | hide | past | favorite | 119 comments



We built a map like this at reddit a long time ago. The methodology was pretty straightforward -- we looked at subreddits that had the same links submitted and upvoted. We used the map to power the "similar subreddits" feature. Unfortunately it suffered from a lot of spam and things like getting linked to very NSFW subreddits, and we didn't have the manpower to fix it or curate it, so the feature died.


Sad to hear this wasn't able to make it into GA.

But the link-relationship methodology is interesting (similar to something like PageRank through backlinks).

But it's not the methodology I'd have initially gravitated towards.

My first instinct would be relations based on subscription overlap. This seems like it should group commonalities based on the user interests. This may also have alleviated some of the SPAM issues.

Though it would have been interesting to see both approaches merged together.


That would not produce anything close to this. Is the goal to find similar subreddits, or find other non-similar subreddits the person may like? The map by OP is grouped by categories, which is quite a bit different than just interest. Even for recommendation system, I don't think it quite works here due to how extremely wide reddit is. Something like Steam or Spotify can use it, but reddit has everything from porn to cities to games. Just because I love Portal and I'm from Vancouver doesn't mean someone else who likes Portal will care about Vancouver, or vice versa.


Yes, but the majority of /r/Steam users are not subscribed to /r/Vancouver (or whatever the Vancouver subreddit is). I'd wager a guess there is a much more significant overlap with related subreddits such as /r/pcgaming.


I think this is how Last.fm works (and it works quite well!)

The weirdness in disparate interests is smoothed out by having a large sample size.

I'm trying to find details of the algorithm. In the meantime, here's an interview with the inventor of AudioScrobbler, which merged with Last.fm to provide its recommendations system: https://www.wired.com/2012/11/richard-jones-scrobbling


> That would not produce anything close to this

Well, have you read the methodology used by this map?

> Each dot is a subreddit. Two dots within the same cluster are usually close to each other if multiple users frequently leave comments on both subreddits.

So it's not exactly about subscribers, but it's the same idea, which proves your refutation wrong.


We may have tried other methodologies as well, I honestly don't remember. I feel like subscription overlap was something we at least talked about, but maybe not.


GA?


My guess is general audience or general availability.


How about amking an icon next to every single post which is a link to summary page of every post of that URL, ever.

I've been a reddit user for 15 years.

I have a fairly good memory. So many reposts from even years ago.

Fine, if nobody had seen such post prior, but an indicator of just how frequently a reposted URL is useful, though antithetical to $model.


If you're on old reddit, you can click the "Other Discussions" button to see every time the URL has been submitted. You can also just go to reddit.com/duplicates/$linkID

Or you can go to reddit.com/$URL (<- They hate it when I tell people this because it's a feature that I wrote 15 years ago as a URL rewrite in the load balancer that they have to maintain as they change load balancers)

Fun fact: That feature exists because I made reddit's co-founder Steve write it for me in exchange for a place to sleep.


Thanks for that feature, I love using it to see discussion of the same content in different communities. It really helps make reddit feel like the front page of the internet, with multiple communities commenting on what's going on every day. If there was one thing I wish was improved about it it would be canonicalization, so that m.wikipedia.com and wikipedia.com articles are connected, or youtube.com and youtu.be links


Super familiar, and thank you for that...

Sorry that I can't stand new reddit.

I think that there should be a visual indicator maybe from green - yellow - red background or something based on the frequency of reposts etc...

I consume reddit for new stuff...

After being a user for literally 15 years. I am ready to delete my account.


Shame that Musk didn’t opt to buy Reddit instead. The only tech company more mismanaged than Twitter over the last decade was Reddit. Not to mention their blatant bias and abuse of power.


> I processed 176,178,986 unique comments that redditors left in years 2020 - 2021 and computed Jaccard Similarity between subreddits.

> Each dot on the map is subreddit. Two dots within the same cluster are usually close to each other if multiple users frequently leave comments on both subreddits.

More detail from the repo: https://github.com/anvaka/map-of-reddit


The large "Asia" region contains, besides Asian topics (the right half of the region):

- Language learning communities

- Latin America (except Brazil which is in the RPG region because of r/TibiaMMO)

- Italy, Spain and Portugal (the latter is located between China and Japan for mysterious low-dimensionality representation reasons)

Other European countries with funny locations:

- Germany in the soccer region (as Gary Lineker once famously said "Football is a simple game: 22 men chase a ball for 90 minutes and, in the end, the Germans always win")

- France in the Canada region (Quebec strong)

- The Netherlands in the EDM region


Hungary is with the strategy games.


And HongKong is under Wtf.. makes sense I guess.


>Portugal (the latter is located between China and Japan for mysterious low-dimensionality representation reasons)

could that have a real world reason given that Macau used to be a Portuguese colony as well as Brazil hosting the largest Japanese diaspora population in the world?


Unlikely, because virtually nobody in Macau speaks Portuguese anymore, and Reddit is not well known in Japan. Various Japan-related subreddits are very popular, but they're populated by English speakers living in Japan, travelling to Japan, etc.


Seems to be pure chance. r/brasil and others are in a very different place in the graph, and none of the subreddits related to r/portugal are connected to any of the Japanese/Chinese subreddits in the graph.


Cats contains the plant related reddits, and is tied to them via r/CatsAndPlants (fair enough). It visibly looks like plants are also connected via r/michaelbaygifs, but that is only because of really odd placement for r/UnexpectedThugLife.

It would be nice if the shown links were not restricted to a single "country", as I bet r/dogsandplants (which is part of Cats) is tied to a bunch of dog reddits, but those are not shown.


There’s actually a subreddit for this map, but can’t find it in the map itself (too meta?) : https://www.reddit.com/r/MofR/ Also surprised how big overwatch is relative to other games.

Bonus - My roommate works at Amazon and works part time with Andrei in some capacity (don’t know the full details), but anyway he has mentioned multiple times how cool and out of his way helpful Andrei is.

I bring this up because when someone of any notoriety is nice I think it’s really cool. I’ve met some ‘big’ tech people who definitely weren’t!


The subreddit is linked of you click _improve this map_at the bottom right.


As a Cambridge resident, I appreciate /r/Boston and /r/CambridgeMA being placed in the "Survival" category!


Baltimore is as well, which is understandable. I'm surprised it's so far from idiotsInCars.


I've had another one of their sites [1] open in a tab for several months now. Whenever I find a new to me subreddit I find interesting I look it up on this site to see what else is in the vicinity topic wise.

[1] https://anvaka.github.io/sayit


Sweet, I just found the https://old.reddit.com/r/bevy subreddit for the Rust based game engine.


I recognize this URL because I’ve been using this:

https://anvaka.github.io/sayit/

It’s a searchable and visual graph of subreddits, mindmap style.


That's super cool :)


Careful exploring 'Australia'


I thought 'Australia' would be bigger, honestly.


To elaborate, the australia-like island consists of very NSFW subreddits.

And yes, one should be careful with porn, consuming it in a bad way can lead to harmful coping mechanisms or problems with sex; seek help if you feel that you’re not in control.


There is no good way to “consume” porn. At best it’s pathetic, and it can be a whole lot worse, causing actual brain damage by rewiring reward circuits. It’s similar to alcohol in that respect: there is no safe dose.

I don’t say this to upset anyone, but anyone who is upset will benefit greatly from curbing the tendency toward denial and honestly asking “Am I an addict?”


All of this is of course backed by sound science right?


> At best it’s pathetic

This seems a personal opinion and unhelpful. Others might say avoiding porn is pathetic; I think this topic should be discussed without name calling.

I like the comparison to alcohol. I don’t drink but I wouldn’t shame those who do (unless they have a compulsion or they drink to avoid problems).


> This seems a personal opinion and unhelpful. Others might say avoiding porn is pathetic

No. Being a pathetic wanker who beats it to videos of things he lacks the agency to experience personally is in fact objectively shameful.


Oh, so it's one of those types of personal opinion.


Why rule34 is on the main island instead of in "Australia"?


Brazil is apparently a suburb of RPG.


New Zealand is great too.


Original Show HN from author a year ago:

https://news.ycombinator.com/item?id=26624879


I'm happy and amused that KerbalSpaceProgram, NASA, SpaceX and Astronomy are placed under "Travel" and "Finance".

Maybe that's where they should be.


You can learn about managing finances in KSP Career mode I guess...


Coincidentally by building fully automated tourist ships...


Why does rule 34 and most of the furry porn not fall onto porn island?


I guess the furries don't feel the need to create alt accounts for their porn


yeah, makes sense since so much anime content (or at least, a signifigant proportion more compared to western animation) already includes pretty explicit content. The line between something like Kill La Kill and kill la kill R34 is already razor thin.

part of it is also community. From a quick glance I recognize that r/hentai is in anime island instead of R34. But the moderation of r/hentai has (or used to have) many moderators from other anime subs, and some regular posters will alternate between r/hentai, r/manga, or r/animemes. That can play a part in which communities decide to form and comingle.


I feel like moderators should receive a weighted vote when identifying connections between subreddits. Two subs with identical moderator lists must be pretty similar right?


That doesn’t make sense when you realize mods uphold arbitrary politics of a sub, often even more than they are just janitors.


Same reason why there's a "gay island" I guess?


Linux under programming next to science and graphic design.

Microsoft and Windows 10 in netsec(!?) along with Facebook.

Google and Apple occupying a sea of banal self referential subs and consumption. Amazon is there, too just super small.

Very interesting.


Considering how terrible Windows' netsec has been historically, that's entirely unsurprising. When you want to scream for help into the void, you might hit up both those places at once.


I guess that applies to Linux and graphic design as well!


Its surprising how often I was already visiting a clique of sub-reddits. Even more surprising how much I was missing out on similar sub-reddits. This visualization is really a discovery tool I needed.


Jojo* appears to be a country of refugees from the Anime region, who've settled in the general gaming region.

*Hopefully before anyone searches it on the map -- it appears that r/jojo was taken by some singer/actress. I was referring to the anime "Jojo's bizarre adventure," which has a capital of "r/shitpostcrusaders."


I wonder how the broader categories were selected. I thought it could be the largest/most connected node on each cluster but that doesn't seem to be the case.


I feel like these dots are so small, you have to zoom in and out constantly to go over the map... would it be possible to now take those dots and turn them into blocks that together make up 100% of the category they currently belong to? Like counties within a state


That's a nice idea. I do wonder if it could create some border problems, where there is a need to link to some non-neighbor states, but not through other states that would otherwise be linked/shared neighbors.

(I guess unless you wanted to render little overflights by airplanes or something)


Back in 2016 I did a similar approach to calculate related users: one using graph similarity (https://minimaxir.com/2016/05/reddit-graph/ , albeit the embeds broke) and another using jaccard similarity (like this viz) with a different approach to visualization which IMO turned out easier to interpret than a graph-based approach (https://minimaxir.com/2016/06/reddit-related-subreddits/ )


There's lots that's interesting to me about this but one is how links provided by subreddits themselves might or might not reflect actual related topics. There's a couple of subs I'm familiar with that have relatively large related subreddits I wasn't aware of before, that aren't linked to or mentioned. Maybe in some cases there's political histories I'm unaware of?

The new reddit design is so problematic in so many ways. So many of the "related subreddits" sidebars (all?) are just eliminated in the redesigned site.


Great map. I suggest to paint active subreddits as bright circles and stale ones as dim circles, so the map would look like Europe viewed from satellite at night.


My methodology for discovery/linking subreddits when I did a big crawl was to look at the subreddit description and find links to other subreddits. I think that was less prone to the issues that other commenters are reporting.


Thanks to /r/quebec, /r/france finds itself in... Canada ?


Halifax is in Nova Scotia, which is in Canada, which is on Earth, which is in Canada.

https://youtu.be/oz88kJSdT6Y


I love this and specifically street view, thank you for making it :)


Oh, nice, I didn't even realise street view existed!

For anybody else who didn't realise - click on a circle to pop up a little preview of that subreddit, then click 'street view'. You can navigate in street view by clicking to bring up a crosshair, then using WSAD.


this the metaverse, I mean, a metaverse, and maybe a VR 3d equivalent of this is the way to go? Curated and weighted as the user wants, not how the advertisers want...

I feel like in a social environment, Just because you don't find everything you want immediately is a feature, not a bug, like getting off in a part of town you don't know and talking to a person you didn't expect. In hindsight, "11 year old explore the city me" never left - still fun 30+ years later



Can this map be better understood as groupings of users? It seems that a lot of the categories that are polyphyletic so to speak - programming grouped with science, Italy and Spain between China and Japan - make sense if you just think of them as overlapping groups of users.

More specifically, the groupings would be user accounts (this distinction is important for understanding Porn Island).


From the info section - "Two dots within the same cluster are usually close to each other if multiple users frequently leave comments on both subreddits."


Just, awesome. Recently I was trying to figure out how could explore in a more "natural" way new communities on reddit, well done


"Large Subreddits" seems kind of lazy.


I don't think so. The big subs have their own culture that's not as shared with the smaller subs. It's part of the reason why (and corollary to) subs get worse when they get larger


/r/politics makes a lot of sense.


how are the edges determined? i saw relative proximity is user interaction overlap, but there are discrete edges too if you click on a subreddit.

i'm curious because i would like to go another step and see how many hops it takes to get from A->B


I had no idea that reddit consists to such a large extend of porn and related content.


Before NSFW content was removed from /r/all a few years back It was much more obvious. Once you scrolled 8-10 pages down your main feed it was almost exclusively porn.

Happy for the change, it made the website more unusable then it already was.


if you ever look at the /u/AutoModerator comments it's still very clear


A few years back I saw something similar where subreddits were presented like a tree, based on topics (and popularity, I think). Made it easy to drill down to interesting niche subreddits.

Haven't been able to find it since.


Who knew that science and teaching were a subset of programming?

Feels like something fundamental is wrong with the methodology. It appears they gave up on the large subreddits, perhaps because they were linked to everything.


I don't know why but my brain interpreted this as "Map of GitHub" and I was rather surprised that there was a GitHub repository for just GIFs and I hadn't known about it.


Incredibly interesting chart, or whatever the presentation is called.


I would go with “map”


charted territories


Territorial chart?


Low-dimensional embedding.


I learned you can click on the nodes to see relationships FWIW


Browsing this as a map was entertaining, and I don’t even use Reddit. Maybe this is a good application of the “metaverse”. Spacial association between communities.


Just as a personal opinion, it feels to me like having a 'large subreddits' category is kind of a cheat.


Black screen on Firefox Android.


Also Firefox (97.2.0 (Build #2015863827)) on Android, for me it's working.


Works for me on Firefox Nightly, Android and desktop versions.


Same on Firefox Windows.


Curious how the data was collected (and when), some subreddits shown are now banned.


reddit actively shadowbans comments and all kinds of user created content.

It was really surprising to me that they'd be shadowbanning comments I would make to reply to other people in already buried (controversial) threads.


How are the larger sections named? Hacking isn't in the hacking section.


The category names are often a little off. "Running" is more like endurance/cardio athletics, for example.

Th methodology is cutting off edges that cross between categories too, which in some cases I think is giving a distorted idea of how the subreddits are organized.


Should be a laaaarge overlapping country reading "gonewild"... :)


How to zoom out?.. (I've finally found that double click will zoom in)


Doesn't include quarantined subreddits, many of which are very active


The bulk of the groups are labeled, but the sections of "NSFW Island", I'll call it, aren't. Is there an explanation of the grouping? Some of the groups make sense (celebs, indian-themed) but I can't make heads or tails of the others.


The country names for "Australia" are (in no particular order, with lists of some of the larger circles, especially ones that suggest other clusters bundled into the country)

- WorldPacks (WorldPacks, NSFWPublic, DiscordNudes as some of the larger ones)

- Ebony (Ebony, GoneWildColor, OnlyFans101, TeenBeauties)

- Sommer Ray (SommerRay, MegNutt, CardiB)

- gonewild (gonewild, realgirls, OnOff, AsiansGoneWild)

- NSFW (NSFW, Celebs, Porn, Boobies)

- Blowjob (Blowjob, painal, creampies, PornVids)

- NSFW Glamour (abelladanger, Miakhalifa, RileyReid, PornStarHQ)

- Perky (Perky, NSFWBox, CamOrgasm, ForgotToPullOut)

- Glamour (KatyPerry, EmmaWatson, ArianaGrande)

- Onmww (Onmww, obsf, gilf, ErinAshford)

- grool (Grool, Anal, squirting, simps)

- milf (milf, Cuckold, Amateurs, GWCouples)

- GoneWildPlus (GoneWildPlus, BBWGW, PerkyChubby)

- feet (feet, sexsells, bdsm, gonewildaudio)

- NSFW India (IndianGoneWild, RepressedGoneWild, ArabPorn)

- texas nsfw (TexansGoneWild, TexasSwingers, TexasCuckoldCommunity)

The names are mostly just the names of one of the larger or more centrally connected subreddits in the country, and what some of them include seems pretty arbitrary.

And you can probably find "New Zealand"'s "Gay" and "Traps" countries pretty easily, and have no trouble telling them apart.


Could we say that this is a proxy for the internet as a whole?


Surprised to not see /r/superstonk in the map


Uhm, so all of the german subs are now in soccer ?!


Reddit has become a cess poll of anti free speech.


impressed by how well the layout is done, and by the smoothness of zooming and scrolling


Starcraft 2 in RPG


Is there a version of this that doesn't require javascript?


Where is porn? I've certainly seen a lot of it on Reddit.


Go south, young man


No categories for science??


Science has a large cluster in North Programming, and a cluster in Eastern Finance. The search is pretty good.


It appears above Programming when you start zooming in. Unintuitive order to put things, but it's there.


That whole big Australia-like continent at the bottom is "for science"...


it is for senses*


Hmm so actually internet is not just for porn


Apparently hackernews is in netsec, huh.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: