We built a map like this at reddit a long time ago. The methodology was pretty straightforward -- we looked at subreddits that had the same links submitted and upvoted. We used the map to power the "similar subreddits" feature. Unfortunately it suffered from a lot of spam and things like getting linked to very NSFW subreddits, and we didn't have the manpower to fix it or curate it, so the feature died.
But the link-relationship methodology is interesting (similar to something like PageRank through backlinks).
But it's not the methodology I'd have initially gravitated towards.
My first instinct would be relations based on subscription overlap. This seems like it should group commonalities based on the user interests. This may also have alleviated some of the SPAM issues.
Though it would have been interesting to see both approaches merged together.
That would not produce anything close to this. Is the goal to find similar subreddits, or find other non-similar subreddits the person may like? The map by OP is grouped by categories, which is quite a bit different than just interest. Even for recommendation system, I don't think it quite works here due to how extremely wide reddit is. Something like Steam or Spotify can use it, but reddit has everything from porn to cities to games. Just because I love Portal and I'm from Vancouver doesn't mean someone else who likes Portal will care about Vancouver, or vice versa.
Yes, but the majority of /r/Steam users are not subscribed to /r/Vancouver (or whatever the Vancouver subreddit is). I'd wager a guess there is a much more significant overlap with related subreddits such as /r/pcgaming.
I think this is how Last.fm works (and it works quite well!)
The weirdness in disparate interests is smoothed out by having a large sample size.
I'm trying to find details of the algorithm. In the meantime, here's an interview with the inventor of AudioScrobbler, which merged with Last.fm to provide its recommendations system: https://www.wired.com/2012/11/richard-jones-scrobbling
Well, have you read the methodology used by this map?
> Each dot is a subreddit. Two dots within the same cluster are usually close to each other if multiple users frequently leave comments on both subreddits.
So it's not exactly about subscribers, but it's the same idea, which proves your refutation wrong.
We may have tried other methodologies as well, I honestly don't remember. I feel like subscription overlap was something we at least talked about, but maybe not.
If you're on old reddit, you can click the "Other Discussions" button to see every time the URL has been submitted. You can also just go to reddit.com/duplicates/$linkID
Or you can go to reddit.com/$URL (<- They hate it when I tell people this because it's a feature that I wrote 15 years ago as a URL rewrite in the load balancer that they have to maintain as they change load balancers)
Fun fact: That feature exists because I made reddit's co-founder Steve write it for me in exchange for a place to sleep.
Thanks for that feature, I love using it to see discussion of the same content in different communities. It really helps make reddit feel like the front page of the internet, with multiple communities commenting on what's going on every day. If there was one thing I wish was improved about it it would be canonicalization, so that m.wikipedia.com and wikipedia.com articles are connected, or youtube.com and youtu.be links
Shame that Musk didn’t opt to buy Reddit instead. The only tech company more mismanaged than Twitter over the last decade was Reddit. Not to mention their blatant bias and abuse of power.
> I processed 176,178,986 unique comments that redditors left in years 2020 - 2021 and computed Jaccard Similarity between subreddits.
> Each dot on the map is subreddit. Two dots within the same cluster are usually close to each other if multiple users frequently leave comments on both subreddits.
The large "Asia" region contains, besides Asian topics (the right half of the region):
- Language learning communities
- Latin America (except Brazil which is in the RPG region because of r/TibiaMMO)
- Italy, Spain and Portugal (the latter is located between China and Japan for mysterious low-dimensionality representation reasons)
Other European countries with funny locations:
- Germany in the soccer region (as Gary Lineker once famously said "Football is a simple game: 22 men chase a ball for 90 minutes and, in the end, the Germans always win")
>Portugal (the latter is located between China and Japan for mysterious low-dimensionality representation reasons)
could that have a real world reason given that Macau used to be a Portuguese colony as well as Brazil hosting the largest Japanese diaspora population in the world?
Unlikely, because virtually nobody in Macau speaks Portuguese anymore, and Reddit is not well known in Japan. Various Japan-related subreddits are very popular, but they're populated by English speakers living in Japan, travelling to Japan, etc.
Seems to be pure chance. r/brasil and others are in a very different place in the graph, and none of the subreddits related to r/portugal are connected to any of the Japanese/Chinese subreddits in the graph.
Cats contains the plant related reddits, and is tied to them via r/CatsAndPlants
(fair enough). It visibly looks like plants are also connected via r/michaelbaygifs, but that is only because of really odd placement for r/UnexpectedThugLife.
It would be nice if the shown links were not restricted to a single "country", as I bet r/dogsandplants (which is part of Cats) is tied to a bunch of dog reddits, but those are not shown.
There’s actually a subreddit for this map, but can’t find it in the map itself (too meta?) : https://www.reddit.com/r/MofR/
Also surprised how big overwatch is relative to other games.
Bonus - My roommate works at Amazon and works part time with Andrei in some capacity (don’t know the full details), but anyway he has mentioned multiple times how cool and out of his way helpful Andrei is.
I bring this up because when someone of any notoriety is nice I think it’s really cool. I’ve met some ‘big’ tech people who definitely weren’t!
I've had another one of their sites [1] open in a tab for several months now. Whenever I find a new to me subreddit I find interesting I look it up on this site to see what else is in the vicinity topic wise.
To elaborate, the australia-like island consists of very NSFW subreddits.
And yes, one should be careful with porn, consuming it in a bad way can lead to harmful coping mechanisms or problems with sex; seek help if you feel that you’re not in control.
There is no good way to “consume” porn. At best it’s pathetic, and it can be a whole lot worse, causing actual brain damage by rewiring reward circuits. It’s similar to alcohol in that respect: there is no safe dose.
I don’t say this to upset anyone, but anyone who is upset will benefit greatly from curbing the tendency toward denial and honestly asking “Am I an addict?”
yeah, makes sense since so much anime content (or at least, a signifigant proportion more compared to western animation) already includes pretty explicit content. The line between something like Kill La Kill and kill la kill R34 is already razor thin.
part of it is also community. From a quick glance I recognize that r/hentai is in anime island instead of R34. But the moderation of r/hentai has (or used to have) many moderators from other anime subs, and some regular posters will alternate between r/hentai, r/manga, or r/animemes. That can play a part in which communities decide to form and comingle.
I feel like moderators should receive a weighted vote when identifying connections between subreddits. Two subs with identical moderator lists must be pretty similar right?
Considering how terrible Windows' netsec has been historically, that's entirely unsurprising. When you want to scream for help into the void, you might hit up both those places at once.
Its surprising how often I was already visiting a clique of sub-reddits. Even more surprising how much I was missing out on similar sub-reddits. This visualization is really a discovery tool I needed.
Jojo* appears to be a country of refugees from the Anime region, who've settled in the general gaming region.
*Hopefully before anyone searches it on the map -- it appears that r/jojo was taken by some singer/actress. I was referring to the anime "Jojo's bizarre adventure," which has a capital of "r/shitpostcrusaders."
I wonder how the broader categories were selected. I thought it could be the largest/most connected node on each cluster but that doesn't seem to be the case.
I feel like these dots are so small, you have to zoom in and out constantly to go over the map... would it be possible to now take those dots and turn them into blocks that together make up 100% of the category they currently belong to? Like counties within a state
That's a nice idea. I do wonder if it could create some border problems, where there is a need to link to some non-neighbor states, but not through other states that would otherwise be linked/shared neighbors.
(I guess unless you wanted to render little overflights by airplanes or something)
There's lots that's interesting to me about this but one is how links provided by subreddits themselves might or might not reflect actual related topics. There's a couple of subs I'm familiar with that have relatively large related subreddits I wasn't aware of before, that aren't linked to or mentioned. Maybe in some cases there's political histories I'm unaware of?
The new reddit design is so problematic in so many ways. So many of the "related subreddits" sidebars (all?) are just eliminated in the redesigned site.
Great map. I suggest to paint active subreddits as bright circles and stale ones as dim circles, so the map would look like Europe viewed from satellite at night.
My methodology for discovery/linking subreddits when I did a big crawl was to look at the subreddit description and find links to other subreddits. I think that was less prone to the issues that other commenters are reporting.
Oh, nice, I didn't even realise street view existed!
For anybody else who didn't realise - click on a circle to pop up a little preview of that subreddit, then click 'street view'. You can navigate in street view by clicking to bring up a crosshair, then using WSAD.
this the metaverse, I mean, a metaverse, and maybe a VR 3d equivalent of this is the way to go? Curated and weighted as the user wants, not how the advertisers want...
I feel like in a social environment, Just because you don't find everything you want immediately is a feature, not a bug, like getting off in a part of town you don't know and talking to a person you didn't expect. In hindsight, "11 year old explore the city me" never left - still fun 30+ years later
Can this map be better understood as groupings of users? It seems that a lot of the categories that are polyphyletic so to speak - programming grouped with science, Italy and Spain between China and Japan - make sense if you just think of them as overlapping groups of users.
More specifically, the groupings would be user accounts (this distinction is important for understanding Porn Island).
From the info section - "Two dots within the same cluster are usually close to each other if multiple users frequently leave comments on both subreddits."
I don't think so. The big subs have their own culture that's not as shared with the smaller subs. It's part of the reason why (and corollary to) subs get worse when they get larger
Before NSFW content was removed from /r/all a few years back It was much more obvious. Once you scrolled 8-10 pages down your main feed it was almost exclusively porn.
Happy for the change, it made the website more unusable then it already was.
A few years back I saw something similar where subreddits were presented like a tree, based on topics (and popularity, I think). Made it easy to drill down to interesting niche subreddits.
Who knew that science and teaching were a subset of programming?
Feels like something fundamental is wrong with the methodology. It appears they gave up on the large subreddits, perhaps because they were linked to everything.
I don't know why but my brain interpreted this as "Map of GitHub" and I was rather surprised that there was a GitHub repository for just GIFs and I hadn't known about it.
Browsing this as a map was entertaining, and I don’t even use Reddit.
Maybe this is a good application of the “metaverse”. Spacial association between communities.
The category names are often a little off. "Running" is more like endurance/cardio athletics, for example.
Th methodology is cutting off edges that cross between categories too, which in some cases I think is giving a distorted idea of how the subreddits are organized.
The bulk of the groups are labeled, but the sections of "NSFW Island", I'll call it, aren't. Is there an explanation of the grouping? Some of the groups make sense (celebs, indian-themed) but I can't make heads or tails of the others.
The country names for "Australia" are (in no particular order, with lists of some of the larger circles, especially ones that suggest other clusters bundled into the country)
- WorldPacks (WorldPacks, NSFWPublic, DiscordNudes as some of the larger ones)
The names are mostly just the names of one of the larger or more centrally connected subreddits in the country, and what some of them include seems pretty arbitrary.
And you can probably find "New Zealand"'s "Gay" and "Traps" countries pretty easily, and have no trouble telling them apart.