> The per-page time is actually quite high; often over 5 minutes on average for a lot of my stories even when its a big crowd. I suspect some subtle misreading of Google Analytics on my part, or else people might actually be reading the whole thing carefully!?
That might be because a lot of people do what I do: keep the article open while browsing comments on HN or Reddit, and switch back and forth between the two tabs while writing my own comment. In fact, I'm doing it right now. I don't know how Google Analytics would measure time spent on a page when the page sits idle in a background tab for the most part but brought to the foreground from time to time.
[Edit] Things get even more complicated if you open 5 articles and 5 comment pages at the same time, and take 30+ minutes to go through all of them.
I tend to have one window for recreation browsing and at the start of the day I will open HN and run down the front page middle clicking the link and the comments pages for anything that looks interesting or highly up-voted.
This gives me a queue of stuff I can read in free time throughout the day but I imagine it really does skew the analytics.
This is why I wrote my own datapoint gathering script (I started it while at Techcrunch, curious about who scrolls down to comments, etc.).
It measures body focus, number of times unfocused, total focus time, scroll position, scroll rate and a bunch of other things.
It isn't completely working cross-browser yet, but it is close and useful enough where for Safari and Chrome visitors I can see how far down the page they are scrolling and how long the page is in focus for.
No doubt, this is part of the reason this noscript_using_reader stopped visiting TC. Call me a tinfoilhat-guy, but I simply would not consent to the collection of such data. It's ironic that I've learned so much from your analysis of data collection techniques elsewhere.
I have been meaning to clean it up and post it to my GitHub, I will do it this weekend. If you like, email me (in profile) and i'll ping you as soon as it is up
A news site in Australia here seems to know when the tab is visible and auto plays the video content for a story. I assume it's using these Javascript events too.
But that can still lead to inaccurate measurements of "time on opage" if the user switches back and forth between tabs. Do you only measure the periods during which the tab is active (and if so, how granular?) or do you simply subtract the first active timestamp from the last active timestamp?
There is something systemically flawed with Reddit's moderator system which precludes it from ever being an unqualified success as a social news aggregator.
The primary issue revolves around the selection process for moderation. To become a moderator for a subreddit, you need to have thought of the name of the subreddit. That's it. This nomenclature designation act grants any user absolute power within that domain space, and the ability to grant any other user the exact same powers.
The person who created the "programming" subreddit has no qualifications, no resume by which to judge their aptitude for the moderation job, and no process exists by which to vet newly added moderators.
A moderator can, for any or no reason, decide to activate the "spam filter" on any submitted article, removing it from public voting and view. This is the only tool by which Reddit moderators are given to modify their respective domains, and when used for reasons other than spam, it "teaches" the filter to remove non-spam results. This is the cause of the "broken" spam filter on Reddit - moderator abuse.
I think you hit the nail on the head with your last point, though I wouldn't necessarily call it moderator abuse. The actions that moderators can take on reddit were built on a laissez faire model of moderation, where anything goes except blatant spam. This obviously doesn't work very well for moderators who choose to run their subreddits in a different manner, and this happens to be most of the big subreddits, because laissez faire moderation just doesn't work on a large scale.
For example, the sole /r/truereddit moderator believes in a laissez faire moderation style, and the subreddit is slowly getting worse and worse as the userbase increases. Every time a /r/truereddit post makes it to the front page, a virtual swarm of idiocy surrounds the entire subreddit for a few days.
But that's the only way the moderation system is designed to work. There's no way to remove/approve content without training the filter. If someone submits a codinghorror post to /r/programming that is completely off-topic, the moderators either have to remove it, effectively warning the filter that codinghorror might be spam, or hope that the userbase does the right thing and votes it down. Usually, they won't, especially the people who are voting from their customized front page, not paying attention to what subreddit something is in.
It's broken, and the admins have never really expressed any concerns about this fact.
One of the core problems with even talking about the issue is the general disdain Reddit users have for one another, or to be more accurate, the disdain individuals have for the collective. The idea that content can be "better" or "worse", and this can be judged by an individual to the exclusion of the group, completely circumvents the entire concept of Reddit - crowdsourcing news.
>The idea that content can be "better" or "worse", and this can be judged by an individual to the exclusion of the group [...]
This happens in every growing online community. As the community grows, the new members don't necessarily have the same interests as the original ones. This happened at digg, reddit, and now even HN has posts about declining quality, which is another way of saying content can be "better".
Weird. As it turns out, the feature that I was complaining about reddit not having (not being able to add/remove without training the filter) was added an hour after I made this comment.
I know that most, if not all of the earlier admins are on HN. Reddit was originally funded by ycombinator. It wouldn't be too much of a stretch to think that the new guys do as well.
As far as self promotion, there isn't a whole lot of consensus on it. The community aspect of the site is somewhat at odds with the link aggregator model. If bloggers submit all of their content, pretty much automatically, and don't contribute to the community in other ways, that doesn't really tend to go over so well. As a result, people who create content are expected to invest time and energy into becoming part of the community so they can accurately gauge how their content will be received.
It's not just supposed to be a place to dump all your links.
/r/programming is more biased towards the link aggregator model than the rest of the site, however the community aspect is still there. Someone mentioned a 10:1 ratio for other content to your own, however in /r/programming, I usually say that 50/50 is fine.
Personally, I dislike how the moderator system for subreddits uses a oldest-first hierarchy, where older moderators can always remove newer ones. Hence, a subreddit has no such thing as equal moderators. There is always one "on top", as such.
> A comment on HN is going to be mature and reasoned; often expanding or exploring technical issues raised
That's what I love about HN : the content of comments here are of a very high quality.
It's something that I've noticed with my own posts - well reasoned comments attract a lot of positive karma. Even if you're taking a controversial stance on an issue.
At least 50% of the time I don't even read the article, I just read the comments. I tend to be more interested in the conversation the posts generate than the posts themselves.
I would venture to guess I read the comments 90% more than the actual posts. The discussion (or the subject) will occasionally lead me to read the article in depth, but more often than not the discourse here is more enlightening overall. I can easily say HN is the only community where my behavior follows this model.
> Nobody actually follows the links in tweets though; click-through is often in the low digits per tweet
Tweet click-through data is obscured in Google Analytics because anyone who clicks from a mobile app shows up as a "direct" hit because there is no referring URL. There's some speculation and evidence[1] that this, at least in some cases, shows up as "Mozilla Compatible Agent".
Yesterday, I posted an article on HackerNews. Its received 62k views in total so far. HackerNews was the site that kicked off all of the view counts and it quickly jumped to the top spot #1.
After it reached the top spot, I thought I would also post it on Reddit. It didn't get a single upvote. It's the third time that a story of mine has made it to the front page of HackerNews and then not received a single visitor on Reddit. In the future I don't think it is worth bothering.
Ultimately, 100% of the traffic that I received, would not of happened if it weren't for the upvotes on HackerNews, because from these upvotes everyone started tweeting about it. It received 900 tweets in total.
Would you care to elaborate on this? Are you referring more to the removal of posts that have made it through the filter, or the lack of approval of posts that are stuck in the filter?
I think we do a pretty good job. No choice is as clear cut as people would like it to be (especially when it is their own content on the line). I've heard feedback from people who think we do an amazing job at moderating, keeping the ever-growing cesspool from leaking into the sacred realm of /r/programming, and I've heard feedback from people who think we're the devil, and that we're censoring the will of the people, and somebody call the ACLU right now!!!
But any community is going to get feedback from those two extremes. That's just how people react to moderation. The fact that there seems to be pretty equal feedback from both extremes tells me that we must either be doing something right, or everything completely wrong.
That first one was submitted to /r/programming by a user who has a really bad history of incessantly spamming his/her own blog and nothing else. The user has 2 comments in 4 years, and only 2 submissions which are not self promotion (one of which is yours).
I couldn't find a trace of the other two, though looking at your account (or what I think is your account) I am seeing a slew of filtered posts, some of which are on topic, some of which are not. I just approved a couple of your most recent on-topic posts, however I can tell you that you will need to keep a good ratio between your own site(s) and other good content for the spam filter to go easy on you. It filters really heavily against people who submit the same domain a large part of the time, and we (moderators) tend to stick to that as well.
I will comment on your other points in a little while.
EDIT: Okay, now for the rest.
I think the warm fuzzy memories you have of proggit are akin to the warm fuzzy memories a lot of people have of reddit as a whole 3 or 4 years ago. But with a larger userbase comes a lot of noise and idiocy. In my opinion, /r/programming was one of the first subreddits to take real action against quality decay. There was a period of time, after the heyday, but while /r/programming was still a default, when rage comics and advice animals were rampant. Maybe not quite as rampant as the rest of the site, but it was pretty bad. There were lots of people who were not programmers submitting and commenting, so lots of stuff ended up being off topic.
I think the quality has improved immensely since then, and I think this is in part due to strict moderation and the disabling of self posts. I will allow that it certainly isn't as great as reddit as a whole was in its heyday, but it's getting better.
There's not much we can do about the comments. We can't moderate comments for stupidity (because that would be pretty messed up). The only thing I've ever moderated comments for is spam and personal info.
A reboot, unless it's in the form of a different sub, is probably not going to happen, but I'd certainly be interested in hearing any ideas for improving the subreddit in its current form.
> It filters really heavily against people who submit the same domain a large part of the time, and we (moderators) tend to stick to that as well.
Is that done on a per-subreddit basis, or a whole-site basis? If I submit mostly my own content to proggit but mostly other people's content to other subreddits, will the proggit filter hate me?
I believe that all the spam prevention measures are done on a per-subreddit basis, meaning that if you only submit to other subreddits, you will look like a new user to the proggit spam filter once you start submitting. A couple users have been plagued by the exact scenario you pointed out, where they had a ton of other sources, but they only posted their own content to /r/programming, which is problematic.
And on your second note, the idea is not that you go seeking out content to mix in with your own. If you happen to come across a cool article, and it hasn't been submitted to proggit yet, just contribute it. This is probably part of the community vs content aspect you mentioned in a different comment, but sheer size of the site means that you cannot really ignore the community aspect.
I could probably get in trouble for saying this, but the reddit userbase, as a whole, is not smart enough to filter out what is good and bad. If a bunch of bloggers just auto submit every blog article they write, we will have tons of crappy, ad-ridden, off-topic posts being voted to the top. It still happens from time to time. HN might be smart enough to distinguish the good from the bad, but reddit surely is not. So we have to be at least a little restrictive.
I'm not saying your content falls into those categories, but we can't really give special treatment to people we like or are familiar with. There is one guy in the /r/programming approved submitters list, and he is a very noteworthy programmer who, for whatever reason, could not get any of his posts through the spam filter. He is the one exception. For the most part, everyone is lumped into the same bucket.
So when your posts get caught in the spam filter, just send us modmail. Sometimes we'll get them, sometimes we won't. That's just how it is. I saw earlier that you said you sometimes delete them if they've been filtered. This may be a bad idea. I cannot say for sure, because the admins don't really disclose this stuff, but deleting and resubmitting the same link has been known to hurt a user's standing with the spam filter, so just deleting it may do the same thing, I'm not sure. And there's really no reason to delete it. The only think it will clutter up is the spam filter, which is going to be cluttered no matter what.
Anyway, I hope this has been somewhat useful to you. Sorry about the massive walls of text. Let me know if you have any other questions or concerns.
Haha. Well, to be completely honest, I was recruited as a moderator for /r/programming because of a novelty account I ran that essentially bitched about any submission that could remotely be construed as not being related to programming, so I, and the mod who added me, are pretty strict about what belongs in the sub and what doesn't.
Looking over your blog though, it seems like most of the stuff you write about is fine. We'll only have a huge problem with it if you start indiscriminately submitting every post on your blog, relevant to programming or not. If you're not sure, just submit anyway. One of the admins just added a feature yesterday that allows us to remove posts that are off topic without training the spam filter (long overdue), so if we do decide that something isn't right for the subreddit, it won't harm your standing with the filter.
The point of using incognito is to be unauthenticated. Otherwise, wouldn't the post always be visible to the submitter, regardless of whether or not it is in the spam filter?
The post will show up for the submitter on their userpage and in their submitted links panel, however it will not show up in the new queue, not even for mods. This may be different for shadowbanned users, but I don't think it is.
Actually, I'm almost certain it's the same for shadowbanned users, because we got two modmail messages today from shadowbanned users asking to have their posts taken out of the spam filter because they can't see them in the new queue.
FWIW, it's currently in /r/programming's spam filter, and it won't be coming out because it's not about programming. Sorry to be all stackoverflow about it, but that's how we operate.
Reddit's spam filter is completely broken with no apparent activity from the current group of admins towards fixing it.
It seems that generally if you submit something to reddit, it will get caught in the spam filter, then you have to message a moderator to request that it get approved.
[Again, it seems] that you're not so much voting on an assortment of links, you're voting on an assortment of links that moderators have deemed worthy.
Yeah, I have always been puzzled about that. I will post extremely relevant content to various sub-reddits and they never show up in new/hot/whatever. It's like insta-spammed somehow. I don't even bother messaging the admin.
This one was legitimately in the spam filter, however if it had not been, one of us probably would have taken it down for being off topic.
EDIT: Right now that submission has 5 points, from people who must have clicked through on the blog post, and one comment complaining about how it's not about programming, also from someone who must have clicked through from the post.
As an amateur, beginner programmer, I'm curious why you don't use something like Jekyll+S3 rather than tumblr. The dude that pulls my espresso shots uses tumblr...
Despite wishing they'd hire people to add features http://williamedwardscoder.tumblr.com/post/18002362007/tumbl... (note the comment there from someone using Google Pages + Jekyll), I'm generally happy enough not to invest time nor money in self-hosting or paid hosting.
No mention of Techmeme? I'll add that the best way of getting picked up by other bloggers is first Techmeme (not the largest audience but definitely the most influential in tech) and second HN.
I've had blog posts pick up their traffic peaks 2-3 days after they were posted as the route was techmeme -> other tech blogs -> mainstream media
+ Attach yourself to a story cluster if a relevant one is already there by linking to a source post
+ `tip @techmeme <url>` on twitter
+ Message the editors on twitter
+ Make sure your feed validates, can be autodiscovered etc.
+ Techmeme can discover stories based on retweet counts, so make sure the same canonical URL is being retweeted and favorited from the post, from twitter, disqus etc.
+ Post interesting things frequently enough that you become a source that is crawled by Techmeme
Good advice. I wondered why/how I got included on Techmeme before. I would assume it is based on "Techmeme can discover stories based on retweet counts,".
TM takes hints from a lot of sources - including HN. It has 30k+ sources that it directly crawls itself as well, and uses a lot of different hints to work out what is popular. The biggest hint to a big story is inbound links from other blogs that are also sources, and from there things like popularity on social media
I have never submitted my own posts to DZone, but they have been submitted by others. It isn't in my top10 referrers. My impression is that most of the audience there is Java developers looking for technical posts
Being the first to submit it to delicious is also a good idea, with good tags. Whoever submits it first their title will be the prevailing title for the post with future re-posts, so it is a good idea to be the first to do it and set your own title before somebody bastardizes it (this is the reason why I submit my own posts to HN)
Submitting to Google+ is also a good idea since it is a very good way to get the Google crawler to visit the page and index it.
My own routine after I publish a post that I think may be interesting to others is HN, twitter, delicious and Google+.
I must admit I'm nauseated by the number of references to reddit on HN. I think reddit is powerful and fills a void, but the few times I've visited > cmnd+w. Twitter is also useful on occasion, but simply not my "cup of tea." These sites feel like eternal September, and make me consider suicide (not to over-dramatize...).
I think to a certain extent 3 or 4 years ago Reddit was very much like HN is now. When I first joined Reddit (2007) there was generally a lot of reasoned discussion, and the submissions were very much programmer orientated in a way that's long since passed.
I'd imagine (although I suppose I have no real evidence for this) that the people who talk about Reddit and HN in the same context were active on Reddit a few years ago.
A lot of us discovered reddit when Paul Graham mentioned it in an essay.
It was pretty cool in the beginning, but then it started tending towards Ron Paul, and various political stories rather than stuff that was actually interesting. And thus, HN...
Ron Paul is still ok. I can tolerate lolcats even. It's the deterioration of comments to youtube level that puts me off.Or just downvotes without explanation. I neither want to read nor write on reddit these days... and I signed up on reddit when Paul Graham mentioned it, 6 years ago.
It's nice to get confirmation! I did once get a lot of comment upvotes lambasting the poor quality of comments on the science reddit, but frankly I think the sheer size of the reddit community will make it very, very hard to return to how it was. The culture that seems to be pushed by a subset of (mostly non-technical) reddit users is largely incompatible with the culture that I (and I assume others) are looking for in a social news site.
There are of course some areas of reddit which maintain a largely engaged user base willing to make in depth and interesting comments, but they tend to be rare, hard to find and often suffer from the mainstream reddit culture being dragged in by more casual visitors.
Perhaps we have to simply resign ourselves to a migration between social news platforms every few years.
Personally, I will never use reddit, no matter the subreddits, or whatever. Usenet and IRC are new again; killfiles and /ignore are powerful, and user-focused.
My experience with HN has been much better than with Reddit. In reddit, even good posts get down ratings a lot of time.
The one thing i like about hacker news is your posts still have 1 point minimum even if others dont like it, whereas in reddit you fall down to 0 points which makes you feel discouraging sometimes.
A lot of downvotes on reddit are actually inserted by the system, accompanied by the same number of upvotes to balance them out. This is supposed to be some sort of anti-spam anti-troll measure, though I'm not sure how that even works.
Anyway, the upshot is that you never know the true number of downvotes you got. This can be discouraging for people who don't know about computer-generated phantom votes.
That might be because a lot of people do what I do: keep the article open while browsing comments on HN or Reddit, and switch back and forth between the two tabs while writing my own comment. In fact, I'm doing it right now. I don't know how Google Analytics would measure time spent on a page when the page sits idle in a background tab for the most part but brought to the foreground from time to time.
[Edit] Things get even more complicated if you open 5 articles and 5 comment pages at the same time, and take 30+ minutes to go through all of them.