Hacker News new | past | comments | ask | show | jobs | submit login
Medium's Technology Stack (medium.com/medium-eng)
77 points by dankohn1 on Oct 25, 2015 | hide | past | favorite | 39 comments



2.6 Millennia of "reading time".

I just love these made-up PR metrics that can make any audience look big or any business seem successful.

Really, if you write a post about infrastructure you shouldn't have a metric that say absolutely nothing about your site's load. 25 millions unique readers per month is more fitting, but still quite lacking (too big a timeframe and no way to infer how many pageviews there are).


The point is that we don't really care about the number of uniques or page views. What we actually look at internally on a day to day basis is the amount of time people spend reading and other engagement metrics.


Whether someone spent a second or a year with the page open is irrelevent to how many servers you need to have running. How many pageviews or requests per second is. Aince this is an article about infrastructure no body cares about Engagement time.


Ha ha .. I totally agree. But marketing always claws in to posts like these. Rarely you get a objective post from a company blog.


In past lives I’ve raced snowboards, jumped out of planes, and lived in the jungle

I'm so glad the author added that line, signalling his credibility. I'm even more impressed by Medium's stack when I know the guy responsible for it skydives. The coolness of this dude is overwhelming me now.


While I don't particularly like your tone, I found the inclusion of that entire paragraph surprising. The paragraph immediately before it states:

> Where the quality of the idea matters, not the author’s qualifications.

It is then followed by the paragraph you talk about, specifically meant to showcase the authors's qualifications.


I don't want to be completely defined by my work, but at the same time didn't want to be overly verbose talking about myself.

Also, the original interview had this section as multiple questions; it was collapsed into a single paragraph during editing.


I hope this doesn't come off as negative, I'm honestly curious: Why are there so many Medium stories on HN? There seems to be at least one story a day on the front page if not more. They usually have very few comments and the content often seems a little bit of a mismatch from what normally gets on the front page.

I'm happy to ignore the stories and skip over them but to me that's not the point. Is someone gaming the system? Is Medium a HN subsidiary of some sort? Has anyone done an analysis of how these types of stories get onto HN in the first place? Are Medium stories really that interesting that there's at least one Medium story worth posting on the front page of HN a day?


medium.com stories are penalized by default on HN, as are most sites that produce solid articles amid a lot of fluff. (If it were only fluff, we'd ban the site.) The penalty can get lifted by moderators or by software.

You may be underestimating the sheer quantity of articles people post from there (dozens a day), as well as the diversity of topics in them. I took a quick look and the 6 most successful medium.com posts from yesterday were all technical. (You can check this for yourself by clicking on the domain name next to the story title, which shows you all the posts from that domain.)


Have you considered expanding the penalization software for content aggregators (such as Medium), so that it would treat a combination of the user and the domain as the key, rather than the domain name alone?


Yes. We haven't had a chance to think through the implications yet, but it would be good for the more substantive publications that put out articles on medium.com.


I don't think there has to be a conspiracy, Medium hosts a lot of high quality long form content which is a natural fit for the demographic here is all.

I like a lot of the medium stuff that's posted and it's one of the nicest in browser reading experiences around.


> Why are there so many Medium stories on HN?

I once tried writing a story on Medium, and their editor is pretty amazing. In addition, stories on Medium do look nice and better than a significant share of Wordpress and friends themes. I'd assume that some people also publish on Medium hoping for some extra traffic driven by Medium.

So Medium could be just summed up as 'it just works'.


There're a lot of sour grapes in this thread about performance and made up metrics, but let's take them at their word:

For a site as seemingly simple as Medium, it may be surprising how much complexity is behind the scenes. It’s just a blog, right? You could probably knock something out using Rails in a couple of days. :)

This hints that there is a lot more functionality being put up behind the scenes in terms of datamining and analysis and whatnot than is apparent from the simple "display an article", which indeed is a trivial Sinatra or Express (not even Rails!) app.

If you're not the customer, etc. etc.


While I agree with the overall thrust of your comment -- that playing armchair architect is sort of proving the author's point -- the post doesn't really even attempt to dispel the notion that their stack is overkill. And, honestly, to me, that would make for the more informative and interesting article -- don't just tell me the package you use for bloom filters, tell me why you need a network daemon for bloom filters!


Sorry it came across as "sour grapes", the spec for the post is describe the components of your stack, why you use them, and challenges you've faced.

In terms of "display an article" I'd encourage you to think about what that entails for a platform like ours, it's actually a pretty interesting problem space: near-WYSIWYG editing across 3 platforms, post model vs. HTML (hint we don't store HTML), operational transforms, version history, typographic treatments, copy/paste normalization, ingestion API, etc. It's not rocket science, but it's deeper than you might think at first blush.


"Sour grapes" was in response to sibling comments here, not the original article!

The additional stuff you mentioned here--all of the annoying fiddly details for editing and whatnot--is exactly what should've been used to justify the stack.

The problem is that all of the extra stack baggage frankly looks really baroque, even with the trickiness of the UX you just hinted at.


Thanks for clarifying and yes, lots of interesting stories to be told around the edges of this post. Some are already written and linked. For example, I'd recommend Why ContentEditable is Terrible by Nick. https://medium.com/medium-eng/why-contenteditable-is-terribl...


On this topic, I would like to get a book's recommendation on how to build a full stack product like this. Something more technical with war stories.

My background: I'm a very strong iOS engineer with decent Python and JS skill. However my backend skill is limited at: building a Flask app with custom end points. It will talk to a single sql instance. I'm completely oblivious to memcache, load balancer, different AWS services. My plan is (surprise surprise) to quit my day job next year to pursue my own project and I would like to gather enough best practice and understanding of a full stack app.

Thank you!


"A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system." -- John Gall, 1975

If you are building a new service, and doubly so if you are new to web application, you want to build your system with a minimal set of moving pieces. Nginx as a webserver, Flask or Django for you application, and MySQL or Postgres for your database will get you very, very far, and likely remain the be the core of your stack.

Again, you can go very vary with a simple stack. Heck, I've served a very dynamic website to front page of Wired traffic off of single small physical server, PHP/MySQL with no caching. (Caching is great though!)

Avoid as the plague, Docker, unless you know why you needed it. Docker adds complexity, and unless your setup is large enough to benefit more than then costs, then your life has gotten worse and not better. And you've wasted a lot of time. Repeatable server build setups are great though - Ansible or even a good shell script.

Application/Server monitoring is a good thing. Datadog and NewRelic are good choices here.

Although you will one day hit a wall, scaling to bigger hardware is tremendously easier, (and probably cheaper) than building a big distributed system. Don't underestimate just how much more powerful a full, real, physical server can be than a $20/month cloud server.

When you do need to scale out, listen to your application and scale out just what you need too.


>Avoid as the plague, Docker, unless you know why you needed it.

Just wanted to add something here. Docker is great when you dont want to do setup work for a service or install unwanted software on your machine when you know it will only be used by that app. For eg - if your app used redis, you can just start a redis instance with 2 docker commands. Rather than downloading and installing it on your system.

You do not always need to dockerise your apps, you can use docker containers for dependencies too. I find it pretty nice.


This is almost a self-parody of an overbuilt system. Be aware that a lot of stuff you'll read war stories on is hugely overdesigned, especially in the cloud, because it's basically like playing engineering legos.

When someone else is paying for it, it's a huge amount of fun. If you're paying for it yourself, eliminate as many layers of the 'stack' as you can get away with.


You might get some more full-stack web understanding out of this talk: https://www.youtube.com/watch?v=8uxQOzKi3_0

It explains some of the steps between "web app and database" and a crazy interconnected diagram of various acronyms.


> We like the type-safety without the verbosity and JVM tuning of Java.

I've never worked on large-scale services, but I'm a fan of articles and talks on scaling. Is there any reason why I often read about JVM tuning, but never come across V8 tuning, or Go runtime tuning or CPython tuning?


This was mostly a hat tip to a scarred past. This is a post about Go's GC tuning philosophy: https://blog.golang.org/go15gc


Can't speak for V8/Go/CPython but the JVM has a lot of knobs to turn (quick look at enterprise instances I've been responsible for: max/initial heap size, max/initial permgen size, max/initial newgen size, enable/disable dump on OOM, enable/disable GC logging, enable/disable compaction), some of which also depend on implementation (e.g. the IBM JVM used to have a different default GC policy which favoured fewer but longer collections and some of the parameters behaved subtly different from Sun's).


or .net runtime tuning.


> The Stack That Helped Medium Drive 2.6 Millennia of Reading Time

> For a site as seemingly simple as Medium, it may be surprising how much complexity is behind the scenes. It’s just a blog, right? You could probably knock something out using Rails in a couple of days. :)

> In past lives I’ve raced snowboards, jumped out of planes, and lived in the jungle.

Looking past the annoying, "look how cool we are" tone, I do appreciate their sharing what they use.


Don't think it counts as looking past when you still feel the need to announce specific sentences you don't like.


As I said above, this was from a bio question that got elided during editing. I didn't want to be completely defined by my work, but at the same time didn't want to be overly verbose talking about myself.

Glad you took something from the post though.


And 3 millennia of load time.

EDIT: “On the web we want to stay close to the metal.”


I realize they called this out in the article as "snark", but it really is just a blog engine, and its monthly web traffic is a small fraction of what wordpress.com gets (which is being served on one of the most craptacular platforms that exists).

They in fact could have solved this problem with Rails or even a more full-fledged CMS like BrowserCMS.


Or Apache serving flat files.


Not even Apache! Just their editor with a DB backend rendering content to S3 with a CDN in front of it.


TLDR;

nodejs and Go on the backend , with DynamoDB

clojure on the front end with their own SPA framework.




You're right, I read closure but for some reasons I typed clojure, my mistake.


A commenter on another recent submission pointed out the sudden surfacing of tech-stack-blog-posts that mention DataDog in their stack. This appears to be another such of those posts. Someone at Datadog is being clever.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: