Hacker News new | past | comments | ask | show | jobs | submit login

> we can assume that Instagram was written using Objective-C and a combination of other things like UIKit.

How can they know the internal infrastructure but have to assume the app language?

Edit: So the entire piece is taken almost verbatim as-is from a couple old articles on instagram engineering blog.

It might as well just redirect to: https://instagram-engineering.com/what-powers-instagram-hund...

This is against the guidelines:

> Please submit the original source. If a post reports on something found on another site, submit the latter.




Instagram engineers themself wrote a bit about their backend infrastructure. One of the more important topics was how they shard the data [0] and this is also linked to in this blog post.

[0]: https://instagram-engineering.com/sharding-ids-at-instagram-...


If it wasn't just an image site the potential for hotspotting would be insane!

Size isn't a bad thing anymore since price has dropped exponentially since the inception of Instagram.

I am positive they would use another modern technology today if it was present in the past.

Fantastic read though.


The potential for hotspotting decreases with the number of inserts per second. Like if you only did 1 insert per second and timed it right you could put all of those inserts on one server, but this would likely not overload the server.

It's virtually impossible for anyone to hotspot in a meaningful way with this system.


Nah this will totally be an issue. You can be on the extreme end of replication or the extreme end of sharding and experience performance problems. Sharding is more likely to hotspot depending on where hot data is consistent.

The solution in most cases is a simple database that acts as a pointer database user db -> user's db. That is generated on the creation of a user.

From here you create some simple cold storage models ( if user isn't active ) and some warm models which will scale out the db if the user's db grows it shards and replicates for more read access. But the last thing you want is to slow replication or have one DB that can't move to balance resource utilization. There are some new DB tech that does this without even sweating the deets.


It IS an original source. It's not just a repost or a report on another older article. It's a reworking of those articles.


I suspect they mean it’s a secondary source not a primary one


I wonder how they horizontally scaled shards. If they had 2k logical shards they probably had much fewer real/physical shards. So a single database was holding many sharding keys. So when a new physical shard gets added, the data needs to somehow be replicated. That is only true if w reading is the problem. If only a relatively short time period of data is hot you can probably just move over the logical shard IDs to the new physical shard without moving existing data. This requires keeping track of when which physical shard became active.


> How can they know the internal infrastructure but have to assume the app language?

At that time that was the only solution that would have made sense given the achieved behaviour, performance AND development effort.

I did spend a lot of time Cordova(PhoneGap) and all the other HTML5 app thingies for iOS at the time.

Not sure why that particular in my opinion pretty obvious choice bothers you that much. That is very much the reason why they didn't even bother releasing it for Android until almost two years later.


They had a different person/team working on the front end and/or they don't remember?


[flagged]


Enough with the "rules" just because you don't find it novel enough.

The author is on HN and says "Just my own brain reading through old talks and articles from Instagram engineering and Excalidraw for the diagrams. I did my best to put together all the info I learned from them into a comprehensive and simple manner".

You can take it with them.


If it was compiled from more than 1 article, it becomes an original article. The post should have only novel idea and info. is an arbitrary requirement and not the meaning of that rule.

Though I would add if author were taking anything verbatim, that should be highlighted as a quote with the original source. (edit: reading more, author has already done that.)


[flagged]


Not OP. No anger in his words. Please don't make OP feel inadequate for expressing themselves clearly.


They sure seem bothered a lot by something trivial (god forbid the post which was NOT made for HN anyway quoted some original sources and didn't go into the detail they'd like it to).

Somebody took the effort of compiling an article on several sources, and we're throwing the rulebook on them.


The author is not the submitter. Rulebook is thrown for submitting, not for writing.


Unless you're a mind reader you are way out of line with this comment.


Looks like those wordpress bots have finally figured out HN as well.


[flagged]


Ah, but that's just the sort of thing an AI Wordpress bot would say.


Looks like someone may have been using ChatGPT to produce that post.


Author here. No ChatGPT was used.

Just my own brain reading through old talks and articles from Instagram engineering and Excalidraw for the diagrams.

I did my best to put together all the info I learned from them into a comprehensive and simple manner.


Actually, ChatGTP is very useful for fixing your writing.

Sometimes when a paragraph I write reads a little too harsh to the ear, I ask ChatGTP to rewrite it - it's still my original thought.

It's really effective, but I tend to tone it down a bit to sound like myself since the output can be too formal, dry, and "academic".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: