Okay. I was at Amazon too in that time frame, and I don't recall the Sun to Linux transition to be all that big a deal.
Amazon was always at that time facing challenges on every front, the dot-com bubble breaking etc. There was a large build out of distribution centres around this time that was very expensive.
This was just one more of the challenges they faced. I may not have been senior enough to be privy to all the details, but I don't think this particular migration brought the company to near bankruptcy. That's a stretch.
I think the big migration was the database, and it took time, and Oracle was having some difficulties with running on Linux. But we were already using Linux boxes on our desktops since 1999. Webservers had switched to Linux around 1999. And migrating core code base which mostly C/C++ with Java starting to appear to Linux was mostly a challenge for the infrastructure teams and one of those routine large projects that every company has. I don't recall anybody talking of bankruptcy for this reason.
That series of posts should be seen in the context of someone who was working on a Solaris to Linux migration (like EVERY company at the time) trying to claim credit for creating cloud computing. This is NOT the origin story for EC2, which started as a skunkworks project in Cape Town.
EC2 wasn't the first AWS service though it is the most fundamental one.
AWS (IaaS) was started after Andy Jassy, who was Chief of Staff or Technical Assistant to the CEO at the time (2003/4), came up with a different vision for "Web Services" [0] to solve web-scale infrastructure problems every single project at Amazon then faced (and so, couldn't ship fast). That is, until Andy reinvented it, "Web Services" was simply limited to access to Amazon's catalog of SKUs [1][2]. As (may be) part of "Web Services" they also ran Target, AOL (shop@AOL), and toys-r-us' e-commerce stores [2]. Mechanical Turk showed up in 2004/5, S3 followed early 2006, whilst SQS, EC2 launched mid 2006 [3], and SimpleDB late 2007.
SQS I believe started in 2004, but in a very limited capacity. I think that MTurk also started in 2004.
Andy Jassy is the one that came up with the vision for AWS. S3 was called S4 at the beginning! There was a big vision on the payments part, but I think S3 + EC2 took off faster than anyone expected, and I credit these two as the main reason why (IMHO) AWS has been so incredibly successful.
I had the privilege to work and interact with Andy Jassy in my past (I was at AWS 2008-2014 and I met him and sat with him numerous times, despite I wasn't based in Seattle), and I can say that I'm pretty sure I've never seen a leader as focused, as effective, as visionary as Andy.
The "2 pizzas team" structure at Amazon and therefore at AWS was also very conducive to running many experiments and failing fast.
It wasn't linked to Darpa but it was developed as an experimental project by an autonomous team in South African office lead by Chris Pinkham and initially Amazon were quite secretive about it.
"""The designation "skunk works" or "skunkworks" is widely used in business, engineering, and technical fields to describe a group within an organization given a high degree of autonomy and unhampered by bureaucracy, with the task of working on advanced or secret projects. """
Amazon EC2 was developed mostly by a team in Cape Town, South Africa led by Chris Pinkham. Pinkham provided the initial architecture guidance for EC2 and then built the team and led the development of the project along with Willem van Biljon.
Porting code from SPARC to x86 is much easier than the other way around. Most x86 instructions allow unaligned reads/writes, whereas SPARC will SIGBUS on unaligned access (at least on the processors from the late 1990s). The SPARC memory model is also more lax than x86, so x86 hides more concurrency bugs than x86 does.
By default yes, though I worked on a codebase which was ported from M68k to SPARC which had a lot of misaligned memory access. You could opt into having "packed" structs with #pragma pack and avoid the runtime error by compiling all the code with --misalign.
As I recall, this had a performance penalty since memory reads/writes are now byte-by-byte. Also it could make code which assumed atomic reads/writes of 32 bit quantities go awry.
On the other hand, on Sun machines you could free() the same pointer twice, but on Linux, it would crash. This led to some small effort during early ports.
Instead of "Sun machines", I think you mean Solaris (or SunOS). Sun made both SPARC and x86 machines. You can run Linux on x86 machines. For a bit, I ran Solaris 7 on my x86 desktop.
This double-free behavior was a part of the memory allocator implementation, not a feature of the processor. Linking (or LD_PRELOAD'ing) against another memory allocator could have gotten you the other behavior on either platform.
Note that even with the Solaris memory allocator not corrupting state or abort()ing if you double-freed a pointer, if a second thread in the same process had called malloc() and gotten that same block of memory in the middle of your double-free, then a later malloc() could also get a pointer to that same block of memory, and you'd unintentionally have two unrelated pointers to the same memory block, almost certainly resulting in memory corruption. One of my fist projects at my first job was fixing a bug that turned out to be memory corruption caused by a double-free. Continuing to apparently run fine despite a very serious and hard to debug bug isn't generally considered a good feature.
While the SPARC architecture allows for a very relaxed memory model, I belive all SPARC implementations have been TSO (i.e. exactly the same as x86) for a long time (and IIRC even non TSO SPARCs had mode bits to force system wide TSO).
> I belive all SPARC implementations have been TSO (i.e. exactly the same as x86) for a long time (and IIRC even non TSO SPARCs had mode bits to force system wide TSO).
The internet does indicate that that seems to have been the default, though it was possible to enable more relaxed models starting from v8 (PSO) and v9 (RMO).
Most of the relaxation seems to have been dropped in the meantime:
> SPARC M7 supports only TSO, with the exception that certain ASI accesses (such as block loads and stores) may operate under RMO. […] SPARC M7 ignores the value set in this field and always operates under TSO.
GP may have confused SPARC and Alpha, the latter having (in)famously relaxed approach to memory ordering (uniquely reordering even dependent loads e.g. if you load a pointer then read the value at that pointer you may get stale data).
I seem to have confused which of the 3 memory ordering states was the default and supported on all models. I'm well aware of the Alpha.
> e.g. if you load a pointer then read the value at that pointer you may get stale data
For anyone thinking that sounds horrifying, a single core doing the writes and read won't observe anything out of the ordinary.
The designers of the Alpha thought "Okay, we can simplify and speed up the cache coherency logic if the "happens before" temporal dependency introduced by a memory fence between two writes won't make any guarantees about two reads on another core, unless there's also a "happens before" temporal dependency introduced by a memory fence between the two reads". This sounds totally reasonable, particularly at a time when lockfree data structures weren't as popular. Reads and writes done while holding a mutex work just like any other processor: the memory fence involved in acquiring a mutex guarantees that any reads made while holding the mutex will see all writes made before the previous release operation on that mutex.
The difficulty with the Alpha memory model comes mostly in writing lockfree data structures, where the safety comes from modifying some data structure while no other threads can see it, then atomically changing a pointer (usually a load-locked/store-conditional or compare-and-swap) to make that pointer available to other threads. On most architectures, you only need a memory fence right before (or a part of) the write operation that changes the pointer. On most architectures, this happens-before guarantee will be seen by all normal read operations. On Alpha, you also need a memory fence between reading that pointer and chasing it. In C, the loading the pointer and chasing it happen in a single expression *p, so you need to load the global pointer into a local variable, execute a memory fence, and then deference the pointer (or else use inline assembly, C11 atomics came out way way after the DEC Alpha AXP).
Lockfree data structures become more popular well after the Alpha was released, so it's understandable why the Alpha designers were willing to make these optimizations that don't affect code that correctly uses mutexes to protect global mutable state.
> The SPARC memory model is also more lax than x86, so x86 hides more concurrency bugs than x86 does.
I think you mistyped this sentence. Did you mean x86 is more lax, so SPARC hides more concurrency bugs? Am I correct that more concurrency bugs would be hidden in a less lax architecture?
> Am I correct that more concurrency bugs would be hidden in a less lax architecture?
No, a more strict (less lax) memory model gives the processor less freedom to re-order memory operations, meaning that missing memory fences (explicit ordering) have potentially less effect vs. more lax memory models.
The SPARC memory model makes fewer guarantees (less strict), allowing the processor more freedom in ordering memory operations, potentially getting more performance. (There's a reason aarch64 went with a more lax memory model than x86, despite being designed decades later.) The downside is that bugs with missing memory fences are more likely to show up.
This has been in the news again lately with Apple Silicon. The ARM architecture has a weaker memory model than x86 in that it does not normally provide "total store ordering". Under that memory model, if thread A executes (from an initially zeroed heap):
a = 1; b = 1;
then thread B can safely execute:
if (b == 1) assert(a == 1);
if (a == 0) assert(b == 0);
x86 provides this guarantee, but ARM does not -- thread B might see the b=1 write before the a=1 write.
Apple Silicon has a runtime toggle (https://old.reddit.com/r/hardware/comments/i0mido/apple_sili...) to provide for that behaviour, which greatly improves performance of translated x86 code (i.e. the translator does not need to insert precautionary memory barriers).
Even on x86, you can't make this assertion without any synchronization primitives (mutexes, etc.). Without synchronization, the a = 1; b = 1; can run between the (a == 0) and assert(b == 0).
Ah, of course you're right. I added that line as a bit of an afterthought and meant for it to be atomic, but of course it isn't. Unthinking parallelism is pitch black, and you are likely to be eaten by a grue.
No, a strict concurrency model means things are more likely to be consistently sequenced; that is, you'll encounter less actual concurrency artifacts. If this model is made more lax, you might discover that you had UB which the previous model was not exploiting.
You're more likely to luck into the behaviour you want with x86 than a more relaxed memory model. I think the Alpha takes the pip for the latter, but I was only something like 3 months old when the IP was sold to Intel let alone in widespread use, so I could be wrong.
Yes, the DEC Alpha AXP was a beast of a chip family. The Alpha design team made nearly as few guarantees as possible in order to leave nearly as much room for optimization as possible. The Alpha's lax memory model provided the least-common denominator upon which the Java memory model is based. A stronger Java memory model would have forced a JVM on the Alpha to use a lot more atomic operations.
All processors (or at least all processors I'm aware of) will make a core observe its own writes in the order they appear in the machine code. That is, a core by itself will be unable to determine if the processor performs out-of-order memory operations. If the machine code says Core A makes writes A0 and then A1, it will always appear to Core A that A0 happened before A1. As far as I know, all processors also ensure that all processors will agree to a single globally consistent observed ordering of all atomic reads and atomic writes. (I can't imagine what atomic reads and writes would even mean if they didn't at least mean this.)
On top of the basic minimum guarantees, x86 and x86_64 (as well as some SPARC implementations, etc.) have a Total Store Ordering memory model: if Core A makes write A0 followed by A1, and Core B makes write B0 followed by B1, the two cores may disagree about whether A0 or B0 happened first, but whey will always agree that A0 happened before A1 and B0 before B1, even if none of the writes are atomic.
In a more relaxed memory model like the SPARC specification or Aarch64 specification (and I think RISC-V), if the machine code says Core A makes write A0 before A1, Core B might see A1, but not yet see A0, unless A0 was an atomic write. If Core B can see a given atomic write from Core A, it's also guaranteed that Core B can see all writes (atomic and non-atomic) that Core A thinks it made before that atomic write.
With the DEC Alpha, the hardware designers left themselves almost the maximum amount of flexibility that made any semantic sense: if Core B makes an atomic read, then that read (and any reads coming after it in machine code order) is guaranteed to see the latest atomic write from Core A, and all writes that came before that atomic write in machine code order. On the Alpha, you can think of it as all of the cores having unordered output buffers and unordered input buffers, where atomic writes flush the core's output buffer and atomic reads flush the input buffer. All other guarantees are off. (Note that even under this very lax memory model, as long as a mutex acquisition involves an atomic read and an atomic write, and a mutex release involves an atomic write, you'll still get correct behavior if you protect all reads and writes of shared mutable state with mutexes. A reader's atomic read in mutex acquisition guarantees that all reads while holding the mutex will see all writes made before another thread released the mutex.) This might be slightly wrong, but it's roughly what I remember of the Alpha memory model.
The thing that confused some programmers with the Alpha is that with most memory models, if one thread makes a ton of atomic writes, and another thread makes a ton of non-atomic reads, the reading thread will still never see the writes in a different order than what the writer thought it wrote. There's no such guarantee on Alpha.
On a side note, the Alpha team was also pretty brutal about only allowing instructions that were easy for compilers to generate and showed a performance improvement in simulations on some meaningful benchmark. The first generation of the Alphas didn't even have single-byte loads or stores and relied on compilers to perform single-byte operations by bit manipulation on 32-bit and 64-bit loads and stores.
Many of the Alpha design people went on to the AMD K6 III (first AMD chips to give Intel a run for their money in the desktop gaming market), the PASemi PWRFicient (acqui-hired by Apple to start their A-series / Apple Silicon team), AMD Ryzen, etc.)
When I bought my first computer in the fall of 1997, the fastest Intel desktop processors were 300 MHz PIIs. DEC Alphas at the time were running at 500 MHz, and had more instructions per clock, particularly in the floating point unit. The Cray T3E supercomputer used DEC Alphas for good reason.
> On top of the basic minimum guarantees, x86 and x86_64 (as well as some SPARC implementations, etc.) have a Total Store Ordering memory model: if Core A makes write A0 followed by A1, and Core B makes write B0 followed by B1, the two cores may disagree about whether A0 or B0 happened first, but whey will always agree that A0 happened before A1 and B0 before B1, even if none of the writes are atomic.
In a more relaxed memory model like the SPARC specification
AFAIK SPARC has always used TSO by default, and while v8 and v9 introduced relaxed memory modes (opt-in), these have been dropped from recent models e.g. M7 is back to essentially TSO-only. While it is backwards-compatible and supports the various instructions and fields, it ignores them and always runs in TSO.
IIRC SunOS/Solaris always used TSO, but Linux originally used RMO, however they switched to TSO once chips that only supported TSO appeared on the market.
I remember early versions of Java not running terribly well on Linux. It ran, but performance was bad. I'm thinking the 1.2 and 1.3 versions. I worked for a startup and did some Java and Oracle work around that time, moving apps from Sun to Linux based demo systems.
Can confirm. Around 2000 - 2001, on RedHat 6 or 7.something Sun's 1.2 JVM was basically unusable: slow and extremely buggy. IBM's JVM (HotJava? - it's all so long ago I can't remember what it was called) was much better: it was a bit faster and it didn't fall over for a pasttime. The 1.3 releases were better but still not much to shout about. I can't remember whether the 1.3 Sun JVMs were good enough to allow us to run our software with them at the time.
(Topically, at around this time, we were targeting Solaris 6.x and 7.x based systems.)
Java on Linux was interpreted until 1.3.1/1.4 when the VM, named ExactVM, was replaced by HotSpot which includes a JIT.
At that time, I remember deciding to prototype a B2B store in Java and then rewrite it in C++ when i will know what i was doing. I never rewrite that soft because Java becomes fast enough overnight.
Makes sense. Sun peaked during the dot-com boom. I worked at several different startups, both full time and as side gigs, and they all had Suns. I remember Enterprise 3500's and later 250 and 450's being popular around that time.
After the crash, you had to basically give that stuff away. I knew a guy who had 10 Sun boxes (all less than 5 years old) at home because a local company wrote them off. I had a couple of low end Ultras (an Ultra 2 and Ultra 10) but sold one of them off. The Ultra 2 I got used for pennies on the dollar at an auction. I think it was like $200 for what was a $15K machine years earlier.
Sun was so far ahead of it time. Solaris was a solid OS, but once Linux became "good enough", somewhere in the '99 to '03 timeframe, depending on industry, it was over for them.
To add: this is Red Hat Linux 6 or 7, not Red Hat Enterprise Linux - I fondly remember RHL 7.3 in 2002 as the last RedHat I liked; updates were still through RHN or Red Carpet (don’t remember if yum was already used).
Yes, exactly, when RedHat was still RedHat as it were: I haven't used either Fedora or RHEL since that division took place. Tbh that's more about circumstance than coming at it with any particular axe to grind but, these days, I can't see much of a reason to choose either of them over Ubuntu. Not for the projects I'm involved with, anyway.
I'm not surprised. Early Sparc chips weren't much to write home about. I had a Sparc 10 at home for a while. It made a great X11 desktop. The later UltraSparc systems, like the Enterprise series, were powerful, supported tons of CPUs, memory, and IO but single threaded CPU performance still wasn't stellar.
At one startup, we had like 10 people logged in to a E3500 for Java builds. The thing had 512 megs of RAM, huge for the time, but with 10 people compiling and running java apps with a min heap of 64 megs, you got into swap pretty quick. It had these enormous disk arrays attached. I think it was 2x Sun Storage arrays with 16 disks each, each one being like 20 gigs or something, for a total of 640 gigs before the RAID. We were also running a database instance on the single box (Sybase, I think), which didn't help. A lot of people started running their apps on their Windows desktops.
This was pretty early, late 98 or early 99. We basically had built our own app server on top of Apache JServ, which predated Tomcat.
I worked there for a few years in the early 2000s and it seemed like such an unremarkable company that wasn't going to go anywhere. The stock was always around $20 and whenever it went up, Bezos usually said something that investors didn't like and drove down the stock price (free shipping, search inside the book, etc).
Then there was the Segway deal where I amused myself by checking the orders database to see how many they sold (because the numbers were so small)
If you told me back then that the stock was going to $3000 some day, I would have stuck around for longer.
This was my feeling in 2010 when I accepted an offer. It was a much bigger company by then, but upon joining it still felt stodgy and like any other big corp. I wonder if it still feels that way if you're there (I left in 2013).
It definitely is, it actually made me look back fondly at my time at Oracle if that says anything lol. The fact that it remains an aspirational job (gotta grind leetcode for FANG!) makes me smirk
It is so much easier to spot things in hindsight. We can all see the evidence that supports the conclusion.
Let's take Uber. Their core product is a social media app with mapping, tracking and billing features. It has burnt through an incredible amount of money trying to kill competitors, employing 20,000 or so people directly and claiming that a self-driving electric fleet will someday make them insanely profitable.
Market cap is up almost 80% in the last year.
Who's the idiot? The investor who believes that Uber will be insanely profitable in N years, or the investor who doesn't want any more exposure to Uber than as part of a whole-market index?
It's all easy looking backwards. If you went through the dot-com boom, bust, and the chaos in between, not so much. If it was easy, everyone would've bought Apple and Microsoft stock in the 80's and just let it ride.
I remember meeting some Amazon developers at OOPSLA around 2000 and being pretty unimpressed. Like they told me they didn't really know what they were doing there and were just getting out of the office. Then, 2 years later, they started hiring the best, and the rest is history.
I think hiring top-quality, motivated engineers doesn't just give you good software, it also provides all the idea generation that you're going to need to enter businesses you never thought about before.
The big problem with Linux at the time was large files, and memory addressing limitations of 32bit architectures. x86_64 was still too new and only introduced in 1999. There were other issues as well, that in the intervening 20 years have been more than addressed. Today's Linux is of course far superior to any OS that existed at the time.
All Unix-like systems had exactly the same issues with large file support on 32b architectures. The solution was standardized in 1996 and adopted by most relevant OSes relatively quickly. Another issue is that support for large files has to be opt-in for user space code and this opt-in has to be same for all modules that share anything whose ABI depends on sizeof(off_t).
Server software that needs larger address space than what is conceivable size of main memory became somewhat common only after mass adoption of amd64. In fact, amd64 is first 64bit architecture where it is common to have 64-bit only userspace, essentially all "64b" unix vendors in 90's shipped systems with most of userspace built in 32b mode (IIRC this includes even technically 64b-only OSF/1 AXP^W^WTru64, which while running on CPU that really does not have 32b mode had 32b ABI in the style of linux's x32 ABI).
OP might have been thinking of Itanium (ia-64), a different 64 bit PC architecture from Intel. It was shockingly expensive
There was certainly development on ia64 in the kernel in 1999 under Project Trillian [1][2], although the first chip wasn't released commercially until 2001
Alternatively they may have been thinking about PAE, which was released in 1999 and allowed linux kernels to address upto 64GB on a 32 bit processor [2]
PAE was also used for Microsoft Windows 2000 Server Datacenter Edition. Server machines at the time capable of holding 64GB RAM were beasts, IIRC on the order of hundreds of thousands of dollars. A desktop machine with 512MB memory was considered way more than you needed. I used a Toshiba Tecra 9000 laptop with 256GB for mostly Java and C++ development and never felt like I was anywhere near maxing it out.
PAE might not have met Amazon's needs for holding their catalog or whatever. If they were trying to directly address a data structure that was larger than 4GB, they would have needed some sort of trickery as PAE was usually implemented kernel side to provide separate 4GB address spaces to individual processes with more than 4GB total physical backing.
From replies to that thread, it looks like there are multiple errors.
One reply is from one of the original engineers working on EC2 that said this swapover had nothing to do with AWS and it's origins, and another reply is from a former distinguished engineer stating that they swapped from compaq/digital tru64 alpha servers, not sun.
Sometimes there are a lot of people that experience part of the story that there can be multiple versions of the truth, each seemingly different, but all still fitting in with a larger narrative.
In 2012 I joined Amazon in a a small data analytics org within the supply chain. We were tasked with reducing split shipments: multiple item orders that could not ship out on a single box. The top reason for splitting a shipment was because we simply didn't have the right combination of items in a single fulfillment center. By the time I had gotten there, my manager had already successfully implemented the first transshipment network, moving items from one warehouse to another so they could ship out with the right combination of items and reduce a split shipment. But by then we were reaching diminishing returns on transshipment, and splits were still rising nationwide, and I was given the chance to provide analysis and new insights.
I probably ran a million analyses that year, but one of the most salient was that the major reason for splitting shipments was because we we sold so few of one of the items that it didn't make sense to keep more than one or two in stock nationwide. Those types of tail items were typically stocked in one of three fulfillment centers, the largest of the fulfillment centers in three regions of the US. And the number of orders getting split shipments between those three fulfillment centers was massive...more than enough to cover the cost of daily cargo plane transshipments between the three fulfillment centers.
I ended up leaving that team and moving on, eventually leaving Amazon completely. Right about the time I left, Amazon made a huge anouncement: Amazon Prime Air (eventually renamed Amazon Air to distinguish between the drone delivery idea). The press releases made it sound like they were launching it to deliver packages nationwide, but a quick call to my former colleague confirmed that upon launch, the only items they were moving was unpackaged inventory ... cargo plane transshipment, as my analyses had pushed for a few years earlier. Since then, they've expanded to actually moving customer shipments, but the service was initially entirely justified based on an analysis of split shipments I had done years earlier.
I say this because all humans like telling stories about themselves and the oversized contribution they had on something that made it big. It's cool, and it makes us feel important. But I was hardly the only person that worked on that project. There were likely dozens of others who were pushing for cargo planes for different reasons and maybe the transshipment story was enough to tip it over the edge. There were many more who bought into a crazy idea really early and came to own it, at least far more than someone who did a throwaway analysis years earlier. And they all probably have different versions of the same story. And they can all be true simultaneously, at least true enough to matter. Because none of us have a perfect recollection of the past...we all have a history written into our minds that is colored by the experience that we lived.
> Sometimes there are a lot of people that experience part of the story that there can be multiple versions of the truth, each seemingly different, but all still fitting in with a larger narrative.
This also applies in families, in retelling of historical events.
In addition, the capability of the story teller to enthrall can cause their version of the story to be weighted more strongly than others.
This can lead to a situation where the story told by the better “showman” gets an outsized credence.
Beware one person’s recollection, and recall that extemporaneous notes are the next best thing to a recording because it is that hard to remember what happened.
If your goal is influence, you must be at least as good of a story teller as someone else talking about something you have a memory of.
No doubt Solaris on SPARC was more reliable (particularly on higher-performance boxes) than Linux on x86 at the time, but tons of decent-sized hosting-type companies were running on Linux in the late 90s, and I worked at one. Obviously at Amazon's scale this was a massive migration, and giving up the support of Sun would be hard, but, hell, Google was launched on Linux at the time too.
That said, I was supporting Sparc Solaris through the late 2000s and the Oracle workloads were the last thing to move to Linux most places. The sheer power of the Sun boxes, along with the single-throat-to-choke was fairly unbeatable. No hardware vendor to look down their nose at you for using a "hobbyist" OS and imply that the problem must be at the Operating System level.
And, of course, you couldn't really replace those huge Sun Fire 6800 type servers with Linux.
Probably audience reach, I'm guessing tweeting the information directly as opposed to a link to a blog results in more engagement.
As for why Twitter specifically it's because that's where their followers are, it will get the most views presumably vs. other platforms or mediums.
FWIW I personally enjoy tweetstorms, I'm not sure why. I think the break in between each tweet is nice, it's like reading a series of messages rather than a blog post and I find myself more engaged. It also means the author has to be fairly concise
This is largely why I do it. I have half a million twitter followers, and while I can write it on my blog and then tweet a link, it is nowhere near as effective as just publishing the content directly on Twitter.
FWIW, I used to despise it when people did this (post massive threads of messages), but frankly I think they fixed the UI for this years ago, and just don't understand the complaints anymore: you click the link, and can scroll through messages as if it were an article.
Between it auto-flowing the messages and the longer tweet length of 280 characters, I honestly don't see any advantage to those "thread readers" people keep linking to... they don't even have larger information density, which is ridiculous. (Hell... they frankly seem to make the text harder to read!)
(And yeah: I carefully write my tons of text on Twitter such that every single message is a carefully enough crafted thought that they could just about stand alone, which I think is actually good for the content.)
> frankly I think they fixed the UI for this years ago, and just don't understand the complaints anymore
"fixed the UI" is a big overstatement - they improved the UI from utterly horrible to barely usable (for this purpose).
> you click the link, and can scroll through messages as if it were an article.
Imagine if someone wrote an article by creating a phpBB forum thread and posting every couple of sentences as a post on the thread. That's what this is like, for the rest of us - since you have a half million followers, your brain is probably used to filtering out all the crap and parsing the posts quickly. But the vast majority of us don't use Twitter as much or at all, and this adds a mental tax for all of us.
That's why the thread readers are so successful - you're not the target audience of them, but most of the rest of the Internet users are.
I never use Twitter but I have an account and the app installed on my iPhone. This particular story displayed in an extremely easy to read format for me on my phone. Was very similar to reading a blog page.
Wow, it is SO weird to see the name "saurik" replying to my comment.
I remember in middle school jailbreaking my iPod touch to install all kinds of cool tweaks from Cydia, and I'm sure that exposure led me to continue messing around with electronics.
Thank you for all you did for the community, I attribute a lot of my interest in computers and coding to the early jailbreak days.
you click the link, and can scroll through messages as if it were an article.
No you can not. Often times the posts are not in order. You have to expand multiple times and if you are unlucky you get pulled into another direction and miss half the story.
Unreadable isn't hyperbole. Maybe you have to have an account for it to be usable?
Note: the posted twitter-thread is the first I've seen that actually works for me. Not sure why that is.
> I honestly don't see any advantage to those "thread readers"
Do you typically read Twitter in an app or via the web?
I find it annoying enough to read Twitter on the web - that I don't. Part of that is the fact that Twitter spends sizeable screenspace nagging me to log in - which presumably isn't a problem if you have an account.
I too find this type of posting horrible, and avoid opening these generally. Part of my problem is specifically with Twitter UI and slow load times though, so when I do feel like it's worth opening, I use a Nitter redirect addon in my browser. Loading via Nitter instances is much faster for me, and works without even needing Javascript.
The most addictive games are those with a small marginal commitment of playing another game. It's incredibly easy to rack up hours on such games, yet if you asked people if they were prepared to commit hours to the game they'd be likely to say no.
Twitter is even worse. Not only does it not make the time commitment clear up front, the tweets are short enough that you read them automatically. Then you're hooked into the doom scroll as you keep automatically reading one after another.
As a writer it means you don't need to bother with structure and other classical writing skills to keep your reader interested. You can just write disconnected chunks of text and rely on twitter to keep them interested.
There's a lot of revisionist history around what AWS allowed.
RackShack and other companies offered pretty cheap (started at $99/month IIRC) dedicated Linux servers which loads of startups used before AWS. There was also various virtual machine providers in the market.
These articles make it sound like every company before AWS had to build their own datacenter or at least buy a million dollars worth of Sun hardware to put a website online.
IBM mainframes virtual machines saying hello => "It is directly based on technology and concepts dating back to the 1960s"
https://en.wikipedia.org/wiki/Z/VM
The biggest thing that AWS offered in the beginning was extreme proximity to the backbone. Their services have grown by the dozens, and other hosting services have followed suit, but they remain (as far as I know) one of the closest points [ EDIT: without co-location ] you can put a VM to the backbone.
Right out of school (early 2000s), I did network modelling consulting. My main client bought a Sunfire V1280 box (12 CPUs, and the server room guys hated me for the amount of power it drew/heat it put out) to run network simulations. As I remember, the main two advantages SPARC and POWER systems being sold at that time over x86 were (1) reliability and (2) memory / I/O bandwidth.
I tried to look up who first said something along the lines of "A supercomuter is a tool for turning a compute-bound problem into an I/O-bound problem" and all I could find was an ArsTechnica article referencing it as an old joke. I thought it was an old joke by someone of the stature of Cray or Amdahl, but maybe that's apocryphal.
In any case, our network simulations attempted to perform a min-cut (using number of packets as the cost metric) to partition simulation nodes among threads, run one thread per core, and pin a thread to each core. With a network simulation, though, as you can imagine, the problem was usually bound by memory bandwidth.
Also, the client was a defense contractor, and the 400k+ USD box was an artefact of "use it or lose it" budgeting. If they didn't spend all of their budget by December 31st, the group's budget would be reduced the following year. In their case, the high cost of the box was a feature.
What is a bit funny is after fighting for $1,000 during the year, some budget manager calls in a panic near year end - how fast can you spend $150K - on anything you want. Turns out - as Amazon has noticed with cheap stuff, whoever has stuff in stock and can ship quick gets the deal - price totally irrelevant.
There was a while during the stimulus spending where politicians were I guess getting heat for not spending stimulus money fast enough, so the pressure to spend got pushed down big time. Every layer was trying to push spending out as fast as they could.
For those not familiar with this area and wondering why a higher level agency would push spending rather than trying to save money?
A lot of govt budgets between appropriation and final vendors are based on % of allocated spending. So a state agency will take 10%, a local agency will take 10%, and then it ends up in the hands of whoever does the work. Everyone above has built their budgets on all the money being spent. They've staffed for that.
So if you show up and spend 50% of your allocation, now they only get 50% of their 10% and can be in a real jam, particularly if they have already spent their entire expected amount or committed to spend it (staff have been hired). It can be hard to re-budget amounts between vendors / recipients so the pressure on whoever didn't spend all their money can be intense, because it's messing up a whole chain of other budgets.
Well this was an interesting thread to see pop up.
Hi, I'm Zack, I was a Sr. Unix Administrator at AMZN in 1999 and worked alongside very talented folks like snovotny, lamont, yvette, grabatin, gborle, jamison, craig, and so many others. The responses from Benjamin Black and Peter Voshall are correct.
We definitely were on DEC Tru64 at this time. Sun was running the big databases like ACB. I recall worries that obidos might not be able to build in a few months time so the engineer with the little corgi (Erik?) spiked out a linux build and then it was linux pizza boxes as quickly as we could build them.
We built CMF and other things to handle mass provisioning but it was chaotic for quite a while. I don't recall anyone talking about what would later become AWS in those days.
He is wrong and is corrected later on in that tweet. At that time, AMZN was mostly DEC Digital Unix. The DNS and mail servers were Linux in 97. AMZN started with SUNW (pre 97), but switched to Digital Unix because it was 64bit and could fit the catalog into RAM.
No. The entire fleet was Compaq/Digital Tru64 Alpha servers in 2000 (and '99, and '98). Amazon did use Sun servers in the earliest days but a bad experience with Sun support caused us to switch vendors.
Well, like every internet company in 99, there was SUN servers. There was a lone sun workstation that printed some of the shipping docs in latex. I believe that was left by Paul Barton Davis. By early 97, the website (Netscape) and database (Oracle) ran on DEC Alpha hardware. Peter is wrong about switching to Digital Unix because Sun had bad support. The switch happened for 64bit reasons.
There was almost a 24 hour outage of amazon.com because Digital Unix's AdvFS kept eating the oracle db files. Lots of crappy operating systems in the those days.
I worked at a company that thought they had bought their way to reliability with Sun, Oracle, and NetApp but we had a three-day-long outage when some internal NetApp kernel thing overflowed on filers with more than 2^31 inodes. Later the same company had a giant dataloss outage when the hot spare Veritas head, an expensive Sun machine, decided it was the master while the old master also thought so, and together they trashed the entire Oracle database.
Both hardware and software in those days were, largely, steaming piles of shit. I have a feeling that many people who express nostalgia for those old Unix systems were probably 4 years old at the relevant time. Actually living through it wasn't any fun. Linux saved us from all that.
My fingers still habitually run `sync` when they're idling because of my innumerable experiences with filesystem corruption and data loss on Linux during the 1990s. There were just too many bugs that caused corruption (memory or disk) or crashes under heavy load or niche cases, and your best bet at preserving your data was to minimize the window when the disk could be in a stale or, especially, inconsistent state. ext3, which implemented a journal and (modulo bugs) guaranteed constant consistent disk state, didn't come until 2001. XFS was ported to Linux also in 2001, though it was extremely unreliable (on Linux, at least) for several more years.
Of course, if you were mostly only serving read-only data via HTTP or FTP, or otherwise running typical 90s websites (Perl CGI, PHP, etc, with intrinsically resilient write patterns[1]), then Linux rocked. Mostly because of ergonomics and accessibility (cost, complexity); and the toolchain and development environment (GNU userland, distribution binary packages, etc) were the bigger reasons for that. Travails with commercial corporate software weren't very common because it was uncommon for vendors to port products to Linux and uncommon for people to bother running them, especially in scenarios where traditional Unix systems were used.
[1] Using something like GDBM was begging for unrecoverable corruption. Ironically, MySQL was fairly stable given the nature of usage patterns back then and their interaction with Linux' weak spots.
Multiple folks on Twitter hinted at inaccuracies in Dan Rose’s recollection of events at Amazon. In fact, when you mentioned Paul Davis, I realized I was looking through the comments to see him [1] point out these inaccuracies since he is known to hang out here on HN.
We had 2-3 Sun boxes at Amazon German '99/2000 but I'll be honest it was a pet project by the local IT director. Even having a different shell on those annoyed me. Compaq/DEC Alpha was used for customer service, fulfillment etc.
It didn't use LaTeX anymore by the time I left in early 96. I had already templated the PDF that LaTeX used to generate, and written C code to instantiate the template for a given shipment.
Cool. I don't think I understood that aspect back then. I was tasked to look at converting the sun box sitting in the sodo DC to something else. I logged in and found latex but didn't understand it how it all fit together.
Switching to Linux in 2000 for something as busy as Amazon.com was a really ballsy move. I remember fighting kernel panics and hangs under load on RH7 - java workloads. This was pre RMAP VM and I vaguely remember the JVM was able to trigger Linux VM scalability issues due large number of anonymous page mappings.
For all the money they were charging Sun did really put lot of engineering effort into Solaris and SPARC.
We (the Linux kernel community) put a lot of work into scalability back then. One time we delayed a Red Hat release several weeks due to a persistent bit flip on an NFS filesystem on one of our stress tests that we thought was a kernel bug. It ultimately turned out to be a broken Cisco switch that was happily fixing up the CRC and checksums on the NFS packets it was bit flipping.
Of course. Sun was ahead of the curve in OS scalability and SMP hardware departments and for some time Solaris was still the best OS to run databases and Java.
The Sparc2k was a beast. I got on in 1998 that (among other tasks) ran the IRC bot for our channel with a 4gb text quotes database to chain off of. ("Shut up, lucy"). It usually ran at a load average of 4 to 12 per CPU. I had just over 1TB of disk array on it; in 4 dozen, full height, 5 inch 23GB SCSI drives.
I sank about $10k into that system in ebay parts. at the time there just wasn't any way to get such bandwidth out of x86, but used gear was cheap. if you could afford the time, sweat and blood to keep it running.
I used to work at a shop that had hundreds of X4100's running x86 Solaris 10. It was a pleasure to work those hosts. Solaris had some really cool stuff like ZFS, dtrace, Zones/Containers, SMF, etc.
And now you read the release notes with dread as you witness more and more functionality being dropped because they choose not to maintain it. Luckily there is FreeBSD
Yeah it’s a shame. If Sun would have made Solaris (both SPARC/x86) free sooner, they might have prevented the mass exodus of companies moving off to save money.
In retrospect, they should have gone down the Sun386I path back in the day, instead of changing gears and spending $$$ building up the SPARC architecture and ecosystem.
I remember reading Sun’s financial statements at the time and bragging about huge 50% margins and thinking that can’t last. In some alternate timeline, Sun would’ve lowered their prices, stayed in the game, and we’d all have rock solid Sparc laptops by now.
If you read the other DE respond in the tweets, it was DEC/Compaq Alphas. I think it's kind of insane to have gone from 64b back to 32b (and the DE claims it caused big issues). Seems like API alphas running Linux would have been a better play until Opteron made it clear amd64 had a lot of runway.
Linux wasn't shocking or daring back then. Literally everyone could see that proprietary unixes (and their locked-in hardware) were doomed. "Attack of the killer micros" had already succeeded. Exactly which micro was still somewhat unsettled.
It depends upon your environment, it wasn't clear to many people.
I thought that proprietary unix was doomed when I started an internship at a bank's IT ops department in 2005. The seasoned Solaris admins I worked with had commissioned some Linux servers the year before in a non-prod setting and had written it off as a toy...
In 2005 Nokia was still using HP-UX as our main UNIX deployment platform, and doing a Solaris port of NetAct cluster software, GNU/Linux was indeed seen as a toy still.
I worked at ETRADE around the same time and led the migration from Sun to Linux on the software side, switching everything to Apache was once a radical idea. We switched to IBM machines and not HP, a Sun server cost around $250k and the equivalent Linux server was $5k and outperformed the Sun machines. It still was considered risky by CTOs.
During the dotcom boom there was huge competition for ordering Sun machines and it was difficult to get enough, we were competeting with eBay to get our hands on machines.
During the crash, we went back to eBay as a seller and couldn't off load all the old Sun machines.
The last comment stated that Amazon recently spent a lot of effort ripping out Oracle. I would be interested in knowing how that was done and how much better things got afterwards.
One way things got better is they just migrated to pre-existing, externally available AWS services for many applications. Being able to sell them to Amazon teams at internal cost, getting performance and scale benefits and being able to use AWS primitives for orchestration is a major win.
Fairly certain they're alluding to Amazon's company wide effort to get off of oracle databases, with most teams opting to use dynamodb instead. AFAIK it took about ~4 years.
Source: interned / worked at
amazon for a bit a few years back
In at least a few orgs I've seen the switch was to RDS (usually postgres) instead of dynamodb because they couldn't justify an entire redesign of the schema at the same time as the switch. It was a huge effort with a lot of late nights especially at the very end of the cutover.
They definitely didn't replace oracle with dynamo, those are very different beasts and would require complete redesigns of the migrated service. RDS sounds much more appropriate/realistic.
Part of the reason for opting towards dynamodb was it's essentially unlimited horizontal scaling. I know at least 1 team had scale that was just too large for traditional relational DBs, so dynamodb it was.
The other part was a large amount of sql usage was essentially just as a key/value store anyway, and dynamodb was vastly cheaper.
I agree that RDS is much more akin to oracle, but in my (extremely) limited exposure to teams doing the migration, the majority we're going to dynamodb. As another poster replied, it was clearly a different story in different orgs.
I don't know how much things got better from performance standpoint - Oracle is pretty good on that front - but they built their own Aurora database adding all the missing features to it and then migrating to it. (Bunch of blog posts on Aurora look it up.) Knowing what I know about Oracle licensing it must have gone pretty well on the money front!
> Jeff started to think - we have all this excess server capacity for 46 weeks/year, why not rent it out to other companies?
If Amazon rented out excess capacity after Nov/Dec, what did they do for next years Nov/Dec peak? Did they build up more capacity throughout the year and not rent it out?
AWS is in their 1970's IBM phase. Riding high, wealthy, good reputation, not much real competition. In 10 years they will be knocked on their heels as they fail to adjust to a new industry change.
My suspicion is that eventually AWS will become the "mainframes" of tech, and a new software+hardware paradigm will make all their services obsolete and their hardware too expensive. My money's on the return of store-and-forward communications. Once your watch has as much power and storage as a SunFire T2000, and every IoT device can store terabytes of data on gigabit networks, who the hell needs redundant servers?
AWS is not the end of history. The hardware arch is still completely insane PC lineage too much, and now that electricity not hardware costs predominate, the calculus is has shifted tons.
"I was at Amzn in 2000 when the internet bubble popped. Capital markets dried up & we were burning $1B/yr."
"Amzn came within a few quarters of going bankrupt around this time."
"Amzn nearly died in 2000-2003."
Current Google's CFO Ruth Porat saved Amazon during dot-com crash [1] , she advised them to go to Europe to raise money otherwise they would face insolvency.
"Sun servers were the most reliable so all internet co's used them back then, even though Sun's proprietary stack was expensive & sticky."
A few things in the article start to rhyme a bit with where NVIDIA is at today. An expensive, proprietary, sticky stack. Purchased in large quantities by startups...
An acquaintance who was at Sun and now NVIDIA tells me "Please don't be Sun... please don't be Sun!"
Granted, NVIDIA is also in a great position in the gaming market, so they look pretty diversified and safe to me.
> even though Sun's proprietary stack was expensive & sticky
What is he referring to?
I'm in the semiconductor industry and in the 1990's everyone had a Sun workstation on their desk. We had large 16 CPU Sun servers in the closet for bigger jobs. Since everything was either command line or X11 it was easy to set your DISPLAY variable and display things remotely.
All of the EDA (Electronic Design Automation) tools ran on Sun / Solaris. Some of these tools from Cadence and Synopsys have list prices of over $1 million per license.
Those big Sun monitors with DB13W3 connections were expensive. We started putting Linux / x86 boxes with 21" monitors on engineers desks and running programs remotely from the big Suns. We even started stacking the old desktop Sun workstations in the closet and running stuff remotely from them because people liked the bigger PC monitors.
By 2000 the EDA software companies were recompiling their software for Linux. I talked to a few and they said it was a pretty easy recompile. x86 was getting to be faster and so much cheaper.
We added a couple of -sun and -linux flags to the CAD wrappers. Everything was still connected over NFS with the same shell and Perl scripts. Honestly the transition was pretty seamless. When AMD released their first 64-bit chips we bought a lot of them and retired our final Suns that were only around for >4GB jobs.
Interesting that this thread seems to attribute a lot of the early AWS decisions directly to Jeff Bezos when everything I've read before indicates it was a collaborative effort including Andy Jassy. https://techcrunch.com/2016/07/02/andy-jassys-brief-history-...
Until you need them, none. They sort of become self evident and you'll know when you're in the market for them. For now, don't waste time on premature optimization.
Another story from same author from few months ago where he recounts yearly Xmas spikes in early AMZN years where everyone including Bezos was packing boxes to meet the demand:
? Why should a business in 2000 have to build its own datacenter?
But in 2020, most small/medium sized business could be run on two-ish beefy boxes. With a cloudflare, or similiar, firewall there simply is no need to build your own or rent a datacenter.
I remember Amazon hiring a lot of systems research people in 05-06. They were very keen on Xen, judging by the people they were hiring. I wonder if they still use Xen post their Citrix acquisition.
EC2 was famously built on top of Xen, but as of 2017, new instance types are based on Nitro, which uses parts of KVM instead. Brendan Gregg wrote up a good history of this [1] (there may be an updated version somewhere, too).
I'd argue they went a bit early. Pre x64, so they would have been moving from 64 to 32 bit. And Linux, at the time, had a variety of issues at scale with kernel panics, leaks, etc.
They were all but dead (AFAIK) by the time I was was born, so the old Unixes and RISCs all have a ghostly feel to them to me at least. It's quite strange reading many now-ageing compiler texts talking about 386 as a wacky ISA soon to die out.
I really want an Alpha box to play with, partly out of curiosity but partly to feel like a real man programming with next to no memory consistency.
Although they didn't help themselves it does seem like the probably-ending dominance of X86 has hurt the industry up til now.
> the old Unixes and RISCs all have a ghostly feel to them to me at least. It's quite strange reading many now-ageing compiler texts talking about 386 as a wacky ISA soon to die out.
Illumos and the BSDs seem to be doing okay, and ARM is huge. The x86 family didn't die out but it is wacky.
> I really want an Alpha box to play with, partly out of curiosity but partly to feel like a real man programming with next to no memory consistency.
Amazon was always at that time facing challenges on every front, the dot-com bubble breaking etc. There was a large build out of distribution centres around this time that was very expensive.
This was just one more of the challenges they faced. I may not have been senior enough to be privy to all the details, but I don't think this particular migration brought the company to near bankruptcy. That's a stretch.
I think the big migration was the database, and it took time, and Oracle was having some difficulties with running on Linux. But we were already using Linux boxes on our desktops since 1999. Webservers had switched to Linux around 1999. And migrating core code base which mostly C/C++ with Java starting to appear to Linux was mostly a challenge for the infrastructure teams and one of those routine large projects that every company has. I don't recall anybody talking of bankruptcy for this reason.