Very cool indeed - though the Boston Viridis might be an easier way to get lots of ARM nodes in a rack.
As the platform is in Cambridge, I should remind everyone of the Cambridge MakeSpace which has just officially opened, and would be a great place to help build that kind of chassis.
I think the Boston Viridis falls into the "commercial blade server" category and as suggested in the conclusion is how it should be done if you were not using existing dev boards.
Is that an invite to show me round Cambridge MakeSpace? ;-) I was thinking about the open evening on tuesday but it is very inconvenient.
* They're not nearly as fast as an Intel server CPU. If you have a task that can be easily split across any number of machines this may not matter, but not every case is like that. For many tasks it's just more pleasant to have 10 fast machines than 50 slow ones.
* They're only 32-bit and 4GB is pretty small for a server these days. Aarch64 will address that whenever they become available and cheap.
* The low-CPU low-RAM tasks that todays ARM servers are well situated for is also easily handled by virtualized x86 in enterprise environments.
I'm a big ARM fan -- I've got 6 or 7 ARM machines I play with from time to time on my home network -- so I hope they make a credible run at the server space.
What other people have already mentioned: They are much less performing than x86/amd64 currently, and a lot of the ARM cores really suck these days when you want them perform CPU-intensive stuff: My i.mx6 and nvidia-tegra overheat without additional cooling. And look at the raspberry-π: The ethernet-controller/usb-hub combo eats about as much power as the CPU!
Then, the state of "board support packages": In contrast to x86 you cannot run a linux-kernel as-is, but rather it has to be patched for that particular board (in most cases): The linux-kernel has to be supplied with a lot of information on compile-time that is commonly provided by the BIOS on your typical PC. Furthermore a lot of the peripherals present inside those tiny ARM-chips have to have their drivers added:
Now the Raspberry-π again, due to its popularity and enthusiastic community excels: You can basically get distributions with modern (now: 3.8.x) kernels ready to install. But if you are stuck on i.MX6, you get to use a patched 3.0.x-kernel (+2000 patches), with nvidia tegra it's at 3.1.x. [at least that was my experience with industrial boards up to a few weeks ago].
And those kernels tend to break in interesting ways as soon as you try to add your specific features, like getting a certain PCIe card to run with the i.MX6, or fiddle with the i2c-busses (on which power-management chips are present), ... and just are much, much, much less well-tested as the mainline kernels you are used to on "normal" PCs, or even the other more mature mainline architectures such as PowerPC (does anyone still use sparc/m68k/...?).
{Side-note: That's also the main pain for the independent android ports, such as cyanogen-mod. They normally get stuck on missing kernel patches, sourcecode, ... even though it's just "an ARM" chip.}
Of course, this is not at all a problem if you are happy with the stock BSP kernel/distribution and just want to compile node.js or mplayer for your board.
Virtualization removed much of the appeal of running a small, power-efficient physical server. If you want a low-powered server, you just take a slice of a much bigger one. And if you don't need to run it 24/7, you can simply stop it and "spin it up" later.
Also in many VPS slices you have a guaranteed amount of compute power but when the others aren't using the CPU you can use more, burstable RAM allocations too, whereas a dedicated blade trying to compete with a Linode slice will have hard limits. While that doesn't apply to all use cases it's a fairly common one.
powerful is pretty relative, there are lots of tasks where even a modern "low-end" Intel chip will spank any ARM device. Also "open" is misleading to downright wrong here too. ARM will license their designs to virtually anyone, but that doesn't mean they are open by most commonly accepted definitions of the word as it relates to technology. ARM is, after all, an IP company.
Having said all of that, almost all of my hobbyist non-day-job programming involves writing Go for ARM devices and I love the platform, but it isn't always a clear winner.
As far as "not everything running on ARM", the compiler situation is pretty good, where things tend to fall short in on kernel support for specific ARM SoCs. Until fairly recently most ARM SoC manufactuers had their own kernel forks with pushes back to the mainline few and far between. This has been changing a lot post Linux-3.7 due to Linus laying down the law on them a bit, but kernel compatibility and hardware drivers (this is why most alternate phone OSes based on Linux are going to route of being compatible with Android device drivers over standard Linux device drivers) are generally still the biggest stumbling blocks.
So, we recently did the cost calculation of whether it was cheaper (based on an application benchmark) to use a bunch of different ARM processors vs. the Intel Sandybridge-E line. Based on our TCO numbers, Intel outperformed ARM by quite a bit.
Intel's power numbers per core were only slightly above ARM, but when it came to overall infrastructure, Intel won in terms of price overall.
Excellent question and some great answers here. Most of the ARM designs are similarly in SOC's that are targeted to a fairly integrated space (like a phone or tablet) so getting the server peripherals on them is painful. A good example of that is the RasPi which does everything through USB basically and attaching storage via USB isn't really great.
That said, there is growing interest in dis-aggregation (think of it like component servers) and there are some interesting changes that brings into the space. Reasonably soon we could see an ARM SOC with a couple of nice network ports on it, and a "rack" where part of the rack was a shared storage pool for the rack and the rest was a swarm of compute. In applications where there is a high data to compute ratio (think digging though multi-petabyte date sets for stuff) The channel bandwidth benefit of many replicated compute nodes gives it an advantage over the larger 'server' type nodes. Fun times to design distributed architectures.
Some Linux vendors have started offering support for various "older" ARM chips, but I doubt they will even try to support all since there are many, and are not even powerful enough to be useful so there's no need and no point in doing that.
Cortex A15 is probably the most supported right now. However, I think you'll really start seeing unified support across all distros and tools starting with the 64 bit ARMv8 architecture and all the chips built on it. Everyone is working on supporting ARMv8 from day one. So you'll have to wait a bit longer for that (a year+).
Intel's x86 chips are much faster (although less efficient) because their technology is more advanced, allowing them to fit more transistors. They aren't about to move their products to a reduced instruction set architecture because they have already invested so much in x86 and are so far in the lead that there's not much pressure on them to change. They've got the best talent and an enormous amount of money, allowing them to optimize a non-optimal architecture to beat the rest. It's not clear how long they can keep this up.
As the platform is in Cambridge, I should remind everyone of the Cambridge MakeSpace which has just officially opened, and would be a great place to help build that kind of chassis.