I'm not OP but I've done this stuff on and off for two decades and I thought I'd toss out some answers.
About 20 years ago I got started using a dozen or so surplus Pentium75 desktops and by combining the working parts and best/largest parts I ended up with eight or so nodes with a reasonable amount of memory and local storage.
A write up would have been pointless it was pretty much standard like you'd read in any linux magazine of the era.
I used one of the PBS options for batching and some homemade stuff to rip audio CDs to mp3 in a distributed manner and completely unproductively fooled around with PVM and MPI. In the old days PVM was stable and easy to use although MPI was predicted to be the wave of the future. In fact today PVM is pretty much dead and everyone uses MPI or the ever popular NIH homemade stuff. "Back in the old days" we did things like spawn off ray tracing jobs semi-manually with bash scripts and perl scripts. You'd write perl that output animated povray files given a frame number and then queue up like 120 ray tracing jobs and then stack them into video and thats how you got a minute of bad ray traced animated movies. Not being much of an artist my animated ray tracing was limited to bad physics demos like mirrored ball bouncing. As most other cluster ops do I calculated a lot of "X+1" and "Is X a prime" to test. Like most screwing around, honestly I just turned a lot of KWh into heat.
The wife acceptance factor was incredibly low. Although admittedly I'm still married, to the same awesome woman even. Eventually I got rid of all the machines. But for a year or two it was a lot of fun. I thought the noise level was bad... then I replaced all the boxes with dell towers a few years later that sounded like jet engines. Then IBM 1U rackmount servers P4 surplus around 2010-ish which did a pretty awesome job of running LXC before containerization became "cool". Portability ha ha ha the size of a closet in the middle of my home office, looks like an organized disaster zone. Really looking forward to building a pi cluster to screw around with that'll fit in a shoebox with room to spare. OMG I could build a toy cluster so small I could drop it into my desk drawer when I'm not playing... Oh if I had more time this summer I'm feeling the urge to build another cluster...
Power doesn't matter as long as the circuit breaker doesn't pop. At one point I figured I was burning 1100 watts using the dell towers (and yes the room does get warm) which implies running 24x7 would cost me $1100/yr (which is very cheap compared to a training class or many other hobbies, but I digress) however I suspect I never powered up that cluster more than maybe 300 hours. I spent 10x more money on a then new technology 100 meg ethernet switch than I did on the power the system used. Back in the dinosaur era we mostly used hubs which were unswitched one big broadcast region, probably sounds very weird to modern young kids.
I can't even begin to list all the stuff I learned. The jump from admin'ing one unix box to admin'ing a cluster is almost as big as the jump from being an appliance user to being a cluster admin.
What I didn't learn was the scalability problems of having 10000 nodes, having less than ten generally, I did a lot of setup by hand. Although I did all the OS level stuff automation like homemade scripts, puppet, etc. A lot of cluster admin is related to the "fun" of physical hardware support for 10000 individual boxes and I miss out on that. Also running 24x7 how do you handle dusty air filters or whatever at that scale for day to day operation? Power failure recovery must be non-trivial.
Its hard to say "well I'd never have gotten job X except for..." because I might have anyway. As a hobby its been tangentially tied in with my day job on and off since at least the turn of the century.
Edit: I just had an idea and detached this post from https://news.ycombinator.com/item?id=12149396 so more people would get a chance to see it. (It originally appeared in an off-topic subthread.) Maybe the first time we've pruned a comment from its original parent because it was so good.
Similar experience here except I started on my high school's network ~1996. Built two physical clusters commercially in the last few years. Personally these days I mostly use virtualization to emulate entire clusters, as it's not worth building them physically until you really need the power!
I like how it was called "beowulf cluster" back then, it's a while I read that name. We had an 160 node beowulf cluster (had already USB support, no floppy drives) for OpenMP and CILK jobs it was amazing. https://en.wikipedia.org/wiki/Beowulf_cluster
https://pi-hole.net/ is my favorite pi software, it's network wide DNS server for ad blocking which I supplement with a bunch of privacy/security lists from https://github.com/StevenBlack/hosts. 5% of my traffic was blocked today which is heaps considering it's counting mountains of requests to eg Dropbox's API.
I really like it too but after extended testing I found it was incompatible with NoScript's Application Boundaries Enforcer because pi-hole replaces ad requests with an empty HTML 'page', rather than simply dropping or blocking ads like a browser ad-blocker would.
I don't know enough about how this works to figure out how to work around it so I reverted to blocking ads at the router with a hosts file. NoScript provides too many other benefits so I had to keep it and stop using pi-hole.
Do you actually use the ABE specifically, that part can be turned off (NoScript > Options > Advanced : ABE and then uncheck the enable checkbox) without stopping any of the other aspects from working.
I'm not a dev but looking very quickly at pihole it seems it serves an index.html file (https://pi-hole.net/faq/is-it-possible-to-change-the-blank-p...) in place of ad content, you might make that a link to /dev/null or just make it a blank file, and see if that works? I suspect it won't as the ABE is checking for the correct origin of content and pihole is spoofing the origin.
I use ABE only because it's enabled by default and it seems like a sensible thing to do. At the time I tested pi-hole, reverting to router hosts blocking seemed easier so I left it at that. Maybe I'll look at it again and see if I can selectively disable ABE for the pi-hole's IP address which should solve the issue.
Hmm, I think ABE works back-to-front of what you want, eg http://stackoverflow.com/questions/20111530/noscript-abe-all.... You let a remote page have access to local data, so you'd need access rules for all the external pages that are adblocked to let them have access to the local content that pihole wants to stuff the ad spaces with.
#ABE rules
# ALL matches any URI
# accept lets it happen, sandbox and anon probably would be fine too
# INCLUSION is any sub-page level part, like images and such, can specify further which inclusions
# LOCAL is any local address including those in noscript.ABE.localExtras (see about:config);
Site ALL
Accept INCLUSION from LOCAL
# Deny # probably only needs that if you're not prepending already made rules??
That might negate much of ABE though as you're letting external sites reference anything that they find on the local domain.
For me the big difference is management - pihole has a nice interface and there are a bunch of browser extensions compatible with it, you can exclude all of their lists and start from scratch.
I'm still hoping for someone to build a SBC like the RPi that takes POE for power. USB is convenient for general use, sure, but for something like this cluster it would be ideal to have a single wire in and out.
Unfortunately it's probably a chicken-and-egg thing, where POE is too expensive until a lot of boards support it, and boards won't support it because it's too expensive.
I'm not sure that cost is the biggest problem. PoE splitters with USB ports can be had for ~$20. Though, relative to an RPi, I guess that is somewhat expensive.
A quick ebay search shows some that are ~$5, so I'd imagine it wouldn't be all that hard for an experienced board developer to get the price lower at higher production rates.
Here's another one that's a brand you may have heard of and it also does multiple voltages too. Only $15. Can even get it as a kit if you don't have a Poe switch.
Those cheap eBay items you reference are likely just sending power over an unused pair. Actual standardized PoE has all sorts of other cool perks. I haven't studied it heavily but I wouldn't say they're quite the same as the little dongles with a plug off the side.
I'm glad to hear someone else is interested in a Poe powered board. I thought about making a pihat that did Poe extraction but you'd still have to use a short cat5 jumper for data.
Cheap maybe, but if you really use Raspberry Pi a lot, you will find that Pi isn't that robust after all. I installed 20 of them two years ago, so far at least replaced 4 of them during the time. Can't comment on the recent models though.
I've been hosting http://www.pidramble.com/ on either a cluster of 5 Pi 2s (now 3s), or a single Pi when the cluster is on a road trip, since July last year.
The only downtime has been a result of power or Internet outages at home.
Anecdotal, but I now have some 36 different Raspberry Pis at my house, and while I've had two (cheap) MicroSD card failures, I haven't had an issue with any of the Pis themselves.
As long as you use quality 2A power supplies (1A for older Pis) and quality MicroSD cards, I don't think you should worry too much about reliability, and if you have a cluster with redundancy... That's kind of the point ;)
I had a lot of trouble with corrupt cards. Took care of it by keeping the file system in a USB stick, and using the SD card only for booting into the OS on the stick.
Basically, you format the drive for ext4, then copy the filesystem over with async. Then edit the cmdline.txt on the sdcard to point to root=/dev/sda2. It's rather more problematic if you have multiple usb drives, and there's a RPi forum message somewhere with a lot more details...
The Pi alternatives can have better specs at a even more attractive price point[0]. The Pi itself benefits from it's accessibility, if that is not required then pick one of those.
I got a pair of odriod's based on price/performance to act as NTP servers and have since removed them from service.
They both had the same MAC address on the NIC, which was not impressive and more importantly have about a .5% (consistent) outbound Ethernet error rate. Not ideal for a service utilizing UDP.
I also got an odroid XU4. As a replacement for a sheeva plug I was using as a home server. I was hoping to use the HDMI out into my TV and use it for browsing and streaming media. Unfortunately none of the distros produced by hardkernel fully work. There is always something broken in each distro. It's a complete pain. I don't know why they cant just produce a distro that has all of the hardware working at install time.
My Odroid X2 and another X3 don't have a proper MAC, it's stored in /etc/smsc95xx_mac_addr.
Now that I look, there are some receive errors reported by ifconfig on the X2, but I've never investigated them. The X3 is on a different network, and reports zero errors. (I don't know how much data either has transmitted, the counter will have wrapped as it is only 32 bit.)
But, I wouldn't recommend them to anyone not thoroughly familiar with Linux. Compared to the Raspberry Pi, there's not much documentation, and the kernel is pretty old.
Like others, I have a stack of Pi systems doing file storage, DNS ad blocking, webservers, home control monitoring, XMBC, etc. A clean mount would be cool
Actually there has been a bit of movement on this board - they did some initial cost analysis and it came out way to expensive ($1000 a board!) so they are going back to the drawing board.
Thanks for the update. I've poked at the twitter feed, but I missed where it says the boards would be $1000 each. (Using Google Translate sometimes doesn't go well).
Maybe when the new Compute Module comes out, someone will come up with a way to put a stack of them on the same board.
Very cool board, love how they get all of that crammed into such a small footprint.
But,it has two problems. 1) you still need to do all the cables to connect them to the outside world. 2)Allwinner support for weird boards is often a problem. I've gotten a few of them in the "wow, look at all the features" to find that the drivers and support underneath has real problems.
The Pi / Beaglebone ecosystems have one key thing, lots of very smart people are working on the hardware/software interface level. This means that things just work. The Pi team has multiple person-decades of software improvements.
Instead of always stacking these, with shelves and/or individual standoffs between the boards, wouldn't it be cheaper and easier to just hang them, using some longer horizontal metal rods going through all the boards' mounting holes?
I have a Pi Zero-based 8 node MPI cluster (4 Zeroes, 1 Pi2) that I set up using 8086 Consulting's Clusterhat. In total it gets about 454 Gigaflops, or roughly the speed of a Pentium 4, or maybe two Raspberry Pi 2s.
It's my 3rd beowulf, but the first that fits in my hand and runs over USB. I use it mostly for exploring fractal space and attempting to (but usually failing) to approximate pi.
Both problems are well suited to doing something in the slow lane. We know Pi to trillions of places, so there's little benefit in a fast machine to calculate it to billions. It's more a great exercise learning problem solving approaches.
So far I've managed to get a working monte-carlo based approximation across the node, but it's a terrible approach to approximating pi to anything more than a few places. I've also had some luck with implementing ramanujan-type formulae, and am still working on implementing chudnovsky in distributed forms within the bounds of the zero's RAM limitations.
On the fractal side I've only really had time to look at mandelbrot, but I'm looking forward to using it to render animations for julia and to explore henon attractors.
>One big thing is to make sure atime is disabled, a massively brain dead feature inherited from 1980s Unix that writes to the disk every time you read from a file.
That's not a fair criticism of the atime feature. There are lots of use cases where it is desirable.
It's not an off the shelf thing, but it wouldn't be that difficult to build a board which takes power from a laptop adapter and spits out the right current to hook up via the GPIO.
As far as computing power it's for sure useless, but tag on some kind of wiring hardness/breakout board for the GPIO pins and a redundant power supply and you might have a nice cheap platform to build a control system around.
Not that weird considering Amazon prices fluctuate daily (even throughout the day e.g. higher prices during lunch hours). Newegg likely does this as well.
Also note that his prices are listed per unit, and the products are not always sold "per [rpi] unit". For example, a 7 port USB hub powers 7 RPi units, so unit price = (item price) / 7.
There's nothing wrong with conversation pieces. Many stories here are, and funnily enough they tend to lead to good conversation. May there be more of them!
Granted the author is not doing anything with it, but it is still a fun experiment. It can very easily be used to run a simple zookeeper/Kafka cluster or try out deployment automation etc.
Hyper-V doesn't support ARM as far as I know. Also, do you know of a good Xen on ARM host? (honestly asking, would be cool to find one for testing cross-compiled builds)
Sorry, I think you misunderstood? I meant hosting provider. If you have to go buy your own hardware then you may as well do precisely what the OP just did.
About 20 years ago I got started using a dozen or so surplus Pentium75 desktops and by combining the working parts and best/largest parts I ended up with eight or so nodes with a reasonable amount of memory and local storage.
A write up would have been pointless it was pretty much standard like you'd read in any linux magazine of the era.
I used one of the PBS options for batching and some homemade stuff to rip audio CDs to mp3 in a distributed manner and completely unproductively fooled around with PVM and MPI. In the old days PVM was stable and easy to use although MPI was predicted to be the wave of the future. In fact today PVM is pretty much dead and everyone uses MPI or the ever popular NIH homemade stuff. "Back in the old days" we did things like spawn off ray tracing jobs semi-manually with bash scripts and perl scripts. You'd write perl that output animated povray files given a frame number and then queue up like 120 ray tracing jobs and then stack them into video and thats how you got a minute of bad ray traced animated movies. Not being much of an artist my animated ray tracing was limited to bad physics demos like mirrored ball bouncing. As most other cluster ops do I calculated a lot of "X+1" and "Is X a prime" to test. Like most screwing around, honestly I just turned a lot of KWh into heat.
The wife acceptance factor was incredibly low. Although admittedly I'm still married, to the same awesome woman even. Eventually I got rid of all the machines. But for a year or two it was a lot of fun. I thought the noise level was bad... then I replaced all the boxes with dell towers a few years later that sounded like jet engines. Then IBM 1U rackmount servers P4 surplus around 2010-ish which did a pretty awesome job of running LXC before containerization became "cool". Portability ha ha ha the size of a closet in the middle of my home office, looks like an organized disaster zone. Really looking forward to building a pi cluster to screw around with that'll fit in a shoebox with room to spare. OMG I could build a toy cluster so small I could drop it into my desk drawer when I'm not playing... Oh if I had more time this summer I'm feeling the urge to build another cluster...
Power doesn't matter as long as the circuit breaker doesn't pop. At one point I figured I was burning 1100 watts using the dell towers (and yes the room does get warm) which implies running 24x7 would cost me $1100/yr (which is very cheap compared to a training class or many other hobbies, but I digress) however I suspect I never powered up that cluster more than maybe 300 hours. I spent 10x more money on a then new technology 100 meg ethernet switch than I did on the power the system used. Back in the dinosaur era we mostly used hubs which were unswitched one big broadcast region, probably sounds very weird to modern young kids.
I can't even begin to list all the stuff I learned. The jump from admin'ing one unix box to admin'ing a cluster is almost as big as the jump from being an appliance user to being a cluster admin.
What I didn't learn was the scalability problems of having 10000 nodes, having less than ten generally, I did a lot of setup by hand. Although I did all the OS level stuff automation like homemade scripts, puppet, etc. A lot of cluster admin is related to the "fun" of physical hardware support for 10000 individual boxes and I miss out on that. Also running 24x7 how do you handle dusty air filters or whatever at that scale for day to day operation? Power failure recovery must be non-trivial.
Its hard to say "well I'd never have gotten job X except for..." because I might have anyway. As a hobby its been tangentially tied in with my day job on and off since at least the turn of the century.