Hacker News new | past | comments | ask | show | jobs | submit login
My Raspberry Pi cluster (erratasec.com)
131 points by hardmath123 on July 22, 2016 | hide | past | favorite | 68 comments



I'm not OP but I've done this stuff on and off for two decades and I thought I'd toss out some answers.

About 20 years ago I got started using a dozen or so surplus Pentium75 desktops and by combining the working parts and best/largest parts I ended up with eight or so nodes with a reasonable amount of memory and local storage.

A write up would have been pointless it was pretty much standard like you'd read in any linux magazine of the era.

I used one of the PBS options for batching and some homemade stuff to rip audio CDs to mp3 in a distributed manner and completely unproductively fooled around with PVM and MPI. In the old days PVM was stable and easy to use although MPI was predicted to be the wave of the future. In fact today PVM is pretty much dead and everyone uses MPI or the ever popular NIH homemade stuff. "Back in the old days" we did things like spawn off ray tracing jobs semi-manually with bash scripts and perl scripts. You'd write perl that output animated povray files given a frame number and then queue up like 120 ray tracing jobs and then stack them into video and thats how you got a minute of bad ray traced animated movies. Not being much of an artist my animated ray tracing was limited to bad physics demos like mirrored ball bouncing. As most other cluster ops do I calculated a lot of "X+1" and "Is X a prime" to test. Like most screwing around, honestly I just turned a lot of KWh into heat.

The wife acceptance factor was incredibly low. Although admittedly I'm still married, to the same awesome woman even. Eventually I got rid of all the machines. But for a year or two it was a lot of fun. I thought the noise level was bad... then I replaced all the boxes with dell towers a few years later that sounded like jet engines. Then IBM 1U rackmount servers P4 surplus around 2010-ish which did a pretty awesome job of running LXC before containerization became "cool". Portability ha ha ha the size of a closet in the middle of my home office, looks like an organized disaster zone. Really looking forward to building a pi cluster to screw around with that'll fit in a shoebox with room to spare. OMG I could build a toy cluster so small I could drop it into my desk drawer when I'm not playing... Oh if I had more time this summer I'm feeling the urge to build another cluster...

Power doesn't matter as long as the circuit breaker doesn't pop. At one point I figured I was burning 1100 watts using the dell towers (and yes the room does get warm) which implies running 24x7 would cost me $1100/yr (which is very cheap compared to a training class or many other hobbies, but I digress) however I suspect I never powered up that cluster more than maybe 300 hours. I spent 10x more money on a then new technology 100 meg ethernet switch than I did on the power the system used. Back in the dinosaur era we mostly used hubs which were unswitched one big broadcast region, probably sounds very weird to modern young kids.

I can't even begin to list all the stuff I learned. The jump from admin'ing one unix box to admin'ing a cluster is almost as big as the jump from being an appliance user to being a cluster admin.

What I didn't learn was the scalability problems of having 10000 nodes, having less than ten generally, I did a lot of setup by hand. Although I did all the OS level stuff automation like homemade scripts, puppet, etc. A lot of cluster admin is related to the "fun" of physical hardware support for 10000 individual boxes and I miss out on that. Also running 24x7 how do you handle dusty air filters or whatever at that scale for day to day operation? Power failure recovery must be non-trivial.

Its hard to say "well I'd never have gotten job X except for..." because I might have anyway. As a hobby its been tangentially tied in with my day job on and off since at least the turn of the century.


What a great comment. Thanks!

Edit: I just had an idea and detached this post from https://news.ycombinator.com/item?id=12149396 so more people would get a chance to see it. (It originally appeared in an off-topic subthread.) Maybe the first time we've pruned a comment from its original parent because it was so good.


Similar experience here except I started on my high school's network ~1996. Built two physical clusters commercially in the last few years. Personally these days I mostly use virtualization to emulate entire clusters, as it's not worth building them physically until you really need the power!


I did the exact same thing (cluster to compress MP3's), only I had a paying customer for the job which (handsomely) paid for the work and the cluster.

The boxes are long gone but the pictures remain:

http://www.clustercompute.com/


I like how it was called "beowulf cluster" back then, it's a while I read that name. We had an 160 node beowulf cluster (had already USB support, no floppy drives) for OpenMP and CILK jobs it was amazing. https://en.wikipedia.org/wiki/Beowulf_cluster


https://pi-hole.net/ is my favorite pi software, it's network wide DNS server for ad blocking which I supplement with a bunch of privacy/security lists from https://github.com/StevenBlack/hosts. 5% of my traffic was blocked today which is heaps considering it's counting mountains of requests to eg Dropbox's API.

It even ran powered straight off my router's usb port, although I just put in a VM now: https://github.com/benlowry/pihole-extended-hosts


I really like it too but after extended testing I found it was incompatible with NoScript's Application Boundaries Enforcer because pi-hole replaces ad requests with an empty HTML 'page', rather than simply dropping or blocking ads like a browser ad-blocker would.

I don't know enough about how this works to figure out how to work around it so I reverted to blocking ads at the router with a hosts file. NoScript provides too many other benefits so I had to keep it and stop using pi-hole.


Do you actually use the ABE specifically, that part can be turned off (NoScript > Options > Advanced : ABE and then uncheck the enable checkbox) without stopping any of the other aspects from working.

I'm not a dev but looking very quickly at pihole it seems it serves an index.html file (https://pi-hole.net/faq/is-it-possible-to-change-the-blank-p...) in place of ad content, you might make that a link to /dev/null or just make it a blank file, and see if that works? I suspect it won't as the ABE is checking for the correct origin of content and pihole is spoofing the origin.


I use ABE only because it's enabled by default and it seems like a sensible thing to do. At the time I tested pi-hole, reverting to router hosts blocking seemed easier so I left it at that. Maybe I'll look at it again and see if I can selectively disable ABE for the pi-hole's IP address which should solve the issue.


Hmm, I think ABE works back-to-front of what you want, eg http://stackoverflow.com/questions/20111530/noscript-abe-all.... You let a remote page have access to local data, so you'd need access rules for all the external pages that are adblocked to let them have access to the local content that pihole wants to stuff the ad spaces with.

This might work, using https://noscript.net/abe/abe_rules.pdf, but I'd ask at the NoScript forums. The rules cascade so it would need to go at the top.

    #ABE rules
    # ALL matches any URI
    # accept lets it happen, sandbox and anon probably would be fine too
    # INCLUSION is any sub-page level part, like images and such, can specify further which inclusions
    # LOCAL is any local address including those in noscript.ABE.localExtras (see about:config);
    Site ALL
    Accept INCLUSION from LOCAL
    # Deny # probably only needs that if you're not prepending already made rules??
That might negate much of ABE though as you're letting external sites reference anything that they find on the local domain.

Let me know if it works!


For me the big difference is management - pihole has a nice interface and there are a bunch of browser extensions compatible with it, you can exclude all of their lists and start from scratch.


Good use of RPi. Far better than a browser-based ad blocker. Curious how popular projects like pi-hole have been? Any idea?


I'm still hoping for someone to build a SBC like the RPi that takes POE for power. USB is convenient for general use, sure, but for something like this cluster it would be ideal to have a single wire in and out.

Unfortunately it's probably a chicken-and-egg thing, where POE is too expensive until a lot of boards support it, and boards won't support it because it's too expensive.


I'm not sure that cost is the biggest problem. PoE splitters with USB ports can be had for ~$20. Though, relative to an RPi, I guess that is somewhat expensive.

Ex: https://www.amazon.com/gp/aw/d/B019BLMWY0 (note, I haven't actually used this product).

A quick ebay search shows some that are ~$5, so I'd imagine it wouldn't be all that hard for an experienced board developer to get the price lower at higher production rates.


Here's another one that's a brand you may have heard of and it also does multiple voltages too. Only $15. Can even get it as a kit if you don't have a Poe switch.

https://www.amazon.com/gp/aw/d/B003CFATQK/ref=mp_s_a_1_1?ie=...


Those cheap eBay items you reference are likely just sending power over an unused pair. Actual standardized PoE has all sorts of other cool perks. I haven't studied it heavily but I wouldn't say they're quite the same as the little dongles with a plug off the side.


You can now get a PoE adapter for the Raspberry Pi: https://www.pi-supply.com/product/pi-poe-switch-hat-power-ov...


I'm glad to hear someone else is interested in a Poe powered board. I thought about making a pihat that did Poe extraction but you'd still have to use a short cat5 jumper for data.


Cheap maybe, but if you really use Raspberry Pi a lot, you will find that Pi isn't that robust after all. I installed 20 of them two years ago, so far at least replaced 4 of them during the time. Can't comment on the recent models though.


I've been hosting http://www.pidramble.com/ on either a cluster of 5 Pi 2s (now 3s), or a single Pi when the cluster is on a road trip, since July last year.

The only downtime has been a result of power or Internet outages at home.

Anecdotal, but I now have some 36 different Raspberry Pis at my house, and while I've had two (cheap) MicroSD card failures, I haven't had an issue with any of the Pis themselves.

As long as you use quality 2A power supplies (1A for older Pis) and quality MicroSD cards, I don't think you should worry too much about reliability, and if you have a cluster with redundancy... That's kind of the point ;)


The Pis I installed are all first version, which uses SD card. I agree that things may improve quite a lot in recent iterations.


All Raspberry Pi versions (except the compute module) use SD cards.


Raspberry Pi 2 and 3 use microsd cards.


I had a lot of trouble with corrupt cards. Took care of it by keeping the file system in a USB stick, and using the SD card only for booting into the OS on the stick.


How did you get that working? I haven't had much luck with USB sticks so far. (not really new to Linux, but I'm very new to the Pi.)


Basically, you format the drive for ext4, then copy the filesystem over with async. Then edit the cmdline.txt on the sdcard to point to root=/dev/sda2. It's rather more problematic if you have multiple usb drives, and there's a RPi forum message somewhere with a lot more details...


The Pi alternatives can have better specs at a even more attractive price point[0]. The Pi itself benefits from it's accessibility, if that is not required then pick one of those.

[0]http://ameridroid.com/products/odroid-c1


I got a pair of odriod's based on price/performance to act as NTP servers and have since removed them from service.

They both had the same MAC address on the NIC, which was not impressive and more importantly have about a .5% (consistent) outbound Ethernet error rate. Not ideal for a service utilizing UDP.

I now use Pi's, which work flawlessly.


I also got an odroid XU4. As a replacement for a sheeva plug I was using as a home server. I was hoping to use the HDMI out into my TV and use it for browsing and streaming media. Unfortunately none of the distros produced by hardkernel fully work. There is always something broken in each distro. It's a complete pain. I don't know why they cant just produce a distro that has all of the hardware working at install time.


My Odroid X2 and another X3 don't have a proper MAC, it's stored in /etc/smsc95xx_mac_addr.

Now that I look, there are some receive errors reported by ifconfig on the X2, but I've never investigated them. The X3 is on a different network, and reports zero errors. (I don't know how much data either has transmitted, the counter will have wrapped as it is only 32 bit.)

But, I wouldn't recommend them to anyone not thoroughly familiar with Linux. Compared to the Raspberry Pi, there's not much documentation, and the kernel is pretty old.


Noted. A new version of the board is due soon that might be improved, but that remains to be seen.


I want one of these Pi Zero Motherboards, but there has been no update by the company since January

http://hackaday.com/2016/01/25/raspberry-pi-zero-cluster-pac...

Like others, I have a stack of Pi systems doing file storage, DNS ad blocking, webservers, home control monitoring, XMBC, etc. A clean mount would be cool


Actually there has been a bit of movement on this board - they did some initial cost analysis and it came out way to expensive ($1000 a board!) so they are going back to the drawing board.

You can follow along here:

https://twitter.com/9_ties https://twitter.com/IdeinInc


Thanks for the update. I've poked at the twitter feed, but I missed where it says the boards would be $1000 each. (Using Google Translate sometimes doesn't go well).

Maybe when the new Compute Module comes out, someone will come up with a way to put a stack of them on the same board.



Very cool board, love how they get all of that crammed into such a small footprint.

But,it has two problems. 1) you still need to do all the cables to connect them to the outside world. 2)Allwinner support for weird boards is often a problem. I've gotten a few of them in the "wow, look at all the features" to find that the drivers and support underneath has real problems.

The Pi / Beaglebone ecosystems have one key thing, lots of very smart people are working on the hardware/software interface level. This means that things just work. The Pi team has multiple person-decades of software improvements.


Instead of always stacking these, with shelves and/or individual standoffs between the boards, wouldn't it be cheaper and easier to just hang them, using some longer horizontal metal rods going through all the boards' mounting holes?


In the old days you had to stack them because no 2D linear arrangement allowed access to the important ports (ethernet and whatever.

Now a days with the pi3 you can run your cluster on wifi (kills performance, but performance was never the point anyway...)

So with pi3 hardware its possible to get a length of wood 2x4 and screw the pi array to the 2x4.

Of course if you want to do stuff with the hardware other than crunch bytes you'll need the stacking anyway to access all the ports.

I'm pleased with the article. Just last week I was looking at the GeauxRobot stacking hardware. Its nice hardware, although around twice the cost.


I have a Pi Zero-based 8 node MPI cluster (4 Zeroes, 1 Pi2) that I set up using 8086 Consulting's Clusterhat. In total it gets about 454 Gigaflops, or roughly the speed of a Pentium 4, or maybe two Raspberry Pi 2s.

It's my 3rd beowulf, but the first that fits in my hand and runs over USB. I use it mostly for exploring fractal space and attempting to (but usually failing) to approximate pi.

Both problems are well suited to doing something in the slow lane. We know Pi to trillions of places, so there's little benefit in a fast machine to calculate it to billions. It's more a great exercise learning problem solving approaches.

So far I've managed to get a working monte-carlo based approximation across the node, but it's a terrible approach to approximating pi to anything more than a few places. I've also had some luck with implementing ramanujan-type formulae, and am still working on implementing chudnovsky in distributed forms within the bounds of the zero's RAM limitations.

On the fractal side I've only really had time to look at mandelbrot, but I'm looking forward to using it to render animations for julia and to explore henon attractors.

[1] - http://clusterhat.com/


>One big thing is to make sure atime is disabled, a massively brain dead feature inherited from 1980s Unix that writes to the disk every time you read from a file.

That's not a fair criticism of the atime feature. There are lots of use cases where it is desirable.


I will be very happy if I can plug a laptop charger to a RPi and much happier if 2 RPi's can share the same laptop charger.


It's not an off the shelf thing, but it wouldn't be that difficult to build a board which takes power from a laptop adapter and spits out the right current to hook up via the GPIO.


It is an Ethernet switch not a hub. The author should look into running something like a rocks cluster.


Nice. Need to see more hardware posts here. Please though, improve the depth of the language.


As far as computing power it's for sure useless, but tag on some kind of wiring hardness/breakout board for the GPIO pins and a redundant power supply and you might have a nice cheap platform to build a control system around.


SETI@home style / GRID-computing could be used to run jobs on numerous idle Raspberry PIs (the location doesn't matter aka no cluster setup needed).


This kind of stuff is what erlang lives for. Perhaps that might be something to consider ? But then again, running a couple of vm's is way better.


Check out http://nerves-project.org, an Elixir framework for Raspberry Pi and BeagleBone Black.


Weird that none of the prices he lists are the same when I go to amazon or newegg.


Not that weird considering Amazon prices fluctuate daily (even throughout the day e.g. higher prices during lunch hours). Newegg likely does this as well.

Also note that his prices are listed per unit, and the products are not always sold "per [rpi] unit". For example, a 7 port USB hub powers 7 RPi units, so unit price = (item price) / 7.


Ah, that makes sense. I was thinking they were per-unit, not per rpi. They add up better that way.


Alternatively titled: "How to build a desk toy that functions better as a conversation piece than a practical replacement for a Hyper-V/Xen lab"


Please don't post snarky dismissals to HN.

There's nothing wrong with conversation pieces. Many stories here are, and funnily enough they tend to lead to good conversation. May there be more of them!


It is not a cluster per the configuration in the blog post, I was just kindly offering an alternate, correct title.

Real Pi clusters get posted every month or so, there is more required than an albeit cute rack case and improvised small PDU and switch.



Granted the author is not doing anything with it, but it is still a fun experiment. It can very easily be used to run a simple zookeeper/Kafka cluster or try out deployment automation etc.


you can spin up 10 containers easily on your linux box. it's like having 10 different systems.


What is your point?


Alternatively: "How to build a low cost Linux cluster simulator using Pi's" :)


Or just spin up 7 Xen VMs with no material cost. :)


Hyper-V doesn't support ARM as far as I know. Also, do you know of a good Xen on ARM host? (honestly asking, would be cool to find one for testing cross-compiled builds)


I have been a happy customer of scaleway for some time. Their c1 server is arm based.



Sorry, I think you misunderstood? I meant hosting provider. If you have to go buy your own hardware then you may as well do precisely what the OP just did.


Scaleway like the other guy said. Yes, I misunderstood. Also the Pi is not a supported Xen host to my knowledge.


Lets see your project.


7 full-size PCs stacked on top of each other with a cheap switch and a power bar.


Cool.

* Have you done a write up?

* What software do you run to make it a cluster?

* What problems are you running on it?

* What have you learned by building it?

* What problems did you have?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: