Hacker News new | past | comments | ask | show | jobs | submit login

Semi related, I have a huge number of BC-250's [0]. Now that ETH mining is over, I'm looking for something interesting to do with them. Not looking to sell them, but that might be possible at quantity, I'd rather work with you to run something on them. They iPXE boot Ubuntu. GigE. No onboard storage, but have 16gigs of ram. Easily tuned for performance.

Thoughts?

[0] https://www.techspot.com/news/93980-14800-asrock-mining-rig-...




Contribute to the different volunteer computing projects: https://en.wikipedia.org/wiki/List_of_volunteer_computing_pr...


Good suggestion, I reached out folding. Let's see what happens.


GIMPS!


How well did the economics of this kind of an operation end up working out? Seems like these were a fairly recent development, so they really wouldn't have had much time, say, the 500 days cited to reach profitability.

It would be interesting to see how the GPU driver side of this works. If they boot Ubuntu, what kind of GPU driver is required to run the GPU? Is it open source amdgpu compatible?

In any case, these would work rather well for some kind of VPS server hosting or maybe more like dedicated server hosting, given the density/form factor. That is assuming the driver situation doesn't preclude a choice of operating system...


They run standard Ubuntu 20.04 and can be upgraded to 22 or whatever else comes along.

Standard AMD Ubuntu driver (21.50.2.50002, but can be upgraded as well). Heavily modified the AMD packaging to minimize it to just the necessary files because these iPXE boot (sadly, still around ~60megs).

The bigger issue is that they don't have any onboard persistent storage (could be added, but the speed is limited to about 500mbit/s) and they are only gigE.

Running strictly from memory, they are also prone to memory corruption. Odd, I know, but I see it at the scale we operate. Thus, they need to be treated as interruptible machines. Reboot to running is about 60s.

So, quite a few limitations, but still good hardware, if we can find a good workload for them.


Is this memory corruption you speak of silent, or simply fatal?

This could be a significant problem if the workload requires some form of integrity, since the hardware could be quietly introducing errors into otherwise normal looking computing

I remember having this issue with overclocked AMD cards mining too, where it was common to try to undervolt or overclock the memory. I wonder if any of those tuning tools work here, and if it would be possible to underclock the memory to increase its stability.

Either way, this echoes some of the sentiment I generally had around hardware intended for mining, including the bitcoin branded 2000 watt power supplies built with bottom of the barrel parts. Most hardware built for mining was built with exactly one purpose in mind, and has significant warts when it is attempted to be repurposed. The kind of constraints and requirements that cryptomining presents are really quite different from those of most modern IT systems.


Silent. It'll be things like you can't ssh into the box any more or you log in and can't reboot it. Likely due to ethash mining, which is heavily RAM based and the voltage/clocking. Luckily, it is easy to change those settings to build more stability. I have a process that auto tunes the machines for known instabilities... but the weird silent ram corruption ones are much harder to detect.

You're totally right that mining hardware was majority single purpose, especially at large scale. Those PSU's did the job, but yes, in general, hand soldered in China and prone to do weird things.

It certainly puts a hamper into what can be done with it now that the merge has happened, but I'd like to keep trying to find uses!


I wonder if these have any chance of running TensorFlow or other ML applications. The problem would again, be that there is no local storage and thus the 4GB Stable Diffusion model might be a bit much, but once loaded, perhaps it may work well for that kind of non critical application.

I think one of the reasons GPU memory corruption may cause the system to freeze is because the GPU and main memory are unified on APUs, which would probably explain the machines being difficult to login or use sometimes


It is effectively this GPU with RDNA1: https://en.wikipedia.org/wiki/Radeon_RX_5000_series

Yes, shared memory is definitely the cause.


> Running strictly from memory, they are also prone to memory corruption. Odd, I know, but I see it at the scale we operate. Thus, they need to be treated as interruptible machines. Reboot to running is about 60s.

This would be an instant dealbreaker for me. To quote the inimitable Sweet Brown, _ain't nobody got time for that_. [1]

1: https://en.wikipedia.org/wiki/Ain't_Nobody_Got_Time_for_That


Do these support graphics of any kind? Can you run a test with Vulkan? Can the boards run windows and correctly start DirectX?


It is effectively this GPU with RDNA1: https://en.wikipedia.org/wiki/Radeon_RX_5000_series

I don't know about Windows, but at this scale, I doubt it would be easy to iPXE boot this many blades over gigE.


stable diffusion hosting


Seems to be AMD cards, which has way less support in the ecosystem.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: