This is the story of how I bought enterprise-grade AI hardware designed for liquid-cooled server racks that was converted to air cooling, and then back again, survived multiple near-disasters (including GPUs reporting temperatures of 16 million degrees), and ended up with a desktop that can run 235B parameter models at home. It’s a tale of questionable decisions, creative problem-solving, and what happens when you try to turn datacenter equipment into a daily driver.
# Tell the driver to completely ignore the NVLINK and it should allow the GPUs to initialise independently over PCIe !!!! This took a week of work to find, thanks Reddit!
I needed this info, thanks for putting it up. Can this really be an issue for every data center?
> why didn't he just fit the two H100s into a better desktop box?
I expect because they were no longer in the sort of condition to sell as new machines? They were clearly well used and selling "as seen" is the lowest reputational risk associated with offload
There also weren't H100s available to scavenge. GH200 puts the Grace CPU and H100 GPU on a big module with a custom form factor and connectors, so the only viable route for using those GPUs was to keep all the electronics together and build a suitable case and cooling system around them. There wasn't any way to adapt any of this for use in an ordinary EATX case or with a different CPU, because the GPUs weren't PCIe add-in cards.
At that pricing I honestly thought they fell off a truck. Even well used H100 go for more than that entire system. In the US an RTX A6000 Ada is already close in price.
These are on a custom board from Nvidia, so its not possible to separate them. I think the seller usually gets H100's and them into a custom case, with a PCIE adapter to the server GPUs.
This thing too unwieldy to make into a desktop (you can see how much effort it took), and was in pretty bad condition. I think he just wanted to get rid of it without having to deal with returns. I took a bet on it, and was lucky it paid out.
We build these desktops from Nvidia servers we buy from reputable manufacturers like Pegatron, Gigabyte, Asrock Rack, and many more.
H100 PCI and GH200 are two very different things. The advantages of Grace Hopper are much higher connections speeds, bandwidth and lower power consumption.
True, which is why I said “might”. Even in the US. I only have to call ahead if I want smaller bills - $20 and $100 they usually have plenty of unless it’s a tiny branch.
Cash deposits or withdrawals over $10k in the US will be reported to the Treasury but if you don’t do them often it won’t raise a big flag, and the bank doesn’t care what you do with it. Treasury only cares if they think you are trying to evade taxes.
In Germany, where large items are often purchased with cash, it would be unremarkable if you did it several times a year.
I've had a bit of practice, but I don't have the right gear for this level of soldering. It took maybe an hour to solder in 2 components, after many failed attempts. Persistence beats intelligence?