An NVIDIA DGX Spark is $4000, pair that with a relatively cheap second box to run GitLab in the corner and you would have pretty good local AI inference setup. (you'd probably have to write a nontrivial amount of software to get your setup where you want)
The local models are just right on the edge of being really useful, there's a tipping point to where accuracy is high enough so that getting things done is easy vs models getting continuously stuck. We're in the neighborhood.
Alternatively, just have local GitLab and use one of the many APIs, those are much more stable than github. Honestly just get yourself a Claude subscription.
The DGX Spark is not good for inference though it's very bandwidth limited - around the same as a lower end MacBook Pro. You're much better off with a Apple silicon for performance and memory size at the moment but I'd recommend holding off until the M5 Max comes out early in the early as the M5 has vastly superior performance to any other Apple silicon chip thanks to its matmul instruction set.
Oof, I was already considering an upgrade from the M1 but was hoping I couldn't be convinced to go for the top of the line. Is the performance jump from the M# -> M# Max chips that substantial?
The main jump is from anything to M5; not because it's simply the latest but because it has matmul instructions similar to a CUDA GPU which fixes the slow prompt processing on all previous generation Apple Silicon chips.
I can't say I'm not tempted looking at the Spark, I could probably save some cash on heating my house with that thing. Though yeah unless there's some good software already built around a similar LLM workflow I could use it'd probably be wasted on me, or spend its time desperately trying to pay for itself with crypto mining.
Adding Claude to my rotation is starting to look like the option with the least amount of building the universe from scratch. I have to imagine it can be used in a similar or identical workflow to the Copilot one where it can create PRs and make adjustments in response to feedback etc.
>Though yeah unless there's some good software already built around a similar LLM workflow I could use it'd probably be wasted on me, or spend its time desperately trying to pay for itself with crypto mining.
A big part of my success using LLMs to build software is building the tools to use LLMs and the LLMs making that tool building easy (and possible).
I tried this for a little while and couldn't really get passionate about it; I have too many other backlogged projects that I was eager to tear into with LLMs and I got impatient. That was a while ago though and the ROI for building my own tools has probably gotten a lot more attractive.
I started building my own tool set because I was doing too many projects with LLMs and getting frustrated by a very real need for organization and tooling to get repetitive meaningless tasks out of the way and to get all of my projects organized so I could see what was going on.
The local models are just right on the edge of being really useful, there's a tipping point to where accuracy is high enough so that getting things done is easy vs models getting continuously stuck. We're in the neighborhood.
Alternatively, just have local GitLab and use one of the many APIs, those are much more stable than github. Honestly just get yourself a Claude subscription.