I have always been in favor of changing the definition if incorporation to ensure that over time ownership transfers slowly but increasingly to the employees of the corporate entity. How that would work, though, would require detailed thought by experts more knowledgeable than i :)
Its so nice to see this echo'd somewhere. This has been what I've been calling them for a while, but it doesn't seem to be the dominant view. Which is a shame, because it is a seriously accurate one.
The benefit of cloud has always been that it allows the company to trade capex for opex. From an engineering perspective, it trades scalability for complexity, but this is a secondary effect compared to the former tradeoff.
Hetzner is also a cloud. You avoid buying hardware, you rent it instead. You can rent either VMs or dedicated servers, but in both cases you own nothing.
How are you guys spinning up vms, specifically windows vms, so quickly? I used to use virtual box back in the day, but that was a pain and required a manual windows OS install.
I'm a few years out of the loop, and would love a quick point in the right direction : )
A lot of the world has moved on from virtualbox to primarily qemu+kvm and to some extent xen. Usually with some higher-level tool on top. Some of these are packages you can run on your existing OS and some are distributions with hypervisor for people who use VMs as part of their primary workflows. If you just want quick-and-easy one-off Windows VM and move on, check out quickemu.
Not sure about windows but I solved it for myself with basic provisioning script (could be an ansible playbook also) that installs everything on a fresh linux vm in a few minutes. For macos, there is tart vm that works well with arm64 (very little overhead compared to alternatives). Could be a rented cloud vm in a nearby location with low latency. Being a neovim user also helped not to having to worry about file sync when editing.
For coding I normally run Linux VMs. But Windows should be doable as well. If you do a fresh install every time then sure it takes a lot of time, but if you keep the install in VirtualBox then it's almost as fast as you rebooting a computer.
Also, you can spin up an ec2/azure/google vm pretty easy too. I do this frequently and it only costs a few bucks. Often more convenient to have it in the data center anyway.
Yesterday I used ChatGPT to transform a csv file. Move around a couple of columns, add a few new ones. Very large file.
It got them all right. Except when I really looked through the data, for 3 of the excel cells, it clearly just made up new numbers. I found the first one by accident, the remaining two took longer than it would have taken to modify the file from scratch myself.
Watching my coworkers blindly trust output like this is concerning.
After we fix the all the simple specious reasoning of stuff like Alexander-the-great and agree to out-source certain problems to appropriate tools, the high-dimensional analogs of stuff like Datasaurus[0] and Simpson's paradox[1] etc are still going to be a thing. But we'll be so disconnected from the representation of the problems that we're trying to solve that we won't even be aware of the possibility of any danger, much less able to actually spot it.
My take-away re: chain-of-thought specifically is this. If the answer to "LLMs can't reason" is "use more LLMs", and then the answer to problems with that is to run the same process in parallel N times and vote/retry/etc, it just feels like a scam aimed at burning through more tokens.
Hopefully chain-of-code[2] is better in that it's at least trying to force LLMs into emulating a more deterministic abstract machine instead of rolling dice. Trying to eliminate things like code, formal representations, and explicit world-models in favor of implicit representations and inscrutable oracles might be good business but it's bad engineering
>it just feels like a scam aimed at burning through more tokens.
I have a growing tin foil hat theory that the business model of LLM's is the same as 1-900-psychic numbers of old.
For just 25¢ 1-900-psychic will solve all your problems in just 5 minutes! Still need help?! No problem! We'll work with you until you get your answers for only 10¢ a minute until your happy!
To me it’s a problem of if a piece of information is not well represented in the training data the llm will always tend towards bad token predictions for related to said information. I think the next big thing in LLM’s could be figuring out how to tell if a token was just a “fill in” or “guess” vs a well predicted token. That way you can have some sort of governor that can kill a response if it is getting too guessy, or atleast provide some other indication that the provided tokens are likely hallucinated.
Maybe there is some way to do it based on the geometry of how the neural net activated for a token, or some other more statistics based approach, idk I’m not an expert.
A related topic you might want to look into here is called nucleus sampling. Similar to temperature but also different.. it's been surprising to me that people don't talk about it more often, and that lots of systems won't expose the knobs for it.
Yup, I always take editing suggestions and implement them manually, then re-feed the edited version back in for new suggestions if needed. Never let it edit your stuff directly —— the risk of stealth random errors sneaking in is too great.
Just because every competent human we know would edit ONLY the specified parts, or move only the specified columns with a cut/paste operation (or similar deterministically reliable operation), does not mean an LLM will do the same, in fact, it seems to prefer to regenerate everything on the fly. NO, just NO.
I don't mean to be rude, but this sounds like user error. I don't understand why anyone would use an LLM for this - or at least, why you would let the LLM perform the transformation.
If I was trying to do something like this I would ask the LLM to write a Python script, validate the output by running it against the first handful of rows (like, `head -n 10 thing.csv | python transform-csv.py`).
There are times when statistical / stochastic output is useful. There are other times when you want deterministic output. A transformation on a CSV is the latter.
It wouldn't be a difficult situation if these guys were ethical shops from the get-go, but they aren't, they're trying to staple minimally required ethics on afterwards, and it shows.
To play devil’s advocate, what ethical safeguards are OpenAI responsible for that they have failed to implement?
This is a wild and difficult to understand technology, even for the people building it. And their safeguards are constantly evolving.
I think you’re attributing to malice what should be attributed to people commercializing a novel technology that is, frankly, being exploited by users.
This is prime HR style lying. The response is: Problem statement. Claim that reality is the opposite of the problem statement, with no justification given, despite obvious evidence to the contrary. Statement that if reality doesn't match their claim, the worker is at fault. End of statement.
> While some worry AI will dehumanize the hiring process, we believe the opposite.
Look at the language Coinbase uses. Only their view is a "belief." The opposing view is a "worry." Others are motivated by fear. Only holy Coinbase is motivated by love!
This is, of course, doublethink. We all know that removing humans from the hiring process is, by definition, dehumanizing.
Coinbase's article would have been more palatable if it were truthful:
> Some believe AI will dehumanize the hiring process. We agree, and we're SO excited about that! I mean, we aren't in this business to make friends. We're in it to make cold, hard cash. And the less we have to interact with boring, messy human beings along the way, the better! If you're cold, calculating and transactional like us, sign on the dotted line, and let's make some dough!
But if they were that truthful, fun, and straightforward, they'd probably be more social, and they wouldn't have this dehumanizing hiring process to begin with.
The fact that a communist dictatorship declares itself to be a benevolent people's paradise, doesn't change the brutal reality one bit. And unlike living under a communist dictatorship, we don't have to accept it. I will strongly vote for those who make this shit illegal.
SWIM worked as a PM at a company that decided to redo their UI. They ran into an issue on internal roll out, where they discovered their support team for years had been doing sql injection on a specific form in the UI, in order to run reports on the company's database. They had to stop the roll out, and productionize the support team's (very valid) use cases in order to remove the sql injection form.
I think you've drawn the wrong conclusions from the history of the web.
The web started out idealistic, and became what it did because of underregulated market forces.
The same thing will happen to ai.
First, a cool new technology that is a bit dubious. Then a consolidation, even if or while local models proliferate. Then degraded quality as utility is replaced with monetization of responses, except in an llm you wont have the ability to either block ads or understand the honesty of the response.
Not the commenter but saying unregulated market does not imply that a regulated market would solve it. But I also agree that unregulated market forces is the best way to describe what happened to the internet.
reply