Hacker Newsnew | past | comments | ask | show | jobs | submit | benatkin's commentslogin

To save a click, it's just a fancy front end for Whisper plus a weaker CPU-only model. It has a demo video that seems impressive, but the speech is careful to sound casual while having no meaningful flaws that would cause it to mess up. If you want to make a speech to speech tool, which is what this post asks about, it would make more sense to go straight to Whisper.

I use it, sponsor it, and did a small pr. One of its goals is to be the most “forkable” starting point if i recall. But yes its just voice input. It’s meaningfully better than the mac dictation for me.

you can use gpu too. i have to admit the app is very easy to use and super convenient. kudos to creator

Yes, and with GPU, it's Whisper, which has been mentioned elsewhere in this article's comments. I mean that handy.computer provides the other option as a fallback for those who can't or don't want to use the GPU.

I'm going to propose a law for these AI orchestration systems based on Greenspun's Tenth Law:

> Any sufficiently complicated AI orchestration system contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Gas Town.


Isn't it the other way around, Gas Town is an ad hoc, informally specified, bug ridden, slow implementation of other AI orchestration systems.

that statement is a bit early no?

To me, this sounds more monolithic than containers. I think I'd like something less monolithic. However, those who like monorepos might be more quick to develop an interest in this. I could of course use containers within the MicroVM, which is what I really want anyways, because I want lighter weight containers than MicroVMs for sandboxes.

While looking into giving fly another shot as a cloud provider even though I think it's still pretty much a commodity for me, I found an issue in Google: I searched for "fly.io sao paolo" and the title of the first result on fly.io is "Regiones · Fly Docs", translated from english to Spanish. While I find the translation in titles on Google annoying, I haven't often seen the characters messed up like this. I reproduced this in Incognito at this URL: https://www.google.com/search?hl=es&q=fly.io%20sao%20paolo



It also shows that it isn't perfectly organized, that it isn't an ideal model for knowledge aggregation. If it's ideal for it to be globally consistent, then it doesn't have that. If it's ideal for it to be adapted to different cultures, then it doesn't have that either, because the divisions are based only on language. However, Wikipedia it is really an amazing place, and it should continue to be preserved and improved.

Found yours while searching for "llm"

I propose the name tokables for the compressed data produced by this. A play on tokens and how wild it is.

please pass the tokables to the left hand side

Peoples' writing is influenced by what they read, so such a strong objection to someone suggesting that an LLM might have been involved in the text of a blog post won't fly with me.

Shrug-emoji. I copyedited this post. I get that people don't have a lot of my writing to go off of on HN, it's a real problem I have.

I wondered if maybe it was about Vienna


I wanted to say I that I think it's overrated in terms of its position on HN, but rather than criticize side issues of it, which often point to something being a weak article in general, I probably should have just said exactly what I don't like about it as a whole. So I'll do that.

I think the headline is problematic because it suggests the raw photos aren't very good and thus need processing, however the raw data isn't something the camera makers intend to be put forth as a photo, and the data is intended to be processed right from the start. The data of course can be presented in as images but that serves as visualizations of the data rather than the source image or photo. Wikipedia does it a lot more justice. https://en.wikipedia.org/wiki/Raw_image_format If articles like OP's catch on, camera makers might be incentivized to game the sensors so their output makes more sense to the general public, and that would be inefficient, so the proper context should be given, which this "unprocessed photo" article doesn't do in my opinion.


> I think the headline is problematic because it suggests the raw photos aren't very good and thus need processing

That’s not how I read either the headline or the article at all. I read it as “this is a ‘raw photo’ fresh off your camera sensor, and this is everything your camera does behind the scenes to make that into something that we as humans recognize as a photo of something.” No judgements or implications that the raw photo is somehow wrong and something manufacturers should eliminate or “game”


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: