Hacker Newsnew | past | comments | ask | show | jobs | submit | tripplyons's commentslogin

It's not just the difference in price that would compensate you, it is also the funding rate.

Other than that, futures tend to be more liquid and capital efficient.


SDXL has been outclassed for a while, especially since Flux came out.

Subjective. Most in creative industries regularly still use SDXL.

Once Z-image base comes out and some real tuning can be done, I think it has a chance of replacing it for the function SDXL has


I don't think that's fair. SDXL is crap at composition. It's really good with LoRAs to stylize/inpaint though.

Source?

Most of the people I know doing local AI prefer SDXL to Flux. Lots of people are still using SDXL, even today.

Flux has largely been met with a collective yawn.

The only thing Flux had going for it was photorealism and prompt adherence. But the skin and jaws of the humans it generated looked weird, it was difficult to fine tune, and the licensing was weird. Furthermore, Flux never had good aesthetics. It always felt plain.

Nobody doing anime or cartoons used Flux. SDXL continues to shine here. People doing photoreal kept using Midjourney.


> it was difficult to fine tune

Yep. It's pretty difficult to fine tune, mostly because it's a distilled model. You can fine tune it a little bit, but it will quickly collapse and start producing garbage, even though fundamentally it should have been an easier architecture to fine-tune compared to SDXL (since it uses the much more modern flow matching paradigm).

I think that's probably the reason why we never really got any good anime Flux models (at least not as good as they were for SDXL). You just don't have enough leeway to be able to train the model for long enough to make the model great for a domain it's currently suboptimal for without completely collapsing it.


> It's pretty difficult to fine tune, mostly because it's a distilled model.

What about being distilled makes it harder to fine-tune?


AFAIK a big part of it is that they distilled the guidance into the model.

I'm going to simplify all of this a lot so please bear with me, but normally the equation to denoise an image would look something like this:

    pos = model(latent, positive_prompt_emb)
    neg = model(latent, negative_prompt_emb)
    next_latent = latent + dt * (neg + cfg_scale * (pos - neg))
So what this does - you trigger the model once with a negative prompt (which can be empty) to get the "starting point" for the prediction, and then you run the model again with a positive prompt to get the direction in which you want to go, and then you combine them.

So, for example, let's assume your positive prompt is "dog", and your negative prompt is empty. So triggering the model with your empty prompt with generate a "neutral" latent, and then you nudge it into the direction of your positive prompt, in the direction of a "dog". And you do this for 20 steps, and you get an image of a dog.

Now, for Flux the equation looks like this:

    next_latent = latent + dt * model(latent, positive_prompt_emb)
The guidance here was distilled into the model. It's cheaper to do inference with, but now we can't really train the model too much without destroying this embedded guidance (the model will just forget it and collapse).

There's also an issue of training dynamics. We don't know exactly how they trained their models, so it's impossible for us to jerry rig our training runs in a similar way. And if you don't match the original training dynamics when finetuning it also negatively affects the model.

So you might ask here - what if we just train the model for a really long time - will it be able to recover? And the answer is - yes, but at this point the most of the original model will essentially be overwritten. People actually did this for Flux Schnell, but you need way more resources to pull it off and the results can be disappointing: https://huggingface.co/lodestones/Chroma


Thanks for the extended reply, very illuminating. So the core issue is how they distilled it, ie that they "baked in the offset" so to speak.

I did try Chroma and I was quite disappointed, what I got out looked nowhere near as good as what was advertised. Now I have a better understanding why.


How much would it cost the community to pretrain something with a more modern architecture?

Assuming it was carefully done in stages (more compute) to make sure no mistakes are made?

I suppose we won't need to with the Chinese gifting so much open source recently?


> How much would it cost the community to pretrain something with a more modern architecture?

Quite a lot. Search for "Chroma" (which was a partial-ish retraining of Flux Schnell) or Pony (which was a partial-ish retraining of SDXL). You're probably looking at a cost of at least tens of thousands or even hundred of thousands of dollars. Even bigger SDXL community finetunes like bigASP cost thousands.

And it's not only the compute that's the issue. You also need a ton of data. You need a big dataset, with millions of images, and you need it cleaned, filtered, and labeled.

And of course you need someone who knows what they're doing. Training these state-of-art models takes quite a bit of skill, especially since a lot of it is pretty much a black art.


> Search for "Chroma" (which was a partial-ish retraining of Flux Schnell)

Chroma is not simply a "partial-ish" retraining of Schnell, its a retraining of Schnell after rearchitecting part of the model (replacing a 3.3B parameter portion of the model with a 250M parameter replacement with different architecture.)

> You're probably looking at a cost of at least tens of thousands or even hundred of thousands of dollars.

For reference here, Chroma involved 105,000 hours of H100 GPU time [0]. Doing a quick search, $2/hr seems to be about the low end of pricing for H100 time per hour, so hundreds of thousands seems right for that model, and still probably lower for a base model from scratch.

[0] https://www.reddit.com/r/StableDiffusion/comments/1mxwr4e/up...


Do they include uptime guarantees in any contracts?

See 1.1 and 1.2; you get some credit back for when they fail to deliver.

https://www.cloudflare.com/business-sla/


Definitely! In 2020, I reported an XSS vulnerability in GitLab using the onLoad attribute to run arbitrary JavaScript, and I was able to perform user actions without requiring any user interaction. For some reason it took them months to fix it after I reported it to them.

Are you referring to the new reasoning version of Kimi K2?


Is there a way to set a minimum release age globally for my pnpm installation? I was only able to find a way to set it for each individual project.


Did you try putting it in your global config file?

Windows: ~/AppData/Local/pnpm/config/rc

macOS: ~/Library/Preferences/pnpm/rc

Linux: ~/.config/pnpm/rc


I think it's a `pnpm-workspace.yaml` setting, for now, but PNPM has been pretty aggressive with expanding this feature set [1].

[1] https://pnpm.io/supply-chain-security


All of the best open source LLMs right now use mixture-of-experts, which is a form of sparsity. They only use a small fraction of their parameters to process any given token.

Examples: - GPT OSS 120b - Kimi K2 - DeepSeek R1


Mixture of experts sparsity is very different from weight sparsity. In a mixture of experts, all weights are nonzero, but only a small fraction get used on each input. On the other hand, weight sparsity means only very few weights are nonzero, but every weight is used on every input. Of course, the two techniques can also be combined.


Correct. I was more focused on giving an example of sparsity being useful in general, because the comment I was replying didn't specifically mention which kind of sparsity.

For weight sparsity, I know the BitNet 1.58 paper has some claims of improved performance by restricting weights to be either -1, 0, or 1, eliminating the need for multiplying by the weights, and allowing the weights with a value of 0 to be ignored entirely.

Another kind of sparsity, while on the topic is activation sparsity. I think there was an Nvidia paper that used a modified ReLU activation function to make more of the models activations set to 0.


“Useful” does not mean “better”. It just means “we could not do dense”. All modern state of the art models use dense layers (both weight and inputs). Quantization is also used to make models smaller and faster, but never better in terms of quality.

Based on all examples I’ve seen so far in this thread it’s clear there’s no evidence that sparse models actually work better than dense models.


Yes, mixture of experts is basically structured activation sparsity. You could imagine concatenating the expert matrices into a huge block matrix and multiplying by an input vector where only the coefficients corresponding to activated experts are nonzero.

From that perspective, it's disappointing that the paper only enforces modest amounts of activation sparsity, since holding the maximum number of nonzero coefficients constant while growing the number of dimensions seems like a plausible avenue to increase representational capacity without correspondingly higher computation cost.


I once discovered and reported a vulnerability in Nextcloud's web client that was due to them including an outdated version of a JavaScript-based PDF viewer. I always wondered why they couldn't just use the browser's PDF viewer. I made $100, which was a large amount to me as a 16 year old at the time.

Here is a blog post I wrote at the time about the vulnerability (CVE-2020-8155): https://tripplyons.com/blog/nextcloud-bug-bounty


I recently needed to show a pdf file inside a div in my app. All i wanted was to show it and make it scrollable. The file comes from a fetch() with authorzation headers.

I could not find a way to do this without pdf.js.


The html object tag can just show a pdf file by default. Just fetch it and pass the source there.

What is the problem with that exactly in your case?


I think it can't do that on iOS? Don't know if that is the relevant thing in the choice being discussed though. Not sure about Android.


This made me try it once more and I got something to work with some Blobs, resource URLs, sanitazion and iframes.

So I guess it is possible


Yeah, blobs seem like the right way to do it.


There does not seem to be a way to configure anything though. It looks quite bad with the default zoom level and the toolbar…


https://www.npmjs.com/package/pdfobject works well as a wrapper around the <object> tag. No mobile support though.


Right now, I'm using Grok Code Fast 1 for coding tasks. I've found it to be many times faster and cheaper than Claude 4.5 or GPT 5 Codex while still being quite capable.

The "code-supernova-1-million" stealth model was more capable, but it is unfortunately no longer available through any of providers. Hopefully it gets publicly released soon. My guess is that it is an updated code model from XAI/Grok.

Having to switch models every few weeks isn't the most fun, but I can't turn down the productivity that comes with it. I wish I could just go back to Neovim, but I can't justify the lower productivity.


Be careful with vanity address generators. A cryptocurrency market maker once lost around $160,000,000 in a vanity Ethereum address because the generator they used was only seeded with 32 bits of entropy.

https://www.forbes.com/sites/jeffkauflin/2022/09/20/profanit...


Indeed, be careful with anything that involves secret bits.

This tool uses proper crypto/rand initialisation of the starting key https://github.com/AlexanderYastrebov/onion-vanity-address/b...

Check out my other vanity generators (they all use crypto/rand):

https://github.com/AlexanderYastrebov/wireguard-vanity-key

https://github.com/AlexanderYastrebov/age-vanity-keygen

https://github.com/AlexanderYastrebov/ethereum-vanity-addres...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: