Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Has anyone prepared a comparison to Mixtral 8x22B? (Life sure moves fast.)


it's in the official post the comparison with Mixtral 8x22B


Where? I only see comparisons to Mistral 7B and Mistral Medium, which are totally different models.


https://ai.meta.com/blog/meta-llama-3/ has it about a third of the way down. It's a little bit better on every benchmark than Mixtral 8x22B (according to Meta).


Oh cool! But at the cost of twice the VRAM and only having 1/8th of the context, I suppose?


Llama 3 70B takes half the VRAM as Mixtral 8x22B. But it does need almost twice the FLOPS/bandwidth. Yes, Llama's context is smaller although that should be fixable in the near future. Another thing is that Llama is English-focused while Mixtral is more multilingual.


also curious how it compares to WizardLM 2 8x22B




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: