I mostly agree, but partially because I can't tell (even though I'm paying $20/month) what tier of Gemini I am getting. I know most people won't care, but because they won't tell me, I will assume I'm getting Gemini Flash from 6 months ago, and I'm not paying $20/month for that. I'm sure if they were honest about the model people wouldn't pay for it.
They're making the mistake of optimizing for general case users (who don't care about model version) when they need to attract power users so that they can find product-market fit.
There's a vanilla 2TB plan for $10/month, the plan with 2TB and Gemini is $20/month. You could say that Gemini is $10/month then, but if you don't actually need 2TB or the other benefits then you're effectively paying $20/month because you can't unbundle Gemini from everything else.
Wait, where's this $10/month plan? I don't see it. I see it at https://one.google.com/about, but I can't actually choose that.
Edit: I found the answer on Reddit, as usual. You need to go to your plan settings and only then is the plan available under one.google.com/settings. That's the only way to downgrade.
I have Gemini Advanced but I can't tell which model it is, yeah. Same problem. Whatever it is, it's useless. GPT-2 level. Can't do anything reasonable.
Claude 3.5 Sonnet and ChatGPT-4o are roughly the same with the former pipping it for me and then out there is this shitty Google product that is worse than Llama-3 running on my own laptop. Even Llama-3 is better at remembering what's going on.
Fortunately, this time I managed to look it up and I have it for free because I have Google One 5TB, apparently. And there's no way to spend more money so that's what I have. It's so bad that when Claude and ChatGPT run out of messages for me I just use local Llama rather than use Gemini. Atrocious product.
In my experience, the latest experimental model is a bit better than the latest Claude/ChatGPT at creativity, but a little worse at general reasoning. They're still mostly comparable and certainly of the same generation.
Where it truly stands out is the 2M context window. That's game-changing for things like analyzing publications and books.
Yeah, in practice, for the tasks I set it. High hallucination rate. Low context window. Frequently refuses to act and suggests Googling. If the other guys didn't exist, it could be useful, but as it stands it's as useful as GPT-2 because neither of them hit the threshold of usefulness.
I'm sure some benchmarks are decent but when Google finally shutters the chatbot I'll be glad because then I won't constantly be wondering if I'm paying for it.
It's a shame because Google's AI features otherwise are incredible. Google Photos has fantastic facial recognition, and I can search it with descriptions of photos and it finds them. Their keyboard is pretty good. But Gemini Advanced is better off not existing. If it's the same team, I suppose they can't keep making hits. If it's a different team, then they're two orders of magnitude less capable.
It doesn't actually work. I pasted in a House Resolution and asked it a question and it immediately spazzed and asked me to Google. I used Claude and it just worked. That's the thing about Gemini: it has a lot of stats but it doesn't work. With Claude I could then ask it the section and look at the actual text. With Gemini it just doesn't do it at all.
This feels a lot like when people would tell me how the HP laptop had more gigahertz and stuff and it would inevitably suck compared to a Mac.
The output from an LLM is like the path a marble takes across a surface shaped by its training data and answers to “why” questions just continue the marble’s path. You may get good sounding answers to your why questions but they are just good sounding answers and not the real reasons because LLM’s lack the ability to introspect their own thinking. Note: Humans do not have this ability either unless using formal step by step reasoning.
I pay for Gemini Advanced and it's much better than GPT-2, I think. I often search the same thing between Gemini and GPT-4 and it's a toss up which is better (they each get questions right when the other gets it wrong, sometimes).
But recently I asked Gemini "Bill Clinton is famous for his charisma and ability to talk with people. Did he ever share tips on how he does it?" and Gemini responded with some generic "can't talk about politics" answer, which was a real turn-off.
They're making the mistake of optimizing for general case users (who don't care about model version) when they need to attract power users so that they can find product-market fit.