Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"it's useless. GPT-2 level" Really?

In my experience they are at similar level for most tasks.



In my experience, the latest experimental model is a bit better than the latest Claude/ChatGPT at creativity, but a little worse at general reasoning. They're still mostly comparable and certainly of the same generation.

Where it truly stands out is the 2M context window. That's game-changing for things like analyzing publications and books.


Yeah, in practice, for the tasks I set it. High hallucination rate. Low context window. Frequently refuses to act and suggests Googling. If the other guys didn't exist, it could be useful, but as it stands it's as useful as GPT-2 because neither of them hit the threshold of usefulness.

I'm sure some benchmarks are decent but when Google finally shutters the chatbot I'll be glad because then I won't constantly be wondering if I'm paying for it.

It's a shame because Google's AI features otherwise are incredible. Google Photos has fantastic facial recognition, and I can search it with descriptions of photos and it finds them. Their keyboard is pretty good. But Gemini Advanced is better off not existing. If it's the same team, I suppose they can't keep making hits. If it's a different team, then they're two orders of magnitude less capable.


> Low context window.

Gemini Advanced has 1M context window. If this is low, I am not sure anything else on the market will satisfy you.


It doesn't actually work. I pasted in a House Resolution and asked it a question and it immediately spazzed and asked me to Google. I used Claude and it just worked. That's the thing about Gemini: it has a lot of stats but it doesn't work. With Claude I could then ask it the section and look at the actual text. With Gemini it just doesn't do it at all.

This feels a lot like when people would tell me how the HP laptop had more gigahertz and stuff and it would inevitably suck compared to a Mac.


I used Gemini Pro API to sort my folders, and it misclassified some files, I asked why, and he said it was done to "promote diversity in my folders"

...

This is very lame (and the sad part is that it's a real story).


The output from an LLM is like the path a marble takes across a surface shaped by its training data and answers to “why” questions just continue the marble’s path. You may get good sounding answers to your why questions but they are just good sounding answers and not the real reasons because LLM’s lack the ability to introspect their own thinking. Note: Humans do not have this ability either unless using formal step by step reasoning.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: