While it's unarguably fun, I can't believe the effort required to become actually good at vim (as opposed to just using a few of the easier features) will ever pay itself off. (These days, at least - if you spend your days editing code in a terminal over a dialup connection, then it's absolutely worth it!)
Maybe there are scenarios where the busywork of text editing really is on your critical path, but even as a fluent coder who uses some verbose languages at times (my current project is C++ and IEC Structured Text, does it get any more blabby?) I still spend far more of my time looking at, and thinking about, code than I do actually typing. Any extraneous cognitive load just takes focus away from what I'm actually meant to be doing.
For me, it is definitely worth the effort. But I didn't try to learn it all at once, so it didn't feel like that much of an effort. I got good enough at first, then as I kept using it and found something I was doing repeatedly, I'd figure out the easier "vim way" to do it. Over the years all those little improvements has made me able to edit code far more quickly than I could possibly do with another editor.
I agree that we spend far more time thinking about code than we do editing it, but being fluent in vim means that when I have the code in my mind, or I decide how I want to refactor something, I can get those ideas out with very little effort. If you reduce the friction of translating thoughts into code, it means you can spend more time thinking about the code instead of futzing about with an editor. Once you get good with vim, it _reduces_ your cognitive load. I know that from the outside all these tricks and tips seem like random jibberish, and impossible to remember, but when you live inside the vim bubble it really does make sense. You find the tricks that fit your mind and your work the best. They quickly become muscle memory and you don't think about the keys at all.
That said, it _is_ a cult. I've seen cool features in other editors, and I've tried to use them, but it always feels like I'm typing with mittens on my hands. And I come crawling back to vim.
Did you take the time to learn vim? Just start using it. With only a small subset of commands you can be pretty damn efficient. The fact that you can compose actions means that you don't need to remember every command - you can just create them yourself. After a very short while you will "get" it. You can say to yourself "I want to change the next three words" and immediately do it by pressing "c3w". Or you can move around with your mouse. The former is much faster.
BTW, I rarely use escape. I am using Spacemacs with evil-mode, and the default "fd" is perfect for me. It's much faster than using ESC.
Oh, I've been using it for years and I'd consider myself an adequate vim user. I'm quite comfortable in it, just using the basics - but given the small amount of time I spend using it these days, the added investment to learn the more seems redundant.
It's not really about adding up time saved, it's about staying in the flow. I think you can probably achieve that by being really good at almost any editor, but vim probably does it a little better than most and it has the advantage that it's everywhere (including IDEs).
Maybe there are scenarios where the busywork of text editing really is on your critical path, but even as a fluent coder who uses some verbose languages at times (my current project is C++ and IEC Structured Text, does it get any more blabby?) I still spend far more of my time looking at, and thinking about, code than I do actually typing. Any extraneous cognitive load just takes focus away from what I'm actually meant to be doing.