I always mocked out 3rd party tests in my tests. I've never actually had a problem with some third party changing their API. That's the whole point of a versioned API anyway. I think when people talk about e2e tests, it's more about testing only integration between contracts that you own.
It’s valid to say you’re e2e testing your system, just not e2e testing the “full system”.
This is why the classification of the test into e2e, integration, and unit can cause confusion. I like to try to encourage people to avoid bucketing and instead say “this test should be more integration style than it currently is”, “this test should be more isolated than it currently is”. At the end of the day all testing mocks out the user and things like old web browsers or other factors that are a part of the real world system you care about may not be simulated in your test, so the way to get ”real” e2e verification is probably monitoring real users, if you consider that the user is a part of your “system”
I’ve run into this a few times with some upstream package breaking and showing up in tests. I try to avoid mocking as much as possible in tests these days.
One thing I've done is adding the ability to run tests both with a "mocked" and a "real" version. The mocked version is fast and can be run quickly, the real version is much slower, but tests the actual real service. It's not that much extra effort to make in most cases, and I've caught some bugs when my mocked version made assumptions that were false, didn't cover some edge case, or whatnot.
That said, I too avoid mocks unless there's a specific good reason to add one.
I like the idea of testpoints in code that can be switched on or off, an idea originally from the hardware side. Modifying the testpoints to allow switching between different test implementations is a useful generalization of the idea.