If ruff & uv have proved anything, it's that a tool that's effortless, a net-pos...

alfalfasprout · 2025-10-14T16:47:52 1760460472

> New typecheckers don't need to be perfect. They need to be good enough, easy to integrate and have low false positives. Sure, they will improve with time, but if feels like a pain then no one will pick it up. Python users are famously averse to tools that slow down their dev cycles, even if it means better long term stability

I really don't agree. Sure, they don't need to be perfect but also keep in mind many codebases have already standardized on something like (based)pyright or mypy. So there's a migration cost. If your analysis has a lot of false positives or misses a lot of what those type checkers miss then there's little incentive to migrate. Sure, ty and pyrefly are much faster, but at the end of the day speed is only one consideration for a type checker.

screye · 2025-10-14T18:31:03 1760466663

Think of it this way. There are 2 groups. Group 1 has has avoided typecheckers because they're a PITA and group 2 has configured mypy/pyright despite the devx pain. Group 1 is a lot bigger than group 2. Group 2 is more lucrative per unit than group 1.

With enough time, ty and pyrefly will approach perfectness. If they're easy enough to use today, group 1 should be able to adopt them without any extra pain. (some typechecker is better than no typechecker). This gives them momentum. In couple of years, ty/pyrefly may finally be better than mypy/pyright. Then, Group 2 can start their ports.

This way, no one misses out. Group 2 still gets their perfect typechecker, just not immediately. But in that time, Group 1 is getting familiar with using typecheckers and their sheer size helps build institutional momentum towards typecheckers as an essential piece of any python dev flow.

If A. 'certain class of python problems are permanently solved by typecheckers' and B. 'every python user has some typechecker' become true, then that opens a lot of doors. Today, B is a harder problem than A. I'm guessing that compiled/JIT python will be the next frontier once python typing is solved. Wide typechecker adoption is a blocking requirement for that door to be opened.

fragmede · 2025-10-14T16:58:56 1760461136

No, I'm pretty sure they need to be perfect. If the tool's telling me bad information about types, first I'm going to lose a stupid amount of time debugging to that wrong info, and then swear off the tool forever after I realize it was the tool that was wrong.

dmurray · 2025-10-14T21:51:10 1760478670

Type checking in Python involves guessing. Take a program like

x = [3]

Should the type checker guess that the type of x is list[Any], or list[int], or list[Literal[3]], or something else? Libraries you depend on will pose more difficult versions of this question.

So it's not possible to expect the tool to be "perfect", whatever that means - usually people think of it as meaning it allows all code that they think is idiomatic and reasonable, and disallows all code that could raise a TypeError at runtime. You can only expect the tool to be reasonably good and perhaps to have an opinionated design that matches the way you would like to write Python.

fragmede · 2025-10-14T22:58:50 1760482730

Arguably it's the job of a linter and not a type checker, but if your code just assigns random things to random variables that never get used again, I hope something would point that out to you before it passes code review and merged to main, ideally before it even gets sent for review.

Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then. If our goal is maintainable code at scale, the first thing I'd expect of a type checker is for it to communicate with the user, stating that a) the variable in file foo.py on line 43 is missing a type hint and couldn't be guessed with confidence, and that it's just gonna guess it's a list[int]. For bonus points, the tool could run in interactive mode and ask the user directly. But even without a hint, assuming the variable actually is used later, how does it get used?

It seems like you'd just start with the most strict interpretation, list[Literal[3]], and relax it as it gets used. If it gets fed into a function that doesn't have hints and can't be introspected for whatever reason, relax it to list[int] and then further relax it to list[Any] as necessary. Then depending on the mode the user configured the tool to run in, print nothing, or a warning, or error out or ask the user what to do if running interactively. Some of the config options could be strict, tolerant, and loose. Or maybe enforcing, permissive, and loose. Whatever color we want this bike shed to be.

More advanced tools let the user configure individual issues and their corresponding levels, but that may be considered too many options to give the user, overwhelming the user who then doesn't use the tool.

As far as perfect typing goes, I mean, yeah, ideally, after satisfying the type checker in strict mode, which means no types need to be guessed at, as all variables and other locations without type annotations that need them were reported, or even annotated interactively, or if we're real fancy, automatically with a comment. IMO programs should feel free to edit code directly (as a non-default option) That would mean the program could not throw an an uncaught TypeError, unexpectedly. That's a lot to ask of a dynamically strongly typed language, but I'd settle for it never being utterly wrong.

What that means is if the type checker says the function is returning a string, but of the 200 calls to that function, the type checker throws an error and says ten of those callsites expect it to return a dict, all I'm saying is this hypothetical type checker had better be right about that. Nothing worse than losing hours rooting around in the code looking for something that isn't actually there (or an errant semicolon, but this isn't C)

dmurray · 2025-10-15T09:27:11 1760520431

> Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then.

The code is in a third party library which doesn't have comprehensive type hints, and perhaps has type expectations so complicated that they can't be expressed in the Python type system.

Even if you enforce type hints on every line in your internal code, you're going to be relying at some point on libraries which are reliable but poorly type hinted. If you're not using that vast ecosystem of libraries, Python probably wasn't the best choice.

nickpsecurity · 2025-10-14T18:12:43 1760465563

Also, in the past, many tools were dumped by users for too many false positives. Tools like Astree and RV-Match got adoption by having no or low false positives.

phplovesong · 2025-10-14T15:24:16 1760455456

If typechecking is tongue on cheek, why even bother?

atoav · 2025-10-14T15:56:33 1760457393

If car crashes still kill people, why wear safety belts?

tremon · 2025-10-14T16:55:41 1760460941

I don't think that's an accurate metaphor. Seatbelts are expressly a runtime solution to car crashes, whereas Python's type checking is a) only done during development (not even during build), and b) completely reliant on third-party tools.

If you're looking for a car analogy, I would suggest comparing Python type checking to installing speed cameras on the factory floor.

ObscureScience · 2025-10-14T18:48:34 1760467714

The analogy would be mandatory inspections of certain components or manufacturing processes to see that rules and specs are followed while assembling.

phplovesong · 2025-10-16T14:38:08 1760625488

Thats a bad analogy. A better one is should we check the breaks before (compile time) we start the car and drive down the highway (runtime).

At this point you either did check the breaks, or did not. If not you are out of luck if the breaks did infact not work.

fastasucan · 2025-10-16T20:32:58 1760646778

Or you test the breaks as you stop to see that everything is clear as you exit your driveway and go into the street.

phplovesong · 2025-10-18T05:59:15 1760767155

This sounds like the typical "its good enough" argument. Say we test the breaks manually nine times of of ten. This means we get really hard to diagnose bugs because "this did work when i tested locally". And users get more runtime errors because of half-assed development.

Type systems should be treated like maths, its either correct, or not. In the end, a typesystem is basically just that, its math behind the scenes, more specifically a genre of maths called category theory.

coldtea · 2025-10-14T16:33:03 1760459583

Because "all or nothing" is a bad idea.

phplovesong · 2025-10-16T14:34:41 1760625281

How so? We have multiple languages with static types, and most know the insane benefit you get from having compile time guardrails.