I think dynamic analysis is incredibly powerful and criminally underused in IDEs and other dev tools.
I have thought of an idea about 6 months ago that has been fermenting in my mind since then : what if (e.g.) a Python VM had a mode where it records all type info of all identifiers as it executed the code and persisted this info into a standard format, later when you open a .py file in an IDE, all type info of the objects defined or named in the file are pulled from the persistent record and crunched by the IDE and used to present a top-notch dev experience?
The traditional achilles's heel of static analysis and type systems is Turing Completeness, traditional answers range from trying and giving up (Java's Object or Kotlin's Any?), not bothering to try in the first place (Ruby, Python, etc...), very cleverly restricting what you can say so that you never or rarely run into the embarrassing problems (Haskell, Rust,...), and whatever the fuck C++'s type system is. The type-profiling approach suggests another answer entirely : what if we just execute the damn thing without regard for types, like we already do now for Python and the like, but record everything that happens so that later static analysis can do a whole ton of things it can't do from program text alone. You can have Turing-Complete types that way, you just can't have them immediately (as soon as you write the code) or completely (as there are always execution paths that aren't visited, which can change types of things you think you know, e.g. x = 1 ; if VERY_SPECIFIC_RARE_CONDITION : x = "Hello" ).
You can have incredibly specific and fine-grained types, like "Dict[String->int] WHERE 'foo' in Dict and Dict['bar'] == 42", which is peculiar subset of all string-int dictionaries that satisfy the WHERE clause. All of this would be "profiled" automatically from the runtime, you're already executing the code for free anyway. Essentially, type- checking and inference becomes a never-halting computation amortized over all executions of a program, producing incremental results along the way.
I have ahead of me some opportunity to at least have a go at this idea, but I'm not completely free to pursue it (others can veto the whole thing) and I'm not sure I have all the angles or the prerequisite knowledge necessary to dive in and make something that matters. If anyone of the good folks at JetBrains or VisualStudio or similar orgs are reading this : please steal this idea and make it far better than I can, or at least pass it to others if you don't have the time.
This is how JavaScript intellisense in Visual Studio used to work, except that the program was executed "behind the scenes" using a trimmed-down VM that could execute without side effects or infinite loops. It was eventually abandoned due to poor performance, predictability, and stability.
The problem is this dilemma: If you have to wait for a "real" execution of a program, then very reasonable expectations like "I can see a local variable I just declared" doesn't work. If you try to fake-execute a program, you have problems like trying to figure out what to do with side-effecting calls, loops, and other control flow problems.
Trying to reconcile a previous type snapshot with an arbitrary set of program edits was tried by an early version of TypeScript and wholly abandoned because it's extremely difficult to get right, and any wrongness quickly propagates and corrupts the entire state. The flow team is still trying this approach and is having a very hard time with it, from what I can tell.
Something very close to this can be done and it's in fact done already by static analysis. Static analysis doesn't have any Turing complete problem. Static analysis has a limitation of providing complete answers because of the halting problem, but it can provide instead sound answers. That is, static analysis can provide all possible runtime types for any given (e.g. python) expression. This can be accomplished by doing Abstract Interpretation, or Constraint Analysis and Dataflow Analysis (kCFA), which is in way similar to what you suggest of running the program in a profiler, but instead it runs the program in an abstract value domain. With static analysis you will get some imprecisions (false positives) but no false negatives, so it can effectively be very useful for a developer in an IDE. The precision (amount of FPs) and performance are mutually dependent, but good performance and reasonable (read useful) precision can be achieved, although it's a non-trivial engineering problem.
Additionally, some programs cannot be easily and/or quickly executed to cover all possible paths, so parts of the program will remain uncovered by the profiler. That is one place where static analysis becomes very powerful, because you can cover all the program much faster (in linear time making largish precision tradeoffs, but analysis still yielding a usable result).
Indeed, Abstract Interpretation is powerful and beautiful as heck. Type Systems can be seen as just a special case of general Abstract Interpretation, where the types are the abstract values computed from each expression. The problem is that it requires ingenuity to devise abstract value systems that don't degrade into useless generality as unknown branches in the code 'execute'. Even types only succeed as much as they do because they require extensive and invasive (language-design-changing as well as program-changing) cooperation from both the language designer(s)\implementer(s), as well as the language developer(s), without this cooperation you will leave open very wide gaps. Symbolic Execution is another much much more powerful technique but, predictably, also very inefficient in general.
My way of looking at this is just frustration at the misallocation of resources. Dynamic programs already have tons of typing information available, just not in the right time and place. Your brain spends an aweful lot of time "executing" the code of a dynamic language while it's writing the code, just like the language runtime only far more slowly and with a sky high probability of error. So all what I'm saying is, why this immense suffering, when the information that your brain is in dire need of is already right there in the interpreter's data structures, just in opaque and execution-temporary forms ? It strikes me as incredibly obvious to try to bring in those typing information from the runtime into persistent transparent records available at dev-time.
Abstract Interpretation or Symbolic Executation are (fancy) tools in programming language theory's toolbox that can help us, but the simplest possible thing to try first is to make use of the already-available information that are just lying around elsewhere. To make a somewhat forced analogy : if we have a food crisis at hand, we could try fancy solutions like genetic engineering of super crops or hydroponic farming, but the dumbest possible solution to try first would be to simply bring in food from other places where its plentiful. Typing info in dynamic languages is very plentiful, it's just not there when we need it.
The problem is what I mentioned at the end of my other comment -- complex programs will get inputs from web APIs, console, database, so it's not possible to have complete runtime coverage for all paths in the interpreter on the general case.
"Principles of Program Analysis" covers the subject very well. It's not an easy or beginner book because it's very formal, but very complete and with plenty of examples.
The idea of runtime type feedback was originally explored by the SELF group for optimization [0], and is currently used by JS runtimes like V8. The obstacle your vision faces today is that program editors are almost all completely separate to the languages they are used for, and thus such communication with the runtime is enormously painful and complex. Time to return to the truly integrated development environments of Smalltalk/Lisp with modern niceties like gradual types and multicore processors, methinks!
I developed a dynamic analysis/IDE tool for old IBM systems. I was told by VC's that IT management never spends significant money on programmer productivity. I think they're mostly right. Maybe because no one knows how to actually measure it.
I think if you have to run a whole program to understand a small part of it, you've already lost. The most valuable tools are a REPL that can execute units of your code, and a language that enforces purity and immutability so that local reasoning is more likely to be sufficient.
> The traditional achilles's heel of static analysis and type systems is Turing Completeness, traditional answers range from trying and giving up (Java's Object or Kotlin's Any?), not bothering to try in the first place (Ruby, Python, etc...), very cleverly restricting what you can say so that you never or rarely run into the embarrassing problems (Haskell, Rust,...), and whatever the fuck C++'s type system is.
This stanza was maybe not your primary point but it was beatifully written and multiple programmer friends of every variety mentioned have been cackling at it.
There has been attempts as you describe before. I can specifically point to work done in Ruby by my PhD advisor using the exact profiling approach, and then static typing from that: http://www.cs.tufts.edu/~jfoster/papers/cs-tr-4935.pdf
> you're already executing the code for free anyway
Based on my experience of working on similar domain of type systems for Ruby (though not the exact approach you describe), this turns out to be the ultimate bottleneck. If you are instrumenting everything, the code execution is very slow. A practical approach here is to abstract values in the interpreter (like represent all whole numbers are Int). However, this would eliminate the specific cases where you can track "Dict[String->int] WHERE 'foo' in Dict and Dict['bar'] == 42". You could get some mileage out of singleton types, but there are still limitations on running arbitrary queries: how do you record a profile and run queries on open file or network handles later? How do you reconcile side effects between two program execution profiles? It is a tradeoff between how much information can you record in a profile vs cost of recording.
There is definitely some scope here that can be undertaken with longer term studies that I have not seen yet. Does recording type information (or other facts from profiling) over the longer term enough to cover all paths through the program? If so, as this discussion is about maintaining code long term, does it help developers refactor and maintain code as a code base undergoes bitrot and then gets minor updates? There is a gap between industry who faces this problem but usually doesn't invest in such studies and academia who usually invests in such studies but doesn't have the same changing requirements as an industrial codebase.
https://github.com/instagram/MonkeyType can perform the call logging, and can export a static typing file which is used by mypy, but also e.g. PyCharm. It doesn't expose such fine grained types, but you could build that based on the logged data.
This is only feasible if the program takes no input, or a very limited set. Once you open it up to arbitrary input, no single run (or even a large set of runs) can capture everything the program might be expected to handle.
How does the type profiler know that the variable that only contained values like "123" or "456" was handling identifiers, not numbers?
>This is only feasible if the program takes no input, or a very limited set.
One of the insights that people making tools for dynamic languages discover over and over again is that most uses of dynamic features is highly static and constrained. In general, yes, a python program can just do eval(input("enter some python code to execute :) \n>>")), but people mostly don't do this. People use extremly dynamic and extremly flexible constructs and features in very constrained ways. This is like the observation that most utterances of human languages are highly constrained and highly specific, even within syntactically-valid and semantically-meaningful sentences, not all utterances are equally probable, and some are so improbable as to be essentially irrelevant. People who try to memorize the dictionary never learn the language, because the vast majority of the dictionary is useless and mostly unused, and even the words that are used are only used in a subset of their possible meanings.
>Once you open it up to arbitrary input, no single run (or even a large set of runs) can capture everything the program might be expected to handle.
Anything is better than nothing, right ? if your program keeps executing with some set of types over and over again (and it will, because no program is infinitely-generic, the human brains that wrote the code can't reason over infinity in general), wouldn't it be better to record this and make it avilable at static write-time ?
Human brains are finite, how do we reason over the "infinite" types that every Python program theoretically deals with ? We don't! like I said, most dynamic features are an illusion, there is a very finite set of uses that we have in mind for them. Here is an experiment you might try, the next time you write in a dynamic language, try to observe yourself thinking about the code. In the vast majority of cases, you will find that your brain already has a very specific type in mind for each variable (or else how can you do anything ? even printing the thing requires assuming it has a __repr__ method that doesn't fail.).
>How does the type profiler know that the variable that only contained values like "123" or "456" was handling identifiers, not numbers?
It doesn't. I think you misunderstood the idea a little, the type profiler makes no attempt whatsoever at discerning the "meaning" of the data pointed to by variables, it will only record that your variable held strings during runtime. If the number of string values the variable held was small enough, it might attempt to list them like this "Str WHERE Str in ["123","456"]". If the number of values the variable held was larger than some threshold but some predicate held for it consistently it can also use that, i.e. "Str WHERE is_numeric(Str)". If a string variable was always tested against a regex before every use, it will notice that and include the regex into the type. No additional "smart pants" than this is attempted, just the info your VM or interpreter already knows, just recorded instead of thrown away after each execution.
The profiler will not and can not attempt to understand any "meaning" behind the data nor it needs to in order to be useful, it's just a (dynamic) type system. No current type system, static or otherwise, attempts to say "'123' is a numeric, would you like to make it an int ?", that would be painful, absurd in most cases I can think of and misguided in general.
> wouldn't it be better to record this and make it available at static write-time
I think you misunderstand my position. It's better for the creator to simply specify it when they write the code - i.e. static typing.
> the type profiler makes no attempt whatsoever at discerning the "meaning" of the data
Your examples are waaay beyond what I need or expect. What I need is for the program to recognize that, for a number, 123 and 456 are valid, but "abc" will never be. Conversely, for an identifier, no matter how many runs use values like 123, someday someone might provide the value "abc" and that's ok. Also, any code that attempts to sum up a collection of identifiers should not be runnable, even if the identifiers in question all happen to be numbers.
This is something that static typing provides, and no amount of profiling will ever be able to divine.
> In the vast majority of cases, you will find that your brain already has a very specific type in mind for each variable
Sure, for the code I write. But I've seen plenty of code written by juniors where, upon inspection, it was completely inscrutable whether a given parameter expects an integer, a string, a brick wall or a banana.
Which is all to say that my default position is unchanged: dynamic typing is unhelpful for anything larger than small, single-purpose scripts.
> No current type system, static or otherwise, attempts to say "'123' is a numeric, would you like to make it an int ?"
SQLite's column type affinity will in fact do this, if you tell SQLite that a column is an INTEGER it will turn '123' into 123 but will happily take a string like "widget" and just store it.
I also wanted to add that your thoughts on this subject are well-stated and align with some work I've been patiently chipping away at, in the intersection of gradual typing and unit testing. I'll have more to say on that subject someday...
It would seem like that's getting the types too late to be useful, but in practice most code goes into production in stages, so you could start getting this probabilistic typing data from code that has rolled out in a limited way, for example features that are hidden to most users.
I have thought of an idea about 6 months ago that has been fermenting in my mind since then : what if (e.g.) a Python VM had a mode where it records all type info of all identifiers as it executed the code and persisted this info into a standard format, later when you open a .py file in an IDE, all type info of the objects defined or named in the file are pulled from the persistent record and crunched by the IDE and used to present a top-notch dev experience?
The traditional achilles's heel of static analysis and type systems is Turing Completeness, traditional answers range from trying and giving up (Java's Object or Kotlin's Any?), not bothering to try in the first place (Ruby, Python, etc...), very cleverly restricting what you can say so that you never or rarely run into the embarrassing problems (Haskell, Rust,...), and whatever the fuck C++'s type system is. The type-profiling approach suggests another answer entirely : what if we just execute the damn thing without regard for types, like we already do now for Python and the like, but record everything that happens so that later static analysis can do a whole ton of things it can't do from program text alone. You can have Turing-Complete types that way, you just can't have them immediately (as soon as you write the code) or completely (as there are always execution paths that aren't visited, which can change types of things you think you know, e.g. x = 1 ; if VERY_SPECIFIC_RARE_CONDITION : x = "Hello" ).
You can have incredibly specific and fine-grained types, like "Dict[String->int] WHERE 'foo' in Dict and Dict['bar'] == 42", which is peculiar subset of all string-int dictionaries that satisfy the WHERE clause. All of this would be "profiled" automatically from the runtime, you're already executing the code for free anyway. Essentially, type- checking and inference becomes a never-halting computation amortized over all executions of a program, producing incremental results along the way.
I have ahead of me some opportunity to at least have a go at this idea, but I'm not completely free to pursue it (others can veto the whole thing) and I'm not sure I have all the angles or the prerequisite knowledge necessary to dive in and make something that matters. If anyone of the good folks at JetBrains or VisualStudio or similar orgs are reading this : please steal this idea and make it far better than I can, or at least pass it to others if you don't have the time.