> I can no longer concieve of writing software without using test driven develop...

skohan · on Aug 4, 2021

I almost see static typing as an alternative to exhaustive testing. If you're working on a large project written in JS, you need a strong test suite to have any confidence you code will be robust at runtime.

As you say, with ADT's and null safety, you can be a lot more judicious about testing to get to the same level of confidence.

When working with Rust or Swift, I'm honestly usually getting away with some integration tests, and whichever unit tests came about organically when a particular section was particularly well suited to TDD. But "code coverage" is a much less pressing concern.

lpapez · on Aug 4, 2021

I would say this is the case even in C++ world. When modern stuff is used correctly (etc. optional, variant, constexpr + static_assert), most of the tests you will ever need are integration tests, basic smoke tests or feature-documenting-tests.

missblit · on Aug 4, 2021

Also fuzz tests! I cannot stress enough how important fuzz tests are if your security model solely relies on your C++ being memory safe (a lot of code should be sandboxed as well but that's another discussion...).

Even when using modern stuff correctly on a mature codebase fuzz testing will find bugs. So if you don't fuzz test and someone evil has your binary then they'll fuzz test for you...

zarzavat · on Aug 4, 2021

If you are writing a new large project in JS without TypeScript, you are doing it wrong period. Bare JS is a cute language for personal websites but it's not feasible for anything more than a few thousand SLOC.

Zababa · on Aug 4, 2021

I'd personally reach for something else than TypeScript. TS tries to type the existing JS world which makes it really big and complex.

lolinder · on Aug 4, 2021

The advantage TypeScript has for front-end over something that compiles to JS is that you pretty much know what your code will compile to: with a few exceptions, you just strip out the type annotations. Knowing this, you can accurately predict things like performance.

In contrast, a compiles-to-JS language can produce code that behaves unexpectedly in its non-functional requirements, and in the worst cases crashes where it shouldn't. That will be less of an issue as compilers get better, and WASM should help a lot, but in the current ecosystem TypeScript produces more predictably good results.

Zababa · on Aug 4, 2021

That may be true in theory, but I find that people write radically different code in JS and in TS. TS code looks a lot more like C# compared to regular JS code, people use a lot of classes over plain functions and data structures.

nawgz · on Aug 4, 2021

TypeScript is not really big and complex, it's just a type system. The only real pain with TypeScript vs. a language with native static types is that TypeScript requires a bit of extra work to fiddle with the compiler, otherwise when it comes to the code it's flexible but can be used in a very simple manner.

Zababa · on Aug 4, 2021

> TypeScript is not really big and complex, it's just a type system.

It's a very complex type system though.

nawgz · on Aug 4, 2021

Can you say some ways it is more complex than other type systems? I think type systems are inherently complex.

From my perspective, I feel user-facing complexity is quite low in TypeScript in some key ways. You can write your type declarations anywhere, you can import and export them in a familiar JS-module way, you don't need to explicitly type anything in particular so you can be anywhere on the spectrum from very rigorous types to no types at all / `any` everywhere, and finally it has really good error messages so you rarely are stuck with a type mismatch that makes no sense.

dimitrios1 · on Aug 4, 2021

Tell that to the mountain of "few thousand line SLOC" javascript projects that existed and continue to exist before Anders Hejlsberg embarked on his mission to C#-ify javascript. I am not opposed to adding strong type checking to JavaScript, but Typescript ain't it.

paavohtl · on Aug 4, 2021

TypeScript has nothing to do with C#, apart from its creator. The structural type system with explicit nulls, sum types and very powerful generics makes it feel much closer to the ML family of languages than C#, despite using the C-like JS syntax. The teams I've used it in have never really used it to build complex OOP hierarchies or even used classes for anything other than React components, though there are probably many who do.

zarzavat · on Aug 5, 2021

Both are true. The original TypeScript v1 was a lot more like C#-in-JS and was very class-oriented, however over the years the type system has grown into a mature ML-like structural type system as a side effect of trying to type the wide variety of dynamic JS out there.

Anybody who last looked at TypeScript a few years ago and dismissed it needs to have another look.

morgancmartin · on Aug 4, 2021

I agree that TypeScript leaves something to be desired, but it is far and away a better alternative to pure JS. TS may not be total, but it comes close enough to get the job done for the dynamic environment it's meant to target assuming operators competent enough to lean on what it offers.

I should also note, I've attempted OOP in TS and it wasn't pretty. It is much more suited to a functional style IME, although I will concede I have seen some open source projects that use it in a Java-esque OOP style presumably to good effect. Personally it wasn't my cup of tea.

k_ · on Aug 4, 2021

> I am not opposed to adding strong type checking to JavaScript, but Typescript ain't it.

If you are curious about this, you might be interested in checking out Haxe [0]. It has static typing, type inference, pattern matching [1] (maybe not on par with ML languages, but still good), and can transpile to multiple targets [2] including JS.

[0]: https://haxe.org [1]: https://haxe.org/manual/lf-pattern-matching.html [2]: https://haxe.org/documentation/introduction/compiler-targets...

xvilka · on Aug 4, 2021

There are also notable Rescript [1], Melange [2], Reason [3], PureScript[4].

[1] https://rescript-lang.org/

[2] https://github.com/melange-re/melange

[3] https://reasonml.github.io/

[4] https://www.purescript.org/

wing-_-nuts · on Aug 4, 2021

>Typescript ain't it.

Expand please?

jamil7 · on Aug 4, 2021

I think what the parent is getting at that Typescript doesn't try to give you the kind of guarantees that something like Rust, Swift, Elm etc. can. They are upfront about this though and it's stated here in the design goals as a non-goal https://github.com/Microsoft/TypeScript/wiki/TypeScript-Desi...

bwship · on Aug 4, 2021

Actually Typescript is it.

claytongulick · on Aug 4, 2021

If you're wasting your time with needless datatypes you are doing it wrong period.

See what I did there?

pydry · on Aug 4, 2021

Static typing is an alternative to invariants in dynamic languages, not testing.

Testing can't really substitute for types. They're two diametrically opposed approaches to impeding bugs.

ashtonkem · on Aug 4, 2021

They are not diametrically opposed, but rather they’re non-overlapping with a strong type system.

In a dynamic language you end up writing a number of tests that would be covered by a static type system and writing a number of tests that would not be covered by that. Going from dynamic to static typing is a replacement for some of the former, and none of the latter.

pydry · on Aug 4, 2021

>In a dynamic language you end up writing a number of tests that would be covered by a static type system

Not once have I done this. I've written plenty of invariants that would have been covered by a static type system but no tests.

As far as I'm concerned if you had to write an extra test to "cover for a dynamic type system" that test is a bad smell that should be deleted. A test validates that a method raises an exception when being fed a null is pointless if it's just testing a one line assertion that does the same thing. It's a waste of keystrokes. It doesnt catch bugs.

There may be some exceptions to this rule but they would be under very obscure circumstances, I think.

camdenreslink · on Aug 4, 2021

One reason to have tests like that, is to cover regressions. Say somebody changes that method in the future and it doesn’t raise an exception anymore when passed a null (but it is still supposed to, the change introduced a bug).

If you have that “pointless” test, you’ll find out about it. If not, you’ll probably only encounter it at runtime with a program that is subtly wrong.

pydry · on Aug 4, 2021

No, you won't. If somebody removes that line they had a reason. That reason will make them remove the test that makes it fail as well.

Whether it's a good or a bad decision is another matter and not something the test will help with.

Tests that exist to check for the presence of a single line of code are deadweight.

shadowgovt · on Aug 4, 2021

This story doesn't match with my observation programming.

Failure to sanity-check a null almost never stemmed from someone deleting the non-null assertion intentionally. It stemmed from someone rewiring the control flow in such a way as to bypass the non-null assertion without intending to (such as early-returning because they think they can shortcut the computation, but the shortcut isn't actually valid if the argument in question is null).

The tests have value precisely because they aren't in the regular control flow of the unit under test and will fail when the code flow is changed and the author didn't realize the change modified the expected behavior of the function.

dragonwriter · on Aug 4, 2021

> If somebody removes that line they had a reason.

That reason will be one of two things: (1) They replaced it with something they thought would handle the same functionality (along with possibly some other functionality) in a beter way, or (2) The requirement changed.

> That reason will make them remove the test that makes it fail as well.

If it was #2, sure (and in a test-first workflow, they would take out the test first). In case #1, they won’t take out the test, and if their implementation is wrong, the test will let them know.

> Tests that exist to check for the presence of a single line of code are deadweight.

Unless you have a very weird framework, tests don’t check the existence of lines of code, they check the behavior of units. The fact that the first-pass implementation in the same iteration in which the test was added is trivial doesn’t mean the function isn’t part of the contract that needs verified, and doesn’t mean that no one will ever change the intended implementation to a less direct one.

duped · on Aug 4, 2021

They're deadweight until someone pushes a change that makes some assumption about that line of code the author didn't mention were invalid.

camdenreslink · on Aug 4, 2021

It’s possible that the code was modified in a way that bypasses the null-check. It isn’t necessarily true that anybody removed code.

Unit tests that test the contract of a method (if you put in this argument, then you’ll get this return or side effect) are a good idea in both dynamic and static languages.

tablespoon · on Aug 4, 2021

>> In a dynamic language you end up writing a number of tests that would be covered by a static type system

> There may be some exceptions to this rule but they would be under very obscure circumstances, I think.

Maybe it depends highly on the specific dynamic and static languages under comparison. My main experience with dynamic languages has been with Groovy (all inherited codebases), and it's an annoyingly frequent occurrence that I've had to deal with PROD issues for things that would have been compile errors in Java.

I believe my favorite is when someone used a String for the counter in a for loop, and it worked...until one day the counter had to go past 9. IIRC, Groovy would convert the String to a char, increment it, then convert that into an Integer. That actually works for 0-9, but incrementing "9" in this way gives you ":" which would then get a NumberFormatException in the last conversion.

kd0amg · on Aug 4, 2021

> I've written plenty of invariants that would have been covered by a static type system but no tests.

Do these invariants have some language-level support?

dragonwriter · on Aug 4, 2021

> A test validates that a method raises an exception when being fed a null is pointless if it’s just testing a one line assertion that does the same thing. It’s a waste of keystrokes. It doesnt catch bugs.

It catches the bug that gets created when someone does a later refactor, replaces the assertion and some other code with a call to another function that they think throws an exception on null as well as doing some computation needed in the current function, but they are wrong.

Tests validating the contract of the unit are useful even when the trivially pass in the first-pass implementation, because implementations evolve, and the thing people will forget to add tests for with changes are the pre-existing, unmodified expected behavior.

(Also, for workflows where tests are written before the first-pass implementation.)

bwship · on Aug 4, 2021

Yea, I mean the problem is. If you have a function that expects a number parameter. In Typescript, the compiler won't let you do this someFunction('a'), so you never have to worry about handling this. But, if it is in javascript, then someone can call the function like that. So in javascript, you might have a test that ensures that the function handles this gracefully. You obviously don't have to have that test, but you will have your app blow up later down the line, if someone adds a call like that. So, testing does occur at times for these types of things. You might not have done it ever, but there are certainly devs who have, myself included.

ninkendo · on Aug 4, 2021

How do you express invariants in dynamic languages? (In a way that doesn’t require test coverage to check those invariants?) I thought the only way would be to write some assertions in the prod code, and then write the unit tests to cover them… but you seem to be implying that you can “do” invariants in dynamic languages in a way that doesn’t require testing.

pydry · on Aug 4, 2021

>In a way that doesn’t require test coverage

They obviously work best when "combined with test coverage" but then again, I dont consider test coverage to be "optional" in any language. This idea that "if it compiles it works" that some haskellers seem to believe, for instance, is complete horseshit.

If somebody argues that static typing works great because without tests it catches more bugs than dynamic typing does when you also dont write tests, it's notionally true, yes, but theyre telling you more about themselves than they are about types. Namely that they're not in the habit of writing tests.

If you've got enough of the right kind of testing coverage to validate an app written in a statically typed language you will have enough to trigger invariants in a dynamically typed langauge.

I have successfully had invariants usefully triggered in prod and by manual testers in big balls of mud without test coverage (failing fast isnt just for test environments) but I considered that to be a form of emergency programming CPR and damage limitation that isnt a substitute for fixing your testing "diet".

shadowgovt · on Aug 4, 2021

Agreed; "if it compiles it works" tends to tilt too far in the other direction.

(... I once had a professor who believed that phrase as nearly a point of religion, and his project work was maddening. No documentation on any functions either, because "the types are self-documenting." Then you'd get hit with a 'division' function that takes (number,number) but he put the denominator as the first argument. Ugh.)

ragnese · on Aug 4, 2021

I'm not an "expert" at dynamic languages, but I've written small microservices in Clojure and Elixir, data crunching code in Python, and your typical junk in PHP and JavaScript.

Clojure does have spec (https://clojure.org/about/spec) which is pretty cool and declarative. And, of course, your IDE can help you when writing code that obviously violates a spec declaration. But that's not fundamentally different from an IDE parsing the type-hints of Python, JavaScript, or PHP. It is more powerful than the aforementioned type-hints, though.

At the end of the day, it blurs the line between static typing and dynamic typing. IMO it should be considered static typing because it serves that same purpose, works at "write time" to help the programmers, and can probably absolve you of writing your own type-checking tests.

In some sense, it is all just asserts. Even statically typed languages have unsafe type casting, which can trigger an "assert" at runtime.

eru · on Aug 4, 2021

I guess you can sort-of use design-by-contract and call it an alternative?

qwerty456127 · on Aug 4, 2021

> I can no longer concieve of writing software without using test driven development. I can't imagine not having a comprehensive suite of unit tests to back up my development.

I don't understand this. This makes a lot of sense when you make complex enterprise products depending on unstable 3rd-party libraries but for personal project this just inflates the time you need to get the thing done. I just code exactly what I need and check if it works and can't imagine dedicating time to writing additional layer of code to automate this checking. Even when I code pure functions which are a lot easier to test.

Sharlin · on Aug 4, 2021

Lately I've been writing some 3D graphics code in Rust for fun. Almost twenty years ago, in highschool, I used to write similar code in C++, without relying on any sort of automated testing – and damn, with the amount of math involved, it's so nice to just write and run some rudimentary sanity-checking tests as you code! I used to rely on constantly building and running some interactive test app and from its behavior try to figure out whether your matrix calculations have a sign error somewhere or not.

A solid type system helps a lot in making refactoring a less fearsome endeavour, but a reasonable test coverage on top of static checking makes it even more so.

Also, automatic perf tests that keep track of regressions and improvements are just awesome.

qwerty456127 · on Aug 4, 2021

Sure, this also makes sense when it is about math. But what I write mostly is about managing files on the disk, querying databases, processing text and XML and also some web frontend. Occasionally even communicating via a COM port. And when I need math I do it with NumPy. These tasks seem rather easy to implement and rather hard to test.

checker · on Aug 4, 2021

You can do some mocking to validate your error handling, definitely check it out.

qwerty456127 · on Aug 4, 2021

Mocking sounds like a lot of work. Am I wrong?

kragen · on Aug 4, 2021

Sometimes mocking, even to check error paths, isn't a lot of work. It depends.

A lot of times designing your interfaces to support mock objects more easily improves them in other ways; if you pass in a file object instead of a filename, for example, not only does it become easier to pass in a mock object that raises an error, but it also becomes easier to pass in a GzipFile object or an HTTPResponse object, and presto, your function now handles compressed data and data stored on a web server.

Also, though, monkeypatching stuff in tests isn't hard in dynamic languages. Here's an example from https://echohack.medium.com/python-unit-testing-injecting-ex...:

    with patch.object(requests, "get") as get_mock:
        with nose.tools.assert_raises(socket.error):
            get_mock.side_effect = socket.error
            inject_exception.call_api("http://example.com")

ratww · on Aug 4, 2021

Depends on how you write your code.

If your functions have lots of hidden dependencies and side-effects, it’s hard to test.

If you split concerns properly and keep your glue/IO code separated from the decision-making/business-rules/logic, mocking is quite trivial, and there are advantages other than just testability.

Check out this talk to see examples of that in action: https://www.destroyallsoftware.com/talks/boundaries

BackBlast · on Aug 4, 2021

I've been on both sides, heavy testing and little to no testing. IMHO it really depends on the system use case.

Tests have advantages, but they also hinder your agility by increasing the costs of the initial version and of making major changes.

Hard to update? Good tests! Well understood problem sphere and product? Good tests! High cost of failure? Good tests! MVP with no actual current users? No TDD. Exploring the product space and update cost is low? No TDD.

brooklinmyers · on Aug 4, 2021

It's not necessary for everything perhaps, but the feedback of an automated test is often faster than manual testing. So it ends up saving time.

chmod775 · on Aug 4, 2021

> I don't understand this. This makes a lot of sense when you make complex enterprise products depending on unstable 3rd-party libraries but for personal project this just inflates the time you need to get the thing done.

I hate tests with a passion, but sometimes I'll write hundreds even for a tiny personal project - when that tiny personal project is going to be at the core of another thing and I want to make absolutely sure that it is working correctly in every way.

Like that time I wrote a process manager for node.js applications because I realized PM2 is a buggy inconsistent mess with race conditions all the way down. Gotta make sure mine works better than that...

Zababa · on Aug 4, 2021

You can see Robert Martin's view on this, even for small and personal projects, here: https://blog.cleancoder.com/uncle-bob/2020/05/27/ReplDrivenD...

pydry · on Aug 4, 2021

I always found him to be a little too zealous about TDD. The circumstances in which it works well are far more limited than he argues.

nodejs_rulez_1 · on Aug 4, 2021

Beware: he makes money or used to make money on TDD training.

kragen · on Aug 4, 2021

In one of my projects the "additional layer of code to automate this checking" looks like this:

    def ok(a, b): assert a == b, (a, b)

    from .util import ok

    ok(Parse(b'.25').do(0, real), (3, ('real', 0.25)))
    ok(Parse(b'1.5').do(0, real), (3, ('real', 1.5)))
    ok(Parse(b'-1.5').do(0, real), (4, ('real', -1.5)))
    ok(Parse(b'+1.5').do(0, real), (4, ('real', +1.5)))

If this results in an exception at import time, I sometimes have to comment out one or more of these tests while I fix parts of the bug.

I mean, I do do some manual testing. Sometimes it's easier to bang on a function and decide whether the output looks right than to figure out what the results should be from first principles. But it doesn't take very many repetitions of a manual test of something like the above before it's the manual testing inflating the time needed, not adding another ok() line.

This is especially true for debugging. Adding automated tests is often a faster way to debug than to add print statements or to step through things in a debugger. Sometimes I take the tests out afterwards, but more commonly I leave them in. And with things like Hypothesis and QuickCheck, tests can have a much higher strength-to-weight ratio than simple tests like the above.

It's also a breath of fresh air when I'm trying to understand what some code is supposed to do, or how to call it, and I come across a one-line test like the above.

Speaking of how to call things, sometimes I do test-driven development so that my internal APIs don't suck. And sometimes I don't and wish I had. There are a couple of lines of code in the same project that say this:

    array = Thunk(lambda: drop_ws(b'[') + Any(pdf_obj) + drop_ws(b']'))
    array.xform = lambda d: ('array', d[0][1])

And, you know, that's just shitty, shitty code, because that was a shitty way to design that interface. And maybe if I'd written that code test-first, instead of just charging ahead madly implementing the first thing that came to mind, I would have realized that when I was writing the test, before writing the implementation. Now, if I want the interface to suck less, I have 700 lines of crap to refactor to the new interface, which probably means either throwing it away and starting from scratch, or incrementally refactoring it while maintaining both interfaces for a day or two. And that's inflating the time I need to get the thing done.

It would have been less work to fix it at design time, the way my friend Aaron taught me: first, write some application code as if the ideal library existed to write it with; then, write the library code that makes it run. And that's just so much easier when that application code is just a unit test.

However, as with most things, I disagree with Robert Martin's point of view on this. I can easily conceive of writing software without TDD, or even without tests. In fact, I do it frequently. Tests are often valuable, and TDD is sometimes valuable, even in weekend personal projects. But they aren't always worth the cost.

brightball · on Aug 4, 2021

> To make up for it in a dynamic language, you end up having to reimplement what a type checker would do for you with no guarantee you didn't miss something or make a mistake, with the added cost of e.g. polluting your projects with noisy defensive code that double checks the types of parameters, adding extra tests, and bolting on linting, analysis and IDE tools to help.

This is entirely dependent on the type of application that you're dealing with.

A web application backed by a database, especially CRUD style, acts as little more than a pass through layer. You're getting everything as a string from the browser, storing it in the database with a SQL string, usually transformed by an ORM layer. In that style of application, the types are maintained in the database itself.

Having a language with strong static typing in that middle layer between the browser and the database only serves to add a huge amount of headaches and extra friction, especially when you're not even using each variable that comes through on its way to the database.

The only thing you're doing in that situation is adding an extra set of type checks that you have to keep in sync between the database and the application layer for virtually no benefit.

duped · on Aug 4, 2021

Types have never given me a headache. They've only prevented me from writing bugs and made classes of tests redundant.

In your example, the database and the application layer must be kept in sync regardless of whether you use static typing or not. The benefit of static typing is that it can be enforced before you push to production or write a test for congruency.

brightball · on Aug 4, 2021

> kept in sync regardless of whether you use static typing or not

This is handled automatically by a large number of ORM tools in dynamic languages (Ruby, PHP, etc). It's a non-issue.

I've spent significantly more time in my career debugging issues in statically typed languages used as an app server in front of a database than I have in dynamically typed languages.

You're doing some processing work in the language? Calculations, ML, etc? Static typing is the obvious way to go.

If you're just sticking things in a database you're going to end up wasting a lot of people's time (and subsequently, the money that goes to pay them). In 20 years, I've never seen it be a good choice but I've seen a lot of people married to the delusion that it is while other people are getting real work done.

lolinder · on Aug 4, 2021

> You're getting everything as a string from the browser

This may be one of the differences of requirements that causes disagreement here. I can see how this might work if your app:

* Loads data from the client

* Later displays that data, unchanged, to the client

In that case, treating everything as a string is great, because you never need to use it as anything else.

In my case, I've never actually worked on an app that didn't need to do more than just display the data at some point. In those cases, I don't want to get everything as a string from the browser. My data is often not merely a string, and treating it as such means I have to deal with stupid questions like whether "0" == 0 == "null".

In my work, I have spent many, many hours repairing issues that could have been avoided if the client and server were both statically typed (and on the flip side, in services that are statically typed, rarely run into those same problems). I'm sure that this can be remedied by just "doing it better", but I'd rather operate from a foundation where doing it right is easier than doing it wrong.

Instead of strings, I want my apps to have a statically typed contract between the client and the server that specifies what endpoints are available and the data structures they accept and return. Ideally this is in a shared class/type definition file, but two definitions kept in sync can work as well if they don't share a language.

The common argument against statically typing your network requests at each end is that you're not resilient against unexpected values and/or version differences, but:

a) It's trivial to tell any JSON parser to ignore unknown values, so adding new optional fields to the request is never a problem.

b) If your app ever uses the data received from the client, version differences will eventually break your dynamic code, too. You'll just only be told so at runtime, potentially only after loading the bad data from the db.

If you ever use your app's data for anything other than rendering a template, you have to make breaking changes to your data structures very carefully, and using a strong layer of static types on both client and server helps to ensure that you consider all the breaking changes and either avoid them altogether or somehow mitigate them.

brightball · on Aug 4, 2021

This is taking it a step further though. Now you're talking about static typing in the client, the server and the database.

I agree that this is a different scenario. If you're able to have the client control the input to ensure only the correct types are passed in there may be some additional value there to keep things more sane.

However, how much of the data that you're dealing with in the client on the server and in the database are you really working with in the server itself? 1% of the variables? 2%?

Generally any heavy work will be handled in either the database itself or a background job just to reduce wait time for the client.

And if that's the case, you're spending the time defining types for the client, server and database, keeping them in sync, having to change them in each place and creating this exercise for the other 98-99% of variables passed in and out of your server just to gain the static benefit for the 1-2% where you're actually working with it in the server on the web request.

It's significantly more work with questionable benefits IMO.

lolinder · on Aug 4, 2021

I guess that's true if you're dealing with an SPA. If your logic is primarily on the client, then the server doesn't benefit much from types. If your logic is primarily on the server, then your client doesn't. If it's a balance, then both do.

capableweb · on Aug 4, 2021

> if you haven't coded in a language like OCaml, Haskell, ML etc

Same argument could be made the other way, if you haven't used Smalltalk, Clojure (and other languages that are strongly dynamic), you shouldn't discount them.

reader_mode · on Aug 4, 2021

I've coded in Clojure enough to know that there are cases where it completely sucks (mostly when you have to do imperative low level code). I used to have a link to a standard library implementation of channels that was downright hideous to read as an example of this (and that code transformed to Java is actually easier to follow).

Clojure has some cases where it's insane how elegant you can make the solution, but frankly static languages with good type systems and tooling come close enough but don't have the scaling downsides.

capableweb · on Aug 4, 2021

> I've coded in Clojure enough to know that there are cases where it completely sucks

Yeah, agree. But this is hardly surprising. Learn enough about ANY language on this planet and you'll come to the same conclusion.

The beauty of Clojure is that it mostly provides you with tooling and other facilities that leads to elegant and more importantly simple code.

bjourne · on Aug 4, 2021

You can find hideous Haskell code too. Especially in code for which lazy evaluation doesn't work and you need to coax the runtime into computing stuff in the right order.

acomar · on Aug 4, 2021

a rebuild of Haskell that got rid of the dynamic exception system and put laziness/strictness in the type system as an effect is my dream.

skohan · on Aug 4, 2021

Once upon a time I was writing a lot of Objective-C, and back then I found it really cool and fun to embrace the potential of dynamic dispatch. But when I did get bit, the thing that was painful about it is that runtime failures often occurred way far away from the source of the issue, so debugging was often extremely painful.

I'm, not sure if Smalltalk and Clojure do this better, but since I've gotten into static languages, I basically haven't looked back, and I haven't missed the dynamism much at all.

capableweb · on Aug 4, 2021

You're making exactly the same point as seanwilson did above. "I have used this dynamic language and it was worse than this strong static type language" while not even tried something like Smalltalk or Clojure.

Yes, languages that are truly dynamic (ships with a REPL, has a instant change->run this snippet workflow, can redefine anything in the runtime AT runtime) does work a lot better than Python which is basically comparing the type systems of Java and Haskell.

Give it a try. Worst that can happen is that you now have tried and learned something different, I'm sure it'll give you some knowledge that'll be helpful in any programming language :)

skohan · on Aug 4, 2021

I thought Objective-C was basically smalltalk, but maybe I'm mistaken

capableweb · on Aug 4, 2021

My comparison was more with the developer experience. In Smalltalk you live inside your program, doing edits as you go along and with dynamic introspection. In Objective-C you would edit your program from the outside, and then build it after doing changes, maybe introspect it from the outside.

Pharo is a nice implementation of Smalltalk and they have a nice page that describes the top features of the language. Take a look and see if it's different from the developer experience you had with Objective-C: https://archive.is/uAWX5 (linking to a archive via archive.is as some images couldn't load)

blacktriangle · on Aug 4, 2021

On the surface yes, however in order to keep the "-C" side Objective-C lost all of smalltalks dynamic REPL development experience in exchange for the traditional compile/run/repeat cycle.

tester89 · on Aug 4, 2021

Objective-C adopted some syntactic "quirks" from smalltalk, but they are not similar at a semantic level.

skohan · on Aug 4, 2021

What are the key differences? My understanding was that both languages are heavily built around message passing between objects and dynamic dispatch.

gpderetta · on Aug 4, 2021

In particular, similarly to Smalltalk and differently for example from C++, Objective-C does method dispatching from empty interfaces and has does-not-understand functionality.

Everything Is An Object in Smalltalk, and of course this can't be true in Objective-C due to its mixed heritage.

andrekandre · on Aug 4, 2021

in smalltalk you are always running inside of a live system, so if you made a mistake its immediate that it happened right then and there, but in obj-c its a code-build-run environment so the place you changed and the state you have to progress to to exercise that change are far away and you might not catch it until runtime / in the field.

DaiPlusPlus · on Aug 4, 2021

> strongly dynamic

Que?

hultner · on Aug 4, 2021

A strongly typed dynamic language will raise an type error if you for instance try to add two incompatible types, for instance in python

  Python 3.9.0 (default, Oct 10 2020, 11:40:52)
  [Clang 11.0.3 (clang-1103.0.32.62)] on darwin
  Type "help", "copyright", "credits" or "license" for more information.
  >>> 1+"1"
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: unsupported operand type(s) for +: 
  'int' and 'str'

While a weakly typed dynamic language like JS will allow for it

  Welcome to Node.js v16.0.0.
  Type ".help" for more information.
  > 1 + "1"
  '11'

dragonwriter · on Aug 4, 2021

If the language allows it the types are (ipso facto) not incompatible; “weak typing” is more a subjective statement about how well a language lines up with the speakers mental model of what types should be compatible than an objective one about falsifiable properties of the type system.

gpderetta · on Aug 4, 2021

Agree. Weakly typed is so loosely defined and subjective to be a completely useless adjective. It basically a synonym for 'language I dislike'.

goatlover · on Aug 4, 2021

How about some languages allow implicit type conversions, and others do not.

iso8859-1 · on Aug 4, 2021

Is Scala weakly typed then because it has implicits?

goatlover · on Aug 4, 2021

I don't know, but JS and PHP implicit conversions are widely acknowledged to cause plenty of issues, which Python and Ruby do not have since you have much fewer cases of implicit conversion.

DaiPlusPlus · on Aug 4, 2021

I contend there is no such thing as "weakly-typed" because it suggests it exists in opposition to strongly-typed systems - but instead I feel "weakly-typing" should be considered the absence of type-correctness checking by either the compiler nor the runtime - and that the term "weakly typed" and related should not be used to refer to languages with plenty of implicit conversion between types, because in those languages (like JS) the implicit conversions are very, very, well-defined and predictable ahead of time - TypeScript models JavaScript's implicit type conversions accurately, too.

JavaScript's more forgiving, and often surprising, implicit conversions are still rigidly-defined .

Remember that implicit conversions should never (ostensibly) throw any exception or cause any errors because implicit conversion should always succeed because it should only be used to represent an obvious and safe narrowing-conversion. A narrowing-conversion necessarily involves data-loss, but it's safe because all of the data needed by the rest of the program is being provably retained.

lolinder · on Aug 4, 2021

> I contend there is no such thing as "weakly-typed" because it suggests it exists in opposition to strongly-typed systems - but instead I feel "weakly-typing" should be considered the absence of type-correctness checking by either the compiler nor the runtime

This doesn't feel like a very useful definition. What languages would fall under the category of "weakly typed" by your definition? I can only think of C, and even that is only true for a subset of the language (when you're messing about with pointers).

> Remember that implicit conversions should never (ostensibly) throw any exception or cause any errors because implicit conversion should always succeed because it should only be used to represent an obvious and safe narrowing-conversion.

    > const fun = (x, y) => (x + y) / y
    > fun(1, 2)
    1.5
    > fun("1", 2)
    6

No exception? Yes. No error? Nope, implicit conversion absolutely just caused an error. If I got that "1" from an input field and forgot to cast it to an int, I want to be told at the soonest possible juncture, I don't want to rely on me noticing that the result doesn't look right. As it is, "weak" feels like a good word for the types here. They don't hold their form well.

pydry · on Aug 4, 2021

Weakly typed / strongly typed are not really useful terms - there are too many separate definitions, all of which are pretty much equally valid.

blacktriangle · on Aug 4, 2021

I thought we were all agreed that the strong - weak type axis was to what degree the language automatically converts types?

igouy · on Aug 4, 2021

'Indeed, the phrases are not only poorly defined, they are also wrong, because the problem is not with the “strength” of the type checker but rather with the nature of the run-time system that backs them. The phrases are even more wrong because they fail to account for whether or not a theorem backs the type system.'

29.5 Types Versus Safety

"Programming and Programming Languages"

https://papl.cs.brown.edu/2020/safety-soundness.html#%28part...

pydry · on Aug 5, 2021

That's one of the 7 definitions listed on wikipedia.

I think it's more germane to refer to that as weak implicit type casting.

enricozb · on Aug 4, 2021

  class A(random.choice([B, C]):
    ...

rphln · on Aug 4, 2021

I don't know whether to feel disgusted or fascinated by this. I find the concept of prototype-based inheritance quite elegant, but this is a bit too much insane.

professoretc · on Aug 4, 2021

There was an experimental language called Cecil that had "conditional inheritance". You could say things like

    class Rectangle:
        var w,h

    class Square(Rectangle if w == h):
        ...

You could then overload functions/methods on both `Square` and `Rectangle` and it would call the correct overload depending on the runtime values of `w` and `h`.

dgb23 · on Aug 4, 2021

They are dynamic to a degree that facilitates interactivity and expression on a different level than one might be used to. Late binding, REPL, expression based, building up code while it is running without losing state etc.

DaiPlusPlus · on Aug 4, 2021

But none of those things you mentioned are dependent on a dynamic type system: on the contrary, plenty of statically-typed languages offer late-binding, expressions (certainly enough to implement a REPL) - I know C# certainly has all of those things.

The last thing: being able to edit program code, add new types/members/functionality and so-on while it's running... that's not really a language limitation nor typing-system limitation either: that's just good tooling that is able to hot-swap executable code after the process has already started. Visual Studio has had "Edit & Continue" in debugging sessions for almost 3 decades now, and the runtime build system used by ASP.NET means that source code and raw aspx/cshtml views can still be FTP'd up to production and just work, because .NET supports runtime compilation.

------------

For ordinary business software, where the software's focus is on data-modelling and data-transforming-processes, so all these programs basically just pass giant aggregates of business data around, so strong-typing is essential to ensure we don't lose any properties/fields/values as these aggregate objects travel through the software-proper. In this situation there is absolutely no "expressiveness" that's could be gained by simply giving up on type-correctness and instead rely on unit tests (and hope you have 100% test coverage).

simiones · on Aug 4, 2021

> that's not really a language limitation nor typing-system limitation either: that's just good tooling that is able to hot-swap executable code after the process has already started.

It's more complex than that. The language absolutely has to define these operations for them to make any sense. Modifying some code inside a function is the easy case, and supported by many environments. Modifying types is much more difficult, and usually not supported (.NET and JVM do not support any kind of type modification - neither field nor function). I'm not sure if the C++ debugger can handle this either. Common Lisp can do it, and it actually defines how existing objects behave if their type changes.

capableweb · on Aug 4, 2021

> For ordinary business software, where the software's focus is on data-modelling and data-transforming-processes, so all these programs basically just pass giant aggregates of business data around, so strong-typing is essential to ensure we don't lose any properties/fields/values as these aggregate objects travel through the software-proper.

There is also ways of dealing with this in dynamic languages. For example, Clojure has clojure.spec that achieves the same results, but without relying on strong typing.

In the end, it's all tradeoffs. There is no "one solution" that will be perfect for everything.

Zababa · on Aug 4, 2021

REPL and expression-based are not limited to dynamic languages. REPL-oriented programming as it's understood by Lisp/Clojure users might be, I'm not sure.

cutler · on Aug 4, 2021

Only Lisps have a reader so I'm not sure the term REPL is accurate with other languages. Interactive console != REPL.

draegtun · on Aug 17, 2021

Rebol / Red also have a reader though in its terminology it could be called a "loader". See LOAD function - http://rebol.com/docs/words/wload.html

Zababa · on Aug 4, 2021

Could you expand on that? I've searched a bit but the Lisp read doesn't sound too different from what other languages do.

capableweb · on Aug 4, 2021

This is probably the best blog post I've seen on the topic:

https://vvvvalvalval.github.io/posts/what-makes-a-good-repl....

> If you're not familiar with Clojure, you may be surprised that I describe the REPL as Clojure's most differentiating feature: after all, most industrial programming languages come with REPLs or 'shells' these days (including Python, Ruby, Javascript, PHP, Scala, Haskell, ...). However, I've never managed to reproduced the productive REPL workflow I had in Clojure with those languages; the truth is that not all REPLs are created equal.

Basically, the "REPL" you usually call a REPL is in fact a shell, and not a REPL as normally used in lisp languages.

kbelder · on Aug 4, 2021

I think FORTH has something very similar to the lisp REPL, if I'm understanding it properly. I know little of lisp, but everything I've read makes it seem that it and FORTH are almost evil twins of each other. Opposite but equal.

Zababa · on Aug 5, 2021

Thanks, from what I understand the difference I made between having a REPL and being able to do REPL-driven development is the one you make between a shell and a REPL.

simiones · on Aug 4, 2021

Unlike most languages, Lisp defines how strings can be transformed to Lisp ASTs (s-expressions). In Scheme (which doesn't allow reader macros) this is a safe operation: you can read untrusted input without worry.

By contrast, most other languages only have eval() - which takes a string and parses and executes it as code. In Lisp, eval() takes a Lisp list object and executes that.

draegtun · on Aug 17, 2021

Ditto for Rebol & Red.

For eg.

  >> code: load "foreach n [1 2 3 4] [print n * 2]"
  == [foreach n [1 2 3 4] [print n * 2]
  ]

  >> type? code
  == block!

  >> do code
  2
  4
  6
  8

cageface · on Aug 4, 2021

Typescript's structural typing plus the "any" escape hatch feels like a pretty good sweet spot to me. 99% of my Typescript code is very strict and I rarely make use of any but it's sometimes nice to have.

Writing correct, maintainable software is hard. You need both tests and typechecking.

xupybd · on Aug 4, 2021

And in languages like F# ( I assume the same for other ML like languages) the type system get out of your way.

Java can put you off with all the boiler plate. That's not a requirement of static typing.

majjgepolja · on Aug 4, 2021

Modern java is pretty nice though. I tried MLs and Haskells but was put off by not-so-mature tooling. (I know some of you do very well with it but I am used to some hand holding). With an IDE like intelliJ, modern java is pretty good. When I use python even with a linter/LSP many errors go undetected, whereas Java is much strict about this. (IIRC Java 9+ has local variable type inference, streams and lambdas were introduced in Java 8. It's not perfect but much expressive than old java.

Zababa · on Aug 4, 2021

Modern Java is getting really close to MLs, with records, pattern matching, sealed classes, lambdas, type inference, streams, optionals. I don't know if the ecosystem has adopted that way of developing though.

gorjusborg · on Aug 4, 2021

When project loom finishes, the JVM should have lightweight threads, tail call elimination and continuations, too.

DaiPlusPlus · on Aug 4, 2021

F# still has rather crippling limitations imposed on it by its inherited .NET CLR type system (for example F# union types must have their own hierarchy: you can't have an F# union with a POCO from another assembly).

F# skirts-around some of the limitations with just-in-time design-time type generation ( https://docs.microsoft.com/en-us/dotnet/fsharp/tutorials/typ... ) though Microsoft pitches it as a "better T4" for F# specifically, I think it's far more powerful.

The way things are looking, C# is set up get some form of higher-kinded-types in the next 3-5 years (hurrah!) - but from what Mads and the design-team have said it seems like they'll be implemented without significant changes to the CLR's type-system - so be prepared for everything to be implemented as glorified structs (and don't expect struct inheritance either...) with excessive amounts of implicit-conversion.

throw868788 · on Aug 5, 2021

All languages have their issues though, and they all make different tradeoffs. After using a number of languages compared to say C#/Java/Scala/Ocaml/etc I don't find F#'s limitations that crippling. In fact I think F# when you look at the swathe of languages with a big enough library ecosystem it makes a lot less tradeoffs than other languages in its class. At least for me it seems to be a good balance between simplicity, conciseness, expressiveness and performance - in some ways it seems nicer to code in and bring teams up in than some other functional languages I've seen. I've seen many teams adopt and then go back to Java from Clojure and Scala for example due to the complexity of the code that arises from people being too smart with the language.

The feature you state (F# unions) - that simplification has some positives as well. I can exhaustive pattern match for example. I can always use composition/pattern matching/active patterns to mix in other assemblies types cases. Is it really limiting that it doesn't do this? Would changing this mean more complexity and make other language features harder to deliver? All features introduce some additional complexity, and it more than linear typically. F# also has some interesting patterns that were either introduced first there en masse (e.g Async) or are easier to use there.

I'm a fan of static checking, and conciseness with performance close to the platform it is hosted on and is easy to reason about. I'm also a fan of easy to read and clean code that isn't "scary" to look at. F# seems to strike that balance well compared to other JIT functional languages, at least from where I’m sitting. Too much complexity and you start scaring people away and/or too much abstraction in the codebase starts to occur trading off maintainability.

xupybd · on Aug 4, 2021

Yeah it's not perfect. I might need to look into another functional language. I find it a good gateway. I can use familiar tooling and lean on C# libraries.

DaiPlusPlus · on Aug 4, 2021

Haskell recently got Linear Types baked in officially within the past year, so that's definitely worth taking a look-at.

Zababa · on Aug 4, 2021

What are the use cases for linear types?

noduerme · on Aug 4, 2021

I miss AS3 compile-time type checking. That was a variation of ECMA4. The closest we have now in the browser is Typescript. And even though it's admittedly pretty elegant with union types and utility types and mixins, some of what it produces is exactly the "noisy defensive code" you're talking about, for the simple reason that it's an elegant language that has to compile down to a worthless untyped jargon like javascript. It's a miracle that the web runs at all on JS.

thaumasiotes · on Aug 4, 2021

> some of what it produces is exactly the "noisy defensive code" you're talking about, for the simple reason that it's an elegant language that has to compile down to a worthless untyped jargon like javascript.

I don't understand. All languages, without exception, must compile down to a "worthless untyped jargon". Only assembly can execute.

noduerme · on Aug 4, 2021

I miscommunicated that, slightly. What I meant was that you still have to write defensive type checking code in Typescript, because it's only decoration, and you can't rely on the underlying JS to throw type errors either at compile time or runtime.

Javascript is a garbage layer of broken standards, a worst of breed language that shouldn't have to exist between a strict typed higher level language and lower level bytecode, but everything on the web now needs to be built on top of it, compile to it first, and run through one of a few JS engines. You can never build a full fledged language on top of something that has to go through JS... so TS, coffescript, et al are always going to be pseudo typed but not truly solid or permanent solutions.

Ultimately I think ECMA standards will finally move toward strict types in JS, and in a decade more we'll have something that for all intents and purposes is exactly what Flash was in 2009.

Zababa · on Aug 4, 2021

> you can't rely on the underlying JS to throw type errors either at compile time or runtime.

Sure, but if your typed language is sound you shouldn't have type errors at runtime. Not having type at runtime is not a problem, type erasure is nothing new and works really well in OCaml.

> You can never build a full fledged language on top of something that has to go through JS... so TS, coffescript, et al are always going to be pseudo typed but not truly solid or permanent solutions.

Rescript and Elm are really solid, OCaml can compile to JS too, Scala and F# too. Granted, only Elm has been built on JS at first, others were adapted (Rescript is like half and half), but they're all solid.

> Ultimately I think ECMA standards will finally move toward strict types in JS, and in a decade more we'll have something that for all intents and purposes is exactly what Flash was in 2009.

I think the idea here is to offer WASM as a compilation target. Trying to gradually type JS leads to really complex systems (TS types) because people did all sorts of crazy things with JS. I wouldn't like this to be the base on which we build other languages, or the web in general.

cutler · on Aug 4, 2021

Clojurescript gives you a lot more than TS but with different trade-offs. I'll take immutability, a real REPL, clojure.spec and 600 functions over static types any day.

throw868788 · on Aug 5, 2021

Why not both? Seen that work quite well.

skohan · on Aug 4, 2021

At the risk of being pedantic, only machine code can execute.

DaiPlusPlus · on Aug 4, 2021

> some of what it produces is exactly the "noisy defensive code" you're talking about

TypeScript, by design, does not produce nor introduce any "noisy defensive code": the point of TypeScript was that type-safety in JavaScript can be achieved without any runtime cost.

The only significant code-gen that TypeScript does do by itself is stuff relating to ECMAScript modules (e.g. wrapping and repackaging your TypeScript module for other module systems).

...with the exception of when you take a modern TypeScript compiler and specifically instruct it to target ES3 (lol) or perhaps some older variants of ES5 - at which point the code `tsc` generates is still so minimal you'll still need to bring-your-own polyfill/monkeypatch libraries. This is stuff like reimplementing the spread-operator or `for(of)`, and so on.

Lapsa · on Aug 4, 2021

With all due respect, in my book - TypeScript is mostly useless, vendor lock-in driven bunch of crap. INTELLISENSE111!!!11ELEVEN

noduerme · on Aug 4, 2021

Wha. I still write everything in Eclipse. Neon. And tsc somewhere under v3, compiling back to es5 in some cases to support old safari. Works great for me... lovely language, and zero lock-in.

cerved · on Aug 4, 2021

huh? typescript doesn't compile into type checks at runtime

the web runs on JavaScript because it's extremely forgiving and therefore stable. there can be lots of errors and things keep chugging along.

DaiPlusPlus · on Aug 4, 2021

To be fair, when writing TypeScript you often end-up needing to write loads of type-guard functions, and there are plenty of popular TypeScript extensions that generate those type-guard functions for you, e.g. https://github.com/rhys-vdw/ts-auto-guard

smolder · on Aug 4, 2021

I don't think there's any additional need for type guards when writing TS vs JS. Whether you need to validate some JSON input or not should be independent. If you can do without checking it in JS, you can likely do without checking it in TS, since all the type stuff is stripped away on transpile anyway.

DaiPlusPlus · on Aug 4, 2021

For checking the type of untrusted input, like a JSON response, then yes, type-check functions are equally useful in both TypeScript and JS worlds.

...and when types are within TypeScript's sphere-of-influence the compiler is largely able to deduce types without needing assertions or type-checks.

...or so you think! But there's two problems:

1. TypeScript's type system is unsound: so there are times when the compiler will let you do some (for example) contravariant array mutation, even though that's almost essentially guaranteed to crash at runtime if the rest of the system makes incorrect assumptions about the type-variance of the array/collection. So having the type-guard function available to assert to tsc that whatever array operation you performed is safe, you wouldn't need to do that in pure JS.

2. Type-guards are essential for when you're working with values that pass through an interface (because that necessarily means some loss of typing information has occurred), so the type-guard function is needed to recover that static information - and while this approach does also apply to JS, in some situations where if your code is the factory of subclasses that pass through an interface (e.g. so `MyComplicatedSubclassForRockAndRoll` appears as `MusicSubclass`) then in JS a blind "reinterpret-cast" of the object is fine - but TypeScript really doesn't want you doing that and again, having a type-guard function is just to keep the compiler happy - and even if it is possible to mathematically prove that the `object` type-cast is correct (because theoretically static analysis can see on both sides of an interface) but TypeScript isn't perfect and there are cases where we really do know better than the compiler, but we don't want to just make it unsafe and untyped - instead we want to keep the typing and we want to keep the compiler happy, but doing it this way (a type-guard) is not only easier than suppressing warnings and ignoring your sense of guilt but they're invariably genuinely useful a few months down the line when someone made a change that the type-system spotted and prevented from compiling (and so going into production), whereas it would be very easy to (for example) unintentionally introduce vulnerabilities because even with near-100% test coverage, because adding brand new functionality to a project (even if it said functionality doesn't work) it won't cause any tests to fail (at least, until tests for the new functionality are completed).

noduerme · on Aug 4, 2021

Thank you so much for expanding on what you knew I was trying to say about introducing additional type guards. Yes for casting from JSON you'd still need to inform JS to treat something as a string or float or int before using it, but with overloads and utility types in TS it becomes much more interesting and subtle to keep the compiler happy, and that is what ultimately leads to highly maintainable code, i.e., zero problems at checkout. And zero use of "any". The reason "any" exists and is a coding horror is because JS and the kind of code people write without type checks is a horror.

cerved · on Aug 4, 2021

idk about loads, it depends on how everything is setup. Most of this is ajax calls to a backend you control and develop. Making sure the types are coherent between frontend and backend is not a big deal. Generate interfaces based on the types returned by each endpoint.

If it's an external API, do the same thing except it probably needs a more manual approach.

This is not something that's in anyway unique to typescript. The difference is JavaScript is designed to fail gracefully using coercion whereas most other languages and runtimes abort and throw exceptions.

mpweiher · on Aug 4, 2021

> you have to pollute your projects with noisy defensive code that double checks the type of everything coupled with extra manually written tests, with no guarantee you didn't miss something.

A common misconception.

In tests, you test that values are correct, the types take care of themselves.

Types being correct does not imply values being correct, so you will have to write those tests anyhow, although you maybe be fooled by the safyness of static typing[1] that you don't have to.

Values being correct does imply that the types are correct, so if your tests pass, then your types are also obviously correct. (If a==3 then I don't have to check if a is of type pink elephant, it obviously isn't).

So that's the in-principle argument. And it turns out that this is completely supported in practice as well: neither the defensive code nor the extra type-based tests that you propose must be the result actually occur in dynamic code-bases with any frequency.

That said, static types certainly have their uses, particularly in being machine-checked documentation and also for producing faster code without going through crazy amounts of extra effort.

[1] https://blog.metaobject.com/2014/06/the-safyness-of-static-t...

seanwilson · on Aug 4, 2021

> A common misconception.

> In tests, you test that values are correct, the types take care of themselves.

How can you be sure your function e.g. returns an integer for all values and never undefined for some values? Unit tests cannot check this exhaustively unlike type checkers.

> Types being correct does not imply values being correct, so you will have to write those tests anyhow, although you maybe be fooled by the safyness of static typing[1] that you don't have to.

As I said, types compliment tests, they don't replace them but types do reduce some of the burden on writing tests for you.

To nitpick: type checking can prove values are correct when the type system is powerful enough (i.e. dependent types) but this isn't practical enough for mainstream yet.

orangecat · on Aug 4, 2021

Unit tests cannot check this exhaustively unlike type checkers.

Right, which is actually an argument against the claim that you have to write more tests with dynamic typing.

kaba0 · on Aug 4, 2021

I agree with you, except for the nitpick. Unfortunately due to formal verification not scaling, dependent types will likely not be the solution :/

chriswarbo · on Aug 4, 2021

Dependent types are already very useful. If we're willing to avoid formal verification, then we don't need a full correctness proof for our application; in which case, there's no need to avoid dependent types just because they don't scale. They still allow us to prove correctness for parts of an application (e.g. a tricky, performance-sensitive algorithm), which is better than nothing.

In particular, we can use dependent types to place much tighter constraints on critical pieces of code, but 'cut short' a full correctness proof by simply punting to an error case when it's convenient.

For example, say we need to manipulate the Nth element of a collection. We can express indices with a type 'Fin X', whose values are equivalent to 1, 2, ..., X, e.g.

    f1 : (c: List Foo) -> (n: Fin (length c)) -> BAR

Within this 'f1' function, we know that the index 'n' will appear in the list 'c' (as a corollary, we also know that 'c' is non-empty, since empty lists have no indices!); this lets us specify the intended behaviour more precisely, lets us satisfy the requirements of other functions we may want to call, lets the compiler spot more problems for us, perhaps avoids the need for certain bounds checks, etc.

Yet this places a burden on anyone who calls this 'f1' function: they need to prove that the index appears within the list (and, in doing so, prove that the list is non-empty). This might be very difficult in the general case, yet a full correctness proof would require it.

However, we don't need a full correctness proof; so we can just check if we're in bounds, and return an error if not:

    f2 : List Foo -> Int -> Either String BAR
    f2 c i = case mkFin (length c) i of
               Just i  -> Right (f1 c i)
               Nothing -> Left ("Index " + show i + " not in list of length " + show (length c))

      where mkFin : (n: Nat) -> (i: Int) -> Maybe (Fin n)
            mkFin Z     _         = Nothing
            mkFin _     i | i < 0 = Nothing
            mkFin (S n) 0         = Just fZ
            mkFin (S n) i         = map fS (mkFin n (i - 1))

Calling a function like 'f2' may end up hitting the error case, so this doesn't prove our entire application is correct; but we can still prove all sorts of things about 'f1', and hence have a lot of confidence in what happens whenever this initial error branch is avoided. Also, anticipating an error up-front is usually preferable to hitting one half-way through processing.

Dependent types are really useful for propagating error conditions 'backwards' through the code like this. When practical, I much prefer this to propagating error conditions 'forwards', by returning Maybe, etc. Most applications start with some sort of parsing step, which is a great place to perform such checks, if we can manage to propagate them back that far. Otherwise, we just handle these error cases just like we would any others (dependent types or not).

chriswarbo · on Aug 4, 2021

> Types being correct does not imply values being correct, so you will have to write those tests anyhow, although you maybe be fooled by the safyness of static typing[1] that you don't have to.

Urgh, that reference is such a shallow, straw-man facade of static typing. This is especially evident with its link to https://www.sans.org/top25-software-errors/ and claim that those "don't look like type errors to me", a list which includes:

- 16 even says 'type' in the title: "Unrestricted Upload of File with Dangerous Type"! Phrased as a type error, this would be "savePicture: Required argument of type 'Either[PNG, JPEG]', given 'Array[Byte]' instead"

- 2, 3, 6, 11, 18 are all injection attacks, which are classic type errors (AKA "stringly typed programming"). Phrased as a type error, these would be "+: Cannot append 'Foo' with 'FormInput'", where 'Foo' might be 'SQL', 'ShellCommand', etc.

- NULL pointer dereference is Tony Hoare's 'billion dollar mistake', which as type error would be "Expected 'Foo', given 'null'"

- integer over/underflow, buffer over/underflow, out-of-bounds access, etc. can all be mitigated with types like 'Maybe', 'Try', etc. at the cost of a bounds/flag check. If we don't want runtime checks, we can instead track memory usage bounds in the types, and have the compiler check that we're allocating enough.

So many of these problems are avoidable with types, even before we go into fancy stuff like dependent types, linear types, etc. In particular, most of the burden to getting this right is on languages and libraries, not application developers.

For example, if a database library can only be queried by an 'SQL' type, and the only way to construct that type is via literals and a 'withParam[T]: SQL -> T -> SQL' function, then application developers are unable to introduce injection attacks; the library forces them to do the right thing. Instead we get 'dbQuery: String -> ...', 'runShell: String -> ...', 'response.withBody: String -> ...', etc. and draw the conclusion that types can't help :(

simiones · on Aug 4, 2021

The problem is that often getting rid of these type-unsafe idioms requires much more sophisticated types OR much more tedious code than expected. For example, a language without NULL references must rely on something like Optional/Maybe. Optional requires generic types (kind-1 types) to be usable, and it also requires some kind of pattern matching or monad comprehensions - otherwise, code using Optional is neither type safe nor readable.

The SQL case is much more complex. SQL is such a non-standardized language in practice, and such a complex one, that trying to offer a generic SQL library that presents an AST as the API for even the most popular RDBMSs quickly gets complex. So, you end up with convenient string->SQL converters, and then people stick untrusted input in those strings. Not to mention, your fancy AST library still needs to spit out SQL text to talk to most DBs, so you end up needing to sanitize identifiers in your own SQL->string conversion.

mpweiher · on Aug 4, 2021

Right. The other thing is that in the parent, "static type checking" got shortened to "types", which in turn got confused with "modelling".

The problem with stringly-typing is the model, which is SQL as strings. Statically type-checking that all your strings are, in fact, strings does not buy you anything whatsoever in this scenario.

On the other hand, if you have an actual model of the SQL, then it also doesn't matter whether that model is checked at runtime (dynamic typing) or at compile-time (static typing). In either case it will not allow injection attacks, if implemented correctly.

And of course the actual place where nastiness happens is the user input, and I haven't yet seen a place that can statically typecheck users. ;-). So, as you point out, you will need to do dynamic checking and sanitising of input.

chriswarbo · on Aug 4, 2021

> And of course the actual place where nastiness happens is the user input, and I haven't yet seen a place that can statically typecheck users. ;-)

User input has type `ByteString`. Not only can we check that statically, we absolutely should; and we should enforce that type, to reject any code which assumes otherwise.

Nastiness happens when developers treat user input as something other than ByteString; e.g. they might try appending it to a fragment of HTML (XSS); or wrap it in quotation marks and send it to a database (SQL injection).

We don't need to 'statically typecheck users'; we do need to statically check that `myApp` has type `ByteString -> Foo`, so we can avoid ever executing implementations which actually have type `HTMLFragment -> Foo` (XSS), or `SQLFragment -> Foo` (SQL injection), or whatever.

seanwilson · on Aug 4, 2021

> On the other hand, if you have an actual model of the SQL, then it also doesn't matter whether that model is checked at runtime (dynamic typing) or at compile-time (static typing). In either case it will not allow injection attacks, if implemented correctly.

It's possible to implement anything correctly with dynamic types but the point is static typing makes it easier to do so. Static types will exhaustively check for you that certain errors aren't possible (vs tests that check a finite number of cases) + compile-time (vs at runtime when the error actually occurs on particular inputs).

> And of course the actual place where nastiness happens is the user input, and I haven't yet seen a place that can statically typecheck users. ;-). So, as you point out, you will need to do dynamic checking and sanitising of input.

It sounds like you're talking about dynamic (type) checking here when you really mean regular runtime behaviour? You absolutely can at compile-time make sure your program is going to respond in a sensible way to problematic user inputs at runtime e.g. the compiler verifies that either the user input string value will get turned into a `valid email` value at runtime or the user is asked to try again. I wouldn't call the condition that checks if the string is a valid email a dynamic type check.

I don't see how user input is different or special from any other value either e.g. it's not like you can predict what exact values are going to come from files, random number generators or from complex calculations during runtime, you just care that they're in the range you expect.

mpweiher · on Aug 4, 2021

> It sounds like you're talking about dynamic (type) checking here when you really mean regular runtime behaviour?

No. Quite the opposite. People here confuse "static typing" with "checking stuff", even such obviously dynamic things as "checking outside input".

> You absolutely can at compile-time make sure your program is going to respond in a sensible way to problematic user inputs at runtime

Right. Sort of. But that is just normal code. Code that absolutely, definitely has to be executed at runtime and at no other time. Because that is when the actual users of your system are going to input the actual data. Not while you're compiling stuff. Your users (and the rest of the world you interact with) aren't there while you're compiling. And your compiler isn't around when your users are inputting data.

> the compiler verifies that either the user input string value will get turned into a `valid email` value at runtime

No, the compiler cannot verify user input, because the compiler isn't running when the user is inputting data. You can implement a model that has the concept of "valid e-mail", and you can implement your system such that it converts input data such as strings into your structured model at the edges and then only deals with "valid e-mail" objects internally. But implementing such a model has nothing whatsoever to do with whether you use a statically type-checked language or a dynamically type-checked language.

It also turns out that such systems tend to be really, really bad. What you actually want to do is keep the user input exactly as it was input, for auditing purposes if nothing else, and build your structures as an enrichment on top of that basic data. Because your idea of what constitutes a "valid" e-mail is almost certainly wrong, and it's better if the system can at least represent data even if it doesn't completely understand it, rather than destroy user data.

seanwilson · on Aug 4, 2021

> > the compiler verifies that either the user input string value will get turned into a `valid email` value at runtime or the user is asked to try again.

> No, the compiler cannot verify user input, because the compiler isn't running when the user is inputting data.

I'm not following why you thought I meant that. It's like you're arguing it's impossible to make any compile-time guarantees because you don't know what exact literal inputs your program will be dealing with at runtime when the field of https://en.wikipedia.org/wiki/Formal_verification exists. E.g. at compile time you can prove a sorting function gives the correct output for all possible inputs without having to run the code - you don't know in advance what the input is going to be (which is normal when coding), but you still know the output will be correct with respect to the function specification.

simiones · on Aug 5, 2021

There are two ways to have the compiler ensure that you produce safe SQL for any user input:

1. Define SQL in your type system, and force programmer to specify for any piece of user input they want to use in a query what its semantics are supposed to be. This is the SQL DSL approach, such as the short lived LINQ-to-SQL or Haskell's Selda. ORMs also do something similar.

2. Enforce that any string sent to the DB passes through some kind of checker that enures certain properties hold for that SQL. The checker will have to understand all of the semantics of SQL, just like in 1.

There are many libraries that go through path 1, but don't support the full capabilities of SQL (usually they support a tiny subset), even for a single DB.

mpweiher · on Aug 5, 2021

No, I am not saying that you cannot make any compile-time guarantees. I am saying that these have little to nothing to do with the actual securing against SQL injection attacks, and that the idea that static types help there is simply an unsupported assumption (and a circular argument, see parallel post).

mpweiher · on Aug 5, 2021

> It's possible to implement anything correctly with dynamic types but the point is static typing makes it easier to do so

This is where the argument became circular and we can basically stop.

The parent claim (by chriswarbo) was that, for example, SQL injection attacks were, in fact, incontrovertible proof that SQL injection attacks "are classic type errors". That simply isn't true, at least not in the sense of static type checking (and only made true, as I mentioned above, by conflating "static type", "type" and "model" into one incoherent gooey mush).

chriswarbo's incorrect claim was in response to my referenced article, The Safyness of Static Typing [1], where I (as a prelude to my actual point) look at the evidence for the claim that "static types make it easier to implement something correctly".

There isn't any.

Or to be precise, there isn't any that passes any statistical or other standards. What little evidence there is goes both ways and has, at best, tiny effect sizes.

And yes, that article is somewhat old, but the evidentiary situation has not changed, despite further attempts to make the claim that static typing is provably better. Claims that were resoundingly debunked.

So that's the background. With this background we have the idea that SQL injection attacks are somehow evidence for the problems with dynamic typing, which they are not. I can have a statically checked string type and have exactly the same SQL injection attacks, and in principle, checking code needs to run at runtime against the dynamic input it is represented with. Which I think we agree on:

> It's possible to implement anything correctly with dynamic types

but apparently chriswarbo does not. So that brings us to the circularity of the argument:

> but the point is static typing makes it easier to do so

This hasn't been shown. It was claimed. And it was claimed that SQL injection somehow proves this. Which it doesn't, because we agree that what prevents the attack is the code that runs, the model that executes. It may be that such code is easier to write with static types, but that hasn't been shown, it is just claimed and the claim repeated in support of the claim. Circle.

[1] https://blog.metaobject.com/2014/06/the-safyness-of-static-t...

seanwilson · on Aug 5, 2021

> Or to be precise, there isn't any that passes any statistical or other standards. What little evidence there is goes both ways and has, at best, tiny effect sizes.

You mean the link from [1] to "An experiment about static and dynamic type systems" based on "an empirical study with 49 [undergraduate] subjects that studies the impact of a static type system for the development of a parser over 27 hours working time.". I think studies like this are more distracting here than useful (low skill level, not large scale enough, too much noise, toy example). The article here sums up a lot of my thoughts on this: https://news.ycombinator.com/item?id=27892615

And do you really need a study that proves e.g. (to pick a simpler example to summarise) a language that makes it impossible for null deferences to happen at compile-time is going to have less null dereference errors than one that lets those errors happen at runtime? It's like insisting for an empirical study that 2 + 2 = 4.

> It may be that such code is easier to write with static types, but that hasn't been shown,

The typed SQL example might look something like:

   function getUserInput(): string

   function createSanitisedString(s: string): SanitisedString
   function createSafeSqlQuery(s: SanitisedString): SqlQuery

For the code `createSafeSqlQuery(getUserInput())`, a static type checker would stop the entire program from starting with a type error pinpointing the exact line where the unsanitised data is coming from. With a dynamic type checker:

1. At best, the code will fail only after the user input is received by createSafeSqlQuery during runtime and you won't know where the input originated from.

2. At worst, the coder forgets to add a check like `typeof s === SanitisedString` or a call to `createSanitisedString` in `createSafeSqlQuery` and creates an injection attack.

The static type checker clearly wins here for me for safety and ease of implementing correctly. I don't need a study to know that compile-time errors are better than production runtime errors, that automated and exhaustive checks of correctness are better than the coder having to remember to add manual checks, and that it's better to know the exact line the unsanitised input came from over only knowing it comes from somewhere.

What languages have you used that have strong static type systems that you've written large programs in? Have you tried OCaml or Haskell for example?

simiones · on Aug 5, 2021

> function createSanitisedString(s: string): SanitisedString > function createSafeSqlQuery(s: SanitisedString): SqlQuery

The problem with the whole argument is that these functions are not actually enough to work with SQL, since they don't allow us to create dynamic SQL from safe strings.

Here are some ideas of why we can' use the functions you proposed:

  user_filter = raw_user_input_col_name ++ " LIKE ?"
  createSafeSQLQuery("SELECT * FROM my_table WHERE " ++ user_filter ++ ";") //type error

  sanitizedQuery = createSanitisedString("SELECT * FROM my_table WHERE" ++ user_filter ++ ";") //should return error or quote everything
  createSafeSQLQuery("SELECT * FROM my_table WHERE" ++ createSanitisedString(user_filter) ++ ";") //filter will be wrong, type error

Let's now add something to allow us to achieve our goal:

  function asSanitizedString(s: string) : SanitizedString {
    return s
  }
  user_filter = createSanitizedString(raw_user_input_col_name ) ++ asSanitizedString(" LIKE ?")
  query = asSanitizedString("SELECT * FROM my_table WHERE") ++ user_filter ++ asSanitizedString(";")
  createSafeSqlQuery(query) //works, nice
  createSafeSqlQuery(asSanitizedString("SELECT * FROM my_table WHERE " ++ user_col_name ++ "LIKE ?;")) //oops, this also works

You can achieve all of this with a non-string API: you can have an SQL DSL or just an SQL AST library; or you can use an ORM. But either way, you can't fix it without modeling the entirety of SQL into your type system (or as much of SQL as you are willing to support).

If you don't believe me, go looking for a library that allows arbitrary strings as SQL, but statically ensures user input can't be used to construct an SQL query. I don't know of any one.

seanwilson · on Aug 5, 2021

> The problem with the whole argument is that these functions are not actually enough to work with SQL

My bad for being unclear but I meant you would be sanitising something like the user entering "tea cups" into a shop search input form and searching your shop product database with that. I didn't mean the user would be entering SQL queries.

> But either way, you can't fix it without modeling the entirety of SQL into your type system

Using the type system to check correctness sounds good to me.

This is getting way too into the weeds about SQL anyway. Null dereference checks are a less distracting example to focus on for instance. As long as you can encode it into the type system, all my points are still relevant.

simiones · on Aug 6, 2021

> My bad for being unclear but I meant you would be sanitising something like the user entering "tea cups" into a shop search input form and searching your shop product database with that. I didn't mean the user would be entering SQL queries.

No, I understood that. But my point is that the programmer is going to be composing SQL queries, and they will be doing it based on user input. The SQL API will have a very hard time distinguishing which parts it receives are trusted programmer input and which parts are untrusted user input.

> Using the type system to check correctness sounds good to me.

Sure, but SQL is a huge language with significant differences between any 2 major DB implementations (and that depends on SQL server configurations - e.g. in MySQL you can specify via config file whether "a" is a string or an identifier). I have never seen a full SQL DSL or AST used in any SQL client library, in any language: it's just too much work.

mpweiher · on Aug 5, 2021

>You mean the link from [1] to "An experiment about static and dynamic type systems"

No. I mean, yes, that is one study, but there are a lot more. They all come out essentially the same.

Danluu did a an overview article a while ago:

https://danluu.com/empirical-pl/

Note that the A large scale study of programming languages and code quality in github paper, the one that makes some of the strongest arguments for safety of the bunch, was the one that was later completely taken apart at OOPSLA 2019. Its findings, which also had very small effect sizes and statistics significance, are completely invalid.

> do you really need a study that proves ... less null references errors

1. Languages don't have null errors, programs do.

2. I don't need a study to show that a program has less than or equal null errors in a language with null safety, because the program in a language without might still have zero. If you're going to make a logical claim, then let's stick to what's logically provable. If you're going to hand-wave...well you might as well use a dynamically typed language ;-)

3. I do need a study to show that such differences matter at all. All software has bugs, if this a significant source of bugs, then a mechanism to make me not have them might matter.

4. I do need a study to show that the net effect is positive. For example, the famous Caper Jones study showed that dynamic languages such as Smalltalk and Objective-C were significantly more productive than statically typed languages, including Haskell, C++ and Java. Studies also have long shown that bugs scale linearly with code size, a very strong correlation. So let's say null-references are 1% of you total bugs, but you now write twice the amount of code, that means a move to a null-checked language will significantly increase your total bug count, despite eliminating a certain class of bugs.

(In fact other, older studies showed that type errors accounted for 10% of bugs, so if you can save just 10% of code using a dynamically typed language, you're ahead in terms of bugs).

Of course, I can also not require such studies and make an engineering judgement. And this is fine, we do it all the time because very little in software has been demonstrated much at all. But you then need to be aware that this is a subjective judgement call, and others may reasonably come to a different conclusion.

And being aware of this makes for much, much better engineering judgement, IMHO.

igouy · on Aug 5, 2021

> … but you now write twice the amount of code…

We could take some JavaScript programs —

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

— and try to transliterate them into Dart —

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

— and see how much or how little boiler plate is required when a language supports null safety, local type inference, implicit-dynamic: false ?

seanwilson · on Aug 5, 2021

> 1. Languages don't have null errors, programs do.

> 2. I don't need a study to show that a program has less than or equal null errors in a language with null safety, because the program in a language without might still have zero. If you're going to make a logical claim, then let's stick to what's logically provable. If you're going to hand-wave...well you might as well use a dynamically typed language ;-)

I think now you're just nitpicking and being uncharitable for the sake of it instead of responding to the overall point of my message I took care to write.

> Studies also have long shown that bugs scale linearly with code size, a very strong correlation. So let's say null-references are 1% of you total bugs, but you now write twice the amount of code, that means a move to a null-checked language will significantly increase your total bug count, despite eliminating a certain class of bugs.

> (In fact other, older studies showed that type errors accounted for 10% of bugs, so if you can save just 10% of code using a dynamically typed language, you're ahead in terms of bugs).

Most software engineering studies have so many confounding factors even in isolation that combing results from multiple studies like this is nonsensical and misleading. This reads like brilliant satire if I'm honest.

> And being aware of this makes for much, much better engineering judgement, IMHO.

You didn't reply to the question about if you've written large projects with strong static type systems like OCaml or Haskell. If the answer is no, they're worth learning for more awareness instead of relying on likely very limited and/or flawed empirical studies in my opinion.

igouy · on Aug 5, 2021

> … famous Caper Jones study…

Have you ever considered the validity of that study: which purports to compare programming languages, apparently without considering obvious differences in available programming tools?

Even back in the late '80s, those Smalltalk implementations provided a very integrated development environment — writing Smalltalk in a simple text editor really isn't the same ;-)

chriswarbo · on Aug 5, 2021

> > It's possible to implement anything correctly with dynamic types but the point is static typing makes it easier to do so

> This is where the argument became circular and we can basically stop.

I think any appeal to "easier" is inherently subjective. I've certainly tried to avoid making any such claims. Instead, I've mostly tried to argue that static types can forbid certain programs from ever running, and we can use that to forbid things like vulnerable SQL queries.

> > It's possible to implement anything correctly with dynamic types

> but apparently chriswarbo does not.

I never said any such thing; besides which, it's trivially the case that anything implemented with static types could also be implemented with dynamic types, since static types only exist at compile time (they are "erased" during compilation), and hence every running program is dynamically typed.

In fact, I don't think I've said anything about correct implementations at all. My point is that static types can forbid incorrect implementations. When it comes to security, that is much more important; i.e. I would much rather bang my head against a screen filled with compiler errors, than expose some insecure program to the harsh reality of the Internet.

> The parent claim (by chriswarbo) was that, for example, SQL injection attacks were, in fact, incontrovertible proof that SQL injection attacks "are classic type errors".

I've not claimed that (in fact, I'm stuggling to parse what that sentence might even mean). I claim that SQL injection attacks are "classic type errors" in the sense that:

- It's a widespread problem, easily encountered, commonly warned about, and most developers have probably done it at some point.

- It's a class of error that can be entirely ruled out at the language/API level, through use of static types. Similar to how segfaults can be entirely ruled out at the language level, using managed memory like in Python/JS/etc.

- Due to the above, it's a common example used to illustrate the usefulness of static types in blog posts, articles, etc. One which sticks in my mind is "A type-based solution to the “strings problem”: a fitting end to XSS and SQL-injection holes?" https://blog.moertel.com/posts/2006-10-18-a-type-based-solut...

- Due to the above, it's considered "folk wisdom" (or a "folk theorem") in the relevant fields (e.g. Programming Language Theory, Formal Methods, etc.)

Other examples of "classic type errors" might include:

- Mixing up integers with strings-of-digits, e.g. 'print("You were born around " + (time.now.year - input("How old are you?"))'

- Forgetting to 'unwrap' some list element, e.g. 'open(listdir("config/")).read()' instead of 'listdir("config/").map(f => open(f).read())'

- Mixing up units of measure, e.g. 'distance = speed + time' instead of 'distance = speed * time'

Basically any common mix-up between values/components in a program, where the computer could help up to spot the mix-up (perhaps entirely automatically, if type inference is involved). In the case of SQL injection, the mix-up is between 'string of bytes' and 'SQL query'.

> And yes, that article is somewhat old, but the evidentiary situation has not changed, despite further attempts to make the claim that static typing is provably better. Claims that were resoundingly debunked.

I very much appreciate when researchers attempt to ground things in a bit more empiricism! Unfortunately those particular studies just aren't looking at the sorts of things that I find relevant; in particular, those studies (and that linked blog post, and many of the comments here), seem overly-preoccupied with mundane trivialities like "string", or "int", or "SQLColumnName".

Personally, I'm much more interested in how types can help me:

- Avoid leaking secret information https://link.springer.com/chapter/10.1007/978-3-030-17138-4_...

- Guarantee big-O resource usage https://twanvl.nl/blog/agda/sorting#a-monad-for-keeping-trac...

- Guarantee the correct order of operations https://docs.idris-lang.org/en/latest/st/machines.html

- Prevent AI agents from losing capabilities http://chriswarbo.net/projects/powerplay

mpweiher · on Aug 15, 2021

[SQL injection]

> It's a class of error that can be entirely ruled out at the language/API level, through use of static types.

Repeating this claim doesn't make it true.

Once again: this has nothing to do with static vs. dynamic types, and everything to do with modelling.

To make this clear, let's compare a dynamically and a statically type-checked version of this with the model of SQL as just strings.

Both the dynamically and the statically type-checked version of this program will be susceptible to SQL injection attacks. The statically type-checked version will verify at compile time that all the values are indeed strings.

Now let's compare a dynamically and a statically checked program with a proper model for the SQL, not strings.

Both of these will not be susceptible to SQL injection attacks.

It has nothing to do with static vs. dynamic, and everything with how you model the problem.

chriswarbo · on Aug 4, 2021

> Optional requires generic types (kind-1 types) to be usable

My gut tells me that's not quite right (for some reasonable definition of 'usable'), since we can always build elimination forms into the language (after all, 'null' must be built in, and few take objection to building in elimination forms like if/then/else).

> it also requires some kind of pattern matching or monad comprehensions - otherwise, code using Optional is neither type safe nor readable.

I really like comprehensions in Python, but hardly ever use them elsewhere. In fact, when I do use them in Haskell and Scala, I usually find myself having to refactor them into calls to `map`/`join`/etc. soon after, to have more fine-grained scoping or somesuch.

> trying to offer a generic SQL library that presents an AST as the API for even the most popular RDBMSs quickly gets complex

That's orthogonal to anything I said. Represent SQL using arrays of bytes in memory if you like; offer a string-like interface for constructing and manipulating them if you like; just make sure to distinguish them from other string-like types, in a way that's statically enforceable, and where the only API to convert a 'String' to an 'SQL' is the escaping function.

Note that I don't particularly care if an SQL value is valid; it would be nice to statically enforce that, but you're right that (vendor-supplied) SQL is pretty complex in its own right. What's much more important to get right is that SQL is not user-generated.

simiones · on Aug 4, 2021

> My gut tells me that's not quite right (for some reasonable definition of 'usable'), since we can always build elimination forms into the language (after all, 'null' must be built in, and few take objection to building in elimination forms like if/then/else).

True, you could go the Go array route and have Optional be a special type that is generic, without otherwise supporting generic data structures.

> I really like comprehensions in Python, but hardly ever use them elsewhere.

I think code like possiblyMissing.map(value => code) introduces lots of unpleasant imbrication, and breaking out functions for every possibly-missing value also seems to me to lead to bad readability. I suppose this may be a matter of taste, or it may be related to other language features.

> just make sure to distinguish them from other string-like types, in a way that's statically enforceable, and where the only API to convert a 'String' to an 'SQL' is the escaping function.

I think this is the part that doesn't really work, because, in order to escape a piece of user input, you need to understand what it's meant to be. For example, a common (unsafe) idiom is:

  sqlFormatString = "SELECT %s FROM %s WHERE %s;";
  filterFormatString = "%s LIKE ?";
  filterString = sprintf(filterFormatString, userInputColumnName);
  sqlQuery = sprintf(sqlFormatString, columnName, tableName, filterString);
  preparedQuery = dbImpl.prepare(sqlQuery, userInputValue);

Which of course does the right thing for userInputValue; may or may not be doing the right thing with columnName and tableName, depending on the source; and does the really wrong thing with userInputColumn.

Now, you could replace this with a type safe dbImpl.prepare(), which only takes a NotUserGeneratedSQL type. But now you need to have some way to build a NotUserGeneratedSQL.

One option is to go all in on an SQL AST library, where you could build the query above like

  query = selectQuery();
  query.columnNames = [ escapeColName(columnName) ];
  query.from = escapeTableName(tableName); 
  queryFilter = equalityComparison();
  queryFilter.left = escapeColName(userInputColumn);
  queryFilter.right = preparedArg();
  query.filter = queryFilter;

  preparedQuery = dbImpl.prepare(query, userInputValue)

But this, as I said, gets tedious for the user and complex for the implementer.

However, I don't think there is any alternative that can actually enforce validation of user input in constructed queries. You can help users think about it by forcing them to use some kind of string -> NotUserGeneratedSQL function, but this function can't actually be written to reject the example I gave initially. A function that creates an empty minimal NotUserGeneratedSQL and offers other functions to build it up to a useful query will either accept strings (making it possible to introduce SQL injection) or have to accept structured SQL, making it fall into the AST problem.

Of course, the problem of NotUserGeneratedSQL is easy if you don't want to support any kind of dynamic query, other than prepared statements. But if you want to dynamically generate queries based on user input (e.g. user chooses which columns they want to see, which columns they want to filter by etc.), then I don't think it's possible to statically ensure that SQL injection is not possible without an ORM or SQL AST API.

Edit: if you believe otherwise, please show me a type-safe, SQL-injection-safe library in any language that isn't an ORM/EDSL or a straight up SQL AST manipulator. I am quite certain none exists, but I would be happy to be proven wrong.

chriswarbo · on Aug 5, 2021

What I had in mind was something along these lines:

    allowedColumns: Map[UserInput, SQL] = {
      "name": "name",
      "age": "age",
      ...
    }

    strict: SQL = strictComparison? "=" : "LIKE"
    
    query: Maybe[SQL] = allowedColumns
      .get(userCol)
      .map(col => "SELECT " + col + " FROM tbl WHERE tbl.foo " + strict + addParameter(":?", userFoo))

In particular:

- We can write literals which look like strings but have type SQL

- We can append fragments of SQL together (potentially making it invalid; oh well)

- We can include user input via parameterised queries, in locations where arbitrary strings/ints/etc. are allowed

- Anything 'structural', like identifiers, choice of comparison operations, building up sub-expressions, etc. must be done programatically, using the above features. In this case we select the column name (written as a static, literal SQL value) by looking up the user's input in a map. We also allow choosing between the type of comparison to use (again, both are SQL literals).

It seems to me that requirements like 'user chooses which columns they want to see, which columns they want to filter by etc.' is fundamentally incompatible with a safe, string-like representation. Instead, the options are:

- Fully representing the structure of the language. This results in an AST approach, which allows safe dynamic queries. I think any alternative, like tracking the offset of each delimiter in a string, etc. will turn out to be equivalent to maintaining an AST.

- A "flat", string-like representation, whose dynamism is limited to choosing between some combination of pre-supplied fragments. This is what I've shown above. This is safe, but the 'dynamism' is inherently limited up-front (i.e. it's overly conservative).

- A "flat", string-like representation, which has unrestricted dynamism, but hence is also inherently unsafe (i.e. it's overly liberal).

simiones · on Aug 6, 2021

> - We can write literals which look like strings but have type SQL

What I'm not clear is: what prevents me from accidentally/stupidly doing:

  filter : SQL = "WHERE " + userInputCol + " = ?"

Is this special string handling some compiler magic that distinguishes literal strings from string variables? If so then I think that in a language that supports something like this you can indeed make a safe library. The main downside is that you need to work entirely with compile-time constructs - e.g. you can't use something like printf to take a compile-time format string and turn it runtime into a query; and you can't take queries from a separate file, they must be in source code. But these may be acceptable trade-offs.

Do you know of any library that implements this?

chriswarbo · on Aug 7, 2021

> Is this special string handling some compiler magic that distinguishes literal strings from string variables?

Ah, maybe I should have made it clearer that I was overloading the double-quote syntax, so we can write:

    "foo": String
    "bar": SQL
    "baz": UserInput
    "quux": Shell
    etc.

I was also relying on type inference to figure out which is which, and on '+' returning the same type as both arguments, e.g.

    +: String  -> String  -> String
    +: SQL     -> SQL     -> SQL
    +: Int     -> Int     -> Int
    +: Float   -> Float   -> Float
    +: List[T] -> List[T] -> List[T]
    etc.

This way, we see how your example fails to typecheck:

    // Code as written
    filter : SQL = "WHERE " + userInputCol + " = ?"

    // Right-hand-side must have type SQL, to match left-hand-side
    "WHERE " + userInputCol + " = ?": SQL

    // Resolving order-of-operations of the two '+' operations
    ("WHERE " + userInputCol) + " = ?" : SQL
  
    // Arguments to outer '+' have same type as return value, which is SQL
    "WHERE " + userInputCol : SQL
    " = ?" : SQL

    // Arguments to inner '+' have same type as return value, which is SQL
    "WHERE " : SQL
    userInputCol : SQL

We've inferred that userInputCol must have type SQL, so it will fail for String/UserInput/whatever.

> The main downside is that you need to work entirely with compile-time constructs - e.g. you can't use something like printf to take a compile-time format string and turn it runtime into a query; and you can't take queries from a separate file, they must be in source code.

Yep, although macros could help with that sort of thing, e.g. Haskell's quasiquotation https://wiki.haskell.org/Quasiquotation

A couple of Google hits for 'haskell quasiquote sql':

https://hackage.haskell.org/package/postgresql-simple-0.6.2/...

https://hackage.haskell.org/package/postgresql-query

> Do you know of any library that implements this?

Not completely. De-coupling double-quoted literal syntax from a single String type can be done with Haskell's OverloadedStrings feature, but that relies on a function 'fromString : String -> t', which is what we're trying to avoid https://hackage.haskell.org/package/base-4.6.0.1/docs/Data-S...

Scala's custom interpolators are similar, but they rely on a function from 'StringContext -> t', and StringContext is easily created from a String ( https://www.scala-lang.org/api/current/scala/StringContext.h... )

To ensure safety, we would need some way to encapsulate the underlying fromString/StringContext implementation to prevent it being called by anything other than the literal-expansion at compile-time.

Of course, if we're willing to use macros then it's pretty easy, like those quasiquote examples above.

Haskell's module system is famously mediocre, so it might be possible to do this overloading + encapsulation with Idris https://idris2.readthedocs.io/en/latest/reference/overloaded...

(Of course, Idris also has an incredibly expressive type system, and a very powerful macro system, AKA "elaborator reflection", so it can definitely be done; but I haven't figured out the cleanest way)

alerighi · on Aug 4, 2021

And there are a ton of other type errors that you cannot catch. For example take a method that wants a string in a particular format, you pass a string in another format and it breaks. A method that wants an integer in a range. An array of maximum N elements. And we can go on forever.

So what you do? I find code that uses excessive type checks too verbose, you end up doing things only to make the compiler happy, even if you know that a particular condition will never happen (e.g. a variable that can be null but given a particular condition you know that it cannot be because it was initialized earlier, but the compiler is not smart enough to know).

So to me using static types makes sense where types are used by the compiler to produce faster code (embedded contexts, basically, of course I don't use Python on a microcontroller) and as a linting of the code (Python type annotations or Typescript).

igouy · on Aug 4, 2021

> For example take a method that wants a string in a particular format, you pass a string in another format and it breaks.

"SafeStrings: Representing Strings as Structured Data"

https://arxiv.org/abs/1904.11254

Bosque

https://github.com/microsoft/BosqueLanguage/blob/master/docs...

chriswarbo · on Aug 5, 2021

> For example take a method that wants a string in a particular format, you pass a string in another format and it breaks.

This sentence betrays a confusion about what types are, and how to use them. In particular, you seem to be mixing up two possible scenarios.

In the first scenario, we have a method which "wants a string" (i.e. its input type is 'String'), and we can "pass a string" (i.e. call it with an argument of type 'String'). This code passes the type-checker, since it implements the specification we gave: accept 'String'. Perhaps there are higher-level concepts like "formats" involved, and the resulting application doesn't work as intended; but that's not the fault of the types. More specifically: typecheckers verify that code implements its specification; but they do not validate that our specification captures the behaviour we want.

In the second scenario, we have a method which "wants a string in a particular format" (i.e. its input type is 'Format[A]'), and we "pass a string in another format" (i.e. call it with an argument of type 'Format[B]'). That is a type error, and the typechecker will spot that for us; perhaps pointing us to the exact source of the problem, with a helpful message about what's wrong and possible solutions. We do not get a runnable executable, since there is no meaningful way to interpret the inconsistent statements we have written.

tl;dr if you want a typechecker to spot problems, don't tell it that everything's fine!

If you want the safety and machine-assistance of such 'Format' types, but also want the convenience of writing them as strings, there are several ways to do it, e.g.

- Polymorphism http://okmij.org/ftp/typed-formatting/FPrintScan.html#print-...

- Macros https://www.cse.iitk.ac.in/users/ppk/teaching/cs653/notes/le...

- Dependent types https://gist.github.com/chrisdone/672efcd784528b7d0b7e17ad9c...

> A method that wants an integer in a range. An array of maximum N elements. And we can go on forever.

You're making the same mix-up with all of these. Fundamentatlly: does your method accept a 'Foo' argument (in which case, why complain that it accepts 'Foo' arguments??!) or does it accept a 'FooWithRestriction[R]' argument (in which case, use that type; what's the problem?)

Again, there are multiple ways to actually write these sorts of types, depending on preferences and other constraints (e.g. the language we're using). Some examples:

- https://hackage.haskell.org/package/fin

- https://ucsd-progsys.github.io/liquidhaskell-blog

Your "integer in a range" is usually called 'Fin n', an array with exactly N elements is usually called 'Vector n t', etc. I don't think there's an "array of maximum N elements" type in any stdlib I've come across, but it would be trivial enough to define, e.g.

    data ArrayOfAtMost Nat (t: Type) where
      Nil  : {n: Nat} -> ArrayOfAtMost n t
      Cons : {n: Nat} -> (x: t) -> (a: ArrayOfAtMost n t) -> ArrayOfAtMost (S n) t

This is exactly the same as the usual 'Vector' definition (which is a classic "hello world" example for dependent types), except a Nil vector has size zero, whilst a Nil 'ArrayOfAtMost' can have any 'maximum size' we like (specified by its argument 'n'; the {braces} tell the compiler to infer its value from the result type). We use 'Cons' to prepend elements as usual, e.g. to represent an array of maximum size 8, containing 5 elements [a,b,c,d,e] of type 'Foo', we could write:

    myArray: ArrayOfAtMost 8 Foo
    myArray = Cons a (Cons b (Cons c (Cons d (Cons e (Nil 3)))))

Note that (a) we can write helper functions which make this nicer to write, and (b) just because the code looks like a linked-list, that doesn't actually dictate how we represent it in memory (we could use an array; or we could optimise it away completely https://en.wikipedia.org/wiki/Deforestation_(computer_scienc... )

I actually wrote a blog post many years ago which goes through various types which have a similar 'Nil and Cons' structure http://chriswarbo.net/blog/2014-12-04-Nat_like_types.html

Also note that we don't actually need to write such constraints directly, since we can often read them in from some external specification instead (known as a "type provider"), e.g. a database schema https://studylib.net/doc/18619182

> you end up doing things only to make the compiler happy, even if you know that a particular condition will never happen

You might know it, but the compiler doesn't; you can either prove it to the compiler, which is what types are for; or you can just tell the compiler to assume it, via an "escape hatch". For example, in any language which allows non-termination (including Idris, which lets us turn off the termination checker on a case-by-case basis), we can write an infinite loop which has a completely polymorphic type, e.g.

    -- Haskell version
    anything: a
    anything = anything

    -- Dependent type version
    anything: (t: Type) -> t
    anything t = anything t

We can use this to write a value of any type the compiler might require. Even Coq, which forbids infinite loops (except for co-induction, which is a whole other topic ;) ) we can tell the compiler to assume anything we like by adding it as an axiom.

> So to me using static types makes sense where types are used by the compiler to produce faster code (embedded contexts, basically, of course I don't use Python on a microcontroller) and as a linting of the code (Python type annotations or Typescript).

Static types certainly give a compiler more information to work with; although whether the resulting language is faster or not is largely orthogonal (e.g. V8 is a very fast JS interpreter; Typed Racket not so much).

As for merely using a type system for linting, that's ignoring most of the benefits. No wonder your code is full of inappropriate types like "string" and "int"! I highly recommend you expand your assumptions about what's possible, and take a look at something like Type Driven Development (e.g. https://www.idris-lang.org )

mpweiher · on Aug 4, 2021

> "Unrestricted Upload of File with Dangerous Type"

You are confusing "type" with "static type". This is user input. You are not going to statically typecheck your users.

(See other comment for stringly-typing).

chriswarbo · on Aug 4, 2021

> You are confusing "type" with "static type". This is user input. You are not going to statically typecheck your users.

You are confusing "type" with "not a function type".

Firstly, I absolutely want to check, statically, that my user input is `ByteString`, or `String`, or some more distinguished newtype/AnyVal wrapper around those. If that doesn't pass the type checker, I would be very worried about ever starting such an executable!

Other than that, I want to check, statically, that my functions conform to their type signatures. When I write `parsePNG userInput: Maybe PNG`, not only do I want to make damn sure that `userInput: ByteString` is definitely a `ByteString`; I also want to make damn sure that `parsePNG: ByteString -> Maybe PNG` is definitely a (partial) function from `ByteString` to `Maybe PNG`. If I can't guarantee, without executing the program, that `parsePNG` can only return `Maybe PNG` values, then all bets are off regarding downstream invariants, and that's not a good situation to be in w.r.t. security vulnerabilities.

mpweiher · on Aug 4, 2021

> I absolutely want to check, statically, that my user input is `ByteString`, or `String`

My condolences, you must have horribly unreliable user widget/user input libraries. My TextFields always deliver a "String" to me (unless I've set them up to deliver numbers). I don't need to check that statically.

> You are confusing "type" with "not a function type".

No I am not.

You don't seem to understand what I meant with "You are not going to statically type check your users".

chriswarbo · on Aug 7, 2021

> My condolences, you must have horribly unreliable user widget/user input libraries

The libraries seem pretty reliable; but I'm not (exactly the same as in the SQL example, BTW).

For example, I don't think I've ever managed to write a Python3 program which hasn't crashed at some point due to me mixing-up string/bytes; of course, testing doesn't usually find such problems, even with Hypothesis, since I make exactly the same mistakes when I'm writing the tests. I've also written a whole bunch of Python3 programs which spend ages crunching numbers, only to output '<map object at 0x10199c2e0>' instead of printing a list.

I make exactly the same mistakes when writing Haskell and Scala; but the computer tells me that I've made a mistake, and how to fix it, before the program ever gets a chance to run.

> You don't seem to understand what I meant with "You are not going to statically type check your users".

That might be the case. I can't elaborate on what you meant (since I may have misunderstood)

mpweiher · on Aug 15, 2021

> Python3 program which hasn't crashed at some point due to me mixing-up string/bytes

Not sure what that even is, but I'd point out that dynamically-typed ≠ Python. In particular, the "hash languages" (Python, Ruby, JavaScript) seem to make it particularly easy to make a hash of things, but not due to the dynamic typing.

I've never had that problem with NSString and NSData. They are quite distinct.

> Python3 ... crunching numbers ... map instead of list

??

Why would you output a map when you are crunching numbers?

To get a "list", for example, I'd use an NSArray. It seems tricky to get an NSDictionary instead, since you have to explicitly ask for an NSDictionary in order to get one.

gpderetta · on Aug 4, 2021

Unless you can exhaustively check your function for its whole domain, you really can't . Your function could indeed return pink elephants when passed unicorns.

dragonwriter · on Aug 4, 2021

> Unless you can exhaustively check your function for its whole domain, you really can't.

If I generate values within a function that are arguments to a constructor call, and the make the constructor call and return the result with no further manipulation, I can—without exhaustive, or even any—testing be quite confident that the function will return the type for which the constructor so called is the constructor any time it returns, and fail otherwise.

seanwilson · on Aug 4, 2021

> and return the result with no further manipulation, I can—without exhaustive, or even any—testing be quite confident

But when you start having loops, conditions, arrays, nullable references etc. you can't be so confident any more unlike with proper type checking where you can be fully confident.

gpderetta · on Aug 4, 2021

right, but it is not the unit test that is giving you that confidence and the unit test will not help you if the function is later modified.

What you have, after manual inspection of the code, is some sort of informal and not written down correctness proof. A machine tested proof would be better and types help with that.

ideamotor · on Aug 4, 2021

I would like to use static types, since dynamic types defeat the point by requiring so many checks. I’m still typing a lot. But in my field, I just don’t think it makes sense (data analysis), and frankly it would create a lot of work that would not make me more employable. Maybe some day. Haskell does interest me, but I work in teams that won’t know it.

kpmah · on Aug 4, 2021

I'm a type enthusiast, but dynamic languages make a lot of sense for small programs that either give you an immediate answer or fail to run (i.e. not long running, few branches).

In those cases, the difference between a compile-time check and a run-time check is much smaller.

pydry · on Aug 4, 2021

If you have a decent test suite the difference also narrows considerably.

Zababa · on Aug 4, 2021

I think that might be an extension that dynamic languages are good when you want to quickly write glue code, for example Rails. On the other hand, when your quick glue code becomes a very important batch job, gets more complex, or your Rails application becomes really big, you might want strict types. It's also hard as everything starts small, and it can be hard to know what will get big or not.

blacktriangle · on Aug 4, 2021

As somebody who works with Rails and likes dynamic languages like Clojure, I would assert Rails scaling issues are not related to language dynamism, they are related to Rails making incredibly poor design decisions.

chriswarbo · on Aug 4, 2021

I would say data analysis is actually very well suited to static types. In particular, static types allow us to be much more specific than e.g. string/int/float (AKA "stringly typed programming"). We can use more specific types like 'count', 'total', 'average', 'dollars', 'distance', 'index', etc. all of which are numeric, but we want to avoid mixing up (e.g. counts can be added, but a count should not be added to a distance).

I also find 'phantom types' incredibly useful. For example, we might have a database or key/value store, where the keys are represented as strings and the values can be strings, integers, floats, booleans, etc. We can 'label' each key with a "phantom type" corresponding to the values. This way, the compiler knows exactly what types are expected, we only have to handle the expected type, we don't need any boilerplate to wrap/unwrap the database results, etc.:

    -- Haskell syntax
    data DBKey t = DBKey String

    dbLookup :: DBKey t -> Maybe t
    dbLookup (DBKey key) = ...

    name = DBKey @String "name"
    age  = DBKey @Int    "age"

    // Scala syntax
    final case class DBKey[T](override val toString: String)

    def dbLookup[T](key: DBKey[T]): Try[T] = ...

    val name = DBKey[String]("name")
    val age  = DBKey[Int   ]("age" )

These are called "phantom types" since they don't actually affect anything about how the code runs; in particular, the definition of 'DBKey' requires a type ('t' or 'T') but doesn't actually use it for anything (it just stores a String). We could get rid of these 'labels' and the code would run identically; or we could have the compiler infer all these labels (defaulting to some uninformative type like 'unit' if it's ambiguous). However, the fact that we've explicitly labelled 'name' with String, and 'age' with Int, constrains the rest of our code to treat them as such, otherwise we'll get a compiler error. For example, if we try to look up a name then our result handler must accept a String, and nothing else; if we provide a default age it must be an Int; etc.

Of course, we can combine such 'labels' with more meaningful types; so that 'name' looks up a 'Username'; 'age' looks up an 'Age', or a 'Nat', or a 'Duration'; or whatever's convenient. It can also be combined with other type system features like implicits/ad-hoc-/subtype-polymorphism, e.g. to decode JSON as its read from the DB, etc.

I've used the example above in real Scala applications when accessing a key/value store. I've also used phantom types (again in Scala) to help me manipulate Array[Byte] values: in that case the bytes could either be 'Plaintext' or 'Encrypted', they could represent either a 'Payload' or a 'Key', and each 'Key' could be intended for either 'Encryption' or 'Signing'. Under the hood these were all just arrays of bytes, and the runtime would treat them as such (using Scala's 'AnyVal' to avoid actually wrapping values up), but these stronger types made things much clearer and safer.

For example, we can decrypt and verify a message which carries its own 'data keys' (encrypted with a 'master key'):

    def decrypt[T](
      key : Bytes[Plaintext, Key[Encryption]],
      data: Bytes[Encrypted, T],
    ): Try[Bytes[Plaintext, T]] = ...

    def validated[T](
      key : Bytes[Plaintext, Key[Signing]],
      data: Bytes[Plaintext, T],
    ): Try[Bytes[Plaintext, T]] = ...

    def decryptMessage[T](
      master : Bytes[Plaintext, Key[Encryption]],
      dataEnc: Bytes[Encrypted, Key[Encryption]],
      dataSig: Bytes[Encrypted, Key[Signing   ]],
      message: Bytes[Encrypted, T              ],
    ): Try[Bytes[Plaintext, T]] = for {
      encKey <- decrypt(master, dataEnc)
      sigKey <- decrypt(master, dataSig)
      raw    <- decrypt(encKey, message)      
      valid  <- validated(sigKey, raw)
    } yield valid

The compiler makes sure we never try to use an encrypted key, we never try to decrypt plaintext, we can't decrypt with a signing key or a payload, we can't validate signatures with an encryption key or a payload, our result type must match the message type, etc. After checking, these types are "erased" to give an implementation of type '(Array[Byte], Array[Byte], Array[Byte], Array[Byte]) => Array[Byte]', which we could have written directly, but I certainly wouldn't trust myself to do so!

bmitc · on Aug 4, 2021

Check out F#.

roenxi · on Aug 4, 2021

> ... Java, C and C++ have much weaker checks of correctness...

It is confusing that people talk about C in terms of being a typed language. The types are an annotation so the compiler knows how much memory to allocate / read. All the side benefits of having to spell that out are incidental.

That sort of typing is a distraction. A programmer does not care that they are passing 4 bytes as the first argument. That isn't really trying to squeeze anything useful out of a type system. They care that they are passing something recognised as an ID, not that it has 4 bytes.

Using C/C++ as an example of a typed language doesn't say much. The guarantees are more a culture of how to use the language than in the language itself.

kaba0 · on Aug 4, 2021

With the same argument, why put C++ into the same bracket? It has much much more type safety than C.

edflsafoiewq · on Aug 4, 2021

  struct ID { uint32_t val; };

  void check_id(struct ID id);

kamaal · on Aug 4, 2021

>>why would I give up getting the computer to automatically check for me at compile time that I'm passing the right number of parameters to a function

I never understood this, how is it possible that a developer calling a function doesn't test if they are calling it right. Once they test it, how does the additional compiler test matter?

Unless of course your calling function is mutating the value of a variable across so many types that it's impossible to realistically test it. It would then make sense to use a type to prevent the variable from being something it shouldn't.

But most people, don't have that situation most of the time. Even in a big application.

May be that explains why something like Python has such wide adoption.

kaba0 · on Aug 4, 2021

How many ways can you call a single function? Like, it may be called from a loop, with a quickly changing variable that may change type here or there (especially with weak typing), and you really can’t just say that “I tested it with this one data and it worked”.

kamaal · on Aug 5, 2021

Most people generally use one variable for one purpose.

Of the all the constraints I've faced while developing software, reusing variables for many purposes in one code block isn't one of them.

majjgepolja · on Aug 4, 2021

The main advantage of typing is, for me, IDE showing errors as you type and better autocomplete, and they serve as documentation also (Of course you should read docs but types are helpful as quick reminder.)

Maybe I am just more absent minded than most dynamic programmers, but live IDE diagnostics and autocomplete are very important for me.

kamaal · on Aug 5, 2021

Don't we have support for most of this in language plugins for Python in vscode et al, already?

majjgepolja · on Aug 7, 2021

It's not as accurate or satisfactory in my experience. For example I get errors after running in python, which could be detected in statically typed languages while typing in IDE. Autocomplete is also less than optimal. As for autocomplete intellij IDEs > LSPs for typed languages > Untyped languages IME.

dragonwriter · on Aug 4, 2021

> I’m with the other posters too about the article content, if you haven’t coded in a language like OCaml, Haskell, ML etc. you really don’t know enough about what strong static types offer to discount them.

And if you have, you probably know that its awesome but for most purposes doesn’t compensate for the weak ecosystems of the languages that have it, compared either to poorer (higher cost to satisfy for lower benefits) statically typed languages or dynamically typed languages with or without available optional static type checking.

brooklinmyers · on Aug 4, 2021

I joined a team recently without a solid test suite or testing practices. Really excited to (gently) spread the gospel of TDD (joking). I have the same thought process though, without tests to back me up I feel exposed.

Less opinionated/experienced with strong type systems though, I do appreciate the flexibility of being able to hack things together to see if they work - then improving upon the solution. strong types seem to make that harder.

justsomeuser · on Aug 4, 2021

I think strong types actually make this easier.

I often build systems, then break them up to take the working parts out, then put a subset back together.

The type system is like magnetic Lego that lets you snap pieces together.

With a dynamic system it is really difficult to break things apart and then put them back together (without ever running the code, and with 100% accuracy and immediate feedback).

_3u10 · on Aug 4, 2021

It really depends on the team size and a number of other factors.

Dynamic languages can be very productive for small teams.

I definitely liked the productivity of F#, but the ecosystem was lacking. Often in dynamic languages the ecosystem more than makes up for the lack of compile time checks.

omegalulw · on Aug 5, 2021

Also add Agda to your list. It's a good entry point to functional programming!

mabelgoodrich · on Aug 5, 2021

This is really wonderful information and I would love to read more informative posts like this. Being a digital marketing expert at this https://flirtymania.com/de/p site and help people with cam chat room opportunity, I need to know some basic graphic designing tasks as well as programming language part! Thanks for sharing this post with us.

zzzeek · on Aug 4, 2021

> if you haven't coded in a language like OCaml, Haskell, ML etc. you really don't know enough about what strong static types offer to discount them. Java, C and C++ have much weaker checks of correctness and lack type inference + pattern matching that make strong types pleasant to work with.

Those are all functional languages, so this argument seems to be the "functional is better than OO" argument, typing is not the primary factor (but of course to the degree that FP allows static typing to be more intuitive, that's a plus). What examples of static typing with OO or procedural languages that are only a "pleasure to work with" can we look at ?

klysm · on Aug 4, 2021

I think functional languages historically have much better type systems but recently, more procedural languages have taken a lot of those innovations and integrated them (e.g. typescript, rust)

kragen · on Aug 4, 2021

The simply-typed lambda-calculus [01940] predated Abadi and Cardelli's object-calculus or sigma-calculus [01996] by 56 years, and both of those have trouble with side effects; the kind of non-sharing thread-safety guarantees Rust's type system can provide for mutable data structures stem from Jean-Yves Girard's "linear logic" [01987] and Wadler's "linear types" [01990], although Reynolds published a POPL paper in 01978 that we can recognize in retrospect as having invented a Rust-like affine typing system.

I've never been able to make heads or tails of Girard's paper but http://pauillac.inria.fr/~fpottier/slides/fpottier-2007-05-l... is an introductory divulgation of some of the pre-Rust history of this stuff. It makes me think I should read Wadler's 01990 paper.

So I think functional languages kind of had a head start, but also in a sense they have an easier problem to solve. ML's type system doesn't even have subtyping, much less mutation and resource leak safety.

tome · on Aug 6, 2021

Thanks, that presentation is absolutely fascinating!

Zababa · on Aug 4, 2021

> What examples of static typing with OO or procedural languages that are only a "pleasure to work with" can we look at ?

OCaml and Rust would fit the bill. You can do procedural and OO in both. You could add Scala to that. I think the conclusion is not that "functional is better than OO" but "they both have their strength and are not mutually exclusive".

cb321 · on Aug 4, 2021

You could add Nim [1] to that list. Many people say it has "the ergonomics of Python but the strong static typing of C++/Rust."

[1] https://nim-lang.org

lolinder · on Aug 4, 2021

In additional to what others have mentioned (OCaml has OOP), Kotlin is very good. It's not as expressive as ML-family languages, but sealed classes + null safety + type inference makes for a very fluid, almost dynamic programming experience that still has strong guarantees.

bmitc · on Aug 4, 2021

OCaml and F# are functional first multi paradigm languages. They’re perfectly fine being used for OOP.

edem · on Aug 4, 2021

And you also have the problem of writing that type checker in a dynamic language, so you also have no way to check whether the checker itself works according to your specification.