> Write a test, which proves for all real numbers, that y = 2 \* x. Tests by def...

smaudet · on April 14, 2022

> while constraining the search space enough for the synthesizer to give a solution.

Keyword being "a". Best solution? Most easily maintained? I don't know about that.

> In practice, it turns out these tests are good enough to synthesize methods from a lot of production grade Ruby on Rails web apps

Yeah, for some very basic stuff, you might have invented an overly complicated way of building boilerplate code.

Your argument is that the tests being incomplete make it tractable and this is an "advantage" somehow - I don't think this is such an advantage. The majority of "useful" tests either are stand-ins for what the language should have given you anyways (proper type checking), or they find and fix known bugs, e.g. the intersection of three features.

They rarely properly specify functionality, usually the best way to do this is writing the actual code. At which point, to properly specify your functionality in "test" form, you must write the actual code as a test. This doesn't provide much benefit, outside of having perhaps automated one side of a counterproductive process. You do save some programmer time, but writing the code is and was never the hard part...

sankha93 · on April 14, 2022

> Best solution? Most easily maintained?

We apply the Occam's razor here. For a given problem, a smaller program that solves the problem is a preferred solution. It is not a very good metric, and it is fine to disagree with it. But getting a "best" solution, requires one to define what "best" is, and there is no such agreed definition for code.

> The majority of "useful" tests either are stand-ins for what the language should have given you anyways (proper type checking), or they find and fix known bugs, e.g. the intersection of three features.

Find and fix known bugs is one aspect of it. You are not accounting the entire philosophy of test driven development, where one specifies functionality as tests upfront, and later writes code to pass those tests. Anyhow, we have found large projects like Discourse [1], Gitlab [2], Diaspora [3] have tests that specify code behavior well enough so that RbSyn can synthesize them. Very few tests are type-checking, bug fixing and intersection of 3 features.

[1]: https://github.com/discourse/discourse [2]: https://github.com/gitlabhq/gitlabhq [3]: https://github.com/diaspora/diaspora

smaudet · on April 15, 2022

> We apply the Occam's razor here.

Occam's razor is ... flawed. "Lazy" if I'm not trying to be polite. Its a tool you use to keep from going mad, but it doesn't provide correct answers, just _sometimes_ statistically correct ones. Sure fine, you admit its not a very good metric, I do disagree here.

> You are not accounting the entire philosophy of test driven development, where one specifies functionality as tests upfront, and later writes code to pass those tests.

Some of that philosophy is flawed too.

Your anecdotes are interesting, I would like to see a comparison of the difference between the programs and your synthesized versions.

But back to the philosophy bit - if you specify your functionality in your tests, you are still writing the code in test form. It doesn't matter if you translate the test back to code manually or automatically, you're still employing some meta language to code your program. Saying "if this do that" in a test and then writing code with an if/else loop is not impressive, or useful.

To me, the philosophy of the test is to verify functionality, not to completely specify it. Going backwards to code, and just making it very easy to do so, you've either written your program in test form already, or you've produced some swiss cheese code.