Show HN: Program Synthesis for Ruby

eatonphil · on April 13, 2022

> Given a method specification in the form of tests, type and effect annotations with RDL, this synthesizes a Ruby function that will pass the tests. This reduces programmer effort to just writing tests that specify the function behavior and the computer writing the function implementation for you.

That's really neat! But doesn't this just mean that instead of writing your code and logic in Ruby you're writing it in RDL?

Also, it would be great to see some examples in the README or in an examples folder. It's kind of hard for me to tell exactly what I need to write and what to call to generate the Ruby.

sankha93 · on April 13, 2022

> But doesn't this just mean that instead of writing your code and logic in Ruby you're writing it in RDL?

RDL allows the specification of types and effect labels only. You still write the tests in standard Ruby and synthesized code should satisfy the logic as checked by the tests.

> Also, it would be great to see some examples in the README or in an examples folder. It's kind of hard for me to tell exactly what I need to write and what to call to generate the Ruby.

Sorry for the sparse documentation, I plan to improve that. I have added an example to the readme. Other examples can be found in `test/benchmark` folder.

evolve2k · on April 13, 2022

It would be wonderful if you could embed a short video that explains the concept and shows some practical examples.

What it can do now. What are the limits. What’s planned next? Are questions we have as first time visitors.

sankha93 · on April 13, 2022

Thanks for the feedback! I just uploaded a video now that demos RbSyn in action: https://www.youtube.com/watch?v=yYAOK5S8lRY

The concepts, capabilities, and limitations of RbSyn are explained in this talk: https://www.pldi21.org/poster_pldi.124.html with full details in the accompanying PLDI paper: https://arxiv.org/abs/2102.13183

I will link these from the readme soon.

smaudet · on April 13, 2022

> You still write the tests in standard Ruby and synthesized code should satisfy the logic as checked by the tests.

I think its important that the code still be user-modifiable:

Take this function, y = 2 * x.

Write a test, which proves for all real numbers, that y = 2 * x.

Can't do it, can you?

Tests can't fully specify code - this effort is doomed to be incomplete.

sankha93 · on April 14, 2022

> Write a test, which proves for all real numbers, that y = 2 * x.

Tests by definition do not prove any for all property. They just check on correctness on concrete values. Formal properties are great for proving properties on values, i.e., for your example one could easily write a logical formula and hand it off to a SMT solver for synthesis and there is enough prior work on that.

> Tests can't fully specify code - this effort is doomed to be incomplete.

RbSyn addresses a different problem: it is difficult to formally specify the behavior of an application in a high-level language that handles HTTP requests, accesses the database. Incompleteness of tests are an advantage here--it allows the specification to be tractable (these are tests programmers would have written anyway), while constraining the search space enough for the synthesizer to give a solution. In practice, it turns out these tests are good enough to synthesize methods from a lot of production grade Ruby on Rails web apps (https://arxiv.org/abs/2102.13183).

smaudet · on April 14, 2022

> while constraining the search space enough for the synthesizer to give a solution.

Keyword being "a". Best solution? Most easily maintained? I don't know about that.

> In practice, it turns out these tests are good enough to synthesize methods from a lot of production grade Ruby on Rails web apps

Yeah, for some very basic stuff, you might have invented an overly complicated way of building boilerplate code.

Your argument is that the tests being incomplete make it tractable and this is an "advantage" somehow - I don't think this is such an advantage. The majority of "useful" tests either are stand-ins for what the language should have given you anyways (proper type checking), or they find and fix known bugs, e.g. the intersection of three features.

They rarely properly specify functionality, usually the best way to do this is writing the actual code. At which point, to properly specify your functionality in "test" form, you must write the actual code as a test. This doesn't provide much benefit, outside of having perhaps automated one side of a counterproductive process. You do save some programmer time, but writing the code is and was never the hard part...

sankha93 · on April 14, 2022

> Best solution? Most easily maintained?

We apply the Occam's razor here. For a given problem, a smaller program that solves the problem is a preferred solution. It is not a very good metric, and it is fine to disagree with it. But getting a "best" solution, requires one to define what "best" is, and there is no such agreed definition for code.

> The majority of "useful" tests either are stand-ins for what the language should have given you anyways (proper type checking), or they find and fix known bugs, e.g. the intersection of three features.

Find and fix known bugs is one aspect of it. You are not accounting the entire philosophy of test driven development, where one specifies functionality as tests upfront, and later writes code to pass those tests. Anyhow, we have found large projects like Discourse [1], Gitlab [2], Diaspora [3] have tests that specify code behavior well enough so that RbSyn can synthesize them. Very few tests are type-checking, bug fixing and intersection of 3 features.

[1]: https://github.com/discourse/discourse [2]: https://github.com/gitlabhq/gitlabhq [3]: https://github.com/diaspora/diaspora

smaudet · on April 15, 2022

> We apply the Occam's razor here.

Occam's razor is ... flawed. "Lazy" if I'm not trying to be polite. Its a tool you use to keep from going mad, but it doesn't provide correct answers, just _sometimes_ statistically correct ones. Sure fine, you admit its not a very good metric, I do disagree here.

> You are not accounting the entire philosophy of test driven development, where one specifies functionality as tests upfront, and later writes code to pass those tests.

Some of that philosophy is flawed too.

Your anecdotes are interesting, I would like to see a comparison of the difference between the programs and your synthesized versions.

But back to the philosophy bit - if you specify your functionality in your tests, you are still writing the code in test form. It doesn't matter if you translate the test back to code manually or automatically, you're still employing some meta language to code your program. Saying "if this do that" in a test and then writing code with an if/else loop is not impressive, or useful.

To me, the philosophy of the test is to verify functionality, not to completely specify it. Going backwards to code, and just making it very easy to do so, you've either written your program in test form already, or you've produced some swiss cheese code.

bingo-bongo · on April 13, 2022

> But doesn't this just mean that instead of writing your code and logic in Ruby you're writing it in RDL?

..and write a test for it..? :|

noneeeed · on April 13, 2022

Interesting. Reminds me of my time at university using something called Progol. You gave it positive and negative cases and it's algorithm would (very slowly) try and come up with Prolog code that fit it.

It was an interesting idea but in reality we spent so much time having to spoon-feed it more and more carefully constructed examples that it just seemed pointless.

Not to say that this will suffer from the same issue, but I do wonder if the benefit will really be significant enough to make it worth it.

sankha93 · on April 13, 2022

Indeed, there is rich prior work on example based program synthesis. Often these use inductive logic programming or aided by SMT solvers if these examples can be lifted to solver level values (think integers or booleans, or data types made from those).

However, not all program values can be easily lifted to solvers. So the goal with RbSyn is to allow the programmer to write the same tests they would have written anyway to check their program correctness, without any extra fiddling. As tests subsume examples, we expect more program behavior can be specified than just using simple examples.

> I do wonder if the benefit will really be significant enough to make it worth it.

I have often pondered on what makes a program synthesis tool useful. I expect writing code as art as much as it is science, and people like to express code in the way they model system behavior in their head. In that regard, I do not think program synthesis tools will enable you to automate away large parts of code, but I do think it will automate mundane parts of a program; like writing utility methods, or filling in a partial program when enough information can be gathered from surrounding context (such as the arguments you write for a substring function).

willlma · on April 14, 2022

I'd love to give this a try next time I'm working on a Ruby codebase. In the meantime, I'm curious to learn more about programming by example. I found [this Microsoft paper on the subject](https://www.microsoft.com/en-us/research/wp-content/uploads/...) and have started reading it, but I'm curious if there are any reading materials you would recommend.

Specifically, I've been working on a activity tracking app that can take plain text and create structured data from it. For example

> I played tennis with Kevin and won 6-2

would turn into

```

type: tennis

opponent: Kevin

my_score: 6

their_score: 2

outcome: win

```

As you can imagine, when it comes to natural language there are many variations of I won (outcome: win), he/she/they won (outcome: loss), I was beat, etc which means that my inputs are not nearly as structured as the examples used in the paper I'm reading. Given that I'm not trying to cover every use case in the English language, which I assume would require some form of NLP, but rather just a subset that a single user would input, do you think think it would be possible to solve this with PBE techniques?

sankha93 · on April 14, 2022

Natural language is ambiguous, so I do not think any programming by example methods will work well here. I think if you are working with a limited vocabulary in the set of sentences you want to parse, it may be fine to roll out a handmade solution. For general english, there is work in intent identification in NLP that can give you structured intents from raw english text, which can then be lifted to make programs as a secondary step.

jasfi · on April 14, 2022

You might be interested in LxAGI (https://lxagi.com) as it aims to solve this problem.

pmontra · on April 13, 2022

I'm writing a Rails API app (for a customer) that is a client to Alexa's Skill Management API and is a server for a backend app. I have tests but I'm afraid that without a 100% coverage of all corner cases I'll be disappointed by some results and it could be difficult to manually fix the code generated by the program. Am I overly pessimistic?

sankha93 · on April 14, 2022

That is a fair concern. The tool only guarantees correctness up to the level checked by the tests you provide. So if all corner cases are not covered, RbSyn will generate some program that passes the tests, but might not pass unspecified corner cases. I suspect the missing cases will be clear with a manual audit of the synthesized code, and you can add update the synthesized code manually or add requisite tests.

giraffe_lady · on April 14, 2022

> I suspect the missing cases will be clear with a manual audit of the synthesized code

We might just have really different brains but "read the code and figure out everything it doesn't do" is one of the hardest possible tasks for me?

sankha93 · on April 14, 2022

It is a hard task, for sure! But without any tests to convey your intent as a programmer, RbSyn also has no way of knowing what you intended a method's behavior to be.

The running thread here is if you would have written a method and some tests to check that you indeed wrote the method correctly, RbSyn will automate the task of writing the method for you. In other words, it is just like test driven development, where the writing the code part is automated.

By manual audit, I meant you can verify if you need more tests to capture your intended behavior or just take a shorter route and update the synthesized code directly without any new tests for the newly added behavior. One could argue the latter is bad programming practice.