No, you cannot say that every project is impossible to provide clear specificati...

kragen · on Jan 28, 2012

If you write your very clear specification down in a formal language, you can simply execute it:

    print(($ARGV[0] + $ARGV[1] + $ARGV[2]) . "asdf1234");

The reason you can provide an accurate estimate for how much longer it will take to develop is that, by providing that clear specification, you've already done the development. All that remains is the trivial step of writing it down and invoking a compiler or interpreter.

> A more complicated, but "useful" example, would be to implement a logging server according to a REST API.

If the REST API in question is well-defined, it's because there's already an implementation of it. Just install that implementation and use it, and you'll be done in half an hour. Are you not sure if you can use the existing implementation? Well, then your project might take 45 minutes, or it might take two weeks. Suddenly you have two orders of magnitude of uncertainty in your estimate. Maybe you know you can't just use the existing implementation; how much of it can you reuse? Is there another piece of software out there that implements the same API? You may be able to get it to work for you in an hour and a half. Or you may be better off writing stuff from scratch.

Then, either with the off-the-shelf software or the software you wrote from scratch, you may encounter a killer performance bug — which could take you two weeks to resolve. Or you may not.

Maybe you think only the not-very-experienced software developer would consider using off-the-shelf software, or encounter a critical performance bug that could take weeks of work to resolve. If that's what you think, I suspect you're the one who's not very experienced!

andreyvit · on Jan 29, 2012

> Are you not sure if you can use the existing implementation? Well, then your project might take 45 minutes, or it might take two weeks. Suddenly you have two orders of magnitude of uncertainty in your estimate. Maybe you know you can't just use the existing implementation; how much of it can you reuse?

This. One of the biggest sources of uncertainty is not knowing how much of the existing code will work well enough for you.

yummyfajitas · on Jan 29, 2012

If you write your very clear specification down in a formal language, you can simply execute it...

This isn't necessarily true. Two examples:

    for x, y in sorted(lst), assert(x < y => index(x, sorted(lst)) < index(y, sorted(lst)) )

    assert( sqrt(x) * sqrt(x) == x )

In more practical terms, at my company we've hired someone to build scrapers for us. They have a fixed interface - scraper.getBrand, scraper.getPrice, etc. The clear specification is that on 100 examples, the scraper needs to agree with humans viewing the webpage (i.e., scraper.getPrice(page) == whatever a human viewing the page says it is).

kragen · on Jan 29, 2012

Insightful. A few comments:

1. It's true that those specifications are clear (although you might also want to specify that multiset(sorted(lst)) == multiset(lst)), and sqrt should be allowed an error of half an ulp and perhaps required to be nonnegative and restricted in its domain to nonnegative numbers.) But they are not necessarily specifications that are easy to estimate, either. (I should have said "your very clear, easy-to-estimate specification", since writing specifications like the ones you have above is clearly not doing the development work.)

2. It is at least theoretically possible to automate the finding of programs that satisfy specifications like your two examples above. Given the knowledge that λx.x*x is strictly monotonic on nonnegative numbers, for example, you can apply a general binary chop routine to compute sqrt(x) in O(log numberofbits) time.

3. For your realistic example, it seems very likely that you could write software to build the scrapers rather than hiring a person. http://lists.canonical.org/pipermail/kragen-hacks/2004-Janua... represents a prototype solution to this problem from 2004, but without the knowledge of AI algorithms I have now, and of course with much less computational power than is available today.

extension · on Jan 29, 2012

True, but if your spec only contains the verification, then it's not helping you estimate the time to a solution.

nradov · on Jan 29, 2012

Your "clear" specification is actually quite vague. Can the numbers be integers, floating point, imaginary, or irrational? Are the numbers limited in range or precision? Are there particular requirements for performance, scalability, and hardware resource utilization?

You may say I'm being pedantic but these are exactly the type of issues which cause inaccurate estimates.

arijo · on Jan 28, 2012

What does "implement a logging server according to a REST API" mean? Have you ever coded in your life?

jerf · on Jan 28, 2012

It means that three weeks into your project your realize you fucked up when you required a full HTTP transaction for every log entry, however beautiful the API may have looked during the specification phase, that not even the really fast transactions-per-second web servers are fast enough, and already one week over your carefully crafted perfect time estimate you have to throw everything out and start over again.

Yeah, I'm not a big believer in accurate time estimates either. Routinely accurate time estimates imply that you're doing something wrong. If your work is routine enough that you can routinely accurately estimate it, you're missing an automation opportunity somewhere. Probably a big one.