I find it useful to consciously separate input data structures from intermediate data structures (accidental complexity). I try to structure my code so it doesn't rely on intermediate data structures, it knows how to recompute them. When this works it can be very pleasing: just input data structures with caching all over the place.
Sometimes I think I'm chasing pure functional programming, but in a more pragmatic form and separated from type-checking.
That's very interesting. We're doing something similar. I'm curious as to how you reconcile your "intermediate data structures" with one of the principles in the OP, that of minimizing the transformations you have to do on your data in the first place. The latter is a profound insight that I am slowly digesting. One thing it throws out the door, for example, is layered architectures. Not a small deal! Yet it makes sense to me, because my experience with layered architectures has been that the more nicely modular and well-defined you make each layer, the more bloated and nasty the mappings between layers become.
Sometimes I think I'm chasing pure functional programming
No question this is more suited to FP than OO.
Edit: this really is a rich subject. It's interesting that a lot of this discourse is coming out of the game dev world, because that's a section of the software universe which is relatively free of pseudo-technical bullshit (probably because it's so ruthlessly competitive and the demands on the apps are so high).
I usually don't think about performance. nostrademons sounds right: "minimize transformations when you productionize." I don't have experience with productionizing.
So far when I've found myself scaling one piece of my pipeline up, I do one of two things: 1) I switch key pieces from arc to scheme or C. 2) I add a periodic precompute stage to reduce cache misses.
Like pg said somewhere (update: http://ycombinator.com/newsnews.html, /15 Jan), the goal isn't optimizing but keeping performance mediocre as you scale up.
Update: After reading http://news.ycombinator.com/item?id=1005145 I see 'productionize' isn't a big-bang, stop-the-world step. By the new definition I find I do rewrite things for performance fairly often.
The image in my head now: a ball of mud (http://www.laputan.org/mud) with rewrites as layers. Older layers that have proven themselves harden and fossilize as subsequent rewrites focus more on their performance without changing semantics. But even they aren't immune to the occasional tectonic upheaval.
I was wrong then. We're not doing something similar :)
Like pg said somewhere (update: http://ycombinator.com/newsnews.html, /15 Jan), the goal isn't optimizing but keeping performance mediocre as you scale up.
Where did he use the word, or the concept, "mediocre"?
I found that minimizing transformations on your data is a principle you apply when you productionize. For most of the development cycle, you want to keep things as debuggable as possible (at the possible expense of performance), and intermediate data products + debugging hooks are a good way to do this.
This brings up a much bigger question of when to productionize, though. Most programs are never actually "done", but at some point you have to release to the public and hopefully get millions of users. You need to make the performance/maintainability tradeoff sometime. The later you push it off, the more productive you can be in the critical early stages, and the better a product you can bring to market. But if you push it off too long, you miss the market window entirely and don't get the benefit of user feedback.
But these are fundamental design issues. You can't change fundamental design when you "productionize"; coming up with that design and implementing it is the development cycle.
Productionize usually means "rewrite". I think that software engineers in general have become too averse to rewriting code; as long as you do it with the same team that wrote the prototype, it's often a good idea to throw away everything and start from scratch.
The development cycle for me is much more about collecting requirements than coming up with a design that satisfies those requirements. That's what iterative design is about - you try something out, see if it works for the user, see what other features are really necessary for it to work for the user, and then adjust as necessary. Once you know exactly what the software should do, coming up with a design that does it is fairly easy.
My current project is nearing its 3rd complete rewrite since September, plus nearly daily changes that rip out large bits of functionality and re-do them some other way.
I try not to have a 'fundamental design'. If you rely wholly on caching, you have no intermediate data structures, and code becomes easier to change in dramatic ways. This is the ideal I've been striving for.
Check out arc's defmemo function. Given the ability to memoize (or cache) function invocations, changing your data structures can become simply a matter of refactoring your function boundaries and deciding which ones perform caching.
Sometimes I think I'm chasing pure functional programming, but in a more pragmatic form and separated from type-checking.