Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Rant:

I'm afraid I have to side with the author against the "relational set theory purists." I actually don't care much about styles of ORMs, and I don't even care that much about having to write SQL, either. I can do, and have done, both. For me, both work equally well, or rather equally poorly.

I loathe relational databases because, every time I have used one, it has eaten a vast amount of time in making the schema and queries fast enough. Yes, I know what third normal form is, and I start my database designs with it. I know how and when to denormalize to improve join performance. I know how to look through a query plan and create necessary indices. I've even looked through low-level IO statistics to figure out where a slow query wastes time doing unnecessary IO. Guess what: that work sucks. I've used Sybase, considered a "real" RDBMS, and I used MySQL, considered a "crappy" RDBMS. Both have sucked about equally, just in slightly different ways.

I am an application and system-level programmer. I want to be able to tell my app: "store this data for future retrieval." I want the store to be instantaneous. I want the future retrieval to be instantaneous. Period. As a major bonus, I want to be able to distribute the store between multiple machines, but do so as transparently to my application as possible. I also don't want to risk losing my data when one server loses its disks, so I want at least semi-transparent replication. ACID semantics are, obviously, a big plus. No RDBMS gives me this flexibility in designing an app. Oracle RAC, supposedly the cat's meow of scalable RDBMS, has RW database load balancing, but it still relies on having a single centralized SAN store on its back end.

To add insult to injury, the promise of arbitrary queries against the data just doesn't pan out. Yes, I can run an arbitrary query. Overnight. Adding this query to an application with a non-trivial schema and a large dataset usually requires so much indexing and query optimization work, that I might as well write the equivalent retrieval code for a Berkeley DB store by hand. I hate going through my queries and using trial-and-error to figure out where a subselect will outperform a join, and I can't afford to hire a bunch of DBAs, who also happen to know set and relational theory, to do this work for me.



It seems the Prevayler or HAppS style of data storage & retrieval is what you and I want (and perhaps in their hearts, every other non-db-specialist as well)

Basically, you just define data structures in your language like you would any other piece of data, and it is silently in the background serialized into an efficient (I would hope) ACID datastore.

http://www.prevayler.org

http://happs.org


You sound like a guy who knows his databases, and I freely admit I don't, but here's what I use to keep my life simple:

I work in java, and so I have the benefit of a stack trace where ever I want. So in the class that deals with sql queries I keep a hashmap with stacktraces as keys and total execution times as values. Whenever I feel an application sluggish, I just optimize the first 3-5 queries, which usually it's just "explain" and "alter table add key", and using a cache when it's serious. This allows no other thought whatsoever about optimisation. I don't even use indexes by default, on the grounds that they always slow inserts but not always speed things up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: