A couple of years ago I inherited an old, badly written web app doing up to 800 ...

A couple of years ago I inherited an old, badly written web app doing up to 800 queries per page. It was slow and hammering the server.

Being read-heavy the app was an ideal candidate for output caching. But - there were no hooks in the admin code to add cache invalidation. And I wasn't going to crawl through 10s of thousands of badly written lines and add cache invalidation calls manually, because I would miss some.

So I hooked into the database layer, parsed the SQL queries and extracted the table names. The read-heavy pages on the frontend were tagged in the cache with the names of the tables they read data from. In the backend, I'd collect the table names in all the write SQL queries and then clear the cache that was tagged with these table names.

Working at the table level rather than row level it cleared more data than was needed, but it was simple and effective.

It worked really well but never went into production - in the end we forced a rewrite of the app. One day I'd like to revisit the idea.