"Thrudb is a set of simple services built on top of the Apache Thrift framework that provides indexing and document storage services for building and scaling websites. Its purpose is to offer web developers flexible, fast and easy-to-use services that can enhance or replace traditional data storage and access layers."
I actually stumbled into thrudb a few months ago after we had already written v1. We talked to some people at Disrupt that are doing similar things with search and Thrift as well.
It seems like a strange thing to provide as a SAAS. Unless you're hosted in the same datacenter, the web latency would surely make the speed of it irrelevant. They mention they're "working on a way for developers to run ThriftDB locally" which might make it worth looking in to. I could see it being useful for some things, certainly, but it wouldn't provide enough benefit as a SAAS to make calls over the web.
Good point. We decided to make it available as a (free) cloud service just so we could get hacker feedback as quickly as possible. If we had waited until we had a version that was easy to install, it would have taken us a lot more time. If people like it, the plan is to open source it.
I agree getting it ready for "easy" install takes a lot more work than completing the coding. I wonder what feature, or combination of features, is the differentiator here? Maybe it's the speed of search, or the loose document schemas combined with freetext search, the REST API? Some more examples and benchmarks would be interesting, when you find the time :)
The thing that excites us the most is the flexible schema because that can cut down on development time dramatically. The REST API is also optimized for developer happiness.
We're working on some examples and should have them ready soon.
Benchmarking is a good idea but tricky because speed depends on the complexity of the data and the query itself. If you have any recommendations for benchmarks please let me know.
In the future we have plans to add machine learning features to optimize relevancy algorithms automatically but that's still a ways off.
I did almost this exactly last fall, except I used JSON and JSON Schema instead instead of thrift. Called it hummingbird db. I submitted to YC but all I got was an email that it wasn't that interesting.
Would love to hear more about hummingbird db. Email me at andres@octopart.com. We chose Thrift because the schema was flexible and that was the biggest problem for us at Octopart.
it sounds like heaps of people are trying to do very similar things.. just this week I've been hacking on a rails engine/plugin that lets you define your models with mongo_mapper and then request them as json schema. It's not quite complete but json schema gives you interfaces for your restful apis..
If your solution is really so fast, then you must be making benchmarks continuously. How else would you know if you are improving and whether you are actually fast or just faster than [a tree | clouds | a ricecorn].
So either you lie about your performance or you choose to purposefully hide your incredibly well performing benchmarks.
That is a great question - we actually didn't consider using elastic search, we went with Solr because we use it for Octopart and are experienced with it so it made developing ThriftDB easier. We're evaluating other options now and will have a look.
Oh cool, so now I can host my app in one data center and have it make DB calls across the open internet to another DB server! But wait, there's more! It's over a stateless protocol: HTTP, with really poor multiplexing/pipelining support.
Latency is a feature, right? Like "slow your roll, cowboy, let's not have a heart attack here".
"Thrudb is a set of simple services built on top of the Apache Thrift framework that provides indexing and document storage services for building and scaling websites. Its purpose is to offer web developers flexible, fast and easy-to-use services that can enhance or replace traditional data storage and access layers."
No long under development though.