We used a combination of Kafka + Hbase+ Phoenix (http://phoenix.apache.org/) for similar purpose. It takes some effort to setup initial Hbase cluster but once you do it manually once and automate with Ansible /systemd it's pretty robust in operation.
All our development was around query engine using plain JDBC/SQL to talk to Hbase via Phoenix. Scaling is as simple as adding a node in the cluster.
That's interesting. What are query times like? Let's say for single series to query data for a week at a five-minute interval, how many seconds it would take?
we didnt rely on it, or on the ordering of messages received from kafka, timestamps and transaction IDs were generated by the client app/kafka publisher and were part of the message put into kafka topic, when we consume that message with one of the parallel kafka consumers and save a row in hbase table that original timestamp + transactionId becomes part of the rowkey string, other parts being attributes that we wanted to index (secondary indices are supported in hbase/phoenix but we didnt use them too much, basically the composite rowkey is the index). Then when querying hbase it works as a parallel scanning machine and can do a time range scan + filtering +aggregation very fast.
On a separate note we didn't use joins even though they are supported in Phoenix, data was completely denormalized into one big table.
All our development was around query engine using plain JDBC/SQL to talk to Hbase via Phoenix. Scaling is as simple as adding a node in the cluster.