We used a combination of Kafka + Hbase+ Phoenix (http://phoenix.apache.org/) for...

pritambarhate · on Nov 8, 2018

That's interesting. What are query times like? Let's say for single series to query data for a week at a five-minute interval, how many seconds it would take?

PopeDotNinja · on Nov 8, 2018

Does Kafka have timestamps? I didn't see any when I looked, but I was working with an older client version & didn't get far into it.

bra-ket · on Nov 8, 2018

we didnt rely on it, or on the ordering of messages received from kafka, timestamps and transaction IDs were generated by the client app/kafka publisher and were part of the message put into kafka topic, when we consume that message with one of the parallel kafka consumers and save a row in hbase table that original timestamp + transactionId becomes part of the rowkey string, other parts being attributes that we wanted to index (secondary indices are supported in hbase/phoenix but we didnt use them too much, basically the composite rowkey is the index). Then when querying hbase it works as a parallel scanning machine and can do a time range scan + filtering +aggregation very fast.

On a separate note we didn't use joins even though they are supported in Phoenix, data was completely denormalized into one big table.