drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian O'Neill <b...@alumni.brown.edu>
Subject Getting plugged in... (Cassandra and Drill?)
Date Mon, 21 Jan 2013 04:40:40 GMT
(Sorry for the cross-list post, I didn't know which list was appropriate for this question)

Last week, Brad Anderson came up and presented at the PhillyDB meetup.
http://www.slideshare.net/boorad/phillydb-talk-beyond-batch

He gave us an overview of Drill, and I'm curious...

Presently, we heavily use Storm + Cassandra.
http://brianoneill.blogspot.com/2012/08/a-big-data-trifecta-storm-kafka-and.html

We treat CRUD operations as events. Then within Storm we calculate aggregate counts of entities
flowing through the system by various dimensions.   That works well, but we still need an
ad hoc reporting capability, and a way to report on data in the system that is not active
(historical).

Would it be possible to use the Drill engine against a Cassandra backend?
If so, what does that mean?   (implementing some API?)

I assume that performance would be terrible unless somehow the data is stored using the columnar
data format from the Dremel paper.  Is that accurate?  Does anyone know if anyone has attempted
a translation of that format to Cassandra?

Regardless, I'm very interested in getting involved and no stranger to getting my hands dirty.
Let me know if you can provide any direction. (our entities are currently stored in JSON in
Cassandra)

-brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message