hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Nguyen <andrew-lists-hb...@ucsfcti.org>
Subject Modeling column families
Date Fri, 23 Apr 2010 23:46:23 GMT
Hello all,

I am currently in the process of researching and learning about HBase (along with other column
stores) as a potential solution for storing large amounts of physiologic data. In both Cassandra
and HBase, it seems that column families need to be created administratively; however, Cassandra
would require an additional server restart.

That said, the patient physiology that I'm looking to store is basically time-series data.
 We are pulling in A/D counts (or as converted to their physical units) for various physiologic
parameters for patients that are in the intensive-care environment.  So, my first inclination
was to model it as follows:

One single column family for "physiology"

Each row key is of the form "PatientName-PhysiologicParameter" and each column name is the
timestamp of the reading.

So, say patient Bob is in the ICU and his arterial blood pressure, heart rate, and intracranial
pressure are currently being monitored.  This would result in the row keys:


The column names would be, "2010-04-23 16:43:44" and so on...

Is this a reasonable way of accomplishing this?  The bulk of the queries would be something

Give me all blood pressures for Bob between two dates
Give me all blood pressures, and intracranial pressures for Bob from <date> until present

In other words, the queries will be very patient-centric, or patient-physiologic parameter-centric.

View raw message