kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sand Stone <sand.m.st...@gmail.com>
Subject Re: Partition and Split rows
Date Thu, 12 May 2016 22:37:43 GMT
Cool, I will study it.

On Thu, May 12, 2016 at 2:17 PM, Dan Burkert <dan@cloudera.com> wrote:

>
>
> On Thu, May 12, 2016 at 2:04 PM, Sand Stone <sand.m.stone@gmail.com>
> wrote:
>
> >Instead, take advantage of the index capability of Primary Keys.
>> Currently I did make the "5-min" field a part of the primary key as well.
>> I am most likely overdoing it. I will play around with the schema and use
>> cases around it.
>>
>
> Definitely take a look at the data model in Kudu TS, it has extremely
> efficient scan semantics (all scans retrieve only the necessary data by
> using the primary key, no client *or* server side filtering), and it works
> with arbitrarily large time range partitions.
>
>
>> >since each tablet server should only have on the order of 10-20 tablets.
>> How does this 10-20 heuristics come out? Is it based on certain machine
>> profile? Or some default parameters in the code/config?
>>
>
> 10-20 isn't a hard and fast rule, and it is very dependent on the dataset
> and the machine size.  We routinely run tablet servers with 300+ tablets,
> so it's definitely not a hard limitation.  My intention was to stress that
> instead of pursuing fine grained partitions as a method of limiting the
> size of scans, instead take advantage of the primary key indexing that Kudu
> provides, just as you might in a traditional relational database.
> Partitions are better suited for ensuring that inserts and large scans can
> be parallelized across multiple machines.
>
> - Dan
>
>
>>
>> On Thu, May 12, 2016 at 1:45 PM, Dan Burkert <dan@cloudera.com> wrote:
>>
>>>
>>> On Thu, May 12, 2016 at 11:39 AM, Sand Stone <sand.m.stone@gmail.com>
>>> wrote:
>>>
>>> I don't know how Kudu load balance the data across the tablet servers.
>>>>
>>>
>>> Individual tablets are replicated and balanced across all available
>>> tablet servers, for more on that see
>>> http://getkudu.io/docs/schema_design.html#data-distribution.
>>>
>>>
>>>
>>>> For example, do I need to pre-calculate every day, a list of 5 minutes
>>>> apart timestamps at table creation? [assume I have to create a new table
>>>> every day].
>>>>
>>>
>>> If you wish to range partition on the time column, then yes, currently
>>> you must specify the splits upfront during table creation (but this will
>>> change with the non-covering range partitions work).
>>>
>>>
>>>>
>>>> My hope, with the additional 5-min column, and use it as the range
>>>> partition column, is that so I could spread the data evenly across the
>>>> tablet servers.
>>>>
>>>
>>> I don't think this is meaningfully different than range partitioning on
>>> the full time column with splits every 5 minutes.
>>>
>>>
>>>> Also, since 5-min interval data are always colocated together, the read
>>>> query could be efficient too.
>>>>
>>>
>>> Data colocation is a function of the partitioning and indexing.  As I
>>> mentioned before, if you have timestamp as part of your primary key then
>>> you can guarantee that scans specifying a time range are efficient. Overall
>>> it sounds like you are attempting to get fast scans by creating many fine
>>> grained partitions, as you might with Parquet.  This won't be an efficient
>>> strategy in Kudu, since each tablet server should only have on the order of
>>> 10-20 tablets.  Instead, take advantage of the index capability of Primary
>>> Keys.
>>>
>>> - Dan
>>>
>>>
>>>> On Thu, May 12, 2016 at 11:13 AM, Dan Burkert <dan@cloudera.com> wrote:
>>>>
>>>>> Forgot to add the PK specification to the CREATE TABLE, it should have
>>>>> read as follows:
>>>>>
>>>>> CREATE TABLE metrics (metric STRING, time TIMESTAMP, value DOUBLE)
>>>>> PRIMARY KEY (metric, time);
>>>>>
>>>>> - Dan
>>>>>
>>>>>
>>>>> On Thu, May 12, 2016 at 11:12 AM, Dan Burkert <dan@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On Thu, May 12, 2016 at 11:05 AM, Sand Stone <sand.m.stone@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> > Is the requirement to pre-aggregate by time window?
>>>>>>> No, I am thinking to create a column say, "minute". It's basically
>>>>>>> the minute field of the timestamp column(even round to 5-min
bucket
>>>>>>> depending on the needs). So it's a computed column being filled
in on data
>>>>>>> ingestion. My goal is that this field would help with data filtering
at
>>>>>>> read/query time, say select certain projection at minute 10-15,
to speed up
>>>>>>> the read queries.
>>>>>>>
>>>>>>
>>>>>> In many cases, Kudu can do his for you without having to add special
>>>>>> columns.  The requirements are that the timestamp is part of the
primary
>>>>>> key, and any columns that come before the timestamp in the primary
key (if
>>>>>> it's a compound PK), have equality predicates.  So for instance,
if you
>>>>>> create a table such as:
>>>>>>
>>>>>> CREATE TABLE metrics (metric STRING, time TIMESTAMP, value DOUBLE);
>>>>>>
>>>>>> then queries such as
>>>>>>
>>>>>> SELECT time, value FROM metrics WHERE metric = "my-metric" AND time
>
>>>>>> 2016-05-01T00:00 AND time < 2016-05-01T00:05
>>>>>>
>>>>>> Then only the data for that 5 minute time window will be read from
>>>>>> disk.  If the query didn't have the equality predicate on the 'metric'
>>>>>> column, then it would do a much bigger scan + filter operation. 
If you
>>>>>> want more background on how this is achieved, check out the partition
>>>>>> pruning design doc:
>>>>>> https://github.com/apache/incubator-kudu/blob/master/docs/design-docs/scan-optimization-partition-pruning.md
>>>>>> .
>>>>>>
>>>>>> - Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks for the info., I will follow them.
>>>>>>>
>>>>>>> On Thu, May 12, 2016 at 10:50 AM, Dan Burkert <dan@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey Sand,
>>>>>>>>
>>>>>>>> Sorry for the delayed response.  I'm not quite following
your use
>>>>>>>> case.  Is the requirement to pre-aggregate by time window?
I don't think
>>>>>>>> Kudu can help you directly with that (nothing built in),
but you could
>>>>>>>> always create a separate table to store the pre-aggregated
values.  As far
>>>>>>>> as applying functions to do row splits, that is an interesting
idea, but I
>>>>>>>> think once Kudu has support for range bounds (the non-covering
range
>>>>>>>> partition design doc linked above), you can simply create
the bounds where
>>>>>>>> the function would have put them.  For example, if you want
a partition for
>>>>>>>> every five minutes, you can create the bounds accordingly.
>>>>>>>>
>>>>>>>> Earlier this week I gave a talk on timeseries in Kudu, I've
>>>>>>>> included some slides that may be interesting to you.  Additionally,
you may
>>>>>>>> want to check out https://github.com/danburkert/kudu-ts,
it's a
>>>>>>>> very young  (not feature complete) metrics layer on top of
Kudu, it may
>>>>>>>> give you some ideas.
>>>>>>>>
>>>>>>>> - Dan
>>>>>>>>
>>>>>>>> On Sat, May 7, 2016 at 1:28 PM, Sand Stone <sand.m.stone@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for sharing, Dan. The diagrams explained clearly
how the
>>>>>>>>> current system works.
>>>>>>>>>
>>>>>>>>> As for things in my mind. Take the schema of
>>>>>>>>> <host,metric,time,...>, say, I am interested in
data for the past 5 mins,
>>>>>>>>> 10 mins, etc. Or, aggregate at 5 mins interval for the
past 3 days, 7 days,
>>>>>>>>> ... Looks like I need to introduce a special 5-min bar
column, use that
>>>>>>>>> column to do range partition to spread data across the
tablet servers so
>>>>>>>>> that I could leverage parallel filtering.
>>>>>>>>>
>>>>>>>>> The cost of this extra column (INT8) is not ideal but
not too bad
>>>>>>>>> either (storage cost wise, compression should do wonders).
So I am thinking
>>>>>>>>> whether it would be better to take "functions" as row
split instead of only
>>>>>>>>> constants. Of course if business requires to drop down
to 1-min bar, the
>>>>>>>>> data has to be re-sharded again. So a more cost effective
way of doing this
>>>>>>>>> on a production cluster would be good.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, May 7, 2016 at 8:50 AM, Dan Burkert <dan@cloudera.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Sand,
>>>>>>>>>>
>>>>>>>>>> I've been working on some diagrams to help explain
some of the
>>>>>>>>>> more advanced partitioning types, it's attached.
  Still pretty rough at
>>>>>>>>>> this point, but the goal is to clean it up and move
it into the Kudu
>>>>>>>>>> documentation proper.  I'm interested to hear what
kind of time series you
>>>>>>>>>> are interested in Kudu for.  I'm tasked with improving
Kudu for time
>>>>>>>>>> series, you can follow progress here
>>>>>>>>>> <https://issues.apache.org/jira/browse/KUDU-1306>.
If you have
>>>>>>>>>> any additional ideas I'd love to hear them.  You
may also be interested in
>>>>>>>>>> a small project that a JD and I have been working
on in the past week to
>>>>>>>>>> build an OpenTSDB style store on top of Kudu, you
can find it
>>>>>>>>>> here <https://github.com/danburkert/kudu-ts>.
 Still quite
>>>>>>>>>> feature limited at this point.
>>>>>>>>>>
>>>>>>>>>> - Dan
>>>>>>>>>>
>>>>>>>>>> On Fri, May 6, 2016 at 4:51 PM, Sand Stone <
>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks. Will read.
>>>>>>>>>>>
>>>>>>>>>>> Given that I am researching time series data,
row locality is
>>>>>>>>>>> crucial :-)
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 6, 2016 at 3:57 PM, Jean-Daniel Cryans
<
>>>>>>>>>>> jdcryans@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> We do have non-covering range partitions
coming in the next few
>>>>>>>>>>>> months, here's the design (in review):
>>>>>>>>>>>> http://gerrit.cloudera.org:8080/#/c/2772/9/docs/design-docs/non-covering-range-partitions.md
>>>>>>>>>>>>
>>>>>>>>>>>> The "Background & Motivation" section
should give you a good
>>>>>>>>>>>> idea of why I'm mentioning this.
>>>>>>>>>>>>
>>>>>>>>>>>> Meanwhile, if you don't need row locality,
using hash
>>>>>>>>>>>> partitioning could be good enough.
>>>>>>>>>>>>
>>>>>>>>>>>> J-D
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 6, 2016 at 3:53 PM, Sand Stone
<
>>>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Makes sense.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yeah it would be cool if users could
specify/control the split
>>>>>>>>>>>>> rows after the table is created. Now,
I have to "think ahead" to pre-create
>>>>>>>>>>>>> the range buckets.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:49 PM, Jean-Daniel
Cryans <
>>>>>>>>>>>>> jdcryans@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> You will only get 1 tablet and no
data distribution, which is
>>>>>>>>>>>>>> bad.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's also how HBase works, but
it will split regions as you
>>>>>>>>>>>>>> insert data and eventually you'll
get some data distribution even if it
>>>>>>>>>>>>>> doesn't start in an ideal situation.
Tablet splitting will come later for
>>>>>>>>>>>>>> Kudu.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> J-D
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:42 PM, Sand
Stone <
>>>>>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One more questions, how does
the range partition work if I
>>>>>>>>>>>>>>> don't specify the split rows?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:37 PM,
Sand Stone <
>>>>>>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks, Misty. The "advanced"
impala example helped.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I was just reading the Java
API,CreateTableOptions.java,
>>>>>>>>>>>>>>>> it's unclear how the range
partition column names associated with the
>>>>>>>>>>>>>>>> partial rows params in the
addSplitRow API.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:08
PM, Misty Stanley-Jones <
>>>>>>>>>>>>>>>> mstanleyjones@cloudera.com>
wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Sand,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please have a look at
>>>>>>>>>>>>>>>>> http://getkudu.io/docs/kudu_impala_integration.html#partitioning_tables
>>>>>>>>>>>>>>>>> and see if it is helpful
to you.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Misty
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, May 6, 2016 at
2:00 PM, Sand Stone <
>>>>>>>>>>>>>>>>> sand.m.stone@gmail.com>
wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi, I am new to Kudu.
I wonder how the split rows work. I
>>>>>>>>>>>>>>>>>> know from some docs,
this is currently for pre-creation the table. I am
>>>>>>>>>>>>>>>>>> researching how to
partition (hash+range) some time series test data.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is there an example?
or notes somewhere I could read
>>>>>>>>>>>>>>>>>> upon.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks much.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message