drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asaf Mesika <asaf.mes...@gmail.com>
Subject Re: Dremel and Google Analytics
Date Thu, 15 Nov 2012 19:13:22 GMT
But we're talking more then 30 dimensions, some with very high
cardinality, in every available order. That's a huge storage penalty
to pay.

Sent from my iPhone

On 15 בנוב 2012, at 07:47, Xun Zhou <shawn.x.zhou@gmail.com> wrote:

> Why don't they only pre-aggregate the standard report set, and compute
> the 'custom report' in runtime based on column-store storage, say
> Bigtable? as you said, they only select 5 dimension at the same time
> in custom report, IMHO, 'column families' in bigtable can help to scan
> less data in practice.
>
> On Wed, Nov 14, 2012 at 1:25 AM, Asaf Mesika <asaf.mesika@gmail.com> wrote:
>> Interesting.
>> Analytics offers drilling up to 5 dimensions in depth - your choice of them out of
a few tenths. That's quite a lot of combinations for them to pre-aggregate. So its seems they
will a heavy storage penalty for such pre calculation.
>> Regarding large data sets - when you are using the app you are focus on one domain.
So the data set is as large as the site traffic. As I understand they 20k-50k machines, so
I thought they can disperse the data on it, and run Dremel on top of this data. They can optimize
by doing some first level aggregations in all sorts of dimensions, and then run Dremel on
top of that which makes the data set smaller by x10 the very least.
>>
>> Asaf
>>
>> On 13 בנוב 2012, at 17:51, David Gruzman <david@bigdatacraft.com> wrote:
>>
>>> As far as I know, it is not. It is heavy sampling and pre-calculations.
>>> If you do processing of large data sets - result of aggregation will be
>>> also large - something dremel does not intended to support. It is designed
>>> to build small derivative over large dataset.
>>> David
>>>
>>> On Tue, Nov 13, 2012 at 5:36 PM, Mesika, Asaf <asaf.mesika@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Do you know if Google Analytics is powered by Dremel?
>>>>
>>>> Thanks,
>>>>
>>>> Asaf
>>

Mime
View raw message