drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Haynes <philip.hay...@virtualnation.com.au>
Subject Re: What language are you going to use to develop drill?
Date Tue, 02 Oct 2012 00:30:00 GMT
Not sure I entirely agree.

I could have a set of log files from a number of application servers.
Whilst the records are read only, the files are continuously appended too.

Taking a snapshot of the log files, preprocessing them each time you want
to run 
a set of queries is a design option, compared to incremental update.

The query systems I have built have moved away from the former to the
latter
due to the cost and time associated with of full pre-processing. After more
than insignificant pain as system scaled beyond 100M records, we moved to
more stream oriented designs.


Not going to die in a ditch over this one, however, as implicit in my note
is the
fix when people start having the problem of processing larger datasets.

Cheers,
Philip

On 2/10/12 9:31 AM, "Dmitriy Ryaboy" <dvryaboy@gmail.com> wrote:

>On Mon, Oct 1, 2012 at 3:35 AM, Philip Haynes <
>philip.haynes@virtualnation.com.au> wrote:
>
>> I actually said ³transactional consistency to atomic clock accuracy²
>> rather than ³global transactions² ­
>> there is a difference.  The key assumption to scope is whether Dremel
>> tables are mutable or not.
>> Whilst BigQuery tables are immutable, it isn't clear (to me anyway) that
>> this is true for
>> Dremel.
>
>
>Philip, the first sentence of the Dremel paper reads:
>"Dremel is a scalable, interactive ad-hoc query system for analysis
>of*read-only
>*nested data".
>
>So I think that part is fairly clear.
>
>-D



Mime
View raw message