drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vitalii Diravka <vita...@apache.org>
Subject Re: Query on tables
Date Tue, 25 Dec 2018 19:16:39 GMT
Hello!

Can you filter the newly coming data and run CTAS only for those data? It
will allow to avoid extra work.

Kind regards
Vitalii


On Mon, Dec 24, 2018 at 6:53 PM Idea Everywhere <mailtonskiran@gmail.com>
wrote:

> Hi Team,
>
> My current situation:
> I have apache drill installed in AWS EC2 (M4.4x large) instances cluster
> of 3 nodes. My source data is coming from S3 bucket.
> I want to engage drill to read that data from S3, create tables within
> itself (using CTAS) while the table data is stored in AWS EFS mapped to ec2
> instances created as mentioned above and allow the user to read the data
> from those tables.
> Tables and Partitioned tables are created as of now.
>
> Questions:
> 1. It is observed that, when the tables are created, it reads the data
> from source and the table is created along with that data (ie., if the
> original source is 10GB, the tables stored in the file system are
> comparable to that size). However, I have a question, if the source is
> growing, how it gets into the CTAS tables or CTAS Partition by tables, so
> that queries will result latest output
>
> Kind Regards
> Kiran
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message