drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Scott <jsc...@maprtech.com>
Subject Re: select from table with options
Date Wed, 21 Oct 2015 13:10:18 GMT
My initial inclination of a table function was that it sounds kind of
sketchy. But given Julian's elaboration and description this sounds like a
great idea.

>From a user perspective this is easy to understand and flexible. To me I
see this table function model effectively like a hint for how to handle the
data and I think others will see it that way too.

+1

On Tue, Oct 20, 2015 at 1:32 PM, Julian Hyde <jhyde@apache.org> wrote:

> +1 to use table functions
>
> In Calcite (and I presume Drill) a “table function” may actually function
> more like a (Lisp) macro. The function gets called at prepare time to yield
> a RelNode (say a TableScan). So a table function is every bit as efficient
> as using a table, but it allows extra parameters.
>
> If the table function has a lot of parameters it might be nice to support
> named parameters:
>
> select * from table(disitributedFile(path => ‘/path/to/something.psv’,
> delimiter => ‘|’));
>
> Named parameters are in the SQL standard but are not supported by
> Calcite’s parser currently. Parameters can be specified in any order, and
> those not specified have a default value.
>
> Julian
>
>
> > On Oct 19, 2015, at 5:18 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> >
> > Wouldn't a table function be a better option?
> >
> > Something like this perhaps?
> >
> > select * from
> > delimitedFile(dfs.`default`.`/path/to/file/something.psv`, '|')
> >
> > ?
> >
> > Or how about fake-o parameters that the delimited record scanner knows
> how
> > to push down into the scanning of the data? That would look like this:
> >
> > select * from
> > dfs.`default`.`/path/to/file/something.psv`
> > where magicFieldDelimiter = '|';
> >
> >
> >
> > On Mon, Oct 19, 2015 at 2:28 PM, Julien Le Dem <julien@dremio.com>
> wrote:
> >
> >> I'm looking into passing information on how to interpret a file through
> the
> >> select clause in Drill.
> >> Something along the lines of:
> >> *select * from
> >> dfs.`default`.`/path/to/file/something.psv?type=text&delimiter=|`;*
> >> (In this example, we want to specify a specific delimiter, but that
> would
> >> apply to any *type* of format)
> >>
> >> Which would allow to read a file without having to centrally configure
> >> formats: https://drill.apache.org/docs/querying-plain-text-files/
> >> Which makes it easier to try to read an existing file.
> >> Typically once the user has found the proper settings, they would update
> >> the central configuration.
> >>
> >> thoughts?
> >>
> >> --
> >> Julien
> >>
>
>


-- 
*Jim Scott*
Director, Enterprise Strategy & Architecture
+1 (347) 746-9281
@kingmesal <https://twitter.com/kingmesal>

<http://www.mapr.com/>
[image: MapR Technologies] <http://www.mapr.com>

Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message