drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Jaimes III <rafjai...@gmail.com>
Subject Re: REST data source?
Date Wed, 01 Apr 2020 01:39:32 GMT
Either a text description of the parse path or specifying the class
with the message parser could work.
I think the latter would be better, if it were simple as dropping the
JAR in 3rdparty after Drill is already built.
That way we can just continually add parsers ad-hoc.

An example JSON response includes about 4 top-level fields,
then 2 of those fields have many sub-fields.
For example a field could be nested 3 levels deep and say:

Characteristic:

   Color:

      Color name: "Red"

      Confidence: 100

   Physical:

       Size: 405

       Confidence:  95

As you can imagine, it would be difficult to flatten this because of
repeated sub-field names like "Confidence".

I don't think it would be easily exportable into a CSV.
At least for me pandas dataframe is the ultimate destination for all
of this, which also don't handle nested fields well either.
I'll have to handle some parsing on my end.

Filter pushdown would be huge and much desired.
Our other end-users are accustomed to using SQL in that manner and the
REST API we use fully support AND, OR, BETWEEN, =, <, >, etc (I can
get a full list if you're interested).
For example I think "between" is a ",". Converting the SQL statement
into the URL format would be awesome and help streamline querying
across data sources.
This is one of the main reasons why we're so interested in Drill.


Thanks,

Rafael

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message