drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Girish <abhishek.gir...@gmail.com>
Subject Re: Trying to wrap my head around using WHERE with flatten
Date Fri, 27 Mar 2015 18:01:47 GMT
Possibly related to DRILL-2208
<https://issues.apache.org/jira/browse/DRILL-2208>

On Fri, Mar 27, 2015 at 10:49 AM, Christopher Matta <cmatta@mapr.com> wrote:

> I’m using the yelp academic data set <
> https://www.yelp.com/academic_dataset>
> for my testing purposes.
>
> Say I want to list all businesses in the “Pets” category, this query
> doesn’t work:
>
> 0: jdbc:drill:zk=sen11:5181,sen12:5181> select `name`,
> flatten(`categories`) from
> maprfs.cmatta.`yelp/yelp_academic_dataset_business.json` WHERE
> flatten(`categories`) = 'Pets' limit 10;
> Query failed: Query stopped., Failure while trying to materialize
> incoming schema.  Errors:
>
> Error in expression at index -1.  Error: Missing function
> implementation: [flatten(VARCHAR-REPEATED)].  Full expression:
> --UNKNOWN EXPRESSION--.. [ daf9f62a-f39b-4745-a0e3-0160875f5cb7 on
> se-node13.se.lab:31010 ]
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
> But this query with a sub-select does work:
>
> 0: jdbc:drill:zk=sen11:5181,sen12:5181> select flat.`name`,
> flat.`categories` from (select `name`, flatten(`categories`) as
> categories from
> maprfs.cmatta.`yelp/yelp_academic_dataset_business.json`) as flat
> WHERE lower(flat.`categories`) = 'pets' limit 10;
> +------------+------------+
> |    name    | categories |
> +------------+------------+
> | Loving Hands Pet Care | Pets       |
> | Amec Mid-City Animal Hospital | Pets       |
> | PetSmart   | Pets       |
> | A Dog's Life Photography | Pets       |
> | Goober Pet Direct | Pets       |
> | Pet Planet | Pets       |
> | All-Star Animal Hospital | Pets       |
> | Team Canine, Inc | Pets       |
> | Foothills Pet Resort | Pets       |
> | Petco      | Pets       |
> +------------+------------+
> 10 rows selected (0.141 seconds)
>
> Now that I can pick out which businesses are in the “Pets” category, I may
> want to get the average star rating and order them by the number of
> reviews. However, using the above technique in an aggregation query doesn’t
> seem to work:
>
> 0: jdbc:drill:zk=sen11:5181,sen12:5181> select
> . . . . . . . . . . . . . . . . . . . >        a.`name`,
> . . . . . . . . . . . . . . . . . . . >        count(b.`text`),
> . . . . . . . . . . . . . . . . . . . >        avg(b.`stars`)
> . . . . . . . . . . . . . . . . . . . >    FROM (
> . . . . . . . . . . . . . . . . . . . >      SELECT `business_id`,
> `name`, flatten(`categories`) as `category`
> . . . . . . . . . . . . . . . . . . . >      FROM
> maprfs.`cmatta`.`yelp/yelp_academic_dataset_business.json`) a
> . . . . . . . . . . . . . . . . . . . >    JOIN
> maprfs.`cmatta`.`yelp/yelp_academic_dataset_review.json` b
> . . . . . . . . . . . . . . . . . . . >        ON a.`business_id` =
> b.`business_id`
> . . . . . . . . . . . . . . . . . . . >    WHERE a.`category` = 'Pets'
> . . . . . . . . . . . . . . . . . . . >    GROUP BY a.`name`
> . . . . . . . . . . . . . . . . . . . >    ORDER BY count(*) DESC
> . . . . . . . . . . . . . . . . . . . >    LIMIT 10;
> Query failed: Query failed: Failure while running fragment., Failure
> while trying to materialize incoming schema.  Errors:
>
> Error in expression at index 2.  Error: Missing function
> implementation: [flatten(VARCHAR-REPEATED)].  Full expression: null..
> [ c28a76ef-ef20-426a-a47e-908e9b68b875 on se-node13.se.lab:31010 ]
> [ c28a76ef-ef20-426a-a47e-908e9b68b875 on se-node13.se.lab:31010 ]
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
> What does *Error: Missing function implementation:
> [flatten(VARCHAR-REPEATED)]* in this context? Am I going about this the
> wrong way?
>
> Chris Matta
> cmatta@mapr.com
> 215-701-3146
> ​
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message