drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nikos R. Katsipoulakis" <nick.kat...@gmail.com>
Subject Re: Additional information on JSON_SUB_SCAN operator and access to query profiles not from the Web UI
Date Fri, 20 Jan 2017 00:10:59 GMT
Hello again,

Thank you Parth for your suggestions! I will try to follow your
instructions and do something like you suggested on the server.

In addition, I noticed something odd: When I execute a query on Drill's
console (terminal) I get an execution time (let's say) X. When I get the
execution profile from the profiler on the Web Console, I see that an
execution time Y is reported, which is always less than X. From what I
understand, the profiler does not include in its timer some additional
operations, which are included on the time reported on the Drill Console.
Why does the previous happen? Is there any chance that in the execution
times reported in Drill's console are included additional startup costs for
a query (like query parsing, evaluation, optimization etc.)? If yes, can I
get an exact breakdown of the time spent for a query?

Thank you,
Nikos

On Thu, Jan 19, 2017 at 5:48 PM, Parth Chandra <pchandra@mapr.com> wrote:

> JSON_SUB_SAN is the Json reader. It uses Jackson to do actual parsing, and
> converts the data into Drill's internal value vector format. TEXT_SUB_SCAN
> is the corresponding operator for csv.
>
> If the Drill system has access to the /log/profile directory then you can,
> in fact, use Drill to query the json in the query profile. You might want
> to setup an nfs location for the query profiles,so that the directory is
> visible to all drillbits.  The simply create a new workspace pointing to
> the directory. You will be able to read the profiles like any other Json
> file.
>
> ________________________________
> From: Nikos R. Katsipoulakis <nick.katsip@gmail.com>
> Sent: Wednesday, January 11, 2017 7:37:30 AM
> To: user@drill.apache.org
> Subject: Additional information on JSON_SUB_SCAN operator and access to
> query profiles not from the Web UI
>
> Hello all,
>
> I am a new user of Apache Drill and I am in the process of better
> understanding its internals. To that end, I have two questions, for which I
> was unable to find more information online.
>
> First, when I execute an EXPLAIN command for a query that gets its data
> from JSON files, I see a physical operator named JSON_SUB_SCAN. What does
> that operator exactly do? Is it only used for parsing (extracting) fields
> from JSON data? Or does it perform additional processing? As far as I know,
> Drill uses Jackson Streaming API for extracting JSON data. Is that still
> true? Finally, what is the equivalent operator for CSV files?
>
> Second, I need to access query profiles from a server that is behind a
> firewall. Therefore, accessing the URL of that machine on port 8047 is a
> headache (since I have to submit a ticket to IT Support). My question is
> whether I can access the Query Profiles in any other way? Like from the
> sqlline or through log/profile files created while executing queries.
>
> Thank you and Kind Regards,
>
> --
> Nikos R. Katsipoulakis,
> Department of Computer Science
> University of Pittsburgh
>



-- 
Nikos R. Katsipoulakis,
Department of Computer Science
University of Pittsburgh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message