drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Girish <agir...@mapr.com>
Subject Re: [newbie]: how to query HDFS
Date Fri, 22 May 2015 15:49:52 GMT
I tried out Tomar's steps on MapR and it was pretty straight-forward.

I have drill installed on one cluster. The only change I made was to add a
new storage plug-in "dfs2" (duplicating the default dfs). I edited the
connection string and changed "maprfs:///" to "maprfs://<IP>". And when i
connected to drill via sqlline (no changes here), I was able to access the
remote file system by simply using the the full path of the file, prefixed
with "dfs2.".

In case of HDFS, I'm assuming the steps required must be similar, except
the connection string (hdfs://<IP>:<port>)

Also, as Andrews mentioned, there looks to be a typo in your query - that
could very well be the issue.

Thanks,
Abhishek

On Fri, May 22, 2015 at 8:27 AM, Andries Engelbrecht <
aengelbrecht@maprtech.com> wrote:

> In step2 you need to have a back tick ` at the end of par and not a single
> quote ‘ .
>
> It has been mentioned that it may not work to just point to the name node
> on a remote cluster. I have not tried it, but suspect their may be various
> issues with the HDFS plug in and how you are trying to use it.
>
> Perhaps if you can explain why you are trying to do this, there may be
> other alternatives to explore.
>
> What Hadoop distro are you using?
>
> —Andries
>
>
> On May 22, 2015, at 8:17 AM, Alan Miller <Alan.Miller@synopsys.com> wrote:
>
> > Thanks Shiran,
> >
> > I tried that but get the same error (see below).
> >
> > Also, strangely I couldn't create the hdfs plugin in one step by using
> the same
> > config as the "dfs" plugin and changing the connection string. The UI
> says Invalid JSON...
> > I had to create the hdfs plugin in 2 steps. First using the same config
> as the dfs plugin.
> > Then  updated the hdfs config, by changing the connection string
> >
> > After adding the hdfs plugin with the same config as dfs (but different
> connection
> > string ("connection": "hdfs://10.10.10.10:9000/",) I tried this
> >
> > 1. Copied the file from node1 to remote HDFS
> >    [alan@node1 drill]$ hdfs dfs -fs hdfs://10.10.10.10:9000/
> -copyFromLocal ~/test.par /tmp
> >    [alan@node1 drill]$ hdfs dfs -fs hdfs://10.10.10.10:9000/ -ls
> /tmp/test.par
> >    -rw-r--r--   1 alan supergroup    4947359 2015-05-22 08:09
> /tmp/test.par
> >
> > 2. From drill on node1
> >    [alan@node1 drill]$ /opt/drill/bin/drill-localhost
> >    apache drill 1.0.0
> >    "json ain't no thang"
> >    0: jdbc:drill:drillbit=localhost> use hdfs;
> >    +-------+-----------------------------------+
> >    |  ok   |              summary              |
> >    +-------+-----------------------------------+
> >    | true  | Default schema changed to [hdfs]  |
> >    +-------+-----------------------------------+
> >    1 row selected (0.422 seconds)
> >    0: jdbc:drill:drillbit=localhost> select * from
> hdfs.root.`/tmp/test.par' limit 5;
> >    Error: PARSE ERROR: Lexical error at line 1, column 55.  Encountered:
> <EOF> after : "`/tmp/test.par\' limit 5"
> >
> >    [Error Id: 1f793d84-62be-4145-bfcf-2ec3da9cb021 on
> node1.mycompany.com:31010] (state=,code=0)
> >    0: jdbc:drill:drillbit=localhost>
> >    0: jdbc:drill:drillbit=localhost> !quit
> >    Closing:
> org.apache.drill.jdbc.DrillJdbc41Factory$DrillJdbc41Connection
> >
> > Alan
>
>


-- 

Abhishek Girish

Senior Software Engineer

(408) 476-9209

<http://www.mapr.com/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message