drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Altekruse <altekruseja...@gmail.com>
Subject Re: Unable to query data from hdfs
Date Wed, 08 Apr 2015 21:19:37 GMT
Hi Latha,

Unfortunately the mailing list does not support attachments, could you
possibly throw the file onto a file sharing service and share a link? If
the file is below 20 MB you should be able to file a JIRA issue and upload
it there as an attachment if you don't have another host available.

-Jason

On Wed, Apr 8, 2015 at 2:06 PM, Sivasubramaniam, Latha <
Latha.Sivasubramaniam@aspect.com> wrote:

>  Ramana,
>
>
>
> Please find attached dservices.tar file.
>
>
>
> Thanks for your help.
>
>
>
> -Latha
>
>
>
> *From:* Sivasubramaniam, Latha
> *Sent:* Wednesday, April 08, 2015 1:33 PM
>
> *To:* 'user@drill.apache.org'
> *Subject:* RE: Unable to query data from hdfs
>
>
>
> Thanks for all the responses.
>
>
>
> Once I renamed files within directories to have extensions .csv, then it
> worked. So looks like for csv format, having extension is a must. It would
> be nice, if it does not allow “null” in the extension description.
>
>
>
> Now in the next step of my proof of concept, I am trying to access parquet
> files. I have parquet files(tables) created for the tables using impala, I
> am assuming that I should be able to access those files via drill as well.
>
>
>
> My parquet tables are placed under /user/hive/warehouse, like listed below
> here
>
>
>
>
>
> [root@rtr-poc-imp1 sample-data]# hdfs dfs -ls /user/hive/warehouse
>
> Found 19 items
>
> drwxrwxrwt   - impala hive          0 2015-03-31 16:00
> /user/hive/warehouse/dim_agent_status_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 16:00
> /user/hive/warehouse/dim_agent_status_reasons_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-27 12:27
> /user/hive/warehouse/dim_agents_parquet
>
> drwxrwxrwt   - impala hive          0 2015-03-31 16:00
> /user/hive/warehouse/dim_call_action_reasons_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 14:09
> /user/hive/warehouse/dim_call_actions_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 13:54
> /user/hive/warehouse/dim_call_types_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 15:59
> /user/hive/warehouse/dim_dispositions_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 15:20
> /user/hive/warehouse/dim_resource_groups_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 13:33
> /user/hive/warehouse/dim_services_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 14:00
> /user/hive/warehouse/dim_sites_parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 15:25
> /user/hive/warehouse/dim_workgroups_parq
>
> drwxrwxrwx   - root   hive          0 2015-04-08 14:36
> /user/hive/warehouse/dservices
>
> drwxrwxrwt   - impala hive          0 2015-03-27 11:48
> /user/hive/warehouse/edwpoc.db
>
> drwxrwxrwt   - impala hive          0 2015-03-31 12:47
> /user/hive/warehouse/fact_agent_activity_detail_12m_partparq
>
> drwxrwxrwt   - impala hive          0 2015-03-30 13:03
> /user/hive/warehouse/fact_contact_detail_12m_partparq
>
> drwxrwxrwt   - impala hive          0 2015-03-27 13:36
> /user/hive/warehouse/fact_contact_detail_partparq
>
> -rw-r--r--   3 root   hive        455 2015-04-08 14:55
> /user/hive/warehouse/region.parq
>
> drwxrwxrwt   - impala hive          0 2015-03-25 22:29
> /user/hive/warehouse/sample_07
>
> drwxrwxrwt   - impala hive          0 2015-03-25 22:29
> /user/hive/warehouse/sample_08
>
>
>
> example listing from one of the directory
>
>
>
> hdfs dfs -ls /user/hive/warehouse/dim_services_parq
>
> Found 3 items
>
> -rw-r--r--   3 impala hive      55121 2015-03-31 13:33
> /user/hive/warehouse/dim_services_parq/4645c4221dafa337-250888c6ac1de29b_1376355963_data.0.parq
>
> -rw-r--r--   3 impala hive      71075 2015-03-31 13:33
> /user/hive/warehouse/dim_services_parq/4645c4221dafa337-250888c6ac1de29c_
> 2123191845_data.0.parq
>
> drwxrwxrwt   - impala hive          0 2015-03-31 13:33
> /user/hive/warehouse/dim_services_parq/_impala_insert_staging
>
> [root@rtr-poc-imp1 sample-data]#
>
>
>
> There is nothing under impala staging directory, this is primarily used
> when insert operation is performed.
>
>
>
> I copied dim_services_parq directory to dservices and below is the listing
> of dservices directory.
>
>
>
> [root@rtr-poc-imp1 sample-data]#  hdfs dfs -ls
> /user/hive/warehouse/dservices
>
> Found 2 items
>
> -rwxrwxrwx   3 root hive      55121 2015-04-08 14:12
> /user/hive/warehouse/dservices/service0.parquet
>
> -rwxrwxrwx   3 root hive      71075 2015-04-08 14:12
> /user/hive/warehouse/dservices/service1.parquet
>
>
>
> Now when I try, I get the below error
>
>
>
> select * from hdfs.drillpoc.`/dservices`;
>
> Query failed: RemoteRpcException: Failure while running fragment.,
> java.lang.UnsupportedOperationException [
> cfca83ec-986a-43c0-a967-5aee102401dd on rtr-poc-imp2.labs.aspect.com:31010
> ]
>
> [ cfca83ec-986a-43c0-a967-5aee102401dd on
> rtr-poc-imp2.labs.aspect.com:31010 ]
>
>
>
> I also copied the drill sample parquet file region.parquet to the same
> location and that works fine like below.
>
>
>
> select * from hdfs.drillpoc.`region.parq`;
>
> +-------------+------------+------------+
>
> | R_REGIONKEY |   R_NAME   | R_COMMENT  |
>
> +-------------+------------+------------+
>
> | 0           | AFRICA     | lar deposits. blithe |
>
> | 1           | AMERICA    | hs use ironic, even  |
>
> | 2           | ASIA       | ges. thinly even pin |
>
> | 3           | EUROPE     | ly final courts cajo |
>
> | 4           | MIDDLE EAST | uickly special accou |
>
> +-------------+------------+------------+
>
> 5 rows selected (0.122 seconds)
>
>
>
> So far what I have read, impala created parquet file should be like any
> other parquet file, there should not be a problem. If this does not work, I
> need to convert all my tables in text format to parquet format and access
> it with drill. Is there any utility to do that.
>
>
>
> Thanks for all the help.
>
> Latha
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *From:* Sivasubramaniam, Latha
> *Sent:* Wednesday, April 08, 2015 8:00 AM
> *To:* 'user@drill.apache.org'
> *Subject:* RE: Unable to query data from hdfs
>
>
>
> Hi,
>
>
>
> Thanks for your responses. Even though I had done use hdfs, only when I
> fully qualified the file name it worked. But I am not able to access files
> without .csv extension.
>
>
>
> I modified
>
>
>
> "csv": {
>
>       "type": "text",
>
>       "extensions": [
>
>         "csv"
>
>       ],
>
>       "delimiter": ","
>
>
>
> To
>
>
>
> "csv": {
>
>       "type": "text",
>
>       "extensions":  null,
>
>       "delimiter": ","
>
>
>
> And tried to access hdfs file ‘DIM_Agents’ and I get the same error. With
> null extensions, I can’t access ‘test.csv’ also, once I reverted back csv
> format description then I could access test.csv again, but I cannot access
> other files with either of the format descriptions.
>
>
>
> Below are what I tried. Is ‘_’  (underscore) a problem in the file name.
> All my hdfs files are in text format.
>
>
>
> 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`;
>
> +------------+------------+
>
> |  columns   |    dir0    |
>
> +------------+------------+
>
> | ["1","Latha"] | root       |
>
> | ["2","Roshan"] | root       |
>
> +------------+------------+
>
> 2 rows selected (0.276 seconds)
>
> 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`;
>
> Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not
> found
>
>
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
> 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`;
>
> Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not
> found
>
>
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
> 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`;
>
> Query failed: SqlValidatorException: Table 'hdfs.root./test.csv' not found
>
>
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
> 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`;
>
> Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not
> found
>
>
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
> 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`;
>
> Query failed: SqlValidatorException: Table 'hdfs.root./test.csv' not found
>
>
>
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
>
> 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`;
>
> +------------+------------+
>
> |  columns   |    dir0    |
>
> +------------+------------+
>
> | ["1","Latha"] | root       |
>
> | ["2","Roshan"] | root       |
>
> +------------+------------+
>
> 2 rows selected (0.112 seconds)
>
>
>
> Appreciate your help.
>
>
>
> Thanks,
>
> Latha
>  This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message