Ramana, Please find attached dservices.tar file. Thanks for your help. -Latha From: Sivasubramaniam, Latha Sent: Wednesday, April 08, 2015 1:33 PM To: 'user@drill.apache.org' Subject: RE: Unable to query data from hdfs Thanks for all the responses. Once I renamed files within directories to have extensions .csv, then it worked. So looks like for csv format, having extension is a must. It would be nice, if it does not allow "null" in the extension description. Now in the next step of my proof of concept, I am trying to access parquet files. I have parquet files(tables) created for the tables using impala, I am assuming that I should be able to access those files via drill as well. My parquet tables are placed under /user/hive/warehouse, like listed below here [root@rtr-poc-imp1 sample-data]# hdfs dfs -ls /user/hive/warehouse Found 19 items drwxrwxrwt - impala hive 0 2015-03-31 16:00 /user/hive/warehouse/dim_agent_status_parq drwxrwxrwt - impala hive 0 2015-03-31 16:00 /user/hive/warehouse/dim_agent_status_reasons_parq drwxrwxrwt - impala hive 0 2015-03-27 12:27 /user/hive/warehouse/dim_agents_parquet drwxrwxrwt - impala hive 0 2015-03-31 16:00 /user/hive/warehouse/dim_call_action_reasons_parq drwxrwxrwt - impala hive 0 2015-03-31 14:09 /user/hive/warehouse/dim_call_actions_parq drwxrwxrwt - impala hive 0 2015-03-31 13:54 /user/hive/warehouse/dim_call_types_parq drwxrwxrwt - impala hive 0 2015-03-31 15:59 /user/hive/warehouse/dim_dispositions_parq drwxrwxrwt - impala hive 0 2015-03-31 15:20 /user/hive/warehouse/dim_resource_groups_parq drwxrwxrwt - impala hive 0 2015-03-31 13:33 /user/hive/warehouse/dim_services_parq drwxrwxrwt - impala hive 0 2015-03-31 14:00 /user/hive/warehouse/dim_sites_parq drwxrwxrwt - impala hive 0 2015-03-31 15:25 /user/hive/warehouse/dim_workgroups_parq drwxrwxrwx - root hive 0 2015-04-08 14:36 /user/hive/warehouse/dservices drwxrwxrwt - impala hive 0 2015-03-27 11:48 /user/hive/warehouse/edwpoc.db drwxrwxrwt - impala hive 0 2015-03-31 12:47 /user/hive/warehouse/fact_agent_activity_detail_12m_partparq drwxrwxrwt - impala hive 0 2015-03-30 13:03 /user/hive/warehouse/fact_contact_detail_12m_partparq drwxrwxrwt - impala hive 0 2015-03-27 13:36 /user/hive/warehouse/fact_contact_detail_partparq -rw-r--r-- 3 root hive 455 2015-04-08 14:55 /user/hive/warehouse/region.parq drwxrwxrwt - impala hive 0 2015-03-25 22:29 /user/hive/warehouse/sample_07 drwxrwxrwt - impala hive 0 2015-03-25 22:29 /user/hive/warehouse/sample_08 example listing from one of the directory hdfs dfs -ls /user/hive/warehouse/dim_services_parq Found 3 items -rw-r--r-- 3 impala hive 55121 2015-03-31 13:33 /user/hive/warehouse/dim_services_parq/4645c4221dafa337-250888c6ac1de29b_1376355963_data.0.parq -rw-r--r-- 3 impala hive 71075 2015-03-31 13:33 /user/hive/warehouse/dim_services_parq/4645c4221dafa337-250888c6ac1de29c_2123191845_data.0.parq drwxrwxrwt - impala hive 0 2015-03-31 13:33 /user/hive/warehouse/dim_services_parq/_impala_insert_staging [root@rtr-poc-imp1 sample-data]# There is nothing under impala staging directory, this is primarily used when insert operation is performed. I copied dim_services_parq directory to dservices and below is the listing of dservices directory. [root@rtr-poc-imp1 sample-data]# hdfs dfs -ls /user/hive/warehouse/dservices Found 2 items -rwxrwxrwx 3 root hive 55121 2015-04-08 14:12 /user/hive/warehouse/dservices/service0.parquet -rwxrwxrwx 3 root hive 71075 2015-04-08 14:12 /user/hive/warehouse/dservices/service1.parquet Now when I try, I get the below error select * from hdfs.drillpoc.`/dservices`; Query failed: RemoteRpcException: Failure while running fragment., java.lang.UnsupportedOperationException [ cfca83ec-986a-43c0-a967-5aee102401dd on rtr-poc-imp2.labs.aspect.com:31010 ] [ cfca83ec-986a-43c0-a967-5aee102401dd on rtr-poc-imp2.labs.aspect.com:31010 ] I also copied the drill sample parquet file region.parquet to the same location and that works fine like below. select * from hdfs.drillpoc.`region.parq`; +-------------+------------+------------+ | R_REGIONKEY | R_NAME | R_COMMENT | +-------------+------------+------------+ | 0 | AFRICA | lar deposits. blithe | | 1 | AMERICA | hs use ironic, even | | 2 | ASIA | ges. thinly even pin | | 3 | EUROPE | ly final courts cajo | | 4 | MIDDLE EAST | uickly special accou | +-------------+------------+------------+ 5 rows selected (0.122 seconds) So far what I have read, impala created parquet file should be like any other parquet file, there should not be a problem. If this does not work, I need to convert all my tables in text format to parquet format and access it with drill. Is there any utility to do that. Thanks for all the help. Latha From: Sivasubramaniam, Latha Sent: Wednesday, April 08, 2015 8:00 AM To: 'user@drill.apache.org' Subject: RE: Unable to query data from hdfs Hi, Thanks for your responses. Even though I had done use hdfs, only when I fully qualified the file name it worked. But I am not able to access files without .csv extension. I modified "csv": { "type": "text", "extensions": [ "csv" ], "delimiter": "," To "csv": { "type": "text", "extensions": null, "delimiter": "," And tried to access hdfs file 'DIM_Agents' and I get the same error. With null extensions, I can't access 'test.csv' also, once I reverted back csv format description then I could access test.csv again, but I cannot access other files with either of the format descriptions. Below are what I tried. Is '_' (underscore) a problem in the file name. All my hdfs files are in text format. 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; +------------+------------+ | columns | dir0 | +------------+------------+ | ["1","Latha"] | root | | ["2","Roshan"] | root | +------------+------------+ 2 rows selected (0.276 seconds) 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`; Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not found Error: exception while executing query: Failure while executing query. (state=,code=0) 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`; Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not found Error: exception while executing query: Failure while executing query. (state=,code=0) 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; Query failed: SqlValidatorException: Table 'hdfs.root./test.csv' not found Error: exception while executing query: Failure while executing query. (state=,code=0) 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/DIM_Agents`; Query failed: SqlValidatorException: Table 'hdfs.root./DIM_Agents' not found Error: exception while executing query: Failure while executing query. (state=,code=0) 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; Query failed: SqlValidatorException: Table 'hdfs.root./test.csv' not found Error: exception while executing query: Failure while executing query. (state=,code=0) 0: jdbc:drill:zk=rtr-poc-imp1:2181> select * from hdfs.root.`/test.csv`; +------------+------------+ | columns | dir0 | +------------+------------+ | ["1","Latha"] | root | | ["2","Roshan"] | root | +------------+------------+ 2 rows selected (0.112 seconds) Appreciate your help. Thanks, Latha This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments.