spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-15347) Problem select empty ORC table
Date Mon, 16 May 2016 15:05:12 GMT

     [ https://issues.apache.org/jira/browse/SPARK-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Owen resolved SPARK-15347.
-------------------------------
       Resolution: Duplicate
    Fix Version/s:     (was: 1.6.0)

Please have a look through JIRA first and read https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

> Problem select empty ORC table
> ------------------------------
>
>                 Key: SPARK-15347
>                 URL: https://issues.apache.org/jira/browse/SPARK-15347
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.6.1
>         Environment: Hadoop 2.7.1.2.4.2.0-258
> Subversion git@github.com:hortonworks/hadoop.git -r 13debf893a605e8a88df18a7d8d214f571e05289
> Compiled by jenkins on 2016-04-25T05:46Z
> Compiled with protoc 2.5.0
> From source with checksum 2a2d95f05ec6c3ac547ed58cab713ac
> This command was run using /usr/hdp/2.4.2.0-258/hadoop/hadoop-common-2.7.1.2.4.2.0-258.jar
>            Reporter: Pedro Prado
>
> Error when I selected empty ORC table
>     [pprado@hadoop-m ~]$ beeline -u jdbc:hive2://
>     WARNING: Use "yarn jar" to launch YARN applications.
>     Connecting to jdbc:hive2://
>     Connected to: Apache Hive (version 1.2.1000.2.4.2.0-258)
>     Driver: Hive JDBC (version 1.2.1000.2.4.2.0-258)
>     Transaction isolation: TRANSACTION_REPEATABLE_READ
>     Beeline version 1.2.1000.2.4.2.0-258 by Apache Hive
> On beeline => create table my_test (id int, name String) stored as orc;
> On beeline => select * from my_test;
>     16/05/13 18:18:57 [main]: ERROR hdfs.KeyProviderCache: Could not find uri with key
[dfs.encryption.key.provider.uri] to create a keyProvider !!
>     OK
>     +-------------+---------------+--+
>     | my_test.id | my_test.name |
>     +-------------+---------------+--+
>     +-------------+---------------+--+
>     No rows selected (1.227 seconds)
> Hive is OK!
> Now, when i execute pyspark.
>     Welcome to
>     SPARK version 1.6.1
>     Using Python version 2.6.6 (r266:84292, Jul 23 2015 15:22:56)
>     SparkContext available as sc, HiveContext available as sqlContext.
> PySpark => sqlContext.sql("select * from my_test")
>     16/05/13 18:33:41 INFO ParseDriver: Parsing command: select * from my_test
>     16/05/13 18:33:41 INFO ParseDriver: Parse Completed
>     Traceback (most recent call last):
>     File "", line 1, in
>     File "/usr/hdp/2.4.2.0-258/spark/python/pyspark/sql/context.py", line 580, in sql
>     return DataFrame(self.ssql_ctx.sql(sqlQuery), self)
>     File "/usr/hdp/2.4.2.0-258/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
line 813, in __call_
>     File "/usr/hdp/2.4.2.0-258/spark/python/pyspark/sql/utils.py", line 53, in deco
>     raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
>     pyspark.sql.utils.IllegalArgumentException: u'orcFileOperator: path hdfs://hadoop-m.c.sva-0001.internal:8020/apps/hive/warehouse/my_test
does not have valid orc files matching the pattern'
> when i create parquet table, it's all right. I do not have problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message