spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chang Ya-Hsuan <sumti...@gmail.com>
Subject Re: Failed to generate predicate Error when using dropna
Date Wed, 09 Dec 2015 04:21:36 GMT
https://issues.apache.org/jira/browse/SPARK-12231

this is my first time to create JIRA ticket.
is this ticket proper?
thanks

On Tue, Dec 8, 2015 at 9:59 PM, Reynold Xin <rxin@databricks.com> wrote:

> Can you create a JIRA ticket for this? Thanks.
>
>
> On Tue, Dec 8, 2015 at 5:25 PM, Chang Ya-Hsuan <sumtiogo@gmail.com> wrote:
>
>> spark version: spark-1.5.2-bin-hadoop2.6
>> python version: 2.7.9
>> os: ubuntu 14.04
>>
>> code to reproduce error
>>
>> # write.py
>>
>> import pyspark
>> sc = pyspark.SparkContext()
>> sqlc = pyspark.SQLContext(sc)
>> df = sqlc.range(10)
>> df1 = df.withColumn('a', df['id'] * 2)
>> df1.write.partitionBy('id').parquet('./data')
>>
>>
>> # read.py
>>
>> import pyspark
>> sc = pyspark.SparkContext()
>> sqlc = pyspark.SQLContext(sc)
>> df2 = sqlc.read.parquet('./data')
>> df2.dropna().count()
>>
>>
>> $ spark-submit write.py
>> $ spark-submit read.py
>>
>> # error message
>>
>> 15/12/08 17:20:34 ERROR Filter: Failed to generate predicate, fallback to
>> interpreted org.apache.spark.sql.catalyst.errors.package$TreeNodeException:
>> Binding attribute, tree: a#0L
>> ...
>>
>> If write data without partitionBy, the error won't happen
>> any suggestion?
>> Thanks!
>>
>> --
>> -- 張雅軒
>>
>
>


-- 
-- 張雅軒

Mime
View raw message