spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Praet (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-26128) filter breaks input_file_name
Date Tue, 20 Nov 2018 13:02:00 GMT
Paul Praet created SPARK-26128:
----------------------------------

             Summary: filter breaks input_file_name
                 Key: SPARK-26128
                 URL: https://issues.apache.org/jira/browse/SPARK-26128
             Project: Spark
          Issue Type: Bug
          Components: Spark Shell
    Affects Versions: 2.3.2
            Reporter: Paul Praet


This works:
{code:java}
scala> spark.read.parquet("/tmp/newparquet").select(input_file_name).show(5,false)
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
|input_file_name()                                                                                                                                   
|
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
|file:///tmp/newparquet/parquet-5-PT6H/junit/data/tenant=NA/year=2017/month=201704/day=20170406/hour=2017040618/data.eu-west-1b.290.PT6H.FINAL.parquet|
|file:///tmp/newparquet/parquet-5-PT6H/junit/data/tenant=NA/year=2017/month=201704/day=20170406/hour=2017040618/data.eu-west-1b.290.PT6H.FINAL.parquet|
|file:///tmp/newparquet/parquet-5-PT6H/junit/data/tenant=NA/year=2017/month=201704/day=20170406/hour=2017040618/data.eu-west-1b.290.PT6H.FINAL.parquet|
|file:///tmp/newparquet/parquet-5-PT6H/junit/data/tenant=NA/year=2017/month=201704/day=20170406/hour=2017040618/data.eu-west-1b.290.PT6H.FINAL.parquet|
|file:///tmp/newparquet/parquet-5-PT6H/junit/data/tenant=NA/year=2017/month=201704/day=20170406/hour=2017040618/data.eu-west-1b.290.PT6H.FINAL.parquet|
+-----------------------------------------------------------------------------------------------------------------------------------------------------+

{code}
When adding a filter:
{code:java}
scala> spark.read.parquet("/tmp/newparquet").where("key.station='XYZ'").select(input_file_name()).show(5,false)
+-----------------+
|input_file_name()|
+-----------------+
| |
| |
| |
| |
| |
+-----------------+

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message