spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (JIRA)" <>
Subject [jira] [Created] (SPARK-18667) input_file_name function does not work with UDF
Date Thu, 01 Dec 2016 07:01:06 GMT
Hyukjin Kwon created SPARK-18667:

             Summary: input_file_name function does not work with UDF
                 Key: SPARK-18667
             Project: Spark
          Issue Type: Bug
          Components: PySpark
            Reporter: Hyukjin Kwon

{{input_file_name()}} does not return the file name but empty string instead when it is used
as input for UDF in PySpark as below: 

with the data as below:

{"a": 1}

with the codes below:

from pyspark.sql.functions import *
from pyspark.sql.types import *

def filename(path):
    return path

sourceFile = udf(filename, StringType())"tmp.json").select(sourceFile(input_file_name())).show()

prints as below:

|                           |

but the codes below:


prints correctly as below:

|   input_file_name()|

This seems PySpark specific issue.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message