spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyukjin Kwon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-19729) Strange behaviour with reading csv with schema into dataframe
Date Mon, 27 Feb 2017 13:08:45 GMT

    [ https://issues.apache.org/jira/browse/SPARK-19729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885758#comment-15885758
] 

Hyukjin Kwon commented on SPARK-19729:
--------------------------------------

I am sorry that I am a bit confused.

{code}
scala> Seq("var1,var2,,").toDF().write.text("/tmp/testcsv")

scala> val df = spark.read.csv("/tmp/testcsv")
df: org.apache.spark.sql.DataFrame = [_c0: string, _c1: string ... 2 more fields]

scala> df.show()
+----+----+----+----+
| _c0| _c1| _c2| _c3|
+----+----+----+----+
|var1|var2|null|null|
+----+----+----+----+

scala> val row = df.first()
row: org.apache.spark.sql.Row = [var1,var2,null,null]

scala> row.size
res19: Int = 4

scala> row.fieldIndex("_c2")
res20: Int = 2

scala> row.getAs[String]("_c2")
res21: String = null

scala> row.get(2)
res22: Any = null

scala> print(row)
[var1,var2,null,null]
{code}

Could you tell me which one makes you feel like an issue?

> Strange behaviour with reading csv with schema into dataframe
> -------------------------------------------------------------
>
>                 Key: SPARK-19729
>                 URL: https://issues.apache.org/jira/browse/SPARK-19729
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API, SQL
>    Affects Versions: 2.0.1
>            Reporter: Mazen Melouk
>
> I have the following schema
> [{first,string_type,false}
> ,{second,string_type,false}
> ,{third,string_type,false}
> ,{fourth,string_type,false}]
> Example lines:
> var1,var2,,
> when accessing the row I get the following
> row.size =4
> row.fieldIndex(third_string)=2
> row.getAs(third_string)=var2
> row.get(2)=var2
> print(row)= var1,var2
> Any idea why the null values are missing?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message