spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject [SQL] Why does spark.read.csv.cache give me a WARN about cache but not text?!
Date Wed, 17 Aug 2016 01:05:17 GMT
Hi,

Can anyone explain why spark.read.csv("people.csv").cache.show ends up
with a WARN while spark.read.text("people.csv").cache.show does not?
It happens in 2.0 and today's build.

scala> sc.version
res5: String = 2.1.0-SNAPSHOT

scala> spark.read.csv("people.csv").cache.show
+---------+---------+-------+----+
|      _c0|      _c1|    _c2| _c3|
+---------+---------+-------+----+
|kolumna 1|kolumna 2|kolumn3|size|
|    Jacek| Warszawa| Polska|  40|
+---------+---------+-------+----+

scala> spark.read.csv("people.csv").cache.show
16/08/16 18:01:52 WARN CacheManager: Asked to cache already cached data.
+---------+---------+-------+----+
|      _c0|      _c1|    _c2| _c3|
+---------+---------+-------+----+
|kolumna 1|kolumna 2|kolumn3|size|
|    Jacek| Warszawa| Polska|  40|
+---------+---------+-------+----+

scala> spark.read.text("people.csv").cache.show
+--------------------+
|               value|
+--------------------+
|kolumna 1,kolumna...|
|Jacek,Warszawa,Po...|
+--------------------+

scala> spark.read.text("people.csv").cache.show
+--------------------+
|               value|
+--------------------+
|kolumna 1,kolumna...|
|Jacek,Warszawa,Po...|
+--------------------+

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message