try running explain on each of these. my guess would be caching in broken
in some cases.
On Tue, Aug 16, 2016 at 6:05 PM, Jacek Laskowski <jacek@japila.pl> wrote:
> Hi,
>
> Can anyone explain why spark.read.csv("people.csv").cache.show ends up
> with a WARN while spark.read.text("people.csv").cache.show does not?
> It happens in 2.0 and today's build.
>
> scala> sc.version
> res5: String = 2.1.0-SNAPSHOT
>
> scala> spark.read.csv("people.csv").cache.show
> +---------+---------+-------+----+
> | _c0| _c1| _c2| _c3|
> +---------+---------+-------+----+
> |kolumna 1|kolumna 2|kolumn3|size|
> | Jacek| Warszawa| Polska| 40|
> +---------+---------+-------+----+
>
> scala> spark.read.csv("people.csv").cache.show
> 16/08/16 18:01:52 WARN CacheManager: Asked to cache already cached data.
> +---------+---------+-------+----+
> | _c0| _c1| _c2| _c3|
> +---------+---------+-------+----+
> |kolumna 1|kolumna 2|kolumn3|size|
> | Jacek| Warszawa| Polska| 40|
> +---------+---------+-------+----+
>
> scala> spark.read.text("people.csv").cache.show
> +--------------------+
> | value|
> +--------------------+
> |kolumna 1,kolumna...|
> |Jacek,Warszawa,Po...|
> +--------------------+
>
> scala> spark.read.text("people.csv").cache.show
> +--------------------+
> | value|
> +--------------------+
> |kolumna 1,kolumna...|
> |Jacek,Warszawa,Po...|
> +--------------------+
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
|