spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: Help required on exercise Data Exploratin using Spark SQL
Date Thu, 16 Oct 2014 15:11:36 GMT
Hi Neeraj,

The Spark Summit 2014 tutorial uses Spark 1.0. I guess you're using 
Spark 1.1? Parquet support got polished quite a bit since then, and 
changed the string representation of the query plan, but this output 
should be OK :)

Cheng

On 10/16/14 10:45 PM, neeraj wrote:
> Hi,
>
> I'm exploring an exercise Data Exploratin using Spark SQL from Spark Summit
> 2014. While running command "val wikiData =
> sqlContext.parquetFile("data/wiki_parquet")".. I'm getting the following
> output which doesn't match with the expected output.
>
> *Output i'm getting*:
>   val wikiData1 =
> sqlContext.parquetFile("/data/wiki_parquet/part-r-1.parquet")
> 14/10/16 19:26:49 INFO parquet.ParquetTypesConverter: Falling back to schema
> conversion from Parquet types; result: ArrayBuffer(id#5, title#6,
> modified#7L, text#8, username#9)
> wikiData1: org.apache.spark.sql.SchemaRDD =
> SchemaRDD[1] at RDD at SchemaRDD.scala:103
> == Query Plan ==
> == Physical Plan ==
> ParquetTableScan [id#5,title#6,modified#7L,text#8,username#9],
> (ParquetRelation /data/wiki_parquet/part-r-1.parquet, Some(Configuration:
> core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml,
> yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml),
> org.apache.spark.sql.SQLContext@27a5dac0, []), []
>
> *Expected Output*:
> wikiData: org.apache.spark.sql.SchemaRDD =
> SchemaRDD[0] at RDD at SchemaRDD.scala:98
> == Query Plan ==
> ParquetTableScan [id#0,title#1,modified#2L,text#3,username#4],
> (ParquetRelation data/wiki_parquet), []
>
> Please help with the possible issue.
>
> I'm using pre-built package of Spark with Hadoop 2.4
>
> Please let me know in case of more information is required.
>
> Regards,
> Neeraj
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Help-required-on-exercise-Data-Exploratin-using-Spark-SQL-tp16569.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message