spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From neeraj <>
Subject Help required on exercise Data Exploratin using Spark SQL
Date Thu, 16 Oct 2014 14:45:03 GMT

I'm exploring an exercise Data Exploratin using Spark SQL from Spark Summit
2014. While running command "val wikiData =
sqlContext.parquetFile("data/wiki_parquet")".. I'm getting the following
output which doesn't match with the expected output.

*Output i'm getting*:
 val wikiData1 =
14/10/16 19:26:49 INFO parquet.ParquetTypesConverter: Falling back to schema
conversion from Parquet types; result: ArrayBuffer(id#5, title#6,
modified#7L, text#8, username#9)
wikiData1: org.apache.spark.sql.SchemaRDD =
SchemaRDD[1] at RDD at SchemaRDD.scala:103
== Query Plan ==
== Physical Plan ==
ParquetTableScan [id#5,title#6,modified#7L,text#8,username#9],
(ParquetRelation /data/wiki_parquet/part-r-1.parquet, Some(Configuration:
core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml,
yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml),
org.apache.spark.sql.SQLContext@27a5dac0, []), []

*Expected Output*:
wikiData: org.apache.spark.sql.SchemaRDD = 
SchemaRDD[0] at RDD at SchemaRDD.scala:98
== Query Plan ==
ParquetTableScan [id#0,title#1,modified#2L,text#3,username#4],
(ParquetRelation data/wiki_parquet), []

Please help with the possible issue.

I'm using pre-built package of Spark with Hadoop 2.4

Please let me know in case of more information is required.


View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message