spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gen tang <gen.tan...@gmail.com>
Subject Re: Loading JSON dataset with Spark Mllib
Date Sun, 15 Feb 2015 23:55:16 GMT
Hi,

In fact, you can use sqlCtx.jsonFile() which loads a text file storing one
JSON object per line as a SchemaRDD.
Or you can use sc.textFile() to load the textFile to RDD and then use
sqlCtx.jsonRDD() which loads an RDD storing one JSON object per string as a
SchemaRDD.

Hope it could help
Cheers
Gen


On Mon, Feb 16, 2015 at 12:39 AM, pankaj channe <pankajc007@gmail.com>
wrote:

> Hi,
>
> I am new to spark and planning on writing a machine learning application
> with Spark mllib. My dataset is in json format. Is it possible to load data
> into spark without using any external json libraries? I have explored the
> option of SparkSql but I believe that is only for interactive use or
> loading data into hive tables.
>
> Thanks,
> Pankaj
>

Mime
View raw message