Hi Saikat,

You may use the wrong mailing list for your question (=> spark user).

If you want to make a single string, it's :
red.collect.mkString("\n")

Be careful of driver explosion !

Cheers,
Jonathan


On Fri, 19 May 2017, 05:21 Saikat Kanjilal, <sxk1969@hotmail.com> wrote:

One additional point, the following line: 

rdd.collect.foreach(t=>println(t._2))
when set to a scala string prints nothing even when I use toString at the end.  This seems to not be something that should be that out of the ordinary but I could be wrong.


From: Saikat Kanjilal <sxk1969@hotmail.com>
Sent: Thursday, May 18, 2017 8:18 PM
To: dev@spark.apache.org
Subject: Spark madness
 

Hi Devs,

I'm needing to read a json file from hdfs and turn that into a scala string, I have dug around for documentation on how to do this  and found this: 


http://stackoverflow.com/questions/30445263/how-to-read-whole-file-in-one-string

"How to read whole [HDFS] file in one string [in Spark, to use as sql]": e.g. // Put file to hdfs from edge-node's shell... hdfs dfs -put


The following two lines of code dont seem to do the job:


rdd = sc.wholeTextFiles("hdfs://nameservice1/user/me/test.txt")
rdd.collect.foreach(t=>println(t._2))


I have tried to set the second line to a scala string but that doesn't seem to work, I would really appreciate some insights into how to do this.


Thanks in advance.

"How to read whole [HDFS] file in one string [in Spark, to use as sql]": e.g. // Put file to hdfs from edge-node's shell... hdfs dfs -put