Was trying out spark XML library . I keep on getting errors in inferring schema. Looks like it cannot infer single line XML data.
Sent from Samsung Mobile.
-------- Original message --------
From: Hyukjin Kwon <firstname.lastname@example.org>
Date:21/08/2016 15:40 (GMT+05:30)
To: Jörn Franke <email@example.com>
Cc: Diwakar Dhanuskodi <firstname.lastname@example.org>, Felix Cheung <email@example.com>, user <firstname.lastname@example.org>
Subject: Re: Best way to read XML data from RDD
Spark XML library can take RDD as source.
val df = new XmlReader()
If performance is critical, I would also recommend to take care of creation and destruction of the parser.
If the parser is not serializble, then you can do the creation for each partition within mapPartition just like
I hope this is helpful.