In Spark 2.1 we've added a from_json function that I think will do what you want.

This seem to work 

import org.apache.spark.sql._
val rdd = { case Row(j: String) => j }
However I wonder if this any inefficiency here ? since I have to apply this function for billion rows.