spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Kronenfeld <nkronenf...@uncharted.software>
Subject CSV conversion
Date Wed, 26 Oct 2016 16:11:00 GMT
We are finally converting from Spark 1.6 to Spark 2.0, and are finding one
barrier we can't get past.

In the past, we converted CSV RDDs (not files) to DataFrames using
DataBricks SparkCSV library - creating a CsvParser and calling
parser.csvRdd.

The current incarnation of spark-csv seems only to have a CSV file format
exposed, and the only entry points we can find are when reading files.

What is the modern pattern for converting an already-read RDD of CSV lines
into a dataframe?

Thanks,
                    Nathan Kronenfeld
                    Uncharted Software

Mime
View raw message