spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eskilson,Aleksander" <Alek.Eskil...@Cerner.com>
Subject CSV Support in SparkR
Date Tue, 02 Jun 2015 18:52:47 GMT
Are there any intentions to provide first class support for CSV files as one of the loadable
file types in SparkR? Data brick’s spark-csv API [1] has support for SQL, Python, and Java/Scala,
and implements most of the arguments of R’s read.table API [2], but currently there is no
way to load CSV data in SparkR (1.4.0) besides separating our headers from the data, loading
into an RDD, splitting by our delimiter, and then converting to a SparkR Data Frame with a
vector of the columns gathered from the header.

Regards,
Alek Eskilson

[1] -- https://github.com/databricks/spark-csv
[2] -- http://www.inside-r.org/r-doc/utils/read.table

CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation
and are intended only for the addressee. The information contained in this message is confidential
and may constitute inside or non-public information under international, federal, or state
securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such
information is strictly prohibited and may be unlawful. If you are not the addressee, please
promptly delete this message and notify the sender of the delivery error by e-mail or you
may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

Mime
View raw message