spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daze5112 <david.zeel...@ato.gov.au>
Subject reading a csv dynamically
Date Thu, 22 Jan 2015 00:25:27 GMT
Hi all, im currently reading a csv file shich has the following format:
(String, Double, Double,Double, Double, Double)
and can map this no problems using:

    val dataRDD = sc.textFile("file.csv").
    map(_.split (",")).
    map(a=> (Array(a(0)), Array(a(1).toDouble, a(2).toDouble), a(3),
Array(a(4).toDouble, a(5).toDouble)))

What i would like to do is because the input file may have a different
number of fields ie it might have an extra double which needs to go in the
first array of doubles ie:

(String, Double, Double,Double, Double,Double, Double)
which would see my map as
    val dataRDD = sc.textFile("file.csv").
    map(_.split (",")).
    map(a=> (Array(a(0)), Array(a(1).toDouble, a(2).toDouble,
a(3).toDouble), a(4), Array(a(5).toDouble, a(6).toDouble)))

Is there a way i can make this map more dynamic ie if i create vals:
val Array_1 = 3
val Array_2 = 2

Then use these to pick up the values for array 1 which we know should
contain 3 values and say okay give me a(1) through to a(3)

thanks in advance




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/reading-a-csv-dynamically-tp21304.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message