spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Davidson <ilike...@gmail.com>
Subject Re: Shouldn't the UNION of SchemaRDDs produce SchemaRDD ?
Date Sun, 30 Mar 2014 18:08:06 GMT
Looks like there is a "unionAll" function on SchemaRDD which will do what
you want. The contract of RDD#union is unfortunately too general to allow
it to return a SchemaRDD without downcasting.


On Sun, Mar 30, 2014 at 7:56 AM, Manoj Samel <manojsameltech@gmail.com>wrote:

> Hi,
>
> I am trying SparkSQL based on the example on doc ...
>
> ....
>
> val people =
> sc.textFile("/data/spark/examples/src/main/resources/people.txt").map(_.split(",")).map(p
> => Person(p(0), p(1).trim.toInt))
>
>
> val olderThanTeans = people.where('age > 19)
> val youngerThanTeans = people.where('age < 13)
> val nonTeans = youngerThanTeans.union(olderThanTeans)
>
> I can do a orderBy('age) on first two (which are SchemaRDD) but not on
> third. The nonTeans is a UnionRDD that does not supports orderBy. This
> seems different than the SQL behavior where results of 2 SQL unions is a
> SQL itself with same functionality ...
>
> Not clear why union of 2 SchemaRDDs does not produces a SchemaRDD ....
>
>
> Thanks,
>
>
>

Mime
View raw message