spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: Scala Spark SQL row object Ordinal Method Call Aliasing
Date Tue, 20 Jan 2015 19:49:32 GMT
I had once worked on a named row feature but haven’t got time to finish 
it. It looks like this:

|sql("...").named.map { row:NamedRow  =>
   row[Int]('key) -> row[String]('value)
}
|

Basically the |named| method generates a field name to ordinal map for 
each RDD partition. This map is then shared shared by all |NamedRow| 
instances within a partition. Not exactly what you want, but might be 
helpful.

Cheng

On 1/20/15 3:39 AM, Night Wolf wrote:

> In Spark SQL we have|Row|objects which contain a list of fields that 
> make up a row. A|Row|has ordinal accessors such 
> as|.getInt(0)|or|getString(2)|.
>
> Say ordinal 0 = ID and ordinal 1 = Name. It becomes hard to remember 
> what ordinal is what, making the code confusing.
>
> Say for example I have the following code
>
> |def  doStuff(row:  Row)  =  {
>    //extract some items from the row into a tuple;
>    (row.getInt(0),  row.getString(1))  //tuple of ID, Name
> }|
>
> The question becomes how could I create aliases for these fields in a 
> Row object?
>
> I was thinking I could create methods which take a implicit Row object;
>
> |def  id(implicit  row:  Row)  =  row.getInt(0)
> def  name(implicit  row:  Row)  =  row.getString(1)|
>
> I could then rewrite the above as;
>
> |def  doStuff(implicit  row:  Row)  =  {
>    //extract some items from the row into a tuple;
>    (id,  name)  //tuple of ID, Name
> }|
>
> Is there a better/neater approach?
>
>
> Cheers,
>
> ~NW
>
​

Mime
View raw message