spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunita Arvind <sunitarv...@gmail.com>
Subject Re: Scala Spark SQL row object Ordinal Method Call Aliasing
Date Tue, 20 Jan 2015 14:48:50 GMT
The below is not exactly a solution to your question but this is what we
are doing. For the first time we do end up doing row.getstring() and we
immediately parse it through a map function which aligns it to either a
case class or a structType. Then we register it as a table and use just
column names. The spark sql wiki has good examples for this. Looks more
easy to manage to me than your solution below.

Agree with you on the fact that when there are lot of columns,
row.getString() even once is not convenient

Regards

Sunita

On Tuesday, January 20, 2015, Night Wolf <nightwolfzor@gmail.com> wrote:

> In Spark SQL we have Row objects which contain a list of fields that make
> up a row. A Rowhas ordinal accessors such as .getInt(0) or getString(2).
>
> Say ordinal 0 = ID and ordinal 1 = Name. It becomes hard to remember what
> ordinal is what, making the code confusing.
>
> Say for example I have the following code
>
> def doStuff(row: Row) = {
>   //extract some items from the row into a tuple;
>   (row.getInt(0), row.getString(1)) //tuple of ID, Name}
>
> The question becomes how could I create aliases for these fields in a Row
> object?
>
> I was thinking I could create methods which take a implicit Row object;
>
> def id(implicit row: Row) = row.getInt(0)def name(implicit row: Row) = row.getString(1)
>
> I could then rewrite the above as;
>
> def doStuff(implicit row: Row) = {
>   //extract some items from the row into a tuple;
>   (id, name) //tuple of ID, Name}
>
> Is there a better/neater approach?
>
>
> Cheers,
>
> ~NW
>

Mime
View raw message