spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: How to retreive the value from sql.row by column name
Date Mon, 16 Feb 2015 18:46:43 GMT
For efficiency the row objects don't contain the schema so you can't get
the column by name directly.  I usually do a select followed by pattern
matching. Something like the following:

caper.select('ran_id).map { case Row(ranId: String) => }

On Mon, Feb 16, 2015 at 8:54 AM, Eric Bell <eric@ericjbell.com> wrote:

> Is it possible to reference a column from a SchemaRDD using the column's
> name instead of its number?
>
> For example, let's say I've created a SchemaRDD from an avro file:
>
> val sqlContext = new SQLContext(sc)
> import sqlContext._
> val caper=sqlContext.avroFile("hdfs://localhost:9000/sma/raw_avro/caper")
> caper.registerTempTable("caper")
>
> scala> caper
> res20: org.apache.spark.sql.SchemaRDD = SchemaRDD[0] at RDD at
> SchemaRDD.scala:108
> == Query Plan ==
> == Physical Plan ==
> PhysicalRDD [ADMDISP#0,age#1,AMBSURG#2,apptdt_skew#3,APPTSTAT#4,
> APPTTYPE#5,ASSGNDUR#6,CANCSTAT#7,CAPERSTAT#8,COMPLAINT#9,CPT_1#10,CPT_10#
> 11,CPT_11#12,CPT_12#13,CPT_13#14,CPT_2#15,CPT_3#16,CPT_4#17,
> CPT_5#18,CPT_6#19,CPT_7#20,CPT_8#21,CPT_9#22,CPTDX_1#23,
> CPTDX_10#24,CPTDX_11#25,CPTDX_12#26,CPTDX_13#27,CPTDX_2#28,
> CPTDX_3#29,CPTDX_4#30,CPTDX_5#31,CPTDX_6#32,CPTDX_7#33,
> CPTDX_8#34,CPTDX_9#35,CPTMOD1_1#36,CPTMOD1_10#37,CPTMOD1_11#
> 38,CPTMOD1_12#39,CPTMOD1_13#40,CPTMOD1_2#41,CPTMOD1_3#42,
> CPTMOD1_4#43,CPTMOD1_5#44,CPTMOD1_6#45,CPTMOD1_7#46,
> CPTMOD1_8#47,CPTMOD1_9#48,CPTMOD2_1#49,CPTMOD2_10#50,
> CPTMOD2_11#51,CPTMOD2_12#52,CPTMOD2_13#53,CPTMOD2_2#54,
> CPTMOD2_3#55,CPTMOD2_4#56,CPTMOD...
> scala>
>
> Now I want to access fields, and of course the normal thing to do is to
> use a field name, not a field number.
>
> scala> val kv = caper.map(r => (r.ran_id, r))
> <console>:23: error: value ran_id is not a member of
> org.apache.spark.sql.Row
>        val kv = caper.map(r => (r.ran_id, r))
>
> How do I do this?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message