spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <dav...@databricks.com>
Subject Re: SparkSQL nested dictionaries
Date Mon, 08 Jun 2015 18:10:02 GMT
I think it works in Python

```
>>> df = sqlContext.createDataFrame([(1, {'a': 1})])
>>> df.printSchema()
root
 |-- _1: long (nullable = true)
 |-- _2: map (nullable = true)
 |    |-- key: string
 |    |-- value: long (valueContainsNull = true)

>>> df.select(df._2.getField('a')).show()
+-----+
|_2[a]|
+-----+
|    1|
+-----+

>>> df.select(df._2['a']).show()
+-----+
|_2[a]|
+-----+
|    1|
+-----+
```

On Mon, Jun 8, 2015 at 6:00 AM, mrm <maria@skimlinks.com> wrote:
> Hi,
>
> Is it possible to query a data structure that is a dictionary within a
> dictionary?
>
> I have a parquet file with a a structure:
> test
> |____key1: {key_string: val_int}
> |____key2: {key_string: val_int}
>
> if I try to do:
>  parquetFile.test
>  --> Column<test>
>
>  parquetFile.test.key2
>  --> AttributeError: 'Column' object has no attribute 'key2'
>
> Similarly, if I try to do a SQL query, it throws this error:
>
> org.apache.spark.sql.AnalysisException: GetField is not valid on fields of
> type MapType(StringType,MapType(StringType,IntegerType,true),true);
>
> Is this at all possible with the Python API in Spark SQL?
>
> Thanks,
> Maria
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-nested-dictionaries-tp23207.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message