spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dongjoon Hyun (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-15031) Use SparkSession in Scala/Python/Java example.
Date Wed, 04 May 2016 17:59:13 GMT

     [ https://issues.apache.org/jira/browse/SPARK-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dongjoon Hyun updated SPARK-15031:
----------------------------------
    Description: 
This PR aims to update Scala/Python/Java examples by replacing `SQLContext` with newly added
`SparkSession`.

- Use *SparkSession Builder Pattern* in 154(Scala 55, Java 52, Python 47) files.
- Add `getConf` in Python `SparkContext` class: python/pyspark/context.py
- Replace *SQLContext Singleton Pattern* with *SparkSession Singleton Pattern*:
  - `SqlNetworkWordCount.scala`
  - `JavaSqlNetworkWordCount.java`
  - `sql_network_wordcount.py`

Now, `SQLContexts` are used only in R examples and the following two Python examples. The
python examples are untouched in this PR since it already fails some unknown issue.
- `simple_params_example.py`
- `aft_survival_regression.py`

  was:
This PR aims to update Scala/Python/Java examples by replacing `SQLContext` with newly added
`SparkSession`. For this, two new `SparkSesion` ctor are added, and also fixes the following
examples.

**sql.py**
{code}
-    people = sqlContext.jsonFile(path)
+    people = sqlContext.read.json(path)
-    people.registerAsTable("people")
+    people.registerTempTable("people")
{code}

**dataframe_example.py**
{code}
- features = df.select("features").map(lambda r: r.features)
+ features = df.select("features").rdd.map(lambda r: r.features)
{code}

Note that the following examples are untouched in this PR since it fails some unknown issue.

- `simple_params_example.py`
- `aft_survival_regression.py`


> Use SparkSession in Scala/Python/Java example.
> ----------------------------------------------
>
>                 Key: SPARK-15031
>                 URL: https://issues.apache.org/jira/browse/SPARK-15031
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Examples
>            Reporter: Dongjoon Hyun
>            Assignee: Dongjoon Hyun
>
> This PR aims to update Scala/Python/Java examples by replacing `SQLContext` with newly
added `SparkSession`.
> - Use *SparkSession Builder Pattern* in 154(Scala 55, Java 52, Python 47) files.
> - Add `getConf` in Python `SparkContext` class: python/pyspark/context.py
> - Replace *SQLContext Singleton Pattern* with *SparkSession Singleton Pattern*:
>   - `SqlNetworkWordCount.scala`
>   - `JavaSqlNetworkWordCount.java`
>   - `sql_network_wordcount.py`
> Now, `SQLContexts` are used only in R examples and the following two Python examples.
The python examples are untouched in this PR since it already fails some unknown issue.
> - `simple_params_example.py`
> - `aft_survival_regression.py`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message