spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <lian.cs....@gmail.com>
Subject Re: Spark 2.0 Dataset Documentation
Date Sat, 18 Jun 2016 05:12:06 GMT
Hey Pedro,

SQL programming guide is being updated. Here's the PR, but not merged 
yet: https://github.com/apache/spark/pull/13592

Cheng

On 6/17/16 9:13 PM, Pedro Rodriguez wrote:
> Hi All,
>
> At my workplace we are starting to use Datasets in 1.6.1 and even more 
> with Spark 2.0 in place of Dataframes. I looked at the 1.6.1 
> documentation then the 2.0 documentation and it looks like not much 
> time has been spent writing a Dataset guide/tutorial.
>
> Preview Docs: 
> https://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/sql-programming-guide.html#creating-datasets

> <https://home.apache.org/%7Epwendell/spark-releases/spark-2.0.0-preview-docs/sql-programming-guide.html#creating-datasets>
> Spark master docs: 
> https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
>
> I would like to spend the time to contribute an improvement to those 
> docs with a more in depth examples of creating and using Datasets (eg 
> using $ to select columns). Is this of value, and if so what should my 
> next step be to get this going (create JIRA etc)?
>
> -- 
> Pedro Rodriguez
> PhD Student in Distributed Machine Learning | CU Boulder
> R&D Data Science Intern at Oracle Data Cloud
> UC Berkeley AMPLab Alumni
>
> ski.rodriguez@gmail.com <mailto:ski.rodriguez@gmail.com> | 
> pedrorodriguez.io <http://pedrorodriguez.io> | 909-353-4423
> Github: github.com/EntilZha <http://github.com/EntilZha> | LinkedIn: 
> https://www.linkedin.com/in/pedrorodriguezscience
>


Mime
View raw message