spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Difference between Data set and Data Frame in Spark 2
Date Thu, 01 Sep 2016 14:26:02 GMT
Here's my paraphrase:

Datasets are really the new RDDs. They have a similar nature
(container of strongly-typed objects) but bring some optimizations via
Encoders for common types.

DataFrames are different from RDDs and Datasets and do not replace and
are not replaced by them. They're fundamentally for tabular data, not
arbitrary objects, and thus supports SQL-like operations that only
make sense on tabular  data.

On Thu, Sep 1, 2016 at 3:17 PM, Ashok Kumar
<> wrote:
> Hi,
> What are practical differences between the new Data set in Spark 2 and the
> existing DataFrame.
> Has Dataset replaced Data Frame and what advantages it has if I use Data
> Frame instead of Data Frame.
> Thanks

To unsubscribe e-mail:

View raw message