spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <rb...@netflix.com.INVALID>
Subject Re: Any reason for not exposing internalCreateDataFrame or isStreaming beyond sql package?
Date Thu, 22 Mar 2018 18:45:16 GMT
Jayesh,

We're working on a new API for building sources, DataSourceV2. That API
allows you to produce UnsafeRow and we are very likely going to change that
to InternalRow (SPARK-23325). There's an experimental version in the latest
2.3.0 release if you'd like to try it out.

Here's an example implementation from the Iceberg table format:
https://github.com/Netflix/iceberg/blob/master/spark/src/main/java/com/netflix/iceberg/spark/source/Reader.java

rb

On Thu, Mar 22, 2018 at 7:24 AM, Thakrar, Jayesh <
jthakrar@conversantmedia.com> wrote:

> Because these are not exposed in the usual API, its not possible (or
> difficult) to create custom structured streaming sources.
>
>
>
> Consequently, one has to create streaming sources in packages under
> org.apache.spark.sql.
>
>
>
> Any pointers or info is greatly appreciated.
>



-- 
Ryan Blue
Software Engineer
Netflix

Mime
View raw message