spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Liu <peter.p...@gmail.com>
Subject Re: read binary files (for stream reader) / spark 2.3
Date Mon, 09 Sep 2019 14:07:47 GMT
Hello experts,

I have one additional question: how can I read binary files into a stream
reader object? (intended for getting data into a kafka server).

I looked into DataStreamReader API (
https://jaceklaskowski.gitbooks.io/spark-structured-streaming/spark-sql-streaming-DataStreamReader.html#option)
and other google results and didn't find an option for binary file.

Any help would be very much appreciated!
(thanks again for Ilya's helpful information below - works fine on
sparkContext object)

Regards,

Peter


On Thu, Sep 5, 2019 at 3:09 PM Ilya Matiach <ilmat@microsoft.com> wrote:

> Hi Peter,
>
> You can use the spark.readImages API in spark 2.3 for reading images:
>
>
>
>
> https://databricks.com/blog/2018/12/10/introducing-built-in-image-data-source-in-apache-spark-2-4.html
>
>
> https://blogs.technet.microsoft.com/machinelearning/2018/03/05/image-data-support-in-apache-spark/
>
>
>
>
> https://spark.apache.org/docs/2.3.0/api/scala/index.html#org.apache.spark.ml.image.ImageSchema$
>
>
>
> There’s also a spark package for spark versions older than 2.3:
>
> https://github.com/Microsoft/spark-images
>
>
>
> Thank you, Ilya
>
>
>
>
>
>
>
>
>
> *From:* Peter Liu <peter.pliu@gmail.com>
> *Sent:* Thursday, September 5, 2019 2:13 PM
> *To:* dev <dev@spark.apache.org>; User <user@spark.apache.org>
> *Subject:* Re: read image or binary files / spark 2.3
>
>
>
> Hello experts,
>
>
>
> I have quick question: which API allows me to read images files or binary
> files (for SparkSession.readStream) from a local/hadoop file system in
> Spark 2.3?
>
>
>
> I have been browsing the following documentations and googling for it and
> didn't find a good example/documentation:
>
>
>
> https://spark.apache.org/docs/2.3.0/streaming-programming-guide.html
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2F2.3.0%2Fstreaming-programming-guide.html&data=02%7C01%7Cilmat%40microsoft.com%7Cad36f2af52aa4cc906d908d7322cc4e1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637033040182027177&sdata=vYJ%2Ftor22teIlzMGMfqvsiQn5D6iFHcf4u0N2K2dkmc%3D&reserved=0>
>
>
> https://spark.apache.org/docs/2.3.0/api/scala/index.html#org.apache.spark.package
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2F2.3.0%2Fapi%2Fscala%2Findex.html%23org.apache.spark.package&data=02%7C01%7Cilmat%40microsoft.com%7Cad36f2af52aa4cc906d908d7322cc4e1%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C1%7C637033040182037172&sdata=HeP0Bxk6eLdCk71uH7wcCxHwIM%2FCjbhzoQaiZgs0Gi0%3D&reserved=0>
>
>
>
> any hint/help would be very much appreciated!
>
>
>
> thanks!
>
>
>
> Peter
>

Mime
View raw message