spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bluejoe (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-22936) providing HttpStreamSource and HttpStreamSink
Date Wed, 03 Jan 2018 13:32:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-22936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309220#comment-16309220
] 

bluejoe edited comment on SPARK-22936 at 1/3/18 1:31 PM:
---------------------------------------------------------

The latest spark-http-stream artifact has been released to the central maven repository and
is free to use in any java/scala projects.

I will be very appreciated if spark-http-stream is listed as a third-party Source/Sink in
Spark documentation.


was (Author: bluejoe):
The latest spark-http-stream artifact has been released to the central maven repository and
is free to use in any java/scala projects.

I would be very appreciated that spark-http-stream is listed as a third-party Source/Sink
in Spark documentation.

> providing HttpStreamSource and HttpStreamSink
> ---------------------------------------------
>
>                 Key: SPARK-22936
>                 URL: https://issues.apache.org/jira/browse/SPARK-22936
>             Project: Spark
>          Issue Type: New Feature
>          Components: Structured Streaming
>    Affects Versions: 2.1.0
>            Reporter: bluejoe
>
> Hi, in my project I completed a spark-http-stream, which is now available on https://github.com/bluejoe2008/spark-http-stream.
I am thinking if it is useful to others and is ok to be integrated as a part of Spark.
> spark-http-stream transfers Spark structured stream over HTTP protocol. Unlike tcp streams,
Kafka streams and HDFS file streams, http streams often flow across distributed big data centers
on the Web. This feature is very helpful to build global data processing pipelines across
different data centers (scientific research institutes, for example) who own separated data
sets.
> The following code shows how to load messages from a HttpStreamSource:
> ```
> val lines = spark.readStream.format(classOf[HttpStreamSourceProvider].getName)
> 	.option("httpServletUrl", "http://localhost:8080/xxxx")
> 	.option("topic", "topic-1");
> 	.option("includesTimestamp", "true")
> 	.load();
> ```



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message