spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Priyank Shrivastava <priy...@asperasoft.com>
Subject [Structured Streaming] Avoiding multiple streaming queries
Date Tue, 13 Feb 2018 01:54:13 GMT
I have a structured streaming query which sinks to Kafka.  This query has a
complex aggregation logic.


I would like to sink the output DF of this query to multiple Kafka topics
each partitioned on a different ‘key’ column.  I don’t want to have
multiple Kafka sinks for each of the different Kafka topics because that
would mean running multiple streaming queries - one for each Kafka topic,
especially since my aggregation logic is complex.


Questions:

1.  Is there a way to output the results of a structured streaming query to
multiple Kafka topics each with a different key column but without having
to execute multiple streaming queries?


2.  If not,  would it be efficient to cascade the multiple queries such
that the first query does the complex aggregation and writes output
to Kafka and then the other queries just read the output of the first query
and write their topics to Kafka thus avoiding doing the complex aggregation
again?


Thanks in advance for any help.


Priyank

Mime
View raw message