spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <c...@koeninger.org>
Subject Re: [Structured Streaming] Kafka group.id is fixed
Date Mon, 19 Nov 2018 21:07:38 GMT
Anastasios it looks like you already identified the two lines that
need to change, the string interpolation that depends on
UUID.randomUUID and metadataPath.hashCode.

I'd factor that out into a function that returns the group id.  That
function would also need to take the "parameters" variable (the map of
user-provided options) and look for a prefix for the group id,
defaulting to the current behavior.

If you have questions, feel free to ping me on the jira, or get as far
as you can and submit a PR for more discussion.
On Mon, Nov 19, 2018 at 2:38 PM Anastasios Zouzias <zouzias@gmail.com> wrote:
>
> Hi Tom,
>
> I initiated an issue here: https://issues.apache.org/jira/browse/SPARK-26121
>
> Feel free to edit/update the ticket. If someone familiar with the codebase has any suggestion
on the proper way of fixing this, I could work on it.
>
> Best,
> Anastasios
>
> On Mon, Nov 19, 2018 at 4:31 PM Tom Graves <tgraves_cs@yahoo.com> wrote:
>>
>> This makes sense to me and was going to propose something similar in order to be
able to use the kafka acls more effectively as well, can you file a jira for it?
>>
>> Tom
>>
>> On Friday, November 9, 2018, 2:26:12 AM CST, Anastasios Zouzias <zouzias@gmail.com>
wrote:
>>
>>
>> Hi all,
>>
>> I run in the following situation with Spark Structure Streaming (SS) using Kafka.
>>
>> In a project that I work on, there is already a secured Kafka setup where ops can
issue an SSL certificate per "group.id", which should be predefined (or hopefully its prefix
to be predefined).
>>
>> On the other hand, Spark SS fixes the group.id to
>>
>> val uniqueGroupId = s"spark-kafka-source-${UUID.randomUUID}-${metadataPath.hashCode}"
>>
>> see, i.e.,
>>
>> https://github.com/apache/spark/blob/v2.4.0/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala#L124
>>
>> I guess Spark developers had a good reason to fix it, but is it possible to make
configurable the prefix of the above uniqueGroupId ("spark-kafka-source")? If so, I could
prepare a PR on it.
>>
>> The rational is that we do not want all spark-jobs to use the same certificate on
group-ids of the form (spark-kafka-source-*).
>>
>>
>> Best regards,
>> Anastasios Zouzias
>
>
>
> --
> -- Anastasios Zouzias

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message