spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungtaek Lim <kabh...@gmail.com>
Subject Design review of SPARK-28594
Date Sun, 25 Aug 2019 22:34:22 GMT
Hi devs,

I have been working on designing SPARK-28594 [1] (though I've started with
this via different requests) and design doc is now available [2].

Let me describe SPARK-28954 briefly - single and growing event log file for
application has been major issue for streaming application as as long as
event log just grows while the application is running, and lots of issues
occur from there. The only viable workaround has been disabling event log
which is not easily acceptable. Maybe stopping the application and
rerunning would be another approach but it sounds really odd to stop the
application due to event log. SPARK-28594 enables the way to roll the event
log files, with compacting old event log files without losing the ability
to replay whole logs.

While I'll break down issue into subtask and start from easier one, in
parallel I'd like to ask for reviewing on the design to get better idea and
find possible defects of design.

Please note that the doc is intended to describe the detailed changes
(closer to the implementation details) and is not a kind of SPIP because I
wouldn't feel going through SPIP process for this improvement - the change
would be rather not huge and the proposal works orthogonal to current
feature. Please let me know if it's not the case and SPIP process is
necessary.

Thanks,
Jungtaek Lim (HeartSaVioR)

1. https://issues.apache.org/jira/browse/SPARK-28594
2.
https://docs.google.com/document/d/12bdCC4nA58uveRxpeo8k7kGOI2NRTXmXyBOweSi4YcY/edit?usp=sharing

Mime
View raw message