I am able to reproduce the same behavior of STORM-2329 by using a single worker HDFS writer that contains clock skew. This isn't the same mechanism as seen by my users (Mechanism level: Failed to find any Kerberos tgt vs my simulation Mechanism level: Clock skew too great (37) - PROCESS_TGS) but allows me to correct the top level IOException effectively. The repro is on Storm 1.0.x but with storm-hdfs from master about a month ago.
to its own try/catch and then fail tuples/throw runtime exception in order to force a complete initialization of the bolt? Users who experience this now restart their topology and have immediate success. I believe we should only see IOException if the writer is null and we go to HDFS for a new one thus we won't be restarting the bolt unnecessarily.