incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: checkpoint problem
Date Tue, 26 Mar 2013 08:39:43 GMT
Reporting issues is through the jira bugtracking system here https://issues.apache.org/jira/browse/S4

You have to create an account - no special permissions needed, if I remember well - then file
a ticket for the S4 project.

That's a great way to start contributing to the project!

Thanks,

Matthieu 

On Mar 26, 2013, at 09:28 , Dingyu Yang wrote:

> Yes, when I run at -futureSerializedState.get(1000, TimeUnit.MILLISECONDS), then I get
the  error previous mentioned.
> My program sets the frequency setting as follows:
>                 wordSumPE.setCheckpointingConfig(new CheckpointingConfig.Builder(CheckpointingMode.TIME).frequency(20).timeUnit(TimeUnit.SECONDS).build());
> 
> The sending in adapter is very easy and just some test words(10 words).
> So I think the problem is at futureSerializedState class.
> I am not familiar with jira system. Or Can I join the contribution of S4?
> Thank you !
> dingyu
> 
> 
> 2013/3/26 Matthieu Morel <mmorel@apache.org>
> Thanks for the feedback.
> 
> When you write "cannot pass" what do you mean? the exception that you reported is logged
and the program continues? something else?
> 
> Besides, the standard tests that we run for the release pass and show that checkpointing
works. The problem is might be related to the speed of checkpointing and of sending events.
Note that it might not be necessary to checkpoint for every single event, and checkpointing
every n events (n relatively small) and losing at worst n-1 events per PE in case of failure
might be ok. 
> 
> It would be good to know in which conditions exactly you encounter the issue, i.e. frequency
of checkpointing and frequency of events sent/received. Reporting a bug on our jira system
would be the best place to follow that conversation.
> 
> Thanks and regards,
> 
> Matthieu 
> 
> 
> 
> 
> On Mar 26, 2013, at 08:56 , Dingyu Yang wrote:
> 
>> Hi,Matthieu
>> I debug the program and still have this problem. 
>> I find the problem when debuging at: SaveStateTask.run-----futureSerializedState.get(1000,
TimeUnit.MILLISECONDS).
>> It cannot pass at here. I don't know what the problem is, Even I have just one PE
instance.  Is it my program problem or S4?
>> Are you able to checkpoint?
>> 
>> Waiting for your answer!
>> 
>> 
>> 2013/3/26 Matthieu Morel <mmorel@apache.org>
>> This looks like a bug, from a race condition in the serializer.
>> 
>> Can you file a bug? Also, are you able to reproduce it systematically?
>> 
>> Thanks,
>> 
>> Matthieu
>> 
>> On Mar 23, 2013, at 07:33 , Dingyu Yang wrote:
>> 
>> > Hi,all
>> > I run a checkpoint example and get some problems.
>> > The version is S4 0.6 RC3 .
>> > ./s4 deploy -a=example.wordcountApp -c=testCluster1 -appName=wordApp -p=s4.checkpointing.filesystem.storageRootPath=/home/tmp/s4checkpoint
-emc=org.apache.s4.core.ft.FileSystemBackendCheckpointingModule
>> >
>> > Then I get this error:
>> > 14:21:50.251 [Checkpointing-storage-0] WARN  org.apache.s4.core.ft.SaveStateTask
- Cannot save checkpoint : [PROTO_ID];[KEY] --> [example.WordSumPE];[./s4]
>> > java.util.concurrent.ExecutionException: com.esotericsoftware.kryo.KryoException:
java.util.ConcurrentModificationException
>> > Serialization trace:
>> > classes (sun.misc.Launcher$AppClassLoader)
>> > contextClassLoader (java.lang.Thread)
>> > thread (java.util.concurrent.ThreadPoolExecutor$Worker)
>> > workers (java.util.concurrent.ThreadPoolExecutor)
>> > fetchingThreadPool (org.apache.s4.core.ft.SafeKeeper)
>> > checkpointingFramework (example.wordcountApp)
>> > app (org.apache.s4.core.Stream)
>> > downStream (example.WordSumPE)
>> >     at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232) ~[na:1.6.0_22]
>> >     at java.util.concurrent.FutureTask.get(FutureTask.java:91) ~[na:1.6.0_22]
>> >     at org.apache.s4.core.ft.SaveStateTask.run(SaveStateTask.java:66) ~[bin/:na]
>> >     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[na:1.6.0_22]
>> >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[na:1.6.0_22]
>> >     at java.lang.Thread.run(Thread.java:662) [na:1.6.0_22]
>> > Caused by: com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException
>> > Serialization trace:
>> > classes (sun.misc.Launcher$AppClassLoader)
>> > contextClassLoader (java.lang.Thread)
>> > thread (java.util.concurrent.ThreadPoolExecutor$Worker)
>> > workers (java.util.concurrent.ThreadPoolExecutor)
>> > fetchingThreadPool (org.apache.s4.core.ft.SafeKeeper)
>> > checkpointingFramework (example.wordcountApp)
>> > app (org.apache.s4.core.Stream)
>> > downStream (example.WordSumPE)
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:585)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:552) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:68)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:571) ~[kryo-2.20.jar:na]
>> >     at org.apache.s4.comm.serialize.KryoSerDeser.serialize(KryoSerDeser.java:91)
~[bin/:na]
>> >     at org.apache.s4.core.ProcessingElement.serializeState(ProcessingElement.java:802)
~[bin/:na]
>> >     at org.apache.s4.core.ft.SerializeTask.call(SerializeTask.java:42) ~[bin/:na]
>> >     at org.apache.s4.core.ft.SerializeTask.call(SerializeTask.java:1) ~[bin/:na]
>> >     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) ~[na:1.6.0_22]
>> >     at java.util.concurrent.FutureTask.run(FutureTask.java:138) ~[na:1.6.0_22]
>> >     ... 3 common frames omitted
>> > Caused by: java.util.ConcurrentModificationException: null
>> >     at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
~[na:1.6.0_22]
>> >     at java.util.AbstractList$Itr.next(AbstractList.java:343) ~[na:1.6.0_22]
>> >     at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:74)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504) ~[kryo-2.20.jar:na]
>> >     at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
~[kryo-2.20.jar:na]
>> >     ... 35 common frames omitted
>> >
>> >
>> 
>> 
> 
> 


Mime
View raw message