metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: Build failing
Date Mon, 23 Jan 2017 21:16:31 GMT
One more thing, just for posterity here, it always freezes at 6 records
written to HDFS.  That's the reason I thought it was a flushing issue.

On Mon, Jan 23, 2017 at 3:38 PM, Casey Stella <cestella@gmail.com> wrote:

> Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> the failing logs on travis for my PR with substantially longer timeouts (
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
>
> Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> 0 vs 10 vs 0
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-1485200689038.json
> 10 vs 10 vs 6
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 317.056 sec <<<
FAILURE!
> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)  Time elapsed: 316.949
sec  <<< ERROR!
> java.lang.RuntimeException: Took too long to complete: 300783 > 300000
> 	at org.apache.metron.integration.ComponentRunner.process(ComponentRunner.java:131)
> 	at org.apache.metron.indexing.integration.IndexingIntegrationTest.test(IndexingIntegrationTest.java:173)
> 	
>
> I'm getting the impression that this isn't the timeout and we have a mystery on our hands.
 Each of those lines "10 vs 10 vs 6" happen 15 seconds apart.  That line means that it read
10 entries from kafka, 10 entries from the indexed data and 6 entries from HDFS.  It's that
6 entries that is the problem.   Also of note, this does not seem to happen to me locally
AND it's not consistent on Travis.  Given all that I'd say that it's a problem with the HDFS
Writer not getting flushed, but I verified that it is indeed flushed per message.
>
>
> Anyway, tl;dr we have a mystery unit test bug that isn't deterministic wrt the unit tests
and may or may not manifest itself outside of the unit tests.  So, yeah, I'll be looking at
it, but would appreciate others taking a gander too.
>
>
> Casey
>
>
> On Mon, Jan 23, 2017 at 2:09 PM, Casey Stella <cestella@gmail.com> wrote:
>
>> Yeah, I adjusted the timeout on the indexing integration tests as part of
>> https://github.com/apache/incubator-metron/pull/420 which I'll merge in
>> today.
>>
>> On Mon, Jan 23, 2017 at 2:01 PM, Zeolla@GMail.com <zeolla@gmail.com>
>> wrote:
>>
>>> Okay, now we have back to back failures, and it looks like it may have
>>> been
>>> a timeout issue?
>>>  `test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest):
>>> Took too long to complete: 150582 > 150000`, more details below:
>>>
>>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>>> 166.167 sec <<< FAILURE!
>>>
>>> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
>>> Time elapsed: 166.071 sec  <<< ERROR!
>>>
>>> java.lang.RuntimeException: Took too long to complete: 150582 > 150000
>>>
>>>         at org.apache.metron.integration.ComponentRunner.process(Compon
>>> entRunner.java:131)
>>>
>>>         at org.apache.metron.indexing.integration.IndexingIntegrationTe
>>> st.test(IndexingIntegrationTest.java:173)
>>>
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>>
>>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>
>>>         at java.lang.reflect.Method.invoke(Method.java:483)
>>>
>>>         at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
>>> FrameworkMethod.java:50)
>>>
>>>         at org.junit.internal.runners.model.ReflectiveCallable.run(Refl
>>> ectiveCallable.java:12)
>>>
>>>         at org.junit.runners.model.FrameworkMethod.invokeExplosively(Fr
>>> ameworkMethod.java:47)
>>>
>>>         at org.junit.internal.runners.statements.InvokeMethod.evaluate(
>>> InvokeMethod.java:17)
>>>
>>>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>>>
>>>         at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
>>> 4ClassRunner.java:78)
>>>
>>>         at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
>>> 4ClassRunner.java:57)
>>>
>>>         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>>>
>>>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:
>>> 71)
>>>
>>>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java
>>> :288)
>>>
>>>         at org.junit.runners.ParentRunner.access$000(ParentRunner.java:
>>> 58)
>>>
>>>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:
>>> 268)
>>>
>>>         at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>>>
>>>         at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUni
>>> t4Provider.java:252)
>>>
>>>         at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestS
>>> et(JUnit4Provider.java:141)
>>>
>>>         at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit
>>> 4Provider.java:112)
>>>
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>>
>>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>
>>>         at java.lang.reflect.Method.invoke(Method.java:483)
>>>
>>>         at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodW
>>> ithArray(ReflectionUtils.java:189)
>>>
>>>         at org.apache.maven.surefire.booter.ProviderFactory$ProviderPro
>>> xy.invoke(ProviderFactory.java:165)
>>>
>>>         at org.apache.maven.surefire.booter.ProviderFactory.invokeProvi
>>> der(ProviderFactory.java:85)
>>>
>>>         at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInPro
>>> cess(ForkedBooter.java:115)
>>>
>>>         at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBoo
>>> ter.java:75)
>>>
>>>
>>> Jon
>>>
>>> On Thu, Jan 19, 2017 at 2:49 PM Zeolla@GMail.com <zeolla@gmail.com>
>>> wrote:
>>>
>>> > The build has been showing as failing
>>> > <https://github.com/apache/incubator-metron> for a little while now.
>>> I
>>> > know we recently updated the language around Merge Requirements
>>> > <https://cwiki.apache.org/confluence/pages/viewpage.action?p
>>> ageId=61332235>,
>>> > but if I recall properly our current issue is simply a Travis CI
>>> overload
>>> > issue.  Is there a way we can update the wiki doc to account for
>>> situations
>>> > like this?
>>> >
>>> > Jon
>>> > --
>>> >
>>> > Jon
>>> >
>>> > Sent from my mobile device
>>> >
>>> --
>>>
>>> Jon
>>>
>>> Sent from my mobile device
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message