metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Leet <justinjl...@gmail.com>
Subject Re: Build failing
Date Tue, 24 Jan 2017 12:57:19 GMT
I definitely agree that this isn't a fluke.

Do we have a Jira for this?  If not, I can create one and I would like to
propose that part of that ticket is adding logging.  Right now, I'm
concerned we don't have enough info from the Travis builds to be able to
(easily) debug failure or reproduce locally.

Justin

On Mon, Jan 23, 2017 at 4:16 PM, Casey Stella <cestella@gmail.com> wrote:

> One more thing, just for posterity here, it always freezes at 6 records
> written to HDFS.  That's the reason I thought it was a flushing issue.
>
> On Mon, Jan 23, 2017 at 3:38 PM, Casey Stella <cestella@gmail.com> wrote:
>
> > Ok, so now I'm concerned that this isn't a fluke.  Here's an excerpt from
> > the failing logs on travis for my PR with substantially longer timeouts (
> > https://s3.amazonaws.com/archive.travis-ci.org/jobs/194575474/log.txt)
> >
> > Running org.apache.metron.solr.integration.SolrIndexingIntegrationTest
> > 0 vs 10 vs 0
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Processed target/indexingIntegrationTest/hdfs/test/enrichment-null-0-0-
> 1485200689038.json
> > 10 vs 10 vs 6
> > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 317.056
> sec <<< FAILURE!
> > test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> Time elapsed: 316.949 sec  <<< ERROR!
> > java.lang.RuntimeException: Took too long to complete: 300783 > 300000
> >       at org.apache.metron.integration.ComponentRunner.process(
> ComponentRunner.java:131)
> >       at org.apache.metron.indexing.integration.
> IndexingIntegrationTest.test(IndexingIntegrationTest.java:173)
> >
> >
> > I'm getting the impression that this isn't the timeout and we have a
> mystery on our hands.  Each of those lines "10 vs 10 vs 6" happen 15
> seconds apart.  That line means that it read 10 entries from kafka, 10
> entries from the indexed data and 6 entries from HDFS.  It's that 6 entries
> that is the problem.   Also of note, this does not seem to happen to me
> locally AND it's not consistent on Travis.  Given all that I'd say that
> it's a problem with the HDFS Writer not getting flushed, but I verified
> that it is indeed flushed per message.
> >
> >
> > Anyway, tl;dr we have a mystery unit test bug that isn't deterministic
> wrt the unit tests and may or may not manifest itself outside of the unit
> tests.  So, yeah, I'll be looking at it, but would appreciate others taking
> a gander too.
> >
> >
> > Casey
> >
> >
> > On Mon, Jan 23, 2017 at 2:09 PM, Casey Stella <cestella@gmail.com>
> wrote:
> >
> >> Yeah, I adjusted the timeout on the indexing integration tests as part
> of
> >> https://github.com/apache/incubator-metron/pull/420 which I'll merge in
> >> today.
> >>
> >> On Mon, Jan 23, 2017 at 2:01 PM, Zeolla@GMail.com <zeolla@gmail.com>
> >> wrote:
> >>
> >>> Okay, now we have back to back failures, and it looks like it may have
> >>> been
> >>> a timeout issue?
> >>>  `test(org.apache.metron.solr.integration.
> SolrIndexingIntegrationTest):
> >>> Took too long to complete: 150582 > 150000`, more details below:
> >>>
> >>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> >>> 166.167 sec <<< FAILURE!
> >>>
> >>> test(org.apache.metron.solr.integration.SolrIndexingIntegrationTest)
> >>> Time elapsed: 166.071 sec  <<< ERROR!
> >>>
> >>> java.lang.RuntimeException: Took too long to complete: 150582 > 150000
> >>>
> >>>         at org.apache.metron.integration.
> ComponentRunner.process(Compon
> >>> entRunner.java:131)
> >>>
> >>>         at org.apache.metron.indexing.integration.
> IndexingIntegrationTe
> >>> st.test(IndexingIntegrationTest.java:173)
> >>>
> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>
> >>>         at sun.reflect.NativeMethodAccessorImpl.
> invoke(NativeMethodAcce
> >>> ssorImpl.java:62)
> >>>
> >>>         at sun.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMe
> >>> thodAccessorImpl.java:43)
> >>>
> >>>         at java.lang.reflect.Method.invoke(Method.java:483)
> >>>
> >>>         at org.junit.runners.model.FrameworkMethod$1.
> runReflectiveCall(
> >>> FrameworkMethod.java:50)
> >>>
> >>>         at org.junit.internal.runners.model.ReflectiveCallable.run(
> Refl
> >>> ectiveCallable.java:12)
> >>>
> >>>         at org.junit.runners.model.FrameworkMethod.
> invokeExplosively(Fr
> >>> ameworkMethod.java:47)
> >>>
> >>>         at org.junit.internal.runners.statements.InvokeMethod.
> evaluate(
> >>> InvokeMethod.java:17)
> >>>
> >>>         at org.junit.runners.ParentRunner.runLeaf(
> ParentRunner.java:325)
> >>>
> >>>         at org.junit.runners.BlockJUnit4ClassRunner.
> runChild(BlockJUnit
> >>> 4ClassRunner.java:78)
> >>>
> >>>         at org.junit.runners.BlockJUnit4ClassRunner.
> runChild(BlockJUnit
> >>> 4ClassRunner.java:57)
> >>>
> >>>         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> >>>
> >>>         at org.junit.runners.ParentRunner$1.schedule(
> ParentRunner.java:
> >>> 71)
> >>>
> >>>         at org.junit.runners.ParentRunner.runChildren(
> ParentRunner.java
> >>> :288)
> >>>
> >>>         at org.junit.runners.ParentRunner.access$000(
> ParentRunner.java:
> >>> 58)
> >>>
> >>>         at org.junit.runners.ParentRunner$2.evaluate(
> ParentRunner.java:
> >>> 268)
> >>>
> >>>         at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> >>>
> >>>         at org.apache.maven.surefire.junit4.JUnit4Provider.execute(
> JUni
> >>> t4Provider.java:252)
> >>>
> >>>         at org.apache.maven.surefire.junit4.JUnit4Provider.
> executeTestS
> >>> et(JUnit4Provider.java:141)
> >>>
> >>>         at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(
> JUnit
> >>> 4Provider.java:112)
> >>>
> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>
> >>>         at sun.reflect.NativeMethodAccessorImpl.
> invoke(NativeMethodAcce
> >>> ssorImpl.java:62)
> >>>
> >>>         at sun.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMe
> >>> thodAccessorImpl.java:43)
> >>>
> >>>         at java.lang.reflect.Method.invoke(Method.java:483)
> >>>
> >>>         at org.apache.maven.surefire.util.ReflectionUtils.
> invokeMethodW
> >>> ithArray(ReflectionUtils.java:189)
> >>>
> >>>         at org.apache.maven.surefire.booter.ProviderFactory$
> ProviderPro
> >>> xy.invoke(ProviderFactory.java:165)
> >>>
> >>>         at org.apache.maven.surefire.booter.ProviderFactory.
> invokeProvi
> >>> der(ProviderFactory.java:85)
> >>>
> >>>         at org.apache.maven.surefire.booter.ForkedBooter.
> runSuitesInPro
> >>> cess(ForkedBooter.java:115)
> >>>
> >>>         at org.apache.maven.surefire.booter.ForkedBooter.main(
> ForkedBoo
> >>> ter.java:75)
> >>>
> >>>
> >>> Jon
> >>>
> >>> On Thu, Jan 19, 2017 at 2:49 PM Zeolla@GMail.com <zeolla@gmail.com>
> >>> wrote:
> >>>
> >>> > The build has been showing as failing
> >>> > <https://github.com/apache/incubator-metron> for a little while
now.
> >>> I
> >>> > know we recently updated the language around Merge Requirements
> >>> > <https://cwiki.apache.org/confluence/pages/viewpage.action?p
> >>> ageId=61332235>,
> >>> > but if I recall properly our current issue is simply a Travis CI
> >>> overload
> >>> > issue.  Is there a way we can update the wiki doc to account for
> >>> situations
> >>> > like this?
> >>> >
> >>> > Jon
> >>> > --
> >>> >
> >>> > Jon
> >>> >
> >>> > Sent from my mobile device
> >>> >
> >>> --
> >>>
> >>> Jon
> >>>
> >>> Sent from my mobile device
> >>>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message