spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <pwend...@gmail.com>
Subject Re: Unit tests in < 5 minutes
Date Fri, 08 Aug 2014 22:47:43 GMT
Josh - that was actually fixed recently (we just bind to a random port
when running tests).

On Fri, Aug 8, 2014 at 12:00 PM, Josh Rosen <rosenville@gmail.com> wrote:
> One simple optimization might be to disable the application web UI in tests
> that don't need it.  When running tests on my local machine while also
> running another Spark shell, I've noticed that the test logs fill up with
> errors when the web UI attempts to bind to the default port, fails, and
> tries a higher one.
>
> - Josh
>
> On August 8, 2014 at 11:54:24 AM, Patrick Wendell (pwendell@gmail.com)
> wrote:
>
> I dug around this a bit a while ago, I think if someone sat down and
> profiled the tests it's likely we could find some things to optimize.
> In particular, there may be overheads in starting up a local spark
> context that could be minimized and speed up all the tests. Also,
> there are some tests (especially in Streaming) that take really long,
> like 60 seconds for a single test (see some of the new flume tests).
> These could almost certainly be optimized.
>
> I think 5 minutes might be out of reach, but something like a 2X
> improvement might be possible and would be very valuable if
> accomplished.
>
> - Patrick
>
> On Fri, Aug 8, 2014 at 11:24 AM, Matei Zaharia <matei.zaharia@gmail.com>
> wrote:
>> Just as a note, when you're developing stuff, you can use "test-only" in
>> sbt, or the equivalent feature in Maven, to run just some of the tests. This
>> is what I do, I don't wait for Jenkins to run things. 90% of the time if it
>> passes the tests that I know could break stuff, it will pass all of Jenkins.
>>
>> Jenkins should always be doing all the integration tests, so I don't think
>> it will become *that* much shorter in the long run, though it can certainly
>> be improved.
>>
>> Matei
>>
>> On August 8, 2014 at 10:20:35 AM, Nicolas Liochon (nkeywal@gmail.com)
>> wrote:
>>
>> fwiw, when we did this work in HBase, we categorized the tests. Then some
>> tests can share a single jvm, while some others need to be isolated in
>> their own jvm. Nevertheless surefire can still run them in parallel by
>> starting/stopping several jvm.
>>
>> Nicolas
>>
>>
>> On Fri, Aug 8, 2014 at 7:10 PM, Reynold Xin <rxin@databricks.com> wrote:
>>
>>> ScalaTest actually has support for parallelization built-in. We can use
>>> that.
>>>
>>> The main challenge is to make sure all the test suites can work in
>>> parallel
>>> when running along side each other.
>>>
>>>
>>> On Fri, Aug 8, 2014 at 9:47 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>> > How about using parallel execution feature of maven-surefire-plugin
>>> > (assuming all the tests were made parallel friendly) ?
>>> >
>>> >
>>> >
>>>
>>> http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html
>>> >
>>> > Cheers
>>> >
>>> >
>>> > On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen <sowen@cloudera.com> wrote:
>>> >
>>> > > A common approach is to separate unit tests from integration tests.
>>> > > Maven has support for this distinction. I'm not sure it helps a lot
>>> > > though, since it only helps you to not run integration tests all the
>>> > > time. But lots of Spark tests are integration-test-like and are
>>> > > important to run to know a change works.
>>> > >
>>> > > I haven't heard of a plugin to run different test suites remotely on
>>> > > many machines, but I would not be surprised if it exists.
>>> > >
>>> > > The Jenkins servers aren't CPU-bound as far as I can tell. It's that
>>> > > the tests spend a lot of time waiting for bits to start up or
>>> > > complete. That implies the existing tests could be sped up by just
>>> > > running in parallel locally. I recall someone recently proposed this?
>>> > >
>>> > > And I think the problem with that is simply that some of the tests
>>> > > collide with each other, by opening up the same port at the same time
>>> > > for example. I know that kind of problem is being attacked even right
>>> > > now. But if all the tests were made parallel friendly, I imagine
>>> > > parallelism could be enabled and speed up builds greatly without any
>>> > > remote machines.
>>> > >
>>> > >
>>> > > On Fri, Aug 8, 2014 at 5:01 PM, Nicholas Chammas
>>> > > <nicholas.chammas@gmail.com> wrote:
>>> > > > Howdy,
>>> > > >
>>> > > > Do we think it's both feasible and worthwhile to invest in getting
>>> our
>>> > > unit
>>> > > > tests to finish in under 5 minutes (or something similarly brief)
>>> when
>>> > > run
>>> > > > by Jenkins?
>>> > > >
>>> > > > Unit tests currently seem to take anywhere from 30 min to 2 hours.
>>> > > > As
>>> > > > people add more tests, I imagine this time will only grow. I think
>>> > > > it
>>> > > would
>>> > > > be better for both contributors and reviewers if they didn't have
>>> > > > to
>>> > wait
>>> > > > so long for test results; PR reviews would be shorter, if nothing
>>> else.
>>> > > >
>>> > > > I don't know how how this is normally done, but maybe it wouldn't
>>> > > > be
>>> > too
>>> > > > much work to get a test cycle to feel lighter.
>>> > > >
>>> > > > Most unit tests are independent and can be run concurrently, right?
>>> > Would
>>> > > > it make sense to build a given patch on many servers at once and
>>> > > > send
>>> > > > disjoint sets of unit tests to each?
>>> > > >
>>> > > > I'd be interested in working on something like that if possible
>>> > > > (and
>>> > > > sensible).
>>> > > >
>>> > > > Nick
>>> > >
>>> > > ---------------------------------------------------------------------
>>> > > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> > > For additional commands, e-mail: dev-help@spark.apache.org
>>> > >
>>> > >
>>> >
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message