spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emre Sevinc <emre.sev...@gmail.com>
Subject Re: Speeding up Spark build during development
Date Mon, 04 May 2015 08:50:12 GMT
Just to give you an example:

When I was trying to make a small change only to the Streaming component of
Spark, first I built and installed the whole Spark project (this took about
15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed files
only in Streaming, I ran something like (in the top-level directory):

   mvn --projects streaming/ -DskipTests package

and then

   mvn --projects assembly/ -DskipTests install


This was much faster than trying to build the whole Spark from scratch,
because Maven was only building one component, in my case the Streaming
component, of Spark. I think you can use a very similar approach.

--
Emre Sevinç



On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri <pramodbiligiri@gmail.com>
wrote:

> No, I just need to build one project at a time. Right now SparkSql.
>
> Pramod
>
> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc <emre.sevinc@gmail.com>
> wrote:
>
>> Hello Pramod,
>>
>> Do you need to build the whole project every time? Generally you don't,
>> e.g., when I was changing some files that belong only to Spark Streaming, I
>> was building only the streaming (of course after having build and installed
>> the whole project, but that was done only once), and then the assembly.
>> This was much faster than trying to build the whole Spark every time.
>>
>> --
>> Emre Sevinç
>>
>> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri <pramodbiligiri@gmail.com
>> > wrote:
>>
>>> Using the inbuilt maven and zinc it takes around 10 minutes for each
>>> build.
>>> Is that reasonable?
>>> My maven opts looks like this:
>>> $ echo $MAVEN_OPTS
>>> -Xmx12000m -XX:MaxPermSize=2048m
>>>
>>> I'm running it as build/mvn -DskipTests package
>>>
>>> Should I be tweaking my Zinc/Nailgun config?
>>>
>>> Pramod
>>>
>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra <mark@clearstorydata.com>
>>> wrote:
>>>
>>> >
>>> >
>>> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
>>> >
>>> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
>>> pramodbiligiri@gmail.com>
>>> > wrote:
>>> >
>>> >> This is great. I didn't know about the mvn script in the build
>>> directory.
>>> >>
>>> >> Pramod
>>> >>
>>> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
>>> >> Brennon.York@capitalone.com>
>>> >> wrote:
>>> >>
>>> >> > Following what Ted said, if you leverage the `mvn` from within
the
>>> >> > `build/` directory of Spark you¹ll get zinc for free which should
>>> help
>>> >> > speed up build times.
>>> >> >
>>> >> > On 5/1/15, 9:45 AM, "Ted Yu" <yuzhihong@gmail.com> wrote:
>>> >> >
>>> >> > >Pramod:
>>> >> > >Please remember to run Zinc so that the build is faster.
>>> >> > >
>>> >> > >Cheers
>>> >> > >
>>> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
>>> >> > ><alexander.ulanov@hp.com>
>>> >> > >wrote:
>>> >> > >
>>> >> > >> Hi Pramod,
>>> >> > >>
>>> >> > >> For cluster-like tests you might want to use the same
code as in
>>> >> mllib's
>>> >> > >> LocalClusterSparkContext. You can rebuild only the package
that
>>> you
>>> >> > >>change
>>> >> > >> and then run this main class.
>>> >> > >>
>>> >> > >> Best regards, Alexander
>>> >> > >>
>>> >> > >> -----Original Message-----
>>> >> > >> From: Pramod Biligiri [mailto:pramodbiligiri@gmail.com]
>>> >> > >> Sent: Friday, May 01, 2015 1:46 AM
>>> >> > >> To: dev@spark.apache.org
>>> >> > >> Subject: Speeding up Spark build during development
>>> >> > >>
>>> >> > >> Hi,
>>> >> > >> I'm making some small changes to the Spark codebase and
trying
>>> it out
>>> >> > >>on a
>>> >> > >> cluster. I was wondering if there's a faster way to build
than
>>> >> running
>>> >> > >>the
>>> >> > >> package target each time.
>>> >> > >> Currently I'm using: mvn -DskipTests  package
>>> >> > >>
>>> >> > >> All the nodes have the same filesystem mounted at the
same mount
>>> >> point.
>>> >> > >>
>>> >> > >> Pramod
>>> >> > >>
>>> >> >
>>> >> > ________________________________________________________
>>> >> >
>>> >> > The information contained in this e-mail is confidential and/or
>>> >> > proprietary to Capital One and/or its affiliates. The information
>>> >> > transmitted herewith is intended only for use by the individual
or
>>> >> entity
>>> >> > to which it is addressed.  If the reader of this message is not
the
>>> >> > intended recipient, you are hereby notified that any review,
>>> >> > retransmission, dissemination, distribution, copying or other use
>>> of, or
>>> >> > taking of any action in reliance upon this information is strictly
>>> >> > prohibited. If you have received this communication in error, please
>>> >> > contact the sender and delete the material from your computer.
>>> >> >
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> Emre Sevinc
>>
>
>


-- 
Emre Sevinc

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message