systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Dusenberry <dusenberr...@gmail.com>
Subject Re: Update Spark Configuration to improve SystemML performance
Date Sat, 29 Jul 2017 01:40:42 GMT
Awesome!  We certainly welcome new contributors!  Just to highlight a few
areas, in terms of ML/DL tasks, SYSTEMML-540 [1] covers all of our work
related to deep learning, SYSTEMML-618 [2] covers our work within that to
create a DML deep learning library ("nn"), SYSTEMML-1479 [3] covers work to
be able to ingest Caffe models for distributed training/prediction, and
more generally, the "Algorithms" tag [4] covers tasks related to any DML
algorithm work.

[1]: https://issues.apache.org/jira/browse/SYSTEMML-540
[2]: https://issues.apache.org/jira/browse/SYSTEMML-618
[3]: https://issues.apache.org/jira/browse/SYSTEMML-1479
[4]:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20SYSTEMML%20AND%20component%20%3D%20Algorithms%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC


Please let us know either here or on the JIRA issues if you have any
questions!


- Mike

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

On Tue, Jul 25, 2017 at 11:14 AM, arijit chakraborty <akc14@hotmail.com>
wrote:

> I’ll also pick one issue and try to solve it.
>
> And it is really a very friendly group!
>
> Thanks!
> Arijit
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
> From: Janardhan Pulivarthi<mailto:janardhan.pulivarthi@gmail.com>
> Sent: Tuesday, July 25, 2017 10:53 PM
> To: himanshu.mohan78@gmail.com<mailto:himanshu.mohan78@gmail.com>;
> dev@systemml.apache.org<mailto:dev@systemml.apache.org>
> Subject: Re: Update Spark Configuration to improve SystemML performance
>
> Hi Himanshu!
>
> We feel great you are here. SystemML is a very friendly community.
>
> To get started.
> 1. see the components of the SystemML, here in this link:
> https://issues.apache.org/jira/projects/SYSTEMML?
> selectedItem=com.atlassian.jira.jira-projects-plugin:components-page
>
> 2. Through this link you can select any component of your interest, and you
> can browse through the corresponding issues and find which one(s) of them
> fascinates you.
>
> 3. If you any need guidance on how to proceed on that particular issues
> notify the other committers on the mailing list or in the jira itself.
>
> Thanks a lot,
>
> Cheers,
> Janardhan
>
> On Tue, Jul 25, 2017 at 8:36 PM, Himanshu Mohan <
> himanshu.mohan78@gmail.com>
> wrote:
>
> > I am also interested in doing some real life hands on work in SystemML
> >
> > Thanks and Regards
> > Himanshu
> >
> > > On 25-Jul-2017, at 2:57 PM, arijit chakraborty <akc14@hotmail.com>
> > wrote:
> > >
> > > Hi Matthias,
> > >
> > >
> > > Thanks for your mail. I'm attaching again the server configurations.
> I'm
> > also adding your personal email id, just to be double sure you can see
> the
> > images. Pardon me for that. I could improve the setup further so that
> now I
> > can run the code at the same speed as R (around 40 mins). But this setup
> > I'm sharing is the older setup. So most probably the performance of my
> code
> > was dependent on spark configuration. So if you can help me on that.
> > >
> > >
> > > Also, currently I'm mainly working on CNN works. And I've decent
> > programming experience in python & R. But I would request you to share
> with
> > me project which is among the least priority one. This will help me to
> get
> > accustomed with this project setup without getting bothered about time
> > lines.
> > >
> > >
> > > Thank you!
> > >
> > > Arijit<cluster performance.png>
> > > <cluster specs.png>
> > > <cores.png>
> > > From: Matthias Boehm <mboehm7@googlemail.com>
> > > Sent: Tuesday, July 25, 2017 2:10:52 PM
> > > To: dev@systemml.apache.org
> > > Subject: Re: Update Spark Configuration to improve SystemML performance
> > >
> > > great to hear that - we welcome additional contributions. Just let us
> > know
> > > in which area you're most interested in (e.g., algorithms, APIs,
> > optimizer,
> > > runtime, etc) and we could identifying a couple of tasks to get you
> > started.
> > >
> > > Regarding the performance numbers, I am not able to see the details.
> Also
> > > could you share which operation was causing the large GC overhead -
> maybe
> > > we can improve the runtime for the specific scenario. Thanks.
> > >
> > > Regards,
> > > Matthias
> > >
> > > On Mon, Jul 24, 2017 at 12:17 PM, arijit chakraborty <
> akc14@hotmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > >
> > > > I tried to work on spark configuration file to improve the systemML
> > > > performance. Even after much tuning R code is running in 40 mins, but
> > > > SystemML is taking 2.2 hours. Please find the spark configuration
> > > > screenshots. Please let me know if I'm making some mistake in tuning
> > of the
> > > > spark configuration. One problem we could rectify is garbage time
> > error.
> > > > Now, it's completely not there. That was one major bottleneck which
> was
> > > > making the code extremely slow.
> > > >
> > > >
> > > > I'm  working in local system and created a standalone version of
> spark,
> > > > with master and workers. The following are the details:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > I also wants to know is it possible to get involved with systemML
> > > > development? My project is almost on the verge of completion and I
> > learned
> > > > a lot from you all people. And I really liked this project. So I want
> > to
> > > > contribute more fruitfully in it.
> > > >
> > > >
> > > > Thank you!
> > > >
> > > > Arijit
> > > >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message