hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tog <guillaume.all...@gmail.com>
Subject Re: Hama status /
Date Thu, 04 Jun 2009 04:26:33 GMT
Edward,

Here is what I would like to benchmark

M.v / (||M||.||v||)

where M is an m.n matrix being 80% sparse, same for the vector
where m =10^6 and n = 5.10^5

M and v can be initialized as random sparse matrix/vector.

Do you think this can be done using hama as it is now ?

Thanks

On Thu, Jun 4, 2009 at 10:24 AM, Edward J. Yoon <edwardyoon@apache.org>wrote:

> Yes, the goal is to handle really huge matrices, for example, matrix
> operations for large-scale statistical processing, matrix
> decomposition of huge web link graph/social graph.
>
> It's the tests on 5 nodes and 10 nodes. In the future, I'll try them
> on a thousand nodes.
>
> On Thu, Jun 4, 2009 at 1:19 AM, tog <guillaume.alleon@gmail.com> wrote:
> > Hi Edward,
> >
> > I had a look to the benchmarks ...
> > Well a 5000 dense matrix multiply is roughly 30 seconds on my laptop. I
> have
> > been doing out-of-core parallel matrix factor on solve with dense systems
> up
> > to 350000
> > so I guess this is at least probably for larger matrix that Hama could be
> > interesting
> > Do you plan to do such tests with really huge matrices ?
> > Otherwise what is your business case ?
> >
> > Cheers
> > Guillaume
> >
> > On Wed, Jun 3, 2009 at 6:14 PM, Edward J. Yoon <edwardyoon@apache.org
> >wrote:
> >
> >> FYI, I ran some benchmarks -
> >> http://wiki.apache.org/hama/PerformanceEvaluation
> >>
> >> If you need any help, Pls let us know.
> >>
> >> Thanks.
> >>
> >> On Wed, Jun 3, 2009 at 6:55 PM, tog <guillaume.alleon@gmail.com> wrote:
> >> > Yes I understand the difference between MPI and Hadoop - I have been
> >> using
> >> > MPI before it actually exists :)
> >> > But as you phrased it, I had the impression that Hama was working on a
> 1
> >> > node/core cluster !!
> >> >
> >> > Regards
> >> > Guillaume
> >> >
> >> > On Wed, Jun 3, 2009 at 5:44 PM, Edward J. Yoon <edwardyoon@apache.org
> >> >wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> There is some difference between Map/Reduce and MPI programming. MPI
> >> >> is based on and designed for fast parallel computing using network
> >> >> communication on small cluster. Since MPI requires network
> >> >> communication, Increased node numbers, there is a linear increase of
> >> >> network cost at same time. On the contrary, Map/Reduce is designed
to
> >> >> distributed processing by connecting many commodity computers
> >> >> together. Therefore, The algorithms should avoid large amounts of
> >> >> communication for best performance and that key is the 'sequential
> >> >> process'.
> >> >>
> >> >> Thanks.
> >> >>
> >> >> On Wed, Jun 3, 2009 at 6:07 PM, tog <guillaume.alleon@gmail.com>
> wrote:
> >> >> > Hi Edward
> >> >> >
> >> >> > I have a test to do which is basically Sparce Mat Vec
> multiplication
> >> and
> >> >> Mat
> >> >> > norm computation. So that should be possible with Hama in its
> current
> >> >> state
> >> >> > I guess.
> >> >> > What do you mean by "sequentially executed"
> >> >> >
> >> >> > Cheers
> >> >> > Guillaume
> >> >> >
> >> >> > On Wed, Jun 3, 2009 at 5:00 PM, Edward J. Yoon <
> edwardyoon@apache.org
> >> >> >wrote:
> >> >> >
> >> >> >> Hi,
> >> >> >>
> >> >> >> Currently, the basic matrix operations are implemented based
on
> the
> >> >> >> map/reduce programming model. For example, the matrix get/set
> >> methods,
> >> >> >> the matrix norms, matrix-matrix multiplication/addition, matrix
> >> >> >> transpose. In near future, SVD, Eigenvalue decomposition and
some
> >> >> >> graph algorithms will be implemented. All the operations are
> >> >> >> sequentially executed.
> >> >> >>
> >> >> >> Thanks.
> >> >> >>
> >> >> >> On Wed, Jun 3, 2009 at 5:45 PM, tog <guillaume.alleon@gmail.com>
> >> wrote:
> >> >> >> >
> >> >> >> > Hi,
> >> >> >> >
> >> >> >> > I would like to know what is the status of Hama ?
> >> >> >> > What am I able to do with it ?
> >> >> >> > What are the future directions ?
> >> >> >> >
> >> >> >> > Cheers
> >> >> >> > Guillaume
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Best Regards, Edward J. Yoon @ NHN, corp.
> >> >> >> edwardyoon@apache.org
> >> >> >> http://blog.udanax.org
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> >
> >> >> > PGP KeyID: 1024D/47172155
> >> >> > FingerPrint: C739 8B3C 5ABF 127F CCFA  5835 F673 370B 4717 2155
> >> >> >
> >> >> > http://cheztog.blogspot.com
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Best Regards, Edward J. Yoon @ NHN, corp.
> >> >> edwardyoon@apache.org
> >> >> http://blog.udanax.org
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > PGP KeyID: 1024D/47172155
> >> > FingerPrint: C739 8B3C 5ABF 127F CCFA  5835 F673 370B 4717 2155
> >> >
> >> > http://cheztog.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon @ NHN, corp.
> >> edwardyoon@apache.org
> >> http://blog.udanax.org
> >>
> >
> >
> >
> > --
> >
> > PGP KeyID: 1024D/47172155
> > FingerPrint: C739 8B3C 5ABF 127F CCFA  5835 F673 370B 4717 2155
> >
> > http://cheztog.blogspot.com
> >
>
>
>
> --
> Best Regards, Edward J. Yoon @ NHN, corp.
> edwardyoon@apache.org
> http://blog.udanax.org
>



-- 

PGP KeyID: 1024D/47172155
FingerPrint: C739 8B3C 5ABF 127F CCFA  5835 F673 370B 4717 2155

http://cheztog.blogspot.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message