mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: ArrayIndexOutOfBoundsException SparseMatrix
Date Mon, 10 Sep 2012 14:07:19 GMT
Multi-threading at the cell level will not likely help.

Multi-threading at the row level might help.

I would recommend that you use a threaded pool executor and feed the rows
into the pool.  You won't need locks this way and you will maximize your
use of your cores.

The basic code would look roughly like this:

        List<Callable<Double>> tasks = Lists.newArrayList();
        ExecutorService pool = Executors.newFixedThreadPool(threads);
        for (final Iterable<MatrixSlice> split : data) {
            tasks.add(new Callable<Double>() {
                @Override
                public Double call() {
                    // initialize row given by split.vector() here.  The
index is split.index()
                    // return some interesting aggregate of the row
                }
            });
        }
        List<Future<Searcher>> results = pool.invokeAll(tasks);
        pool.shutdown();


On Mon, Sep 10, 2012 at 12:40 AM, PEDRO MANUEL JIMENEZ RODRIGUEZ <
pmjimenez1983@hotmail.com> wrote:

>
> Hi Ted,
> I just playing around with the code trying to obtain a better performance.
> Calculate each value doesn't take so long but I'm trying to obtain a better
> performance filling the matrix. Do you think that use multithreading is not
> going to improve it?
> Thanks for your reply.
> Pedro.
>
> > From: ted.dunning@gmail.com
> > Date: Sun, 9 Sep 2012 12:07:49 -0700
> > Subject: Re: ArrayIndexOutOfBoundsException SparseMatrix
> > To: user@mahout.apache.org
> >
> > You are using lots of threads but the sparse matrix structure is not
> thread
> > safe.  Setting a value in the SparseMatrix causes mutation to internal
> data
> > structures.
> >
> > If you can have each thread do all the updates for a single thread, that
> > would be much better.  Another option is to synchronize on the matrix
> where
> > you call set.  Another option is synchronize on the row somehow.
> >  Synchronizing on row might give you the highest performance, but would
> be
> > slightly trickier to arrange and it would still include lots of
> uncontended
> > lock overhead.
> >
> > Is multi-threading really what you want?  Why?  Does each value take a
> long
> > time to compute?
> >
> > Can you use a lockless container to collect all of the values to insert
> and
> > insert them using a single thread?  This last is probably the fastest of
> > all of the options.
> >
> > On Sun, Sep 9, 2012 at 11:29 AM, PEDRO MANUEL JIMENEZ RODRIGUEZ <
> > pmjimenez1983@hotmail.com> wrote:
> >
> > >
> > > Hi all,
> > >
> > > I'm trying to set all values of a SparseMatrix structure using multiple
> > > threads but I'm getting an error of "ArrayIndexOutBoundsException" even
> > > when access indexes are correct. In fact, when I subtitude SparseMatrix
> > > structure for a double array I didn't get any error.
> > >
> > > Does any one have any idea what could be the problem?
> > >
> > > Error:
> > >      matrixK.set(
> > >                         index,
> > >                         j,
> > >                         Math.exp((-1 * (matrixD.get(index, j) *
> > > matrixD.get(
> > >                                 index, j))) / epsilon));
> > >
> > >
> > > No error:
> > >         matrixK[index][j] = Math.exp((-1 * (matrixD.get(index, j) *
> > > matrixD.get(
> > >                         index, j))) / epsilon);
> > >
> > > Thanks a lot.
> > >
> > > Pedro.
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message