mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: understanding vectors
Date Mon, 14 Jan 2013 22:08:07 GMT
Oh I'm sorry Koert, I totally misunderstood you: you're saying it's *fine*
in trunk, which matches what Ted and I were thinking: we'd seen this bug
before, and fixed it.

In which case the answer is: we probably should have a unit test in trunk
to verify we don't get a regression on this, but I think we probably
already do.

For your use case, I would go with trunk, as it works as expected there
(and in fact, this is usually the case: trunk is usually pretty far ahead
of the last release,
except right after a release, since we don't have releases that often [we
wish this wasn't true, but is what it is]).


On Mon, Jan 14, 2013 at 1:54 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Any word on how it works with trunk?
>
> On Mon, Jan 14, 2013 at 1:41 PM, Koert Kuipers <koert@tresata.com> wrote:
>
> > i probably wasn't clear before: i can not reproduce bug in trunk.
> >
> > On Mon, Jan 14, 2013 at 4:40 PM, Koert Kuipers <koert@tresata.com>
> wrote:
> >
> > > ok i can write unit test for trunk, but it will succeed, since this is
> > > fixed in trunk. ok?
> > >
> > > On Mon, Jan 14, 2013 at 4:35 PM, Jake Mannix <jake.mannix@gmail.com
> > >wrote:
> > >
> > >> We don't have a way to add the code to 0.7, it would just go in trunk.
> > >>
> > >>
> > >> On Mon, Jan 14, 2013 at 1:19 PM, Koert Kuipers <koert@tresata.com>
> > wrote:
> > >>
> > >> > since trunk behaves properly, where do you want unit test? in
> version
> > >> 0.7?
> > >> >
> > >> > On Mon, Jan 14, 2013 at 3:59 PM, Jake Mannix <jake.mannix@gmail.com
> >
> > >> > wrote:
> > >> >
> > >> > > I could have sworn we've seen and fixed this bug before, but
I
> guess
> > >> not.
> > >> > >  Let's get a (failing) unit test in there, and then it should
be
> an
> > >> easy
> > >> > > fix.
> > >> > >
> > >> > > I think nobody runs into this because it's really rare to want
to
> > >> iterate
> > >> > > over all of the entries in a sparse vector - iterateNonZero()
is
> > used
> > >> > > almost exclusively for these.
> > >> > >
> > >> > >
> > >> > > On Mon, Jan 14, 2013 at 12:51 PM, Koert Kuipers <
> koert@tresata.com>
> > >> > wrote:
> > >> > >
> > >> > > > not the same behavior in trunk. SequentialAccessVector.iterator
> > >> seems
> > >> > to
> > >> > > > behave properly there.
> > >> > > >
> > >> > > > On Mon, Jan 14, 2013 at 3:07 PM, Koert Kuipers <
> koert@tresata.com
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > sorry yes i meant 0.7
> > >> > > > >
> > >> > > > > On Mon, Jan 14, 2013 at 12:58 PM, Jake Mannix <
> > >> jake.mannix@gmail.com
> > >> > > > >wrote:
> > >> > > > >
> > >> > > > >> I think you mean 0.7, right?  Can you see if you
get the same
> > >> > behavior
> > >> > > > in
> > >> > > > >> svn trunk?
> > >> > > > >>
> > >> > > > >>
> > >> > > > >> On Mon, Jan 14, 2013 at 9:21 AM, Koert Kuipers
<
> > >> koert@tresata.com>
> > >> > > > wrote:
> > >> > > > >>
> > >> > > > >> > i am using version mahout 7.0
> > >> > > > >> >
> > >> > > > >> > On Mon, Jan 14, 2013 at 10:15 AM, Ted Dunning
<
> > >> > > ted.dunning@gmail.com>
> > >> > > > >> > wrote:
> > >> > > > >> >
> > >> > > > >> > > Which version are you using?
> > >> > > > >> > >
> > >> > > > >> > > (this misbehavior sounds familiar)
> > >> > > > >> > >
> > >> > > > >> > > iterator() should return all values.
> > >> > > > >> > >
> > >> > > > >> > > iterateNonZero() is allowed to skip zeros.
> > >> > > > >> > >
> > >> > > > >> > > On Mon, Jan 14, 2013 at 6:11 AM, Koert
Kuipers <
> > >> > koert@tresata.com
> > >> > > >
> > >> > > > >> > wrote:
> > >> > > > >> > >
> > >> > > > >> > > > i am looking at the iterators for
DenseVector,
> > >> > > > >> > RandomAcccessSparseVector
> > >> > > > >> > > > and SequentialAccessSparseVector.
> > >> > > > >> > > >
> > >> > > > >> > > > for both DenseVector and RandomAcccessSparseVector
the
> > >> > iterator
> > >> > > > >> seems
> > >> > > > >> > to
> > >> > > > >> > > > return all values, including the
missing zero values.
> > >> > > > >> > > > for SequentialAccessSparseVector
the iterator also
> > returns
> > >> all
> > >> > > > >> values,
> > >> > > > >> > > but
> > >> > > > >> > > > only up to the last non-missing
value!
> > >> > > > >> > > >
> > >> > > > >> > > > is this by design? what is a vector
iterator supposed
> to
> > >> > return
> > >> > > > >> > exactly?
> > >> > > > >> > > i
> > >> > > > >> > > > can't see a logical pattern/consistency.
see examples
> > >> below.
> > >> > > > >> > > > thanks! koert
> > >> > > > >> > > >
> > >> > > > >> > > > scala> val x = new
> > >> > > > >> org.apache.mahout.math.RandomAccessSparseVector(5)
> > >> > > > >> > > > x: org.apache.mahout.math.RandomAccessSparseVector
= {}
> > >> > > > >> > > >
> > >> > > > >> > > > scala> x.set(3, 1.0)
> > >> > > > >> > > >
> > >> > > > >> > > > scala> for (item <- x.iterator.asScala)
> > >> println((item.index,
> > >> > > > >> item.get))
> > >> > > > >> > > > (0,0.0)
> > >> > > > >> > > > (1,0.0)
> > >> > > > >> > > > (2,0.0)
> > >> > > > >> > > > (3,1.0)
> > >> > > > >> > > > (4,0.0)
> > >> > > > >> > > >
> > >> > > > >> > > > scala> val y = new
> > >> > > > >> > org.apache.mahout.math.SequentialAccessSparseVector(5)
> > >> > > > >> > > > y: org.apache.mahout.math.SequentialAccessSparseVector
> =
> > {
> > >> > > > >> > > >
> > >> > > > >> > > > scala> y.set(3, 1.0)
> > >> > > > >> > > >
> > >> > > > >> > > > scala> for (item <- y.iterator.asScala)
> > >> println((item.index,
> > >> > > > >> item.get))
> > >> > > > >> > > > (0,0.0)
> > >> > > > >> > > > (1,0.0)
> > >> > > > >> > > > (2,0.0)
> > >> > > > >> > > > (3,1.0)
> > >> > > > >> > > >
> > >> > > > >> > >
> > >> > > > >> >
> > >> > > > >>
> > >> > > > >>
> > >> > > > >>
> > >> > > > >> --
> > >> > > > >>
> > >> > > > >>   -jake
> > >> > > > >>
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > >
> > >> > >   -jake
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >>   -jake
> > >>
> > >
> > >
> >
>



-- 

  -jake

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message