spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hall <d...@cs.berkeley.edu>
Subject Re: mllib vector templates
Date Mon, 05 May 2014 23:05:49 GMT
On Mon, May 5, 2014 at 3:40 PM, DB Tsai <dbtsai@stanford.edu> wrote:

> David,
>
> Could we use Int, Long, Float as the data feature spaces, and Double for
> optimizer?
>

Yes. Breeze doesn't allow operations on mixed types, so you'd need to
convert the double vectors to Floats if you wanted, e.g. dot product with
the weights vector.

You might also be interested in FeatureVector, which is just a wrapper
around Array[Int] that emulates an indicator vector. It supports dot
products, axpy, etc.

-- David


>
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Mon, May 5, 2014 at 3:06 PM, David Hall <dlwh@cs.berkeley.edu> wrote:
>
> > Lbfgs and other optimizers would not work immediately, as they require
> > vector spaces over double. Otherwise it should work.
> > On May 5, 2014 3:03 PM, "DB Tsai" <dbtsai@stanford.edu> wrote:
> >
> > > Breeze could take any type (Int, Long, Double, and Float) in the matrix
> > > template.
> > >
> > >
> > > Sincerely,
> > >
> > > DB Tsai
> > > -------------------------------------------------------
> > > My Blog: https://www.dbtsai.com
> > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > >
> > >
> > > On Mon, May 5, 2014 at 2:56 PM, Debasish Das <debasish.das83@gmail.com
> > > >wrote:
> > >
> > > > Is this a breeze issue or breeze can take templates on float /
> double ?
> > > >
> > > > If breeze can take templates then it is a minor fix for Vectors.scala
> > > right
> > > > ?
> > > >
> > > > Thanks.
> > > > Deb
> > > >
> > > >
> > > > On Mon, May 5, 2014 at 2:45 PM, DB Tsai <dbtsai@stanford.edu> wrote:
> > > >
> > > > > +1  Would be nice that we can use different type in Vector.
> > > > >
> > > > >
> > > > > Sincerely,
> > > > >
> > > > > DB Tsai
> > > > > -------------------------------------------------------
> > > > > My Blog: https://www.dbtsai.com
> > > > > LinkedIn: https://www.linkedin.com/in/dbtsai
> > > > >
> > > > >
> > > > > On Mon, May 5, 2014 at 2:41 PM, Debasish Das <
> > debasish.das83@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Why mllib vector is using double as default ?
> > > > > >
> > > > > > /**
> > > > > >
> > > > > >  * Represents a numeric vector, whose index type is Int and
value
> > > type
> > > > is
> > > > > > Double.
> > > > > >
> > > > > >  */
> > > > > >
> > > > > > trait Vector extends Serializable {
> > > > > >
> > > > > >
> > > > > >   /**
> > > > > >
> > > > > >    * Size of the vector.
> > > > > >
> > > > > >    */
> > > > > >
> > > > > >   def size: Int
> > > > > >
> > > > > >
> > > > > >   /**
> > > > > >
> > > > > >    * Converts the instance to a double array.
> > > > > >
> > > > > >    */
> > > > > >
> > > > > >   def toArray: Array[Double]
> > > > > >
> > > > > > Don't we need a template on float/double ? This will give us
> memory
> > > > > > savings...
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > Deb
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message