mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Was the Vector hierarchy ever Serializable?
Date Fri, 25 Jul 2014 22:52:23 GMT

There is a problem with using
The java serialization framework in that the serialization that we do thinks about the properties
of the vectors rather than the class. That means that the serialization isn't round trip safe.


Sent from my iPhone

> On Jul 25, 2014, at 14:33, Anand Avati <avati@gluster.org> wrote:
> 
> Ivan,
> 
> Yes you will need some extra (trivial) plumbing, but the meat of efficient
> serialize/deserialize are in those helper classes.
> 
> Thanks
> 
> 
>> On Fri, Jul 25, 2014 at 2:26 PM, Ivan Brusic <ivan@brusic.com> wrote:
>> 
>> Thanks for the quick response. VectorWritable looks like exactly what I
>> need, but it doesn't extend Vector, so there needs to be work done on my
>> part for deeper serialization.
>> 
>> Cheers,
>> 
>> Ivan
>> 
>> 
>>> On Fri, Jul 25, 2014 at 2:13 PM, Anand Avati <avati@gluster.org> wrote:
>>> 
>>> I don't think Vector and Matrix were ever declares Serializable. Please
>>> look at VectorWritable and MatrixWritable classes in mrlegacy module.
>> Both
>>> the Spark bindings and H2O bindings use these *Writable classes for
>>> shipping matrix and vector over the wire. You can even look at
>> https://github.com/avati/mahout/blob/MAHOUT-1500/h2o/src/main/java/org/apache/mahout/h2obindings/drm/H2OBCast.java
>>> as
>>> a reference for how to do it.
>>> 
>>> Thanks
>>> 
>>> 
>>>> On Fri, Jul 25, 2014 at 2:04 PM, Ivan Brusic <ivan@brusic.com> wrote:
>>>> 
>>>> I am in the midst of upgrading our Mahout library in order to take
>>>> advantage of all the excellent recent additions.
>>>> 
>>>> As far as I can tell, the library was based off a snapshot of 0.5. The
>>> code
>>>> does not use any of the Mahout algorithms, just a few of the data
>>>> structures such as DenseVector. The existing code builds a Java object
>>>> which is then serialized and distributed. After upgrading to 0.9, I
>>> noticed
>>>> I was no longer able to deserialize objects since DenseVector is
>>>> not Serializable. After inspect the old jar, it seems like
>> AbstractVector
>>>> was declared Serializable.
>>>> 
>>>> So either someone at my company added serialization to the Mahout
>> classes
>>>> or they were Serializable at some point. I am assuming the former. Is
>>> this
>>>> the case? I looked at the commits and at no point was anything
>>>> Serializable.
>>>> 
>>>> Since the classes are not Serializable and no longer inherit from
>>> Writable,
>>>> is there an existing strategy to output Mahout structures? Would hate
>> to
>>>> write wrapper classes or once again modify the source.
>>>> 
>>>> Cheers,
>>>> 
>>>> Ivan
>> 

Mime
View raw message