samoa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: New Instances
Date Mon, 19 Jan 2015 17:17:29 GMT
Is there a pull request somewhere?

One thing I want for example, is to dynamically set the number of
classes for an instance, as we discover those classes in the stream.
Hopefully the new instances will allow that.

Thanks,

Matthieu

On Wed, Jan 14, 2015 at 2:34 AM, Albert Bifet <abifet@waikato.ac.nz> wrote:
> Thanks Gianmarco,
>
> 1/ Range contains the information of which are the input and output
> attributes.  Each instance has an InstancesHeader field that contains an
> AttributesInformation object.
>
> 2/ In the case that there is no metadata information, then all attributes
> are numeric, right? This seems reasonable.
>
> - InstancesHeader contains an InstanceInformation object. We may use
> InstanceInformation instead of InstancesHeader.
>
> - Yes, AttributesInformation can be modified at runtime, adding attributes
> and values of attributes.
>
> Cheers,
>
> Albert
>
> On Tue, Jan 13, 2015 at 9:18 PM, Gianmarco De Francisci Morales <
> gdfm@apache.org> wrote:
>
>> Thanks Albert.
>>
>> I have a couple of questions.
>>
>> 1/ how do we distinguish between input and output attributes?
>> In particular, let's take as an example the default single-label
>> classification.
>> I guess that is the role of Range.
>> However, do we have to serialize it with every instance we send?
>>
>> 2/ to distinguish between numeric and categorical we need some metadata,
>> which I guess goes into InstancesHeader.
>> I am fine with keeping it also for compatibility with MOA, and we might use
>> it if we have access to it.
>> However, I would prefer algorithms not to rely on it, and consider the
>> presence of metadata optional.
>>
>> Some other points:
>> - what's the difference between InstanceInformation and InstancesHeaders
>> - can the AttributesInformation be modified at runtime? Or is it statically
>> set for the whole duration of the algorithm?
>>
>> Cheers,
>>
>> --
>> Gianmarco
>>
>> On 10 January 2015 at 04:26, Albert Bifet <abifet@apache.org> wrote:
>>
>> > Hi all,
>> >
>> > This is a short explanation of the new instances of SAMOA.
>> >
>> >
>> >
>> https://github.com/abifet/moa/tree/master/moa/src/main/java/com/yahoo/labs/samoa/instances
>> >
>> > Instances will be much simpler than the current implementation. They
>> > can be dense or sparse, and they contain only one array (or two for
>> > sparse) with all the attribute values. In the current implementation
>> > we have two arrays, one for input values and another for output values
>> >
>> > The main changes are two:
>> >
>> > 1/ All instances are going to be multi-label, that means they have
>> > input and output attributes, and we can call their values with
>> > getInputValue(i) and getOutputValue(i).
>> >
>> > 2/ Attributes are numeric by default, so we only keep information of
>> > discrete attributes (values). For example if we have one million
>> > numeric attributes, we will not need to store attribute information of
>> > these one million numeric attributes.
>> >
>> > Basically, we have:
>> >
>> > - Instance: interface
>> > - MultiLabelInstance: interface (empty interface that extends Instance)
>> > - InstanceImpl extends MultiLabelInstance: implementation of Instance.
>> > Contains
>> >     - InstanceData
>> >     - InstancesHeader
>> > - DenseInstance extends InstanceImpl
>> > - SparseInstance extends InstanceImpl
>> >
>> > -Instances: a list of instances and an InstanceInformation object
>> > -InstancesHeader extends Instances
>> >
>> > -InstanceData: interface
>> > -DenseInstanceData implements InstanceData
>> > -SparseInstanceData implements InstanceData
>> >
>> > - InstanceInformation contains name, attribute information and
>> > attributes to predict.
>> > - AttributesInformation contains two list of Attributes (indices and
>> > values) for non-numerical attributes. Numerical attributes are by
>> > default
>> > - Range: attributes to predict
>> >
>> > Cheers,
>> >
>> > Albert
>> >
>>

Mime
View raw message