mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Angel Luis Scull <ascu...@facinf.uho.edu.cu>
Subject Re: Using clustering output for classification
Date Tue, 06 May 2014 15:02:29 GMT
I will check it thanks.
On 06/05/14 09:32, Ted Dunning wrote:
> I think Peng is right.  It might help to amplify a bit.
>
> The idea is that in addition to the other predictor variables that you
> have, there is also one predictor variable per cluster.  Whichever cluster
> is closest to the training example is turned on.
>
> On Wikipedia, the term used is "one hot" encoding.
>
> http://en.wikipedia.org/wiki/One-hot
>
>
>
>
> On Tue, May 6, 2014 at 4:02 AM, Peng Zhang <pzhang.xjtu@gmail.com> wrote:
>
>> Angel,
>>
>> I thinks Ted means each example falls into one cluster. If you have k
>> clusters, and each example should have one of the encodings: 1,2,…k.
>>
>> On May 6, 2014, at 5:27 AM, Angel Luis Scull <ascullp@facinf.uho.edu.cu>
>> wrote:
>>
>>> What do you mean with "get a 1 of n encodings..."
>>>
>>> On 05/05/14 16:59, Ted Dunning wrote:
>>>> In theory, what you need to do is take your training data for your
>>>> classifier and run your clustering to get a 1 of n encoding of the
>> cluster
>>>> for each example in the training data.
>>>>
>>>> Then train the classifier using original and new features.
>>>>
>>>> Does that help?  I have a simple demo of the process in R that I do if
>> that
>>>> would help.
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, May 5, 2014 at 5:53 PM, Angel Luis Scull
>>>> <ascullp@facinf.uho.edu.cu>wrote:
>>>>
>>>>> Hello to all
>>>>>
>>>>> I've a document dataset that I applied kmeans over it an obtained a
>>>>> clusters, now I want to use this the association of the vectors and
>>>>> clusters as input for a classification algorithm.
>>>>>
>>>>> How can I achieve that?
>>>>>
>>>>> thanks in advance
>>>>>
>>


Mime
View raw message