spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: MLlib: word2vec - words vectors into feature vector
Date Fri, 07 Oct 2016 08:06:22 GMT
It's just the average of the word vectors, for all words in the text.

On Fri, Oct 7, 2016 at 9:04 AM kaching <waved@o2.pl> wrote:

> Hi. How exacly MLlib implementation of word2vec converts word vectors
> into one feature vector per row?
>
>            TEXT
> [Hi, I, heard, ab..]
> [I, wish, Java, c..]
> [Logistic, regres.]
>
>              | word2vec
>
>              V
>
> WORD                       VECTOR
> heard            [0.14950960874557...|
> are                [-0.1639076173305...|
> neat              [0.13949351012706...|
> classes          [0.03703496977686...|
> I                    [-0.0189154129475...|
> regression    [0.15298652648925...|
> Logistic         [-0.1270201653242...|
> Spark            [-0.0535793155431...|
> could            [0.12216471135616...|
> use               [0.08246973901987...|
> Hi                  [0.16548289358615...|
> models         [-0.0568316541612...|
> case             [0.11626788973808...|
> about           [-0.1500445008277...|
> Java             [-0.0407485179603...|
> wish             [0.11882393807172...|
>
>                  | HOW?
>
>                  V
>
>          TEXT                                RESULT
> [Hi, I, heard, ab... ]     [0.01849065460264...|
> [I, wish, Java, c...  ]     [0.05958533100783...|
> [Logistic, regres...]     [-0.0110558800399...|
>
> Is there a way to change this default method?
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Mime
View raw message