lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: [ANN] word2vec for Lucene
Date Fri, 21 Nov 2014 01:11:40 GMT
Hi Joseph,

Thank you for asking. If you want to do it in the interactive sense,
it won't work well practically because it takes several minutes for learning.

If you accept working in batch sense, the feature can be implemented,
but I've not done it yet. I have the open ticket for that:

accept filter query
https://github.com/kojisekig/word2vec-lucene/issues/2

Thanks,

Koji

(2014/11/21 8:22), Joseph Obernberger wrote:
> Hi Koji - is it possible to execute word2vec on a subset of documents from
> Solr?  -  ie could I run a query, get back the top n results and pass only
> those to word2vec?
> Will this work with Solr Cloud?
>
> Thank you!
>
> -Joe
>
> On Thu, Nov 20, 2014 at 12:18 PM, Paul Libbrecht <paul@hoplahup.net> wrote:
>
>> As far as I could tell, word2vec seems more mathematical, which is rather
>> nice.
>> At least I see more transparent math in the web-page.
>> Maybe this helps a bit?
>>
>> SemanticVectors has always rather pleasant for the LSI/LSA-like approach,
>> but precisely this is mathematically opaque.
>> Maybe it's more a question of presentation.
>>
>> Paul
>>
>>
>> On 20 nov. 2014, at 16:24, Koji Sekiguchi <koji@r.email.ne.jp> wrote:
>>
>>> Hi Paul,
>>>
>>> I cannot compare it to SemanticVectors as I don't know SemanticVectors.
>>> But word vectors that are produced by word2vec have interesting
>> properties.
>>>
>>> Here is the description of the original word2vec web site:
>>>
>>>
>> https://code.google.com/p/word2vec/#Interesting_properties_of_the_word_vectors
>>> Interesting properties of the word vectors
>>> It was recently shown that the word vectors capture many linguistic
>> regularities, for example vector
>>> operations vector('Paris') - vector('France') + vector('Italy') results
>> in a vector that is very
>>> close to vector('Rome'), and vector('king') - vector('man') +
>> vector('woman') is close to
>>> vector('queen')
>>>
>>> Thanks,
>>>
>>> Koji
>>>
>>>
>>> (2014/11/20 20:01), Paul Libbrecht wrote:
>>>> Hello Koji,
>>>>
>>>> how would you compare that to SemanticVectors?
>>>>
>>>> paul
>>>>
>>>> On 20 nov. 2014, at 10:10, Koji Sekiguchi <koji@r.email.ne.jp> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> It's my pleasure to share that I have an interesting tool "word2vec
>> for Lucene"
>>>>> available at https://github.com/kojisekig/word2vec-lucene .
>>>>>
>>>>> As you can imagine, you can use "word2vec for Lucene" to extract word
>> vectors from Lucene index.
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Koji
>>>>> --
>>>>>
>> http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>>
>> http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>


-- 
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message