lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Rödig ...@shi-gmbh.com>
Subject AW: Structured fields and termVectors
Date Tue, 17 May 2011 06:29:32 GMT
Hello,

I think you can use a field that is stored and indexed. On the index time you can use a Keywordtokanizer
und a filter to reduce the Path (without M). The value to display (stored Field) is always
the orginal value, that means the value that comes in. The value was stored before the tokanizer
and filter are working. The indexed values (and Termvectors) are the terms after the tokanizer
and filters, so you have the reduce Path in the Termvectors.  
I hope this can Help you.

Mit freundlichen Grüßen
M.Sc. Dipl.-Inf. (FH) Martin Rödig
 
SHI Elektronische Medien GmbH
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
AKTUELL - NEU - AB SOFORT 
Solr/Lucene Schulung vom 28. - 30. Juni in Zürich
Wien 19. - 21.07.2011 | München 27. - 29.09. und 15. - 17.11.2011

Als erster zertifizierter Trainingspartner von Lucid Imagination in 
Deutschland, Österreich und Schweiz bietet SHI ab sofort 
deutschsprachige Solr Schulungen an.
Weitere Informationen: www.shi-gmbh.com/services/solr-training
Achtung: Die Anzahl der Plätze ist beschränkt!
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
Postadresse: Watzmannstr. 23, 86316 Friedberg
Besuchsadresse: Curt-Frenzel-Str. 12, 86167 Augsburg

Internet: http://www.shi-gmbh.com
Registergericht Augsburg HRB 17382
Geschäftsführer: Peter Spiske
Steuernummer: 103/137/30412

-----Ursprüngliche Nachricht-----
Von: Jack Repenning [mailto:jrepenning@collab.net] 
Gesendet: Dienstag, 17. Mai 2011 03:30
An: solr-user@lucene.apache.org
Betreff: Structured fields and termVectors

How does MoreLikeThis use termVectors?

My documents (full sample at the bottom) frequently include lines more or less like this

   M /trunk/home/.Aquamacs/Preferences.el

I want to "MoreLikeThis" based on the full path, but not the "M". But what I actually display
as a search result should include "M" (should look pretty much like the sample, below).

If I define a field to include that whole line, I can certainly search in ways that skip the
"M", but how do I control the termVector and MoreLikeThis?  I think the answer is not to termVector
the line as shown, but rather to index these lines twice, once whole (which is also copyFielded
into the display text), and a second time with just the path (and termVectors="true"). Which
is OK, but since these lines will represent most of my data, double-indexing seems to double
my storage, which is ... oh, well ... not entirely optimal.

So is there some way I can index the full line, once, with "M" and path, and tell the termVector
to include the whole path and nothing but the path?



-==-
Jack Repenning
Technologist
Codesion Business Unit
CollabNet, Inc.
8000 Marina Boulevard, Suite 600
Brisbane, California 94005
office: +1 650.228.2562
twitter: http://twitter.com/jrep



------------------------------------------------------------------------
r3580 | jack | 2011-04-26 13:55:46 -0700 (Tue, 26 Apr 2011) | 1 line Changed paths:
   M /trunk/home/.Aquamacs
   M /trunk/home/.Aquamacs/Preferences.el
   M /trunk/www/wynton-start-page.html

simplify the hijack of Aquamacs prefs storage, aufl
------------------------------------------------------------------------


Mime
View raw message