lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Very basic questions: Indexing text
Date Tue, 29 Jun 2010 00:22:56 GMT
try adding &hl.fl=text
to specify your highlight field. I don't understand why you're only
getting the ID field back though. Do note that the highlighting
is after the docs, related by the ID.

Try a (non highlighting) query of just * to verify that you're
pointing at the index you think you are. It's possible that
you've modified a different index with SolrJ than your web
server is pointing at.

Also, SOLR has no way of knowing you're modified your index
with SolrJ, so it may not be automatically reopening an
IndexReader so your recent changes may not be visible
until you force the SOLR reader to reopen.

HTH
Erick

On Mon, Jun 28, 2010 at 6:49 PM, Peter Spam <pspam@mac.com> wrote:

> On Jun 28, 2010, at 2:00 PM, Ahmet Arslan wrote:
>
> >> 1) I can get my docs in the index, but when I search, it
> >> returns the entire document.  I'd love to have it only
> >> return the line (or two) around the search term.
> >
> > Solr can generate Google-like snippets as you describe.
> > http://wiki.apache.org/solr/HighlightingParameters
>
> Here's how I commit my documents:
>
> J=0;
> for i in `find . -name \*.txt`; do
>        (( J++ ))
>        curl "http://localhost:8983/solr/update/extract?literal.id=doc$J"
> -F "myfile=@$i";
> done;
>
> echo "------------- Committing"
> curl "http://localhost:8983/solr/update/extract?commit=true"
>
>
> Then, I try to query using
> http://localhost:8983/solr/select?rows=10&start=0&fl=*,score&hl=true&q=testing
> but I only get back the document ID rather than the snippet:
>
> <doc>
> <float name="score">0.05030759</float>
> <arr name="content_type">
> <str>text/plain</str>
> </arr>
> <str name="id">doc16</str>
> </doc>
>
>  I'm using the schema.xml from the "lucid imagination: Indexing text and
> html files" tutorial.
>
>
>
> -Pete
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message