mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Scholten <fr...@frankscholten.nl>
Subject Re: lucene2seq error: field does not exist in the index
Date Wed, 16 Apr 2014 17:52:45 GMT
Hi Terry,

What happens when you make the 'body' field indexed in your schema?

LuceneIndexHelper checks the field using an IndexSearcher so it might be
that the field has to be indexed as well as being stored, which would be a
bug because lucene2seq is designed to load stored fields.

Cheers,

Frank


On Fri, Apr 11, 2014 at 5:33 AM, Terry Blankers <terry@amritanet.com> wrote:

> Hi All, I'm very new to trying to use lucene2seq so I'm not sure if it's
> just user error, but I'm experiencing some unexpected behavior when running
> lucene2seq against my solr index (4.7.1). I've tried using both 0.9 and the
> trunk build of mahout. (And BTW, I have been able to successfully run
> Reuters example as a test baseline.)
>
>
> Here's the command I'm running:
>
>    $MAHOUT_HOME/bin/mahout lucene2seq -i
>    /home/ec2-user/solr/solr-data/solrindex/index -o solr/sequence -id
>    key_sha1hex -f body -xm sequential -q topics:diabetes -n 500
>
>
> Excerpts from my solr schema:
>
> <fieldname="content"type="text"stored="false"indexed="
> true"multiValued="true"/>
> <fieldname="body"type="string"stored="true"indexed="false"/>
>
> <!-- Use the indexed/un-stored "content" field for searching --><copyField
> source="body" dest="content" />
> <!-- field for the QueryParser to use when an explicit fieldname is absent
> --><defaultSearchField>content</defaultSearchField>
>
>
>
> When I use SolrAdmin and specify fl=body the search handler returns the
> 'body' field with data as expected. Yet I get the following error when
> running lucene2seq and specify '-f body':
>
>    /IllegalArgumentException: Field 'body' does not exist in the index/
>
>
>
> And if I specify '-f content', lucene2seq runs without errors or warnings,
> but seqdumper output shows no values for any key:
>
>    /Key class: class org.apache.hadoop.io.Text Value Class: class
>    org.apache.hadoop.io.Text
>    Key: 96C4C76CF9D7449C724CA77CB8F650EAFD33E31C: Value:
>    Key: D6842B81B8D09733B50BEDB4767C2A5C49E43B20: Value:
>    Key: 61CB95FEE2C6BF0AC6E8A1F7738338CA36F42264: Value:
>    Key: 0F9903B72A7C9F0373A5171403B3AAEB291B16E1: Value: /
>
>
> Can anyone give me any suggestions as to how to track down what might be
> happening here?
>
> Many thanks,
>
> Terry
>
>
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message