lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Serba <ase...@gmail.com>
Subject Re: DataImportHandler dynamic fields clarification
Date Wed, 13 Oct 2010 20:37:51 GMT
Harry, could you please file a jira for this and I'll address this in
a patch. I fixed related issue (SOLR-2102) and I think it's pretty
similar.

> Interesting, I was under the impression that case does not matter.
>
> From http://wiki.apache.org/solr/DataImportHandler#A_shorter_data-config :
> "It is possible to totally avoid the field entries in entities if the names
> of the fields are same (case does not matter) as those in Solr schema"
>
Yeah, case does not matter only for explicit mapping of sql columns to
Solr fields. The reason is that DIH populates hash map for case
insensitive match only for explicit mappings.

You can also workaround this upper case column names in Oracle using
the following SQL clause:
=========================
data-config.xml
<entity name="item" query="select column_1 as &quote;column_1&quote;,
column_100 as &quote;column_100&quote; from wide_table">
</entity>

schema.xml
<dynamicField name="column_*"  type="string"  indexed="true"  stored="true"
multiValued="true" />
=========================

HTH,
Alexey


On Thu, Sep 30, 2010 at 9:10 PM, harrysmith <harrysmithwla@gmail.com> wrote:
>
>>
>>Two things, one are your DB column uppercase as this would effect the out.
>>
>>
>
> Interesting, I was under the impression that case does not matter.
>
> From http://wiki.apache.org/solr/DataImportHandler#A_shorter_data-config :
> "It is possible to totally avoid the field entries in entities if the names
> of the fields are same (case does not matter) as those in Solr schema"
>
> I confirmed that matching the schema.xml field case to the database table is
> needed for dynamic fields, and the wiki statement above is incorrect, or at
> the very least confusing, possibly a bug.
>
> My database is Oracle 10g and the column names have been created in all
> uppercase in the database.
>
> In Oracle:
> Table name: wide_table
> Column names: COLUMN_1 ... COLUMN_100 (yes, uppercase)
>
> Please see following scenarios and results I found:
>
> data-config.xml
> <entity name="item" query="select column_1,column_100 from wide_table">
> <field column="column_100" name="id"/>
> </entity>
>
> schema.xml
> <dynamicField name="column_*"  type="string"  indexed="true"  stored="true"
> multiValued="true" />
>
> Result:
> Nothing Imported
>
> =========
>
> data-config.xml
> <entity name="item" query="select COLUMN_1,COLUMN_100 from wide_table">
> <field column="column_100" name="id"/>
> </entity>
>
> schema.xml
> <dynamicField name="column_*"  type="string"  indexed="true"  stored="true"
> multiValued="true" />
>
> Result:
> Note query column names changed to uppercase.
> Nothing Imported
>
> =========
>
>
> data-config.xml
> <entity name="item" query="select column_1,column_100 from wide_table">
> <field column="COLUMN_100" name="id"/>
> </entity>
>
> schema.xml
> <dynamicField name="column_*"  type="string"  indexed="true"  stored="true"
> multiValued="true" />
>
> Result:
> Note ONLY the field entry was changed to caps
>
> All records imported, with only COLUMN_100 id field.
>
> ============
>
> data-config.xml
> <entity name="item" query="select column_1,column_100 from wide_table">
> <field column="COLUMN_100" name="id"/>
> </entity>
>
> schema.xml
> <dynamicField name="COLUMN_*"  type="string"  indexed="true"  stored="true"
> multiValued="true" />
>
> Result:
> Note BOTH the field entry was changed to caps in data-config.xml, and the
> dynamicField wildcard in schema.xml
>
> All records imported, with all fields specified. This is the behavior
> desired.
>
> =============
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>>
>>Second what does your db-data-config.xml look like
>>
>>
>
> The relevant data-config.xml is as follows:
>
> <document name="">
> <entity name="item" query="select COLUMN_1,COLUMN_100 from wide_table">
>  <field column="COLUMN_100" name="id"/>
> </entity>
> </document>
>
> Ideally, I would rather have the query be 'select * from wide_table" with
> the fields being dynamically matched by the column name from the
> dynamicField wildcard from the schema.xml.
>
> <dynamicField name="COLUMN_*"  type="string"  indexed="true" stored="true"/>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/DataImportHandler-dynamic-fields-clarification-tp1606159p1609578.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
View raw message