lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Serba <ase...@gmail.com>
Subject Re: DIH and multivariable fields problems
Date Tue, 10 Aug 2010 17:39:52 GMT
> Have others successfully imported dynamic multivalued fields in a
> child entity using the DataImportHandler via the child entity returning
> multiple records through a RDBMS?
Yes, it's working ok with static fields.

I didn't even know that it's possible to use variables in field names
( "dynamic" names ) in DIH configuration. This use case is quite
unusual.

> This is increasingly more looking like a bug. To recap, I am trying to use
> the DIH to import multivalued dynamic fields and using a variable to name
> that field.
I'm not an expert in DIH source code but it seems there's special
processing of "dynamic" fields that prevents handling field type (and
multivalued attribute). Specifically there's conditional jump
("continue") over field type detection code in case of "dynamic" field
name ( see DataImporter:initEntity ). I guess the reason of such
behavior is that you can't determine field type based on dynamic field
name ("${variable}_s") at that time (configuration parsing). I'm
wondering if it's possible to determine field types at runtime (when
actual field "title_s" name is resolved).

I encountered similar problem with implicit sql_column <-> solr_field
mapping using SqlEntityProcessor, i.e. when you select some columns
and do not explicitly list all these columns as fields entries in your
configuration. In this case field type detection doesn't work either.
I think that moving type detection process into runtime would solve
that problem also. Am i missing something obvious that prevents us
from doing field type detection at runtime?

Alex

On Tue, Aug 10, 2010 at 4:20 AM, harrysmith <harrysmithwla@gmail.com> wrote:
>
> This is increasingly more looking like a bug. To recap, I am trying to use
> the DIH to import multivalued dynamic fields and using a variable to name
> that field.
>
> Upon further testing, the multivalued import works fine with a
> static/constant name, but only keeps the first record when naming the field
> dynamically. See below for relevant snips.
>
> From schema.xml :
> <dynamicField name="*_s"  type="string"  indexed="true"  stored="true"
> multiValued="true" />
>
> From data-config.xml :
>
> <entity name="terms" query="select distinct CORE_DESC_TERM from metadata
> where item_id=${item.DIVID_PK}">
> <entity name="metadata" query="select * from metadata where
> item_id=${item.DIVID_PK} AND core_desc_term='${terms.CORE_DESC_TERM}'" >
> <field name="metadata_record_s" column="TEXT_VALUE" />
> </entity>
> </entity>
>
> ....
> Produces the following, note that there are 3 records that should be
> returned and are correctly done, with the field name being a constant.
>
> - <result name="response" numFound="1" start="0">
> - <doc>
>  <str name="id">9892962</str>
> - <arr name="metadata_record_s">
>  <str>record 1</str>
>  <str>record 2</str>
>  <str>record 3</str>
>  <str>Polygraph Newsletter Title</str>
>  </arr>
> - <arr name="title">
>  <str>Polygraph Newsletter Title</str>
>  </arr>
>  </doc>
>  </result>
>
> ===
>
> Now, changing the field name to a variable..., note only the first record is
> retained for the 'Relation_s' field -- there should be 3 records.
>
> <field name="metadata_record_s" column="TEXT_VALUE" />
> becomes
> <field name="${terms.CORE_DESC_TERM}_s" column="TEXT_VALUE" />
>
> produces the following:
> - <result name="response" numFound="1" start="0">
> - <doc>
> - <arr name="Relation_s">
>  <str>record 1</str>
>  </arr>
> - <arr name="Title_s">
>  <str>Polygraph Newsletter Title</str>
>  </arr>
>  <str name="id">9892962</str>
> - <arr name="title">
>  <str>Polygraph Newsletter Title</str>
>  </arr>
>  </doc>
>  </result>
>
> Only the first record is retained. There was also another post (which
> recieved no replies) in the archive that reported the same issue. The DIH
> debug logs do show 3 records correctly being returned, so somehow these are
> not getting added.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/DIH-and-multivariable-fields-problems-tp1032893p1065244.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
View raw message