lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yasufumi Mizoguchi <yasufumi0...@gmail.com>
Subject Re: Solr Import
Date Tue, 25 Sep 2018 02:18:50 GMT
Hi,

I do not have a good idea about No. 1, but No. 2 is clear.

> 2. Delta indexing of xml file.
> We would be provided with an xml file and that would be imported to Solr
> using full-import during the first import. Subsequently we would be
> provided with changes made to the xml file (will be provided as an delta
> file) and I would need to import just the changes to the Solr using
> delta-import. When I click on delta-import, I do not see any update to the
> Solr response.
> Please guide us how we can achieve delta-import for *xml *file.

Solr itself can not detect which documents are updated from the last import
operation.
So, delta import is only supported in SqlEntityProcessor because Solr can
detect the difference
by appropriate SQL.

>From Solr ref. guide.
> For incremental imports and change detection. Only the SqlEntityProcessor
supports delta imports.
(
https://lucene.apache.org/solr/guide/7_4/uploading-structured-data-store-data-with-the-data-import-handler.html#uploading-structured-data-store-data-with-the-data-import-handler
)

So, if you use delta import, you should use SqlEntityProcessor by saving
data into RDB.

Thanks,
Yasufumi

2018年9月24日(月) 3:48 Sebastian Aswin <saswin@fossil.com>:

> Hi Experts,
> Good Day!
>
> We are having Solr 7.4 installed in our premise and we are planning to do
> indexing of xml file. I am using data import handler to do the indexing,
> but I had few queries on the indexing.
> 1.  Within a doc tag, there are multiple store, but the Solr response
> contains only one *store value.*  With the below structure, Solr is not
> accepting the xml, so when I changed the xml structure, I was able to
> import the xml file to the Solr using the post tool and got *all *the value
> of store which was comma separated.
>
> *Snippet of the xml import file. *
>
> <ProductFeedRetailToSOLR>
>  <doc>
>   <sku>FS4120</sku>
>   <store>MCUS</store>
>   </doc>
>  <doc>
>   <sku>FS4122</sku>
>   <store>MCIN</store>
>   <store>MFAU</store>
>   <store>MCUS</store>
>   </doc>
>  <doc>
>   <sku>FS4123</sku>
>   <store>MFAU</store>
>   </doc>
>   </ProductFeedRetailToSOLR>
>
> *Snippet of the data-config.xml *
>
>   <entity name="f" processor="FileListEntityProcessor"
> fileName="ProductFeed20180924-001434-719.xml$" recursive="true"
> rootEntity="false" dataSource="null" transformer="DateFormatTransformer"
> baseDir="/dataimport/ISR">
>
>       <!-- this processor extracts content using Xpath from each file found
> -->
>       <entity name="nested" processor="XPathEntityProcessor"
> forEach="/ProductFeedRetailToSOLR   |  ProductFeedRetailToSOLR/doc |
> /metadata" url="${f.fileAbsolutePath}" >
>               <field column="sku"
> xpath="/ProductFeedRetailToSOLR/doc/sku"/>
>               <field column="store_s" xpath="/MT_ProductFeed/doc/store"/>
>     </entity>
>
>           What changes needs to be done to the data-config.xml so that we
> have the response similar to the output that we get while using the post
> script, that is to get *all the values of the store* that is comma
> separated in the Solr response for each document.
>
>
> 2. Delta indexing of xml file.
> We would be provided with an xml file and that would be imported to Solr
> using full-import during the first import. Subsequently we would be
> provided with changes made to the xml file (will be provided as an delta
> file) and I would need to import just the changes to the Solr using
> delta-import. When I click on delta-import, I do not see any update to the
> Solr response.
> Please guide us how we can achieve delta-import for *xml *file.
>
> Thanks for the time and advice in advance.
>
> --
> Regards,
> Ashwin
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message