lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mysurf Mail <stammail...@gmail.com>
Subject Solr - Delta Query Via Full Import
Date Tue, 02 Jul 2013 07:34:27 GMT
I am using DIH to fetch rows from db to solr.
I have many 1:n relations and I can do it only if I use caching (super
fast) Therefor I am adding the following attributes to my inner entity

processor="CachedSqlEntityProcessor" cacheKey="" cacheLookup=""

Everything works great and fast. (First the n tables are queried than the
main entity.)

Now I want configured the delta import. And it is not actually working.

I know that by standard<http://wiki.apache.org/solr/DataImportHandler#Delta-Import_Example>
I
need to define the following attributes:

   1. query - Initial Query
   2. DeltaQuery - The rows that were changed
   3. DeltaImportQuery - Fetch the data that was changed
   4. parentDeltaQuery - The Keys of the parent entity that has changed
   rows in the current entity

(2-4 only used in delta queries)

And I have seen in a hack in the
documents<http://wiki.apache.org/solr/DataImportHandler#Delta-Import_Example>
that
you can do delta query via full import.
So instead of adding the following attribute -
Query,deltaImportQuery,deltaQuery -I can just add query and call full
instead of delta.

Problem - Only the first query (main entity) is executed when I run the
full import without clean.

Here is a part of my configuration in data-config.xml (I have left
deltaImportQuery though I call only full import)

<entity name="PackageVersion" pk="PackageVersionId"
        query=  "select ....
                from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
                Where '${dataimporter.request.clean}' != 'false'
                    OR Package.LastModificationTime >
'${dataimporter.last_index_time}' OR PackageVersion.Timestamp >
'${dataimporter.last_index_time}'"
        deltaImportQuery="select ...
                from [dbo].[Package] Package inner join
[dbo].[PackageVersion] PackageVersion on Package.Id =
PackageVersion.PackageId
                Where '${dataimporter.request.clean}' != 'false'
                    OR Package.LastModificationTime >
'${dataimporter.last_index_time}' OR PackageVersion.Timestamp >
'${dataimporter.last_index_time}' and
                    ID=='${dih.delta.id}'">
    <entity name="PackageTag" pk="ResourceId"
processor="CachedSqlEntityProcessor" cacheKey="ResourceId"
cacheLookup="PackageVersion.PackageId"
            query=  "SELECT ResourceId,[Text] PackageTag
                        from [dbo].[Tag] Tag
                        Where '${dataimporter.request.clean}' = 'true'
                        OR Tag.TimeStamp > '${dataimporter.last_index_time}'"
            parentDeltaQuery="select PackageVersion.PackageVersionId
                              from [dbo].[Package] Package
                              inner join [dbo].[PackageVersion] PackageVersion
                              ON Package.Id = PackageVersion.PackageId
                              where Package.Id=${PackageTag.ResourceId}">
    </entity>
</entity>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message