lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From michael8 <mich...@saracatech.com>
Subject Re: dih.last_index_time - exacty what time is this capturing?
Date Sun, 11 Oct 2009 16:16:28 GMT

Thanks for your clarification Shalin.  

Given your explanation, would you agree that there is still a small window
(how ever small this may be) where some documents could be missed in the
next delta using dih.last_index_time if the data source adds or updates
documents very frequently?  i.e. the time between the SQL done executing and
data received by Solr to start indexing, some new/updated documents may have
been written in the DB such that the timestamps for those documents are
slightly before the captured last_index_time when indexing starts?

Michael


Shalin Shekhar Mangar wrote:
> 
> On Sat, Oct 10, 2009 at 1:42 AM, michael8 <michael@saracatech.com> wrote:
> 
>>
>> Hi,
>>
>> Does anyone know when exactly is the dih.last_index_time in
>> dataimport.properties captured?  E.g. start of issueing SQL to data
>> source,
>> end of executing SQL to data source to fetch the list of IDs that have
>> changed since last index, end of indexing all changed/new documents?  The
>> name seems to imply 'end of indexing all changed/new docs', but i just
>> want
>> to be sure.
>>
>>
> last_index_time is set to current date/time before the actual indexing is
> started. The rationale is not to miss any documents. If we had set the
> last_index_time after the indexing is completed then we may lose the rows
> inserted/modified after the query of the previous import. In the current
> setup, some documents may get re-imported again but because most users
> have
> a uniqueKey, it is not a big problem.
> 
> 
>> Also, I noticed a discrepancy between the commented time string and the
>> actual last_index_time value.  Is the commented time (#) the time the
>> file
>> was written, vs. the actual last index time?
>>
>> #Fri Oct 09 13:01:57 PDT 2009
>> item.last_index_time=2009-10-09 12\:58\:10
>> last_index_time=2009-10-09 12\:58\:10
>>
>>
> The commented time is the time at which the property file was written.
> This
> is automatically added by Java's Properties class.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: http://www.nabble.com/dih.last_index_time---exacty-what-time-is-this-capturing--tp25827228p25844816.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message