lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Valentin Popov <valentin...@gmail.com>
Subject Re: 500 millions document for loop.
Date Thu, 12 Nov 2015 17:36:17 GMT
We are using 4.10.4 and it is not possible move right now to 5.x version. 

Thanks! 
> On 12 нояб. 2015 г., at 19:47, Anton Zenkov <azenkov@crimsonhexagon.com> wrote:
> 
> Which version of Lucene are you using?
> 
> 
> On Thu, Nov 12, 2015 at 11:39 AM, Valentin Popov <valentin.po@gmail.com>
> wrote:
> 
>> Hello everyone.
>> 
>> We have ~10 indexes for 500M documents, each document has «archive date»,
>> and «to» address, one of our task is calculate statistics of «to» for last
>> year. Right now we are using search archive_date:(current_date - 1 year)
>> and paginate results for 50k records for page. Bottleneck of that approach,
>> pagination take too long time and on powerful server it take ~20 days to
>> execute, and it is very long.
>> 
>> I done experiment with csv file, put there 200M records and parse it with
>> same alghoritm as using for statistics, it takes few hours to execute.
>> 
>> Is it possible some how just fast iterate throw lucene documents without
>> search and pagination? Or some how increase speed of traverse?
>> 
>> Thanks
>> 
>> Regards,
>> Valentin.
>> 
>> 
>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 


 С Уважением,
Валентин Попов






---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message