lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Zenkov <azen...@crimsonhexagon.com>
Subject Re: 500 millions document for loop.
Date Thu, 12 Nov 2015 16:47:45 GMT
Which version of Lucene are you using?


On Thu, Nov 12, 2015 at 11:39 AM, Valentin Popov <valentin.po@gmail.com>
wrote:

> Hello everyone.
>
> We have ~10 indexes for 500M documents, each document has «archive date»,
> and «to» address, one of our task is calculate statistics of «to» for last
> year. Right now we are using search archive_date:(current_date - 1 year)
> and paginate results for 50k records for page. Bottleneck of that approach,
> pagination take too long time and on powerful server it take ~20 days to
> execute, and it is very long.
>
> I done experiment with csv file, put there 200M records and parse it with
> same alghoritm as using for statistics, it takes few hours to execute.
>
> Is it possible some how just fast iterate throw lucene documents without
> search and pagination? Or some how increase speed of traverse?
>
> Thanks
>
> Regards,
> Valentin.
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message