lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen GRAY" <stephen.g...@immi.gov.au>
Subject RE: Retrieving values for a NumericDocValuesField [SEC=UNOFFICIAL]
Date Thu, 24 Oct 2013 23:21:00 GMT
UNOFFICIAL

Hi Adrien,

Thanks for your help, I'll try that.

Regards,
Steve

Stephen Gray
Java Developer
Border Midrange Systems Support
Department of Immigration and Border Protection
Phone: (02) 6223 9207
Mobile: 0419 885 959


-----Original Message-----
From: Adrien Grand [mailto:jpountz@gmail.com]
Sent: Thursday, 24 October 2013 6:19 PM
To: java-user@lucene.apache.org
Subject: Re: Retrieving values for a NumericDocValuesField [SEC=UNOFFICIAL]

Hi Stephen,

On Thu, Oct 24, 2013 at 1:18 AM, Stephen GRAY <stephen.gray@immi.gov.au> wrote:
> I actually need to loop through a large number of documents (50,000 - 100,000) calculating
a number of statistics (min, max, sum) so I really need the most efficient/fastest solution
available. It sounds like it would be best to just store the data in a stored field.

I see. For that many documents, doc values are actually the right thing to use, sorry if I
put you on the wrong track I was assuming you were only going to collect values from a few
documents.

In your case the best option would be to split your doc ids according to the segment they
belong to, and then for each segment, get a per-segment NumericDocValues instance and aggregate
your statistics.
It is better than using MultiDocValues because MultiDocValues needs to binary-search for the
appropriate segment for every document.

--
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


UNOFFICIAL


--------------------------------------------------------------------
Important Notice: If you have received this email by mistake, please advise
the sender and delete the message and attachments immediately.  This email,
including attachments, may contain confidential, sensitive, legally privileged
and/or copyright information.  Any review, retransmission, dissemination
or other use of this information by persons or entities other than the
intended recipient is prohibited.  DIAC respects your privacy and has
obligations under the Privacy Act 1988.  The official departmental privacy
policy can be viewed on the department's website at www.immi.gov.au.  See:
http://www.immi.gov.au/functional/privacy.htm


---------------------------------------------------------------------

Mime
View raw message