lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Harris (JIRA)" <>
Subject [jira] Commented: (LUCENE-2232) Use VShort to encode positions
Date Mon, 01 Feb 2010 18:32:18 GMT


Chris Harris commented on LUCENE-2232:

I have a little bit of sampling profiling data from YourKit that may be relevant. (Paul encouraged
me to post anyway.) Note that the queries submitted were not limited to those requiring PRX
data, although some of them (30%? 40%?) did. This data is _without_ applying this LUCENE-2232
patch. YourKit was set to time and .read with wall clock

1. I replayed about 1000 queries taken from our user query logs on a test system that uses
rotating drives, without first submitting any battery of warmup queries.

     IndexInput.readVInt() <----------

I looked at the time spent in the marked call to IndexInput.readVInt(). 93% of the time in
this readVint() was spent in I/O, leaving a maximum of 7% that could theoretically be wasted
on the CPU decoding VInts.

2. I profiled one of our live Solr servers that uses SSD drives, after the system had warmed
up a bit. Here is the resulting profiling data, with times relative to SegmentTermPositions.readDeltaPosition():

SegmentTermPositions.readDeltaPosition() - 100%
  IndexInput.readVInt - 100%
    BufferedIndexInput.readByte - 69%
      BufferedIndexInput.refill - 69%
        SimpleFSDirectory$SimpleFSIndexInput.readInternal - 69%
 - 55%
 - 14%

Here we have a healthier 31% of the time that could potentially be sped up by this patch.
It partly depends on how much the patch would increase I/O, though. (I guess the hope is that
it wouldn't increase I/O by too crazy amount if your documents are above a certain size.)

> Use VShort to encode positions
> ------------------------------
>                 Key: LUCENE-2232
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Paul Elschot
>         Attachments: LUCENE-2232-nonbackwards.patch, LUCENE-2232-nonbackwards.patch
> Improve decoding speed for typical case of two bytes for a delta position at the cost
of increasing the size of the proximity file.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message