lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhanlijun (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-6037) PendingTerm cannot be cast to PendingBlock
Date Sun, 02 Nov 2014 11:30:34 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193740#comment-14193740
] 

zhanlijun edited comment on LUCENE-6037 at 11/2/14 11:29 AM:
-------------------------------------------------------------

       Lucene-spatial module changes are unrelated to the bug, because the bug also happens
when I use the native lucene-spatial module. 
       lucene-spatial module is widely used in mobile internet applications of china. I have
an application scenario is to calculate the distance between the user and all POIs in the
city. However, when the number of POIs in one city more than 100000, the distance calculation
of lucene becomes very slow (more than 10ms). Lucene use spatial4j HaversineRAD to calculate
the distance, and I have do a test on my computer (2.9GHz Intel Core i7, 8GB mem)
POI num	|  time
50000      	|  7ms
100000	        |  14ms
1000000	|  144ms
      I did some simplified the distance calculation formula. This simplification greatly
improve the computational efficiency under the premise of maintaining the use of precision.
Here is the result of the test.
            test point pair                               | disSimplify(meter)	        | 
distHaversineRAD(meter)	  |  diff(meter)
(39.941, 116.45) (39.94, 116.451)	 | 140.0242769266660    | 140.02851671981400	          |
 0.0
(39.96 116.45) (39.94, 116.40)	 | 4804.113098854450    | 4804.421153907680	          |  0.3
(39.96, 116.45) (39.94, 117.30)	 | 72438.90919479560	| 72444.54071519510	          |  5.6
(39.26, 115.25) (41.04, 117.30)	 | 263516.676171262	| 263508.55921886700	          |  8.1

POI num	|  time
50000	        | 0.1
100000	        | 0.3
1000000	      | 4


was (Author: zhanlijun):
       Lucene-spatial module changes are unrelated to the bug, because the bug also happens
when I use the native lucene-spatial module. 
       lucene-spatial module is widely used in mobile internet applications of china. I have
an application scenario is to calculate the distance between the user and all POIs in the
city. However, when the number of POIs in one city more than 100000, the distance calculation
of lucene becomes very slow (more than 10ms). Lucene use spatial4j HaversineRAD to calculate
the distance, and I have do a test on my computer (2.9GHz Intel Core i7, 8GB mem)
POI num	|  time
5w       	|  7ms
10w	        |  14ms
100w	|  144ms
      I did some simplified the distance calculation formula. This simplification greatly
improve the computational efficiency under the premise of maintaining the use of precision.
Here is the result of the test.
            test point pair                               | disSimplify(meter)	        | 
distHaversineRAD(meter)	  |  diff(meter)
(39.941, 116.45)(39.94, 116.451)	 | 140.0242769266660    | 140.02851671981400	   
      |  0.0
(39.96 116.45)(39.94, 116.40)	 | 4804.113098854450    | 4804.421153907680	       
  |  0.3
(39.96, 116.45)(39.94, 117.30)	 | 72438.90919479560	| 72444.54071519510	         
|  5.6
(39.26, 115.25)(41.04, 117.30)	 | 263516.676171262	| 263508.55921886700	         
|  8.1

POI num	|  time
5w	        | 0.1
10w	        | 0.3
100w	| 4

> PendingTerm cannot be cast to PendingBlock
> ------------------------------------------
>
>                 Key: LUCENE-6037
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6037
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/codecs
>    Affects Versions: 4.3.1
>         Environment: ubuntu 64bit
>            Reporter: zhanlijun
>            Priority: Critical
>             Fix For: 4.3.1
>
>
> the error as follows:
> java.lang.ClassCastException: org.apache.lucene.codecs.BlockTreeTermsWriter$PendingTerm
cannot be cast to org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock
>         at org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finish(BlockTreeTermsWriter.java:1014)
>         at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:553)
>         at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
>         at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116)
>         at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
>         at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
>         at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:493)
>         at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480)
>         at org.apache.lucene.index.DocumentsWriter.postUpdate(DocumentsWriter.java:378)
>         at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
>         at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1283)
>         at org.apache.lucene.index.IndexWriter.addDocuments(IndexWriter.java:1243)
>         at org.apache.lucene.index.IndexWriter.addDocuments(IndexWriter.java:1228)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message