lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shinichiro Abe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-6889) Highlight using multiple threads
Date Wed, 12 Jul 2017 18:59:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084502#comment-16084502
] 

Shinichiro Abe commented on SOLR-6889:
--------------------------------------

I agree. But could it make much faster by being parallelized when using FieldOffsetStrategy#getOffsetsEnums(),
especially OffsetSource.ANALYSIS strategy case, i.e. storeOffsetsWithPositions = false case
in which user can select fields to highlight after indexing? I assumed text analysis work,
which the standard highlighter has, would be able to be parallelized, borrowed by an idea
of facet.threads method at that time. Although I saw a benchmark where UH's offsetSource=ANALYSIS
is already much faster than the standard one.

> Highlight using multiple threads
> --------------------------------
>
>                 Key: SOLR-6889
>                 URL: https://issues.apache.org/jira/browse/SOLR-6889
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 6.0
>            Reporter: Shinichiro Abe
>         Attachments: SOLR-6889.patch
>
>
> I think we could gain search performance a little bit using Stream.parallel().forEach()~
which has processors awareness via f/j framework under the hood.
> Especially it would affect docList's for-loop processes, e.g. debugging, highlighting.
> It seems to me that this improvement is effective for many CPUs environment.
> My test condition:
> 1. Core i5(2core 4thead), standalone Solr.
> 2. q=日本&debug=true&hl=true, other parameters are [here|https://github.com/anond2/simplesearch/blob/master/conf/solrconfig.xml#L836].
> 3. 7171 hits / 12000 docs(taken from ja.wikipedia dump)
> 4. compared to trunk, parallel streams are faster a little.
> My query execution results(QTime):
> {noformat}
> == rows=10 ==
>     trunk  patch 
> 1st 236    146
> 2nd 179    100
> 3rd 79     72
> 4th 75     53
> 5th 91     80
> == rows=50 ==
>     trunk  patch 
> 1st 485    325
> 2nd 225    243
> 3rd 199    151
> 4th 168    127
> 5th 149    118
> == rows=100 ==
>     trunk  patch 
> 1st 948    607
> 2nd 472    390
> 3rd 237    201
> 4th 256    200
> 5th 224    178
> == rows=500 ==
>     trunk  patch 
> 1st 3248   2826
> 2nd 1545   1067
> 3rd 1563   801
> 4th 1551   816
> 5th 1452   777
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message