jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (OAK-2853) Use default codec for fulltext index
Date Tue, 28 Jul 2015 13:07:04 GMT

    [ https://issues.apache.org/jira/browse/OAK-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644310#comment-14644310
] 

Chetan Mehrotra edited comment on OAK-2853 at 7/28/15 1:06 PM:
---------------------------------------------------------------

Updated the benchmark to enable CopyOnRead. With that following numbers are seen

{noformat}
# FullTextSearchTest               C     min     10%     50%     90%     max       N
Oak-Tar                            1      67      69      73      98     203     749  //CopyOnRead
enabled
Oak-Tar                            1    1397    1413    1480    1751    1908      40   //CopyOnRead
disabled
Oak-Tar                            1     476     485     508     575     663     115 //CopyOnRead
enabled, Compression On
{noformat}

After revisiting the numbers it appears that enabling compression adds overhead. Things might
improve with Lucene 5.x (UCENE-5914). This is observed by [others|http://stegard.net/2015/05/performance-of-stored-field-compression-in-lucene-4-1/]
also.

So looks like it does not make sense to yet enable compression by default.

/cc [~teofili] [~alex.parvulescu] [~tmueller]. Unless you think otherwise I would close this
issue as WontFix let the status quo be continued



was (Author: chetanm):
Updated the benchmark to enable CopyOnRead. With that following numbers are seen

{noformat}
Oak-Tar                            1      67      69      73      98     203     749  //CopyOnRead
enabled
Oak-Tar                            1    1397    1413    1480    1751    1908      40   //CopyOnRead
disabled
{noformat}



> Use default codec for fulltext index
> ------------------------------------
>
>                 Key: OAK-2853
>                 URL: https://issues.apache.org/jira/browse/OAK-2853
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>            Priority: Minor
>             Fix For: 1.3.4
>
>         Attachments: OAK-2853.patch
>
>
> Currently OakCodec is used by default if full text indexing is enabled for that index.
OakCodec disables compression and was done as performance issues were observed around 1.0
release (See OAK-1737). 
> Post 1.0 we introduced CopyOnRead which should provide better performance even with compression
enabled. We should revisit the usage of OakCodec by default to see if with default code we
get comparable performance or not and hence get benefit of smaller index size.
> Changing the default would require change in index format version as this change would
not be compatible to default
> Note that one can still change codec by specifying {{codec}} value for index config to
the code name like {{Lucene46}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message