hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Mollitor (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-21636) Performance cost when using replaceAll() vs replace()
Date Thu, 11 Jun 2020 17:07:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-21636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133421#comment-17133421
] 

David Mollitor commented on HIVE-21636:
---------------------------------------

This issue can also lead to subtle bugs if the string being searched for just happens to contain
regex material in it.

> Performance cost when using replaceAll() vs replace() 
> ------------------------------------------------------
>
>                 Key: HIVE-21636
>                 URL: https://issues.apache.org/jira/browse/HIVE-21636
>             Project: Hive
>          Issue Type: Improvement
>          Components: Accumulo Storage Handler, HCatalog, Vectorization
>            Reporter: bd2019us
>            Assignee: bd2019us
>            Priority: Trivial
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HVIE-21636.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use String.replace() instead of String.replaceAll() when replaceAll() does not use a
regex
> replace() does not need extra compilation / performance overhead when a non-regex string
is used. Thus changing replaceAll() to replace() can remove the associated performance overhead.
> Affected files:
> # accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/compare/StringCompare.java
> # hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
> # vector-code-gen/src/org/apache/hadoop/hive/tools/GenVectorCode.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message