hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <>
Subject [jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
Date Wed, 20 Dec 2017 18:25:00 GMT


Gopal V commented on HIVE-17994:

[~teddy.choi]: this looks like it is a threading only issue - my guess is that the hash memory
is interleaved across NUMA zones and gets moved around for some reason, when so many threads
read it (AFAIK hash-lookups are just reads, so won't trigger NUMA migrations, but there might
be something else going on).

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> ----------------------------------------------------------------------
>                 Key: HIVE-17994
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gopal V
>            Assignee: Matt McCline
>            Priority: Minor
>         Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, HIVE-17994.03.patch, HIVE-17994.04.patch,
HIVE-17994.05.patch, HIVE-17994.06.patch, vec-serialize-hashmap.png
> On machines with slower NUMA, the hashmap lookup for TypeInfo::getPrimitiveCategory is
the slowest part of the vectorized serialization loops. The static object references run hot
with the NUMA access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization enforces that
this type cannot change at all.
> !vec-serialize-hashmap.png!

This message was sent by Atlassian JIRA

View raw message