trafodion-codereview mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zellerh <...@git.apache.org>
Subject [GitHub] incubator-trafodion pull request #637: [TRAFODION-2138] Hive scan on wide ta...
Date Wed, 03 Aug 2016 21:08:46 GMT
Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/incubator-trafodion/pull/637#discussion_r73421305
  
    --- Diff: core/sql/generator/GenRelScan.cpp ---
    @@ -1182,6 +1181,15 @@ if (hTabStats->isOrcFile())
     
       UInt32 rangeTailIOSize = (UInt32)
           CmpCommon::getDefaultNumeric(HDFS_IO_RANGE_TAIL);
    +  if (rangeTailIOSize == 0) 
    +    {
    +      rangeTailIOSize = getTableDesc()->getNATable()->getRecordLength() +
    --- End diff --
    
    This record length is what the compiler thinks, the real length may be more or less. Hive
makes it probably more complex, some fields like date/time/timestamp are strings in Hive and
encoded binaries in Trafodion. Character sets may be different as well in some cases (e.g.
GBK). I don't have a better solution than what the code does, but let's say we have this:
create table t(a char(20000) not null, b timestamp(6) not null, c timestamp(6) not null).
Trafodion will make the length 20022. In reality, a string may look like "20k chars..., 2015-01-01
00:00:00.000, 2016-01-01 00:00:00.000". That's 2049 characters. Maybe the solution is to add
the 16K to the record length, not take the max?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message