phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Afshin Moazami (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PHOENIX-2521) Index rows are not updated when the index key updated using bulk loader
Date Sun, 13 Dec 2015 14:53:46 GMT
Afshin Moazami created PHOENIX-2521:
---------------------------------------

             Summary: Index rows are not updated when the index key updated using bulk loader

                 Key: PHOENIX-2521
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2521
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.5.2
            Reporter: Afshin Moazami


 found out the map reduce csv bulk load tool doesn't behave the same as UPSERTs. Is it by
design or a bug?

Here is the queries for creating table and index:

{code} CREATE TABLE mySchema.mainTable (
id varchar NOT NULL,
name varchar,
address varchar
CONSTRAINT pk PRIMARY KEY (id)); {code}


{code} CREATE INDEX myIndex 
ON mySchema.mainTable  (name, id) 
INCLUDE (address); {code}

if I execute two upserts where the second one update the name (which is the key for index),
everything works fine (the record will be updated in both table and index table)

{code} UPSERT INTO mySchema.mainTable (id, name, address) values ('1', 'john', 'Montreal');{code}
{code}UPSERT INTO mySchema.mainTable (id, name, address) values ('1', 'jack', 'Montreal');{code}

{code}SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name
= 'jack'; {code}  ==> one record
{code}SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name
= 'john';  {code}  ==> zero records

But, if I load the date using org.apache.phoenix.mapreduce.CsvBulkLoadTool to the main table,
it behaves different. The main table will be updated, but the new record will be appended
to the index table:

HADOOP_CLASSPATH=/usr/lib/hbase/lib/hbase-protocol-1.1.2.jar:/etc/hbase/conf hadoop jar  /usr/lib/hbase/phoenix-4.5.2-HBase-1.1-bin/phoenix-4.5.2-HBase-1.1-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool -d',' -s mySchema -t mainTable -i /tmp/input.txt


input.txt:
2,tomas,montreal
2,george,montreal

(I have tried it both with/without -it and got the same result)

{code}SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name
= 'tomas' {code} ==> one record;

{code} SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name
= 'george' {code} ==> one record;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message