phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Levine (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-898) Extend PhoenixHBaseStorage to specify upsert columns
Date Tue, 08 Apr 2014 20:08:15 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963370#comment-13963370
] 

Eli Levine commented on PHOENIX-898:
------------------------------------

Pushed the patch to 3.0 and 4.0.  [~jviolettedsiq], any chance you can rebase using latest
master and submit a new patch?  Getting a bunch of non-trivial conflicts after applying the
patch to master.  Thanks!

> Extend PhoenixHBaseStorage to specify upsert columns
> ----------------------------------------------------
>
>                 Key: PHOENIX-898
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-898
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: James Violette
>             Fix For: 3.0.0
>
>         Attachments: PHOENIX_898_1.patch
>
>
> We have a Phoenix table with data from multiple sources.  We would like to write a pig
script that upserts only data associated with a feed, leaving other data alone.  The current
PhoenixHBaseStorage automatically upserts all columns in a table.
> Given this table schema as an example, 
> create TABLE IF NOT EXISTS MYSCHEMA.MYTABLE
>  (NAME varchar not null
>   ,D.INFO VARCHAR
>   ,D.D1 DOUBLE
>   ,D.I1 INTEGER
>   ,D.C1 VARCHAR
>  CONSTRAINT pk PRIMARY KEY (NAME));	
> Assuming 'A' is loaded into pig,
> The current syntax loads all columns into MYSCHEMA.MYTABLE:
> STORE A into 'hbase://MYSCHEMA.MYTABLE' using
>     org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> We could specify upsert columns after the table in the hbase:// url.  
> This column-based example is equivalent to the full table upsert.
> STORE A into 'hbase://MYSCHEMA.MYTABLE/NAME,D.INFO,D.D1,D.I1,D.C1' using
>     org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> This column-based example chooses to load only three of the five columns.
> STORE A into 'hbase://MYSCHEMA.MYTABLE/NAME,D.INFO,D.I1' using
>     org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 5000');
> This change would touch 
> PhoenixHBaseStorage.setStoreLocation - parse the columns
> PhoenixPigConfiguration.configure - add an optional column list parameter.
> PhoenixPigConfiguration.setup - create the upsert statement and create the column metadata
list
> The rest of the code should work as-is.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message