phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chinmay Kulkarni (Jira)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-6153) Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException
Date Mon, 28 Sep 2020 18:59:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chinmay Kulkarni updated PHOENIX-6153:
--------------------------------------
    Fix Version/s: 5.1.0

> Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException
> -------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-6153
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6153
>             Project: Phoenix
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 4.15.0, 4.14.3, master
>            Reporter: Saksham Gangwar
>            Assignee: Saksham Gangwar
>            Priority: Major
>             Fix For: 5.1.0, 4.16.0
>
>         Attachments: PHOENIX-6153.master.v1.patch, PHOENIX-6153.master.v2.patch, PHOENIX-6153.master.v3.patch,
PHOENIX-6153.master.v4.patch, PHOENIX-6153.master.v5.patch
>
>
> Different MR job requests which reach [MapReduceParallelScanGrouper getRegionBoundaries|https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
we currently make use of shared configuration among jobs to figure out snapshot names. 
> Example jobs' sequence: first two jobs work over snapshot and the third job over a regular
table.
> Prininting hashcode of objects when entering: [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65]
> *Job 1:* (over snapshot of  *ABC_TABLE_1* and is successful)
> context.getConnection(): 521093916
>  ConnectionQueryServices: 1772519705
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):*ABC_TABLE_1*
>  
> *Job 2:* (over snapshot of *ABC_TABLE_2* and is successful)
> context.getConnection(): 1928017473
>  ConnectionQueryServices: 961279422
>  *Configuration conf: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> *Job 3:* (over the table *ABC_TABLE_3* but fails with CorruptedSnapshotException while
it got nothing to do with snapshot)
> context.getConnection(): 28889670
>  ConnectionQueryServices: 424389847
>  *Configuration: 813285994*
>      conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): *ABC_TABLE_2*
>  
> Exception which we get:
>  [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] [c.s.hgrate.mapreduce.MapReduceImpl]
- Error submitting M/R job for Job 3
>  java.lang.RuntimeException: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException:
Couldn't read snapshot info from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
>  at org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:511)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:62)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89)
~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
>  at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
>  at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_172]
>  at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_172]
>   
>  
>  Change Required:
> 1. While setting the snapshot name in a shared configuration we also need to add a mechanism
to remove it as well when jobs are not snapshot related:
> [https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/PhoenixInputFormat.java#L210]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message