falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Satish Mittal" <satish.mit...@apache.org>
Subject Re: Review Request 18626: FALCON-284: Hcatalog based feed retention doesn't work when partition filter spans across multiple partition keys
Date Wed, 02 Apr 2014 10:21:52 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated April 2, 2014, 10:21 a.m.)

Review request for Falcon and Srikanth Sundarrajan.


Attaching updated patch with review comments incorporated.

Repository: falcon-git


When an HCatalog based feed is scheduled in falcon, retention only looks at the first partition
key that satisfies either of date pattern: yyyy | MM | dd | HH | mm. As a result, it calculates
a partition filter that contains only one of these patterns. However if HCatalog table is
defined in such a way that date spans across multiple partition keys (year/month/day/hour/minute),
then feed retention doesn't delete any partitions that are granular than first level (year).

Diffs (updated)

  common/src/main/java/org/apache/falcon/catalog/AbstractCatalogService.java fc9c3b1 
  common/src/main/java/org/apache/falcon/catalog/HiveCatalogService.java 3c3660e 
  common/src/main/java/org/apache/falcon/entity/common/FeedDataPath.java 4031e14 
  retention/src/main/java/org/apache/falcon/retention/FeedEvictor.java a8db52e 
  webapp/src/test/java/org/apache/falcon/catalog/HiveCatalogServiceIT.java fd004a1 
  webapp/src/test/java/org/apache/falcon/lifecycle/TableStorageFeedEvictorIT.java 770780e

Diff: https://reviews.apache.org/r/18626/diff/


- Added new integration tests in TableStorageFeedEvictorIT.java to test retention for an Hcatalog
feed where date consists of multiple partitions columns (year/month/day).
- Verified the retention behavior on a test cluster having an Hcatalog based feed partitioned
by year/month/day/hour/minute/country.


Satish Mittal

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message