crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills" <>
Subject Review Request: CRUNCH-128: Enable pipeline stages to depend on files being created on the filesystem.
Date Tue, 11 Dec 2012 07:26:05 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for crunch and Gabriel Reid.


This involves updating the PCollectionImpl class to be able to track any SourceTarget instances
that it needs to exist before any Target that depends on this PCollectionImpl can be created,
and optimizing the MSCRPlanner to check for this information and build the jobs to incorporate
these dependencies.

This isn't the prettiest implementation of this idea, but I think it'll turn out to be a useful
thing to have.

This addresses bug CRUNCH-128.


  crunch/src/it/java/org/apache/crunch/lib/join/ 297680e 
  crunch/src/main/java/org/apache/crunch/ bcf8727 
  crunch/src/main/java/org/apache/crunch/impl/mem/ 77c41ce 
  crunch/src/main/java/org/apache/crunch/impl/mr/ 60950f3 
  crunch/src/main/java/org/apache/crunch/impl/mr/collect/ f0d8187 
  crunch/src/main/java/org/apache/crunch/impl/mr/plan/ 7fe2809 
  crunch/src/main/java/org/apache/crunch/io/ 95c90aa 
  crunch/src/main/java/org/apache/crunch/lib/join/ 0ca1ab3 
  crunch/src/main/java/org/apache/crunch/materialize/ 3830616 



Updated the mapside join IT to use the new code and fixed the in-memory impl to work properly.


Josh Wills

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message