crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Juhn <>
Subject Processing splittable inputs
Date Fri, 26 Feb 2016 01:15:06 GMT
Hello there,

I haven’t been able to get crunch to split inputs into multiple mappers.  Currently it’s
giving me one mapper per text file, even though they’re 1GB each.  I’ve tried supplying
split.maxsize on the command line and in the DoFn implementation: 

public void configure(Configuration conf) {
conf.set("crunch.combine.file.size", "67108864");
conf.set("mapreduce.input.fileinputformat.split.maxsize", "67108864");
conf.set("mapreduce.input.fileinputformat.split.minsize", "67108864");

Any suggestions?


View raw message