Yeah, I suspect the Source-property approach is the right thing here. On Fri, Feb 26, 2016 at 3:37 PM, Micah Whitacre wrote: > Where are you trying to specify them? Inside a DoFn? Prior to > constructing the MRPipeline? > > I'd suggest trying either: > 1. Setting those values on the initial Configuration object you pass to the > MRPipeline > 2. Setting them as Source specific properties[1] on the source itself. > > The latter approach might be better if you are reading a lot of different > sources into your pipeline and don't want to affect them all. > > [1] - > > http://crunch.apache.org/apidocs/0.12.0/org/apache/crunch/Source.html#inputConf(java.lang.String,%20java.lang.String) > > On Fri, Feb 26, 2016 at 5:17 PM, Ben Juhn wrote: > > > The data isn’t compressed. The parameters aren’t showing up in the job > > configuration either. > > > > > > > On Feb 25, 2016, at 5:15 PM, Ben Juhn wrote: > > > > > > Hello there, > > > > > > I haven’t been able to get crunch to split inputs into multiple > > mappers. Currently it’s giving me one mapper per text file, even though > > they’re 1GB each. I’ve tried supplying split.maxsize on the command line > > and in the DoFn implementation: > > > > > > @Override > > > public void configure(Configuration conf) { > > > conf.set("crunch.combine.file.size", "67108864"); > > > conf.set("mapreduce.input.fileinputformat.split.maxsize", "67108864"); > > > conf.set("mapreduce.input.fileinputformat.split.minsize", "67108864"); > > > } > > > > > > Any suggestions? > > > > > > Thanks, > > > Ben > > > > > > > >