sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghav Gautam" <raghavgau...@gmail.com>
Subject Re: Review Request 13035: SQOOP-744: log4j configuration for generated mapreduce job
Date Sun, 18 Aug 2013 22:48:38 GMT


> On Aug. 14, 2013, 4:57 p.m., Jarek Cecho wrote:
> > execution/mapreduce/src/main/resources/META-INF/log4j.properties, lines 20-23
> > <https://reviews.apache.org/r/13035/diff/2/?file=330782#file330782line20>
> >
> >     I've tried the patch on a real cluster and got following output (please accept
my apologies for the really long text):
> >     
> >     Task Logs: 'attempt_201308141631_0001_m_000001_0'
> >     
> >     
> >     stdout logs
> >     
> >     
> >     stderr logs
> >     2660 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor
 - SqoopOutputFormatLoadExecutor consumer thread is starting
> >     2748 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor
 - Running loader class org.apache.sqoop.job.etl.HdfsTextImportLoader
> >     2752 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Starting progress service
> >     2777 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Running extractor class
org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> >     2782 [pool-2-thread-1] DEBUG org.apache.sqoop.job.mr.ProgressRunnable  - Auto-progress
thread reporting progress
> >     3969 [main] INFO  org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
 - Using query: SELECT * FROM text WHERE 100001 <= id AND id < 200001
> >     32122 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Extractor has finished
> >     32129 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Stopping progress
service
> >     32136 [main] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  -
SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> >     34002 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor
 - Loader has finished
> >     34002 [main] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  -
SqoopOutputFormatLoadExecutor::SqoopRecordWriter is closed
> >     
> >     
> >     syslog logs
> >     2013-08-14 16:47:17,950 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter
is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
> >     2013-08-14 16:47:19,451 WARN org.apache.hadoop.conf.Configuration: session.id
is deprecated. Instead, use dfs.metrics.session-id
> >     2013-08-14 16:47:19,452 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing
JVM Metrics with processName=MAP, sessionId=
> >     2013-08-14 16:47:20,160 INFO org.apache.hadoop.util.ProcessTree: setsid exited
with exit code 0
> >     2013-08-14 16:47:20,172 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin
: org.apache.hadoop.util.LinuxResourceCalculatorPlugin@549b6220
> >     2013-08-14 16:47:20,585 INFO org.apache.hadoop.mapred.MapTask: Processing split:
org.apache.sqoop.job.mr.SqoopSplit@23de4dd8
> >     2013-08-14 16:47:20,610 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
SqoopOutputFormatLoadExecutor consumer thread is starting
> >     2013-08-14 16:47:20,698 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
Running loader class org.apache.sqoop.job.etl.HdfsTextImportLoader
> >     2013-08-14 16:47:20,702 INFO org.apache.sqoop.job.mr.SqoopMapper: Starting progress
service
> >     2013-08-14 16:47:20,727 INFO org.apache.sqoop.job.mr.SqoopMapper: Running extractor
class org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor
> >     2013-08-14 16:47:20,732 DEBUG org.apache.sqoop.job.mr.ProgressRunnable: Auto-progress
thread reporting progress
> >     2013-08-14 16:47:21,919 INFO org.apache.sqoop.connector.jdbc.GenericJdbcImportExtractor:
Using query: SELECT * FROM text WHERE 100001 <= id AND id < 200001
> >     2013-08-14 16:47:50,072 INFO org.apache.sqoop.job.mr.SqoopMapper: Extractor
has finished
> >     2013-08-14 16:47:50,079 INFO org.apache.sqoop.job.mr.SqoopMapper: Stopping progress
service
> >     2013-08-14 16:47:50,086 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> >     2013-08-14 16:47:51,952 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
Loader has finished
> >     2013-08-14 16:47:51,952 INFO org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
SqoopOutputFormatLoadExecutor::SqoopRecordWriter is closed
> >     2013-08-14 16:47:51,952 INFO org.apache.hadoop.mapred.Task: Task:attempt_201308141631_0001_m_000001_0
is done. And is in the process of commiting
> >     2013-08-14 16:47:53,201 INFO org.apache.hadoop.mapred.Task: Task attempt_201308141631_0001_m_000001_0
is allowed to commit now
> >     2013-08-14 16:47:53,282 INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter:
Saved output of task 'attempt_201308141631_0001_m_000001_0' to /user/root/text
> >     2013-08-14 16:47:53,289 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201308141631_0001_m_000001_0'
done.
> >     2013-08-14 16:47:53,295 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing
logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> >     
> >     It seems that all the message are printed out twice, once into the syslog and
secondly to the error log. As our configuration is configuring only error log, I'm assuming
that different log4j is being loaded as well (I would expect one from Hadoop for the MR related
log lines). Considering that we do have all logs in the normal syslog, I'm wondering if the
JIRA is still valid. What do you think Raghav?

Syslog is where all the logging goes and I think this is controlled by $HADOOP_CONF_DIR/log4j.properties.
The issue with this is that it has logs that are completely unrelated to our Sqoop jobs. And
since $HADOOP_CONF_DIR/log4j.properties is not under Sqoop's control there is not much we
can do there.

The patch allows Sqoop job to have it's own log4j.properties. This allows it to have it's
own appender and conversion pattern and print sqoop's logs to stderr exclusively. This would
be useful in situations where we need to pull these logs from Hadoop and show them to the
user to help them with debugging and stuff.


- Raghav


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13035/#review25193
-----------------------------------------------------------


On July 30, 2013, 12:37 p.m., Raghav Gautam wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/13035/
> -----------------------------------------------------------
> 
> (Updated July 30, 2013, 12:37 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-744
>     https://issues.apache.org/jira/browse/SQOOP-744
> 
> 
> Repository: sqoop-sqoop2
> 
> 
> Description
> -------
> 
> Adding log4j.properties for the generated job.
> 
> 
> Diffs
> -----
> 
>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/ConfigurationUtils.java f5f6d8e

>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopMapper.java 59cf391

>   execution/mapreduce/src/main/java/org/apache/sqoop/job/mr/SqoopReducer.java b31161c

>   execution/mapreduce/src/main/resources/META-INF/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/13035/diff/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Raghav Gautam
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message