sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zoltán Tóth-Czifra <gphi...@vipmail.hu>
Subject Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports
Date Fri, 28 Sep 2012 10:27:17 GMT


> On Sept. 28, 2012, 9:59 a.m., Abhijeet Gaikwad wrote:
> > src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java, line 329
> > <https://reviews.apache.org/r/7135/diff/1/?file=155911#file155911line329>
> >
> >     What happens when MYSQL_CHECKPOINT_SLEEP_KEY is greater than mapred.task.timeout?
> >     
> >     If the job is killed, we need to handle the scenario.

That's a good point! 

Given that the default value of mapred.task.timeout is 600000 (10m) I consider this very unlikely,
the ideal value of the new config key has order of magniture of a few hundred ms. However,
in some extreme cases (or when clearly misusing this feature) it is possible that this case
needs to be handled. 

Do you have any suggestion? For example, limiting sqoop.mysql.export.sleep.ms to a maximum
of the value in mapred.task.timeout?


- Zoltán


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7135/#review12019
-----------------------------------------------------------


On Sept. 27, 2012, 3:47 p.m., Zoltán Tóth-Czifra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7135/
> -----------------------------------------------------------
> 
> (Updated Sept. 27, 2012, 3:47 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604
> 
> The solution in short: Using the already existing "checkpoint" feature of the direct
(--direct) MySQL exports (the export process is restarted every X bytes written), extending
it with a new config value that would simply make the thread sleep for X milliseconds at the
checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling
mechanism.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 
> 
> Diff: https://reviews.apache.org/r/7135/diff/
> 
> 
> Testing
> -------
> 
> Executing with different settings of sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms:
> 
> 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec)
> 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec)
> 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec)
> 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec)
> 
> I did not add unit tests yet and as it involves calling to Thread.sleep, I find testing
this difficult. Unfortunately there is no "machine" or "environment" object that could be
injected to these classes as mocks that could take care of time-related fixtures.
> 
> 
> Thanks,
> 
> Zoltán Tóth-Czifra
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message