sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gwen Shapira (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-2151) Sqoop2: Sqoop mapreduce job gets into deadlock when loader throws an exception
Date Wed, 13 May 2015 15:19:00 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542084#comment-14542084
] 

Gwen Shapira commented on SQOOP-2151:
-------------------------------------

Went over the patch with Ted and I agree it resolves an issue and will commit it shortly.

We don't know if it resolves Jarcec's specific problem since we did not manage to reproduce
it.

I'll also open two follow up JIRA:
1. Add tests for the situation where the mapper throws an exception.
2. Move from the current simple producer-consumer model with a single in-flight record to
an actual buffer.

> Sqoop2: Sqoop mapreduce job gets into deadlock when loader throws an exception
> ------------------------------------------------------------------------------
>
>                 Key: SQOOP-2151
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2151
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.99.5
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Ted Malaska
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: SQOOP-2151.patch
>
>
> I'm working on Kite integration tests and I've noticed that there is certain case where
Sqoop mapreduce job gets into deadlock.
> I've get there by running Kite job after upgrading to Kite 1.0 but before fixing the
temporary data set problem covered by SQOOP-2150. Here is the log output from mapper:
> {code}
> 2015-02-28 09:14:50,994 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor
 - SqoopOutputFormatLoadExecutor consumer thread is starting
> 2015-02-28 09:14:51,021 [OutputFormatLoader-consumer] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor
 - Running loader class org.apache.sqoop.connector.kite.KiteLoader
> 2015-02-28 09:14:51,025 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Starting
progress service
> 2015-02-28 09:14:51,030 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Running extractor
class org.apache.sqoop.connector.jdbc.GenericJdbcExtractor
> 2015-02-28 09:14:51,306 [main] INFO  org.apache.sqoop.connector.jdbc.GenericJdbcExtractor
 - Using query: SELECT * FROM FROMRDBMSTOKITETEST WHERE 1 <= "id" AND "id" <= 4
> 2015-02-28 09:14:51,627 [OutputFormatLoader-consumer] ERROR org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor
 - Error while loading data out of MR job.
> org.kitesdk.data.ValidationException: Dataset name temp_9975e79a-7e5d-493a-b6d4-646f3452a51f
is not alphanumeric (plus '_')
> 	at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
> 	at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:103)
> 	at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:66)
> 	at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
> 	at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
> 	at org.kitesdk.data.Datasets.create(Datasets.java:239)
> 	at org.kitesdk.data.Datasets.create(Datasets.java:307)
> 	at org.kitesdk.data.Datasets.create(Datasets.java:335)
> 	at org.apache.sqoop.connector.kite.KiteDatasetExecutor.createDataset(KiteDatasetExecutor.java:67)
> 	at org.apache.sqoop.connector.kite.KiteLoader.getExecutor(KiteLoader.java:51)
> 	at org.apache.sqoop.connector.kite.KiteLoader.load(KiteLoader.java:61)
> 	at org.apache.sqoop.connector.kite.KiteLoader.load(KiteLoader.java:36)
> 	at org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:250)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 2015-02-28 09:14:51,633 [main] INFO  org.apache.sqoop.job.mr.SqoopMapper  - Stopping
progress service
> 2015-02-28 09:14:51,634 [main] INFO  org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor
 - SqoopOutputFormatLoadExecutor::SqoopRecordWriter is about to be closed
> {code}
> But the mapper never finished, here is the relevant jstack:
> {code}
> "main" #1 prio=5 os_prio=31 tid=0x00007fedf180a800 nid=0xc07 waiting on condition [0x000000010b3f2000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x0000000127399b50> (a java.util.concurrent.Semaphore$FairSync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> 	at java.util.concurrent.Semaphore.acquire(Semaphore.java:312)
> 	at org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$SqoopRecordWriter.close(SqoopOutputFormatLoadExecutor.java:113)
> 	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:667)
> 	at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2012)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:794)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message