From dev-return-2161-apmail-crunch-dev-archive=crunch.apache.org@crunch.apache.org Tue Mar 5 17:02:17 2013 Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0CF3DDA92 for ; Tue, 5 Mar 2013 17:02:17 +0000 (UTC) Received: (qmail 45714 invoked by uid 500); 5 Mar 2013 17:02:16 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 45533 invoked by uid 500); 5 Mar 2013 17:02:16 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 45014 invoked by uid 500); 5 Mar 2013 17:02:16 -0000 Delivered-To: apmail-incubator-crunch-dev@incubator.apache.org Received: (qmail 45007 invoked by uid 99); 5 Mar 2013 17:02:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Mar 2013 17:02:15 +0000 Date: Tue, 5 Mar 2013 17:02:15 +0000 (UTC) From: "Chao Shi (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CRUNCH-172) Refine synchronization mechanism in CrunchJobControl MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593586#comment-13593586 ] Chao Shi commented on CRUNCH-172: --------------------------------- I've been working on a patch. There are some small issues left. After solvig them, I will post it here. > Refine synchronization mechanism in CrunchJobControl > ---------------------------------------------------- > > Key: CRUNCH-172 > URL: https://issues.apache.org/jira/browse/CRUNCH-172 > Project: Crunch > Issue Type: Bug > Components: Core > Affects Versions: 0.6.0 > Reporter: Chao Shi > Assignee: Josh Wills > > Currently CrunchJobControl uses a runnerState to synchronize its background loop and client calls (e.g. stop). This is not sufficient. Jenkins reports a failure after CRUNCH-156 is checked in. > MRExecutor does the following in its monitorLoop: > {code} > Thread controlThread = new Thread(control); > controlThread.start(); > while (killSignal.getCount() > 0 && !control.allFinished()) { > killSignal.await(1, TimeUnit.SECONDS); > } > control.stop(); > {code} > And how CrunchJobControl works: > {code} > public void stop() { > this.runnerState = ThreadState.STOPPING; > } > public void run() { > this.runnerState = ThreadState.RUNNING; > while (true) { > ... > } > {code} > So it is possible to have stop() called before run() called in the other thread. Then MRExecutor thinks everything has been stopped and start to do clean up work, while CrunchJobControl is continue to submit new jobs. Because the clean up work is done, the newly submitted job will complain FileNotFound. > I think a solution is to remove background thread in CrunchJobControl and let MRExecutor to call it periodically. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira