From dev-return-32774-apmail-sqoop-dev-archive=sqoop.apache.org@sqoop.apache.org Thu Oct 13 14:26:22 2016 Return-Path: X-Original-To: apmail-sqoop-dev-archive@www.apache.org Delivered-To: apmail-sqoop-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 14FAC197A1 for ; Thu, 13 Oct 2016 14:26:22 +0000 (UTC) Received: (qmail 18193 invoked by uid 500); 13 Oct 2016 14:26:22 -0000 Delivered-To: apmail-sqoop-dev-archive@sqoop.apache.org Received: (qmail 18167 invoked by uid 500); 13 Oct 2016 14:26:21 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 18129 invoked by uid 99); 13 Oct 2016 14:26:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2016 14:26:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 877212C4C71 for ; Thu, 13 Oct 2016 14:26:21 +0000 (UTC) Date: Thu, 13 Oct 2016 14:26:21 +0000 (UTC) From: "Hudson (JIRA)" To: dev@sqoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SQOOP-2986) Add validation check for --hive-import and --incremental lastmodified MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SQOOP-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572070#comment-15572070 ] Hudson commented on SQOOP-2986: ------------------------------- SUCCESS: Integrated in Jenkins build Sqoop-hadoop200 #1067 (See [https://builds.apache.org/job/Sqoop-hadoop200/1067/]) SQOOP-2986: Add validation check for --hive-import and --incremental (maugli: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=14754342d3a9bd6e146b9628b2e103ff30f310d8]) * (add) src/test/org/apache/sqoop/tool/ImportToolValidateOptionsTest.java * (edit) src/java/org/apache/sqoop/tool/BaseSqoopTool.java > Add validation check for --hive-import and --incremental lastmodified > --------------------------------------------------------------------- > > Key: SQOOP-2986 > URL: https://issues.apache.org/jira/browse/SQOOP-2986 > Project: Sqoop > Issue Type: Bug > Affects Versions: 1.4.6 > Reporter: Szabolcs Vasas > Assignee: Szabolcs Vasas > Fix For: 1.4.7 > > Attachments: SQOOP-2986.patch, SQOOP-2986.patch > > > Sqoop import with --hive-import and --incremental lastmodified options is not supported, however the application is able to run with these parameters but it produces unexpected results, the output can contain duplicate rows. > Steps to reproduce the issue: > 1) Create the necessary table for example in MySQL: > CREATE TABLE "Employees" ( > "id" int(11) NOT NULL, > "name" varchar(45) DEFAULT NULL, > "salary" varchar(45) DEFAULT NULL, > "change_date" datetime DEFAULT NULL, > PRIMARY KEY ("id") > ) ENGINE=MyISAM DEFAULT CHARSET=latin1; > INSERT INTO `Employees` (`id`,`name`,`salary`,`change_date`) VALUES (1,'employee1',1000,now()); > INSERT INTO `Employees` (`id`,`name`,`salary`,`change_date`) VALUES (2,'employee2','2000',now()); > INSERT INTO `Employees` (`id`,`name`,`salary`,`change_date`) VALUES (3,'employee3','3000',now()); > INSERT INTO `Employees` (`id`,`name`,`salary`,`change_date`) VALUES (4,'employee4','4000',now()); > INSERT INTO `Employees` (`id`,`name`,`salary`,`change_date`) VALUES (5,'employee5','5000',now()); > 2) Import the table to Hive > sudo -u hdfs sqoop import --connect jdbc:mysql://servername:3306/sqoop --username sqoop --password sqoop --table Employees --num-mappers 1 --hive-import --hive-table Employees > 3) Update some rows in MySQL: > UPDATE Employees SET salary=1010, change_date=now() where id=1; > UPDATE Employees SET salary=2010, change_date=now() where id=2; > 4) Execute the incremental import command: > sudo -u hdfs sqoop import --verbose --connect jdbc:mysql://servername:3306/sqoop --username sqoop --password sqoop --table Employees --incremental lastmodified --check-column change_date --merge-key id --num-mappers 1 --hive-import --hive-table Employees --last-value "last_timestamp" > 5) As a result employees with ids 1 and 2 will not be updated but we will see duplicate rows in the Hive table. > The task is to introduce a fail-fast validation which will make the Sqoop import fail if it was submitted with --hive-import and --incremental lastmodified options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)