From dev-return-4807-apmail-sqoop-dev-archive=sqoop.apache.org@sqoop.apache.org Sat Dec 15 02:06:16 2012 Return-Path: X-Original-To: apmail-sqoop-dev-archive@www.apache.org Delivered-To: apmail-sqoop-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 49C0AD715 for ; Sat, 15 Dec 2012 02:06:16 +0000 (UTC) Received: (qmail 988 invoked by uid 500); 15 Dec 2012 02:06:16 -0000 Delivered-To: apmail-sqoop-dev-archive@sqoop.apache.org Received: (qmail 950 invoked by uid 500); 15 Dec 2012 02:06:16 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 941 invoked by uid 99); 15 Dec 2012 02:06:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Dec 2012 02:06:16 +0000 Date: Sat, 15 Dec 2012 02:06:16 +0000 (UTC) From: "Hari Shreedharan (JIRA)" To: dev@sqoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (SQOOP-761) HDFSTextExportExtractor loses lines around partition boundaries MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SQOOP-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532894#comment-13532894 ] Hari Shreedharan edited comment on SQOOP-761 at 12/15/12 2:06 AM: ------------------------------------------------------------------ Uncommented unit tests. All unit tests pass now. Fixed multiple issues: * HdfsTextExportExtractor was using FileSystem.getPos() method - which was causing missing data in uncompressed files - so now explicitly calculate the size. * SequenceFile was reading data multiple times, since HdfsSequenceExportExtractor was reading a file till the end of a file, ignoring the end parameter. * Added forking to the unit tests, and added more memory. * Uncommented the tests in TestHdfsExtract was (Author: hshreedharan): Uncommented unit tests. All unit tests pass now. Fixed multiple issues: * HdfsTextExportExtractor was using FileSystem.getPos() method - which was causing missing data in uncompressed files - so now explicitly calculate the size. * SequenceFile was reading data multiple times, since HdfsSequenceExportExtractor was reading a file till the end of a file, ignoring the end parameter. * Added forking to the unit tests, and added more memory. *Uncommented the tests in TestHdfsExtract > HDFSTextExportExtractor loses lines around partition boundaries > --------------------------------------------------------------- > > Key: SQOOP-761 > URL: https://issues.apache.org/jira/browse/SQOOP-761 > Project: Sqoop > Issue Type: Bug > Reporter: Hari Shreedharan > Priority: Blocker > Fix For: 1.99.1 > > Attachments: SQOOP-761-missingdata.txt, SQOOP-761.patch > > > Blocker for 1.99 release -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira