nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Wing <jvw...@gmail.com>
Subject Re: ListS3 Processor Error
Date Wed, 13 Dec 2017 17:55:23 GMT
For ListS3, you will want to separate those in the Bucket and Prefix properties.

> On Dec 13, 2017, at 9:34 AM, Aruna Sankaralingam <Aruna.Sankaralingam@Cormac-Corp.com>
wrote:
> 
> James,
>  
> “part-d-prescription-drug” is the main folder in S3 and “unstructured” is the
sub folder inside the main folder.
>  
> From: James Wing [mailto:jvwing@gmail.com] 
> Sent: Wednesday, December 13, 2017 1:34 AM
> To: users@nifi.apache.org
> Subject: Re: ListS3 Processor Error
>  
> Are you able to list the bucket with the AWS CLI (aws s3 ls)?  It can be helpful to compare
performance between NiFi and the AWS CLI, especially if you are able to do so from the same
machine, with the same permissions, and as similar bucket and prefix settings as you can manage.
> 
> In the screenshot above, the bucket is shown as "part-d-prescription-drug/unstructured",
which looks unusual to me.  Is the bucket "part-d-prescription-drug" and the prefix "unstructured/"?
> 
> Thanks,
> 
> James
>  
> On Tue, Dec 12, 2017 at 7:34 AM, Aruna Sankaralingam <Aruna.Sankaralingam@cormac-corp.com>
wrote:
> Joe,
>  
> No, I don’t have anything in between AWS and NiFi.
> NiFi is installed in one of the EC2 instance in AWS – N.Virginia Region
> S3 is also in N.Virginia Region
>  
> From: Joe Witt [mailto:joe.witt@gmail.com] 
> Sent: Monday, December 11, 2017 1:28 PM
> To: users@nifi.apache.org
> Subject: Re: ListS3 Processor Error
>  
> The XML response is truncated for some reason as implied by the following. Do you have
any devices/software/systems/proxies in between your NiFi and the amazon service?  Are you
able to manually issue the request and get the response you expect?
>  
> 2017-12-11 18:01:02,875 ERROR [Timer-Driven Process Thread-6] org.apache.nifi.processors.aws.s3.ListS3
ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3]
failed to process session due to com.amazonaws.SdkClientException: Failed to parse XML document
with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler:
{}
> com.amazonaws.SdkClientException: Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
>             at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:156)
>             at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseListBucketObjectsResponse(XmlResponsesSaxParser.java:298)
>             at com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsUnmarshaller.unmarshall(Unmarshallers.java:70)
>             at com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsUnmarshaller.unmarshall(Unmarshallers.java:59)
>             at com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
>             at com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
>             at com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1444)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1151)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:964)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:676)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:650)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:633)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:601)
>             at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:583)
>             at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:447)
>             at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4137)
>             at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4079)
>             at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:819)
>             at org.apache.nifi.processors.aws.s3.ListS3$S3ObjectBucketLister.listVersions(ListS3.java:314)
>             at org.apache.nifi.processors.aws.s3.ListS3.onTrigger(ListS3.java:208)
>             at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>             at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1119)
>             at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>             at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>             at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:128)
>             at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>             at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>             at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>             at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>             at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>             at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>             at java.lang.Thread.run(Thread.java:748)
> Caused by: org.xml.sax.SAXParseException: Premature end of file.
>             at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
>             at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
>             at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
>             at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
>             at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1472)
>             at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1014)
>             at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
>             at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
>             at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
>             at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
>             at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
>             at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
>             at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
>             at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:142)
>             ... 32 common frames omitted
>  
>  
> On Mon, Dec 11, 2017 at 1:07 PM, Aruna Sankaralingam <Aruna.Sankaralingam@cormac-corp.com>
wrote:
> Attached my nifi-app.log. Could you please let me know what went wrong?
>  
> From: Joe Witt [mailto:joe.witt@gmail.com] 
> Sent: Friday, December 08, 2017 4:04 PM
> 
> To: users@nifi.apache.org
> Subject: Re: ListS3 Processor Error
>  
> Here is an example I found for another processor
>  
>   https://mail-archives.apache.org/mod_mbox/nifi-dev/201509.mbox/%3CCAFddr26AEVqnoQ=mWr7DSNDFVrr9NuYy9GCcXg=4FYyCQAbbuw@mail.gmail.com%3E
>  
> Thanks
>  
> On Fri, Dec 8, 2017 at 4:02 PM, Aruna Sankaralingam <Aruna.Sankaralingam@cormac-corp.com>
wrote:
> Joe,
> Could you please let me know how to turn on the debug logging?
>  
> From: Joe Witt [mailto:joe.witt@gmail.com] 
> Sent: Friday, December 08, 2017 3:59 PM
> To: users@nifi.apache.org
> Subject: Re: ListS3 Processor Error
>  
> What version of NiFi?
>  
> Looks like either a classpath/classloader issue OR the amazon client library cannot parse
the response it is getting back...
>  
> The logs/nifi-app.log should have the full stack trace.  If not you can turn on debug
logging for that processor and perhaps then it will.
>  
> Thanks
>  
> On Fri, Dec 8, 2017 at 3:56 PM, Aruna Sankaralingam <Aruna.Sankaralingam@cormac-corp.com>
wrote:
> I am trying to get a pdf file from S3 and load to Elastic Search. The ListS3 processor
is giving me this error. Could someone please let me know where I am going wrong?
>  
> 20:52:25 UTC
> ERROR
> 37d7226e-0160-1000-6049-d4c489cd32f3
> ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3]
failed to process session due to com.amazonaws.SdkClientException: Failed to parse XML document
with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler:
Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
> 20:52:25 UTC
> WARNING
> 37d7226e-0160-1000-6049-d4c489cd32f3
> ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] Processor Administratively Yielded for
1 sec due to processing failure
> 20:52:26 UTC
> ERROR
> 37d7226e-0160-1000-6049-d4c489cd32f3
> ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3]
failed to process due to com.amazonaws.SdkClientException: Failed to parse XML document with
handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler;
rolling back session: Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
> 20:52:26 UTC
> ERROR
> 37d7226e-0160-1000-6049-d4c489cd32f3
> ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3]
failed to process session due to com.amazonaws.SdkClientException: Failed to parse XML document
with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler:
Failed to parse XML document with handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
> 20:52:26 UTC
> WARNING
> 37d7226e-0160-1000-6049-d4c489cd32f3
> ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] Processor Administratively Yielded for
1 sec due to processing failure
> Auto-refresh
>  
> <image001.png>
>  
>  
>  
>  

Mime
View raw message