nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From trixpan <...@git.apache.org>
Subject [GitHub] nifi pull request: NIFI-1899 - Introduce ExtractEmailAttachments processor
Date Wed, 01 Jun 2016 13:18:31 GMT
Github user trixpan commented on the pull request:

    https://github.com/apache/nifi/pull/483
  
    @joewitt - Thanks for the comments. Will fix the typos.
    
    The Apache Commons Email API decodes them automatically (reason I used the API in the
first place)
    
    e.g, 
    
    ![image](https://cloud.githubusercontent.com/assets/3108527/15710053/8fee0b20-284b-11e6-928c-df2912e393f3.png)
    
    clicking "download" will result in NiFi gracefully sending the logo.gif file, with the
correct filename, ready for consumption. 
    
    Would you prefer the child flowfiles to contain the blobs or to keep the data still in
base64? 
    
    I know we have the Base64Encode processor but I wonder if it is a good choice to keep
the data in base64. I may be projecting my personal choice but I would imagine one of the
main reasons people would decode an attachment is to do something with the decoded data itself?
    
    Regarding additional attributes, I agree and will have a look on what extra details may
be valuable.
    
    I for example, noticed that parent flowfile UUID attribute isn't there, and this is something
that IMHO would be particularly valuable for those looking to use NiFi as an interface between
SMTP and an email archival database running on Mongo or ElasticSearch.
    
    Regarding timing information, I am considering if we shouldn't split ListenSMTP into ListenSMTP
and ParseEmail.
    
    This way DFM can decide between:
    
    1. If MTA (e.g. message volumes are higher than ListenSMTP can support, or other reasons)
is in use, consider saving messages to files (i.e. Maildir format)
    
    1a. ListFile -> FetchFile -> ParseEmail (timing related attributes are added) ->
ExtractEmailAttachments (if desired).
    
    2. NiFi acts as the MTA
    2a. ListenSMTP(timing related attributes are added) -> ExtractEmailAttachments (if
desired)
    
    
    We naturally could add the email file parsing attributes to ExtractEmailAttachments (the
API already has this information, I just haven added) but I fear the name will be misleading
(maybe I am being extra pessimistic regarding this but I wouldn't automatically assume ExtractEmailAttachments
would do those things)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message