Clojure libraries (or any JARs) can be used by the supported scripting languages. However Clojure itself is not yet supported by the NiFi scripting processors, there were issues with the Clojure ScriptEngine bridge so it was left off the original list. If there is interest in adding Clojure, I can write up an improvement Jira with the initial findings.

Regards,
Matt


On Feb 20, 2016, at 2:18 PM, Russell Whitaker <russell.whitaker@gmail.com> wrote:

Don't forget Clojure as well. 

Russell Whitaker
Sent from my iPhone

On Feb 20, 2016, at 7:44 AM, Matt Burgess <mattyb149@gmail.com> wrote:

I have a blog post on how to do this with NiFi using a Groovy script in the ExecuteScript (new in 0.5.0) processor using PDFBox instead of Tika:


Jython is also supported but can't yet use Java libraries (it uses Jython scripts/modules instead). The other languages (Groovy, Lua, JavaScript, JRuby) can use Java libraries like Tika and PDFBox.

Regards,
Matt

Sent from my iPhone

On Feb 20, 2016, at 10:31 AM, Ralf Meier <news@cht3.com> wrote:

Hi Everybody, 

I’m new to Nifi and I want to find out if it is possible to extract content and metadata from PDF’s using a library like tika. 
My first Idea was to to use the following processors:
- GetFile (Watch a specific Folder)
- IdentifyMimeType (Identify if the file is a typ application/pdf) 
- RouteOnAttribute (If it is a pdf)
- ExecuteStreamCommand:
I changed the following settings.
Command Arguments: {flowfilw_contents}
Command Path: tika-python parse all
I use the python tika wrapper from (https://github.com/chrismattmann/tika-python)

But it is not working. 
Has somebody an Idea how to use tika to extract the content and the metadata using nifi or what I’m doing wrong.

Thanks for your help.
BR 
Ralf