Just tested in my Centos VM, worked like a charm without Hadoop. I'll open a Jira bug on PutParquet, doesn't seem to run on Windows. 
Still not sure what I can do. Converting our production Windows NiFi install to Docker would be a major effort. 
Has anyone heard of a Parquet writer tool I can download and call from NiFi?

On Wed, Aug 15, 2018 at 12:01 PM, Mike Thomsen <mikerthomsen@gmail.com> wrote:
> Mike, that's a good tip. I'll test that, but unfortunately, I've already committed to Windows.

You can run both Docker and the standard NiFi docker image on Windows.

On Wed, Aug 15, 2018 at 2:52 PM scott <tcots8888@gmail.com> wrote:
Mike, that's a good tip. I'll test that, but unfortunately, I've already committed to Windows. 
What about a script? Is there some tool you know of that can just be called by NiFi to convert an input CSV file to a Parquet file?

On Wed, Aug 15, 2018 at 8:32 AM, Mike Thomsen <mikerthomsen@gmail.com> wrote:
Scott,

You can also try Docker on Windows. Something like this should work:

docker run -d --name nifi-test -v C:/nifi_temp:/opt/data_output -p 8080:8080 apache/nifi:latest

I don't have Windows either, but Docker seems to work fine for my colleagues that have to use it on Windows. That should bridge C:\nifi_temp and /opt/data_output between host and container and remap localhost:8080 to the container on 8080 so you don't have to mess with a Hadoop client just to try out some Parquet stuff.

Mike

On Wed, Aug 15, 2018 at 11:20 AM scott <tcots8888@gmail.com> wrote:
Thanks Bryan. I'll give the Hadoop client a try. 

On Wed, Aug 15, 2018 at 7:51 AM, Bryan Bende <bbende@gmail.com> wrote:
I think there is a good chance that installing the Hadoop client would
solve the issue, but I can't say for sure since I don't have a Windows
machine to test.

The processor depends on the Apache Parquet Java client library which
depends on Apache Hadoop client [1], and the Hadoop client has a
limitation on Windows where it requires something additional.

[1] https://github.com/apache/parquet-mr/blob/master/parquet-avro/pom.xml#L62-L65



On Wed, Aug 15, 2018 at 10:16 AM, scott <tcots8888@gmail.com> wrote:
> If I install a Hadoop client on my NiFi host, would I be able to get past
> this error?
> I don't understand why this processor depends on Hadoop. Other projects like
> Drill and Spark don't have such a dependency to be able to write Parquet
> files.
>
> On Tue, Aug 14, 2018 at 2:58 PM, Juan Pablo Gardella
> <gardellajuanpablo@gmail.com> wrote:
>>
>> It's a warning. You can ignore that.
>>
>> On Tue, 14 Aug 2018 at 18:53 Bryan Bende <bbende@gmail.com> wrote:
>>>
>>> Scott,
>>>
>>> Sorry I did not realize the Hadoop client would be looking for this
>>> winutils.exe when running on Windows.
>>>
>>> On linux and MacOS you don't need anything external installed outside
>>> of NiFi so I wasn't expecting this.
>>>
>>> Not sure if there is any other good option here regarding Parquet.
>>>
>>> Thanks,
>>>
>>> Bryan
>>>
>>>
>>> On Tue, Aug 14, 2018 at 5:31 PM, scott <tcots8888@gmail.com> wrote:
>>> > Hi Bryan,
>>> > I'm fine if I have to trick the API, but don't I still need Hadoop
>>> > installed
>>> > somewhere? After creating the core-site.xml as you described, I get the
>>> > following errors:
>>> >
>>> > Failed to locate the winutils binary in the hadoop binary path
>>> > IOException: Could not locate executable null\bin\winutils.exe in the
>>> > Hadoop
>>> > binaries
>>> > Unable to load native-hadoop library for your platform... using
>>> > builtin-java
>>> > classes where applicable
>>> > Failed to write due to java.io.IOException: No FileSystem for scheme
>>> >
>>> > BTW, I'm using NiFi version 1.5
>>> >
>>> > Thanks,
>>> > Scott
>>> >
>>> >
>>> > On Tue, Aug 14, 2018 at 12:44 PM, Bryan Bende <bbende@gmail.com> wrote:
>>> >>
>>> >> Scott,
>>> >>
>>> >> Unfortunately the Parquet API itself is tied to the Hadoop Filesystem
>>> >> object which is why NiFi can't read and write Parquet directly to flow
>>> >> files (i.e. they don't provide a way to read/write to/from Java input
>>> >> and output streams).
>>> >>
>>> >> The best you can do is trick the Hadoop API into using the local
>>> >> file-system by creating a core-site.xml with the following:
>>> >>
>>> >> <configuration>
>>> >>     <property>
>>> >>         <name>fs.defaultFS</name>
>>> >>         <value>file:///</value>
>>> >>     </property>
>>> >> </configuration>
>>> >>
>>> >> That will make PutParquet or FetchParquet work with your local
>>> >> file-system.
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Bryan
>>> >>
>>> >>
>>> >> On Tue, Aug 14, 2018 at 3:22 PM, scott <tcots8888@gmail.com> wrote:
>>> >> > Hello NiFi community,
>>> >> > Is there a simple way to read CSV files and write them out as
>>> >> > Parquet
>>> >> > files
>>> >> > without Hadoop? I run NiFi on Windows and don't have access to a
>>> >> > Hadoop
>>> >> > environment. I'm trying to write the output of my ETL in a
>>> >> > compressed
>>> >> > and
>>> >> > still query-able format. Is there something I should be using
>>> >> > instead of
>>> >> > Parquet?
>>> >> >
>>> >> > Thanks for your time,
>>> >> > Scott
>>> >
>>> >
>
>