nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mandeep Gill <mand...@nstack.com>
Subject Nulls in input data throwing exceptions when using QueryRecord
Date Wed, 07 Nov 2018 11:54:14 GMT
Hi,

We're hitting a couple of issues working with nulls when using QueryRecord
using both NiFi 1.7.1 and 1.8.0.

Things work as expected for strings, however when using other primitive
types as defined by the avro schema, such as boolean, long, and double,
null values in the input data aren't converted to NULLs within the SQL
engine / Calcite. Instead they appear to remain as java null values and
throw NPEs when attempting to use them within a query or simply return them
as the output.

To give some examples, given the following record data and schema (tested
using both JSON and Avro record reader/writers)

[ {  "str_test" : "hello1",  "bool_test" : true }, {  "str_test" :
null,  "bool_test" : null } ]

{
  "type": "record",
  "name": "schema",
  "fields": [
    {
      "name": "str_test",
      "type": [ "string", "null" ],
      "default": null
    },
    {
      "name": "bool_test",
      "type": [ "boolean", "null" ],
      "default": null
    }
  ]
}

The following queries return the empty resultset,

select 'res' as res from FLOWFILE where bool_test IS NULL
select 'res' as res from FLOWFILE where bool_test IS UNKNOWN

and the query below returns a resultset of count 2,

select 'res' from FLOWFILE where bool_test IS NOT NULL

The query below works as expected, suggesting things work fine for strings

select 'res' as res from FLOWFILE where str_test IS NULL

However, finally the following query throws a NullPointerException (see
[1]) on trying to convert the null to a boolean within the output writer

select * from FLOWFILE where bool_test IS NOT NULL

The null values for these types seem to be treated as distinct to the NULLs
within the SQL engine, as the following query returns the empty resultset.

select 'res' as res from FLOWFILE where CAST(NULL as boolean) IS
DISTINCT FROM bool_test

and the following query gives an RuntimeException (see [2]),

select (COALESCE(bool_test, TRUE)) as res from flowfile

Given all this we're unable to make use of datasets with nulls, are nulls
only supported for strings or is there perhaps something we're doing wrong
here in our setup/config. One thing we've noticed when running a simple
"SELECT * from FLOWFILE" returns a nullable type for strings in the output
avro schema but not for other primitives, even if they were nullable in the
input schema - which could be related.

Cheers,
Mandeep

[1] org.apache.nifi.processor.exception.ProcessException: IOException
thrown from QueryRecord[id=43ee29ff-0166-1000-28bd-06dd07c1425d]:
java.io.IOException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
java.lang.NullPointerException: null of boolean in field bool_test of
org.apache.nifi.nifiRecord
at
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2667)
at
org.apache.nifi.processors.standard.QueryRecord.onTrigger(QueryRecord.java:309)
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
java.lang.NullPointerException: null of boolean in field bool_test of
org.apache.nifi.nifiRecord
at
org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:327)
at
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2648)
... 12 common frames omitted
Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
java.lang.NullPointerException: null of boolean in field bool_test of
org.apache.nifi.nifiRecord
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308)
at
org.apache.nifi.avro.WriteAvroResultWithSchema.writeRecord(WriteAvroResultWithSchema.java:61)
at
org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)
at
org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:52)
at
org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:324)
... 13 common frames omitted
Caused by: java.lang.NullPointerException: null of boolean in field
bool_test of org.apache.nifi.nifiRecord
at
org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:132)
at
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:126)
at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302)
... 17 common frames omitted
Caused by: java.lang.NullPointerException: null
at
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:121)
at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
at
org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
at
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
... 20 common frames omitted


[2] org.apache.nifi.processor.exception.ProcessException: IOException
thrown from QueryRecord[id=43ee29ff-0166-1000-28bd-06dd07c1425d]:
java.io.IOException: java.lang.RuntimeException: Cannot convert null to
boolean
at
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2667)
at
org.apache.nifi.processors.standard.QueryRecord.onTrigger(QueryRecord.java:309)
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.RuntimeException: Cannot convert
null to boolean
at
org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:327)
at
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2648)
... 12 common frames omitted
Caused by: java.lang.RuntimeException: Cannot convert null to boolean
at
org.apache.calcite.runtime.SqlFunctions.cannotConvert(SqlFunctions.java:1460)
at org.apache.calcite.runtime.SqlFunctions.toBoolean(SqlFunctions.java:1483)
at Baz$1$1.current(Unknown Source)
at org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.next(Linq4j.java:684)
at
org.apache.calcite.avatica.util.IteratorCursor.next(IteratorCursor.java:46)
at
org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:217)
at
org.apache.nifi.serialization.record.ResultSetRecordSet.next(ResultSetRecordSet.java:84)
at
org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:51)
at
org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:324)
... 13 common frames omitted

-- 

Mandeep Gill

nstack.com <http://www.nstack.com/> / +44 7961822575 <+44%207961%20822575>

Mime
View raw message