pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Noguchi <knogu...@oath.com.INVALID>
Subject Re: byte array to long cast inside foreach
Date Fri, 09 Nov 2018 15:19:34 GMT
Hi Manoj,

It looks like you would need
https://issues.apache.org/jira/browse/PIG-3938
(only in 0.17)

Koji


On Fri, Nov 9, 2018 at 2:51 AM Manoj Narayanan <manoj.narayanan@gmail.com>
wrote:

> I am trying to cast a byte array to a long value inside a FOREACH. I
> understand that in-order for byte array to be casted to long, there needs
> to be some sort LoadCaster available. I assumed that a standard UDF like
> CONCAT would have that available. Is this expected to work or fail?
> Appreciate any help you guys can provide.
>
> Here is my script.
>
> *$ cat cast_bytearray_udf_cast.pig*
>
> A = load 'cast_simple.txt' using PigStorage(',') as (id:int,
> name:chararray, count1:bytearray, count2:bytearray);
>
> G = GROUP A BY name;
>
> B = foreach G {
>
>         L = FOREACH A GENERATE CONCAT(count1, count2) as concat_count;
>
>         M = FOREACH L GENERATE (long)concat_count as casted_concat_count;
>
>         N = FOREACH M GENERATE casted_concat_count - 1, casted_concat_count
> ;
>
>         GENERATE N;
>
>  };
>
> dump B;
>
> *$ cat cast_simple.txt *
>
> 1,cat,1234,134
>
> 2,cat,1342,213
>
> 3,dog,1343,331
>
>
> I am getting below exception
>
>
> java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
> ERROR 0: Exception while executing (Name: N: New For Each(false,f.
>
>         at
>
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>
>         at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
> Exception while executing (Name: N: New For Each(false,false)[bag].
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:257)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNextDataBag(PhysicalOperator.java:411)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:566)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PORelationToExprProject.getNextDataBag(PORelation)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:335)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:405)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:322)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:465)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapRedu)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:262)
>
>         at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
>         at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
>         at
>
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
>         at java.lang.Thread.run(Thread.java:748)
>
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> 1075: Received a bytearray from the UDF or Union from two different L.
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNextLong(POCast.java:640)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:349)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:405)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:322)
>
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
>
>         ... 20 more
>
> *$ pig -version*
>
> Apache Pig version 0.16.0.2.6.2.0-205 (rUnversioned directory)
>
> compiled Aug 26 2017, 09:34:39
>
>
>
> Thanks
>
> Manoj Narayanan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message