spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex <siri8...@gmail.com>
Subject Roadblock -- stuck for 10 days :( how come same hive udf giving different results in spark and hive
Date Tue, 31 Jan 2017 14:34:17 GMT
Hi All,

i am trying to run a hive udf in spark-sql and its giving different rows as
result in both hive and spark..

My UDF query looks something like this

select col1,col2,col3, sum(col4) col4, sum(col5) col5,Group_name
from
(select inline(myudf('cons1',record))
from table1) test group by col1,col2,col3;

but the results are same till here if i give below subquery

its giving the same output

(select inline(myudf('cons1',record))
from table1) test group by col1,col2,col3;

But If I pass the entire script its giving different outputs in both hive
and spark


select col1,col2,col3, sum(col4) col4, sum(col5) col5,Group_name
from
(select inline(myudf('cons1',record))
from table1) test group by col1,col2,col3;

how come? :(

Mime
View raw message