spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "FangFang Chen" <lulynn_2015_sp...@163.com>
Subject 回复:回复:Spark sql and hive into different result with same sql
Date Wed, 20 Apr 2016 12:25:59 GMT
I found spark sql lost precision, and handle data as int with some rule. Following is data
got via hive shell and spark sql, with same sql to same hive table:
Hive:
0.4
0.5
1.8
0.4
0.49
1.5
Spark sql:
1
2
2
Seems the handle rule is: when decimal point data <0.5 then to 0, when decimal point data>=0.5
then to 1.


Is this a bug or some configuration thing? Please give some suggestions. Thanks


发自 网易邮箱大师
在2016年04月20日 18:45,FangFang Chen 写道:
The output is:
Spark SQ:6828127
Hive:6980574.1269


发自 网易邮箱大师
在2016年04月20日 18:06,FangFang Chen 写道:
Hi all,
Please give some suggestions. Thanks


With following same sql, spark sql and hive give different result. The sql is sum(decimal(38,18))
columns.
Select sum(column) from table;
column is defined as decimal(38,18).


Spark version:1.5.3
Hive version:2.0.0


发自 网易邮箱大师





Mime
View raw message