spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Preece (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-12555) Build Failure on 1.6
Date Tue, 29 Dec 2015 10:59:49 GMT

    [ https://issues.apache.org/jira/browse/SPARK-12555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073795#comment-15073795
] 

Tim Preece commented on SPARK-12555:
------------------------------------

Analysis shows that this test fails because of data corruption:

There is a mismatch between unsaferow (string,int) and the schema (int,string), presumably
because the test involves reordering of columns.

Subsequently when joining (string,int) + (string) the code incorrectly patches the int value
with the offset change of the first String.

This data corruption occurs on ALL platforms and the offset part of the first string is always
incorrect. On Big Endian platforms the value for the integer is also corrupted. This is simply
due to location of the 4-byte integer in the 8-byte unsafe row slot.

> Build Failure on 1.6
> --------------------
>
>                 Key: SPARK-12555
>                 URL: https://issues.apache.org/jira/browse/SPARK-12555
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>         Environment: ALL platforms ( although test only explicitly fails on Big Endian
platforms ).
>            Reporter: Tim Preece
>            Priority: Blocker
>
> org.apache.spark.sql.DatasetAggregatorSuite
> - typed aggregation: class input with reordering *** FAILED ***
>   Results do not match for query:
>   == Parsed Logical Plan ==
>   Aggregate [value#748], [value#748,(ClassInputAgg$(b#650,a#651),mode=Complete,isDistinct=false)
AS ClassInputAgg$(b,a)#762]
>   +- AppendColumns <function1>, class[a[0]: int, b[0]: string], class[value[0]:
string], [value#748]
>      +- Project [one AS b#650,1 AS a#651]
>         +- OneRowRelation$
>   
>   == Analyzed Logical Plan ==
>   value: string, ClassInputAgg$(b,a): int
>   Aggregate [value#748], [value#748,(ClassInputAgg$(b#650,a#651),mode=Complete,isDistinct=false)
AS ClassInputAgg$(b,a)#762]
>   +- AppendColumns <function1>, class[a[0]: int, b[0]: string], class[value[0]:
string], [value#748]
>      +- Project [one AS b#650,1 AS a#651]
>         +- OneRowRelation$
>   
>   == Optimized Logical Plan ==
>   Aggregate [value#748], [value#748,(ClassInputAgg$(b#650,a#651),mode=Complete,isDistinct=false)
AS ClassInputAgg$(b,a)#762]
>   +- AppendColumns <function1>, class[a[0]: int, b[0]: string], class[value[0]:
string], [value#748]
>      +- Project [one AS b#650,1 AS a#651]
>         +- OneRowRelation$
>   
>   == Physical Plan ==
>   TungstenAggregate(key=[value#748], functions=[(ClassInputAgg$(b#650,a#651),mode=Final,isDistinct=false)],
output=[value#748,ClassInputAgg$(b,a)#762])
>   +- TungstenExchange hashpartitioning(value#748,5), None
>      +- TungstenAggregate(key=[value#748], functions=[(ClassInputAgg$(b#650,a#651),mode=Partial,isDistinct=false)],
output=[value#748,value#758])
>         +- !AppendColumns <function1>, class[a[0]: int, b[0]: string], class[value[0]:
string], [value#748]
>            +- Project [one AS b#650,1 AS a#651]
>               +- Scan OneRowRelation[]
>   == Results ==
>   !== Correct Answer - 1 ==   == Spark Answer - 1 ==
>   ![one,1]                    [one,9] (QueryTest.scala:127)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message