spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wush Wu <>
Subject Difference behaviour of DateType in SparkSQL between 1.2 and 1.3
Date Fri, 27 Mar 2015 01:30:12 GMT
Dear all,

I am trying to upgrade the spark from 1.2 to 1.3 and switch the existed API
of creating SchemaRDD to DataFrame.

After testing, I notice that the following behavior is changed:

import java.sql.Date
import com.bridgewell.SparkTestUtils
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.types.{DataType, DateType}
import org.junit.runner.RunWith
import org.scalatest.matchers.ShouldMatchers
import org.scalatest.junit.JUnitRunner
import scala.reflect.ClassTag
import scala.reflect.runtime.universe.TypeTag

case class TestDateClass(x : Date)

class DataFrameDateSuite extends SparkTestUtils with ShouldMatchers {

  sparkTest("Validate Date") {

    def test[TestClass <: Product : ClassTag : TypeTag, T :
ClassTag](params : Seq[T], tp : DataType, f : (T => TestClass)): Unit = {

      val hc = new HiveContext(sc)
      val rdd : RDD[TestClass] = sc.parallelize(params).map(f(_))
      // code that works for spark-1.2
      // hc.registerRDDTable(hc.createSchemaRDD(rdd), "test")
      val row = hc.sql("SELECT * FROM test").first
      row(0).asInstanceOf[Date] shouldEqual params.head

    test[TestDateClass, Date](Array(new Date(86400)), DateType, new



The above test is passed in spark 1.2 but failed now. If i print the
getTime() from the two date object(before saving to Hive and after loading
from Hive), the value are 86400 and -28800000.

Do I misuse the API? Or is this a bug?


View raw message