spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maxim Gekk (Jira)" <j...@apache.org>
Subject [jira] [Created] (SPARK-29328) Incorrect calculation mean seconds per month
Date Wed, 02 Oct 2019 10:20:00 GMT
Maxim Gekk created SPARK-29328:
----------------------------------

             Summary: Incorrect calculation mean seconds per month
                 Key: SPARK-29328
                 URL: https://issues.apache.org/jira/browse/SPARK-29328
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.4
            Reporter: Maxim Gekk


Existing implementation assumes 31 days per month or 372 days per year which is far away from
the correct number. Spark uses the proleptic Gregorian calendar by default SPARK-26651 in
which the average year is 365.2425 days long: https://en.wikipedia.org/wiki/Gregorian_calendar
. Need to fix calculation in 3 places at least:
- GroupStateImpl.scala:167:    val millisPerMonth = TimeUnit.MICROSECONDS.toMillis(CalendarInterval.MICROS_PER_DAY)
* 31
- EventTimeWatermark.scala:32:    val millisPerMonth = TimeUnit.MICROSECONDS.toMillis(CalendarInterval.MICROS_PER_DAY)
* 31
- DateTimeUtils.scala:610:    val secondsInMonth = DAYS.toSeconds(31)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message