spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tianhua huang <huangtianhua...@gmail.com>
Subject Re: Ask for ARM CI for spark
Date Thu, 18 Jul 2019 03:12:22 GMT
Thanks for your reply.

About the first problem we didn't find any other reason in log, just found
timeout to wait the executor up, and after increase the timeout from 10000
ms to 30000(even 20000)ms,
https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L764

https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L792
the test passed, and there are more than one executor up, not sure whether
it's related with the flavor of our aarch64 instance? Now the flavor of the
instance is 8C8G. Maybe we will try the bigger flavor later. Or any one has
other suggestion, please contact me, thank you.

About the second problem, I proposed a pull request to apache/spark,
https://github.com/apache/spark/pull/25186  if you have time, would you
please to help to review it, thank you very much.

On Wed, Jul 17, 2019 at 8:37 PM Sean Owen <srowen@gmail.com> wrote:

> On Wed, Jul 17, 2019 at 6:28 AM Tianhua huang <huangtianhua223@gmail.com>
> wrote:
> > Two failed and the reason is 'Can't find 1 executors before 10000
> milliseconds elapsed', see below, then we try increase timeout the tests
> passed, so wonder if we can increase the timeout? and here I have another
> question about
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/TestUtils.scala#L285,
> why is not >=? see the comment of the function, it should be >=?
> >
>
> I think it's ">" because the driver is also an executor, but not 100%
> sure. In any event it passes in general.
> These errors typically mean "I didn't start successfully" for some
> other reason that may be in the logs.
>
> > The other two failed and the reason is '2143289344 equaled 2143289344',
> this because the value of floatToRawIntBits(0.0f/0.0f) on aarch64 platform
> is 2143289344 and equals to floatToRawIntBits(Float.NaN). About this I send
> email to jdk-dev and proposed a topic on scala community
> https://users.scala-lang.org/t/the-value-of-floattorawintbits-0-0f-0-0f-is-different-on-x86-64-and-aarch64-platforms/4845
> and https://github.com/scala/bug/issues/11632, I thought it's something
> about jdk or scala, but after discuss, it should related with platform, so
> seems the following asserts is not appropriate?
> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala#L704-L705
> and
> https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala#L732-L733
>
> These tests could special-case execution on ARM, like you'll see some
> tests handle big-endian architectures.
>

Mime
View raw message