spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shixiong Zhu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-4824) Join should use `Iterator` rather than `Iterable`
Date Thu, 11 Dec 2014 02:57:12 GMT
Shixiong Zhu created SPARK-4824:
-----------------------------------

             Summary: Join should use `Iterator` rather than `Iterable`
                 Key: SPARK-4824
                 URL: https://issues.apache.org/jira/browse/SPARK-4824
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
            Reporter: Shixiong Zhu


In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new
`Seq`. Such as,
{code}
  val iterable = Seq(1, 2, 3).map(v => {
    println(v)
    v
  })
  println("Iterable map done")

  val iterator = Seq(1, 2, 3).iterator.map(v => {
    println(v)
    v
  })
  println("Iterator map done")
{code}
outputed
{code}
1
2
3
Iterable map done
Iterator map done
{code}
So we should use 'iterator' to reduce memory consumed by join.

Found by [~johannes.simon]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message