From user-return-23288-apmail-spark-user-archive=spark.apache.org@spark.apache.org Tue Dec 30 03:55:17 2014 Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3AFF210C8C for ; Tue, 30 Dec 2014 03:55:17 +0000 (UTC) Received: (qmail 48011 invoked by uid 500); 30 Dec 2014 03:55:15 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 47935 invoked by uid 500); 30 Dec 2014 03:55:14 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 47925 invoked by uid 99); 30 Dec 2014 03:55:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Dec 2014 03:55:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cjnolet@gmail.com designates 209.85.213.173 as permitted sender) Received: from [209.85.213.173] (HELO mail-ig0-f173.google.com) (209.85.213.173) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Dec 2014 03:54:48 +0000 Received: by mail-ig0-f173.google.com with SMTP id r2so12063838igi.12 for ; Mon, 29 Dec 2014 19:54:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=C8WCXiT2tUJ+BSEi6OpviMPC35JoIPW85hSG2Q59MzE=; b=LS1BqJSp0FZysb6+SXg65ZMHKMVPZDVFZk4MuBs4V0AsJlFSeFazG7Ppa6ePWzQRhV 3W2+DetOhLeodnbzfJ+eLoaHQqEby0DuZYFaUIeMsuMfNhCx7iE5LhgRvI9Xxc9hbh83 qaBVeLJoa78QHA10yMayQ9gav/vQZm5wRMs1EI+elRcHWIwQWs4j+x76pXFcWF2WXEYI z/SHC1CJstv3rkoeMiQpYoUndYgiRELqAizu2JtWQd0jADpB9FG9KF2QrII+wZLiJgF3 pBueMJOr4Q32b8dq1ojgVi67Yw2NrHVXV9TTvrUBZTsrkwTjrAZFEZym7Rzgyx7kZ3kd Vx6g== X-Received: by 10.50.142.38 with SMTP id rt6mr38473368igb.25.1419911642468; Mon, 29 Dec 2014 19:54:02 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.32.2 with HTTP; Mon, 29 Dec 2014 19:53:42 -0800 (PST) From: Corey Nolet Date: Mon, 29 Dec 2014 22:53:42 -0500 Message-ID: Subject: Cached RDD To: user Content-Type: multipart/alternative; boundary=001a11c3a9500fd885050b66f17b X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3a9500fd885050b66f17b Content-Type: text/plain; charset=UTF-8 If I have 2 RDDs which depend on the same RDD like the following: val rdd1 = ... val rdd2 = rdd1.groupBy()... val rdd3 = rdd1.groupBy()... If I don't cache rdd1, will it's lineage be calculated twice (one for rdd2 and one for rdd3)? --001a11c3a9500fd885050b66f17b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
If I have 2 RDDs which depend on the same RDD like the fol= lowing:

val rdd1 =3D ...

val rd= d2 =3D rdd1.groupBy()...

val rdd3 =3D rdd1.groupBy= ()...


If I don't cache rdd1, wi= ll it's lineage be calculated twice (one for rdd2 and one for rdd3)?
--001a11c3a9500fd885050b66f17b--