From user-return-77267-apmail-spark-user-archive=spark.apache.org@spark.apache.org Thu Feb 7 04:52:38 2019 Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A55A618DCB for ; Thu, 7 Feb 2019 04:52:38 +0000 (UTC) Received: (qmail 77113 invoked by uid 500); 7 Feb 2019 04:52:34 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 76783 invoked by uid 500); 7 Feb 2019 04:52:34 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 76767 invoked by uid 99); 7 Feb 2019 04:52:34 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2019 04:52:34 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id DCBCCC682F for ; Thu, 7 Feb 2019 04:52:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.048 X-Spam-Level: ** X-Spam-Status: No, score=2.048 tagged_above=-999 required=6.31 tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id skVh6n2mG48w for ; Thu, 7 Feb 2019 04:52:29 +0000 (UTC) Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 884B962828 for ; Thu, 7 Feb 2019 04:52:28 +0000 (UTC) Received: by mail-wm1-f46.google.com with SMTP id v26so105040wmh.3 for ; Wed, 06 Feb 2019 20:52:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YO/imK51a5qXoK5Kgev7kWCgnN2K17q2tzWStqZJ+z4=; b=oFytriNMvVkllfPVfCbgu5ycLDBmUk++C2ibGCC3NzSm+ZnOlCtQDwYuLzQLOlQLx6 YsLtXCHqJk5ck6hS7zFtaXYbXucdmBxcKmMCOQ3JFy6w/yzwQojnzhkolsU/9DMq8DtS 8a+4gXjhrDgmMeynBi1VNjiJmHtyg0b5ykGzESdWRlrZKTTLbc+xcSgUQ9qpajlB/TMh qb+lP0ixKYKskooqTGOj2RqqvAWp5iz+X4HENkNcGd6vfpBDqiakK59LAP2JXFlBH/sM AhjPF3JIY1ZloLhnkc4hLRyyH9UiMEXxHDAGsPGMcgSzNFFkxyjj2yUKbDTBKS/OBFnk SACQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YO/imK51a5qXoK5Kgev7kWCgnN2K17q2tzWStqZJ+z4=; b=kDKbqUeMJ5acUn7bq6Nos+aEPJiitcESWs3NnIGRwSAvXFr4bzM0C0xs1GDhqmT7Eb 4/r/UVUqwJi/i0lxutorYqNoAtAKwPl1h2Vh0sWZuMd1Jof0AbqzYAkBD8h687MI8l3+ oamSMr3mqNg6nUTs9nE5uEhpBDX/ZjHyfRYo5OsDv0xjJ4Sls87TyBF5fQb58MQHezR5 0Q15a/4ExZ/Hkhdtr8zSmOjgJseSUVV+3yQMXNaeLus+ygqh8l15IFDKo1mGGHPw7mRb gl2zw5awm3TrVCRpc18TMxXrknANEFVeCsDEku+mvZI/2owEXTULoMPvRuZtLMbE9LR5 2TEA== X-Gm-Message-State: AHQUAuYS0J+2ykgeG1R0c9MLAtFtltWi9JoHysT6P1tqGMDrbJK1xxiq 6buZp6YTKW7apYjJztfQUDuiYDhyS7uIxLqXmFQ= X-Google-Smtp-Source: AHgI3IaBh6iBDhKoL4fX1HNrMfbE7omxozmoYAh9b4TLtpFRl2lhGNxrrHUejbz34ZNXRpAdOyH1k1MaHU/2KVsAKPM= X-Received: by 2002:a1c:7e56:: with SMTP id z83mr6037898wmc.100.1549515148079; Wed, 06 Feb 2019 20:52:28 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: hemant singh Date: Thu, 7 Feb 2019 10:22:18 +0530 Message-ID: Subject: Re: Spark DataFrame/DataSet Wide Transformations To: Faiz Chachiya Cc: user Content-Type: multipart/alternative; boundary="000000000000f9b7e20581469bd7" --000000000000f9b7e20581469bd7 Content-Type: text/plain; charset="UTF-8" Same concept applies to Dataframe as it is with RDD with respect to transformations. Both are distributed data set. Thanks On Thu, Feb 7, 2019 at 8:51 AM Faiz Chachiya wrote: > Hello Team, > > With RDDs it is pretty clear which operations would result in wide > transformations and there are also options available to find out parent > dependencies > > I have been struggling to do the same with DataFrame/DataSet, I need your > helping in finding out which operations may lead to wide transformations > like (OrderBy) and if there is way to find out the parent dependencies. > > There is one way to find out parent dependencies by converting the DF/DS > to RDD and invoke the dependencies. > > I hope my question is clear and would request your help with it. > > Thanks, > Faiz > --000000000000f9b7e20581469bd7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Same concept applies to Dataframe as it is with RDD with r= espect to transformations. Both are distributed data set.

Thanks

On Thu, Feb 7, 2019 at 8:51 AM Faiz Chachiya <faiz.india@gmail.com> wrote:
=
Hello Te= am,

With RDDs it is pretty clear which operations would = result in wide transformations and there are also options available to find= out parent dependencies=C2=A0

I have been struggl= ing to do the same with DataFrame/DataSet, I need your helping in finding o= ut which operations may lead to wide transformations like (OrderBy) and if = there is way to find out the parent dependencies.=C2=A0

There is one way to find out parent dependencies by converting the DF= /DS to RDD and invoke the dependencies.=C2=A0

I ho= pe my question is clear and would request your help with it.

=
Thanks,
Faiz
--000000000000f9b7e20581469bd7--