So the data in the fcst dataframe is like this
Product, fcst_qty
A 100
B 50
Sales DF has data like this
Order# Item# Sales qty
101 A 10
101 B 5
102 A 5
102 B 10
I want to update the FCSt DF data, based on Product=Item#
So the resultant FCST DF should have data
Product, fcst_qty
A 85
B 35
Hope it helps
If I join the data between the 2 DFs (based on Product# and item#), I will get a cartesion
join and my result will not be what I want
Thanks for your help
From: Mike Metzger [mailto:mike@flexiblecreations.com]
Sent: Friday, August 26, 2016 2:12 PM
To: Subhajit Purkayastha <spurkaya@p3si.net>
Cc: user @spark <user@spark.apache.org>
Subject: Re: Spark 2.0 - Insert/Update to a DataFrame
Without seeing exactly what you were wanting to accomplish, it's hard to say. A Join is still
probably the method I'd suggest using something like:
select (FCST.quantity - SO.quantity) as quantity
<other needed columns>
from FCST
LEFT OUTER JOIN
SO ON FCST.productid = SO.productid
WHERE
<conditions>
with specifics depending on the layout and what language you're using.
Thanks
Mike
On Fri, Aug 26, 2016 at 3:29 PM, Subhajit Purkayastha <spurkaya@p3si.net <mailto:spurkaya@p3si.net>
> wrote:
Mike,
The grains of the dataFrame are different.
I need to reduce the forecast qty (which is in the FCST DF) based on the sales qty (coming
from the sales order DF)
Hope it helps
Subhajit
From: Mike Metzger [mailto:mike@flexiblecreations.com <mailto:mike@flexiblecreations.com>
]
Sent: Friday, August 26, 2016 1:13 PM
To: Subhajit Purkayastha <spurkaya@p3si.net <mailto:spurkaya@p3si.net> >
Cc: user @spark <user@spark.apache.org <mailto:user@spark.apache.org> >
Subject: Re: Spark 2.0 - Insert/Update to a DataFrame
Without seeing the makeup of the Dataframes nor what your logic is for updating them, I'd
suggest doing a join of the Forecast DF with the appropriate columns from the SalesOrder DF.
Mike
On Fri, Aug 26, 2016 at 11:53 AM, Subhajit Purkayastha <spurkaya@p3si.net <mailto:spurkaya@p3si.net>
> wrote:
I am using spark 2.0, have 2 DataFrames, SalesOrder and Forecast. I need to update the Forecast
Dataframe record(s), based on the SaleOrder DF record. What is the best way to achieve this
functionality
|