spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kant kodali <>
Subject Re: is there a way to persist the lineages generated by spark?
Date Fri, 07 Apr 2017 06:08:37 GMT
yes Lineage that is actually replayable is what is needed for Validation
process. So we can address questions like how a system arrived at a state S
at a time T. I guess a good analogy is event sourcing.

On Thu, Apr 6, 2017 at 10:30 PM, Jörn Franke <> wrote:

> I do think this is the right way, you will have to do testing with test
> data verifying that the expected output of the calculation is the output.
> Even if the logical Plan Is correct your calculation might not be. E.g.
> There can be bugs in Spark, in the UI or (what is very often) the client
> describes a calculation, but in the end the description is wrong.
> > On 4. Apr 2017, at 05:19, kant kodali <> wrote:
> >
> > Hi All,
> >
> > I am wondering if there a way to persist the lineages generated by spark
> underneath? Some of our clients want us to prove if the result of the
> computation that we are showing on a dashboard is correct and for that If
> we can show the lineage of transformations that are executed to get to the
> result then that can be the Q.E.D moment but I am not even sure if this is
> even possible with spark?
> >
> > Thanks,
> > kant

View raw message