spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arush Kharbanda <ar...@sigmoidanalytics.com>
Subject Re: Checking Data Integrity in Spark
Date Fri, 27 Mar 2015 13:35:37 GMT
Its not possible to configure Spark to do checks based on xmls. You would
need to write jobs to do the validations you need.

On Fri, Mar 27, 2015 at 5:13 PM, Sathish Kumaran Vairavelu <
vsathishkumaran@gmail.com> wrote:

> Hello,
>
> I want to check if there is any way to check the data integrity of the
> data files. The use case is perform data integrity check on large files
> 100+ columns and reject records (write it another file) that does not meet
> criteria's (such as NOT NULL, date format, etc). Since there are lot of
> columns/integrity rules we should able to data integrity check through
> configurations (like xml, json, etc); Please share your thoughts..
>
>
> Thanks
>
> Sathish
>



-- 

[image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>

*Arush Kharbanda* || Technical Teamlead

arush@sigmoidanalytics.com || www.sigmoidanalytics.com

Mime
View raw message