spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sathish Kumaran Vairavelu <>
Subject Checking Data Integrity in Spark
Date Fri, 27 Mar 2015 11:43:42 GMT

I want to check if there is any way to check the data integrity of the data
files. The use case is perform data integrity check on large files 100+
columns and reject records (write it another file) that does not meet
criteria's (such as NOT NULL, date format, etc). Since there are lot of
columns/integrity rules we should able to data integrity check through
configurations (like xml, json, etc); Please share your thoughts..



View raw message