spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manjunath Shetty H <>
Subject Saving Spark run stats and run watermark
Date Wed, 18 Mar 2020 11:03:05 GMT
Hi All,

Want to save each spark batch run stats (start, end, ID etc) and watermark ( Last processed
timestamp from external data source).

We have tried Hive JDBC, but it is very slow due MR jobs it will trigger. Cant save to normal
Hive tables as it will create lots of small files in HDFS.

Please suggest what is the recommended way to do this ? Any pointers will be helpful

Thanks and regards

View raw message