spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TEST ONE <>
Subject Merging two avro RDD/DataFrames
Date Tue, 29 Sep 2015 00:00:24 GMT
I have a daily update of modified users (~100s) output as avro from ETL.
I’d need to find and merge with existing corresponding members in a master
avro file (~100,000s) The merge operation involves merging a ‘profiles’
Map<String,String> between the matching records.

What would be the recommended pattern to handle record merging with Spark?



View raw message