spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lian Jiang <>
Subject structured streaming handling validation and json flattening
Date Sat, 09 Feb 2019 19:25:39 GMT

We have a structured streaming job that converting json into parquets. We
want to validate the json records. If a json record is not valid, we want
to log a message and refuse to write it into the parquet. Also the json has
nesting jsons and we want to flatten the nesting jsons into other parquets
by using the same streaming job. My questions are:

1. how to validate the json records in a structured streaming job?
2. how to flattening the nesting jsons in a structured streaming job?
3. is it possible to use one structured streaming job to validate json,
convert json into a parquet and convert nesting jsons into other parquets?

I think unstructured streaming can achieve these goals but structured
streaming is recommended by spark community.

Appreciate your feedback!

View raw message