Message view | « Date » · « Thread » |
---|---|
Top | « Date » · « Thread » |
From | Aakash Basu <aakash.spark....@gmail.com> |
Subject | Avoiding collect but use foreach |
Date | Fri, 01 Feb 2019 07:37:18 GMT |
Hi, This: *to_list = [list(row) for row in df.collect()]* Gives: [[5, 1, 1, 1, 2, 1, 3, 1, 1, 0], [5, 4, 4, 5, 7, 10, 3, 2, 1, 0], [3, 1, 1, 1, 2, 2, 3, 1, 1, 0], [6, 8, 8, 1, 3, 4, 3, 7, 1, 0], [4, 1, 1, 3, 2, 1, 3, 1, 1, 0]] I want to avoid collect operation, but still convert the dataframe to a python list of list just as above for downstream operations. Is there a way, I can do it, maybe a better performant code that using collect? Thanks, Aakash. | |
Mime |
|
View raw message |