spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Ho <>
Subject How to do nested for-each loops across RDDs ?
Date Mon, 15 Aug 2016 20:15:30 GMT
I've nested foreach loops like this:

  for i in A[i] do:
    for j in B[j] do:
      append B[j] to some list if B[j] 'matches' A[i] in some fashion.

Each element in A or B is some complex structure like:
  some complex JSON,
  some number

Question: if A and B were represented as RRDs (e.g. RRD(A) and RRD(B)), how
would my code look ?
Are there any RRD operators that would allow me to loop thru both RRDs like
the above procedural code ?
I can't find any RRD operators nor any code fragments that would allow me
to do this.

Thing is: by that time I composed RRD(A), this RRD would have contain
elements in array B as well as array A.
Same argument for RRD(B).

Any pointers much appreciated.



-eric ho

View raw message