spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <>
Subject Re: Cartesian join on RDDs taking too much time
Date Wed, 25 May 2016 11:19:16 GMT

Alternatively depending on the exact use case you may employ solr on Hadoop for text analytics

> On 25 May 2016, at 12:57, Priya Ch <> wrote:
> Lets say i have rdd A of strings as  {"hi","bye","ch"} and another RDD B of
> strings as {"padma","hihi","chch","priya"}. For every string rdd A i need
> to check the matches found in rdd B as such for string "hi" i have to check
> the matches against all strings in RDD B which means I need generate every
> possible combination r

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message