spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Chammas <>
Subject Using Spark to crack passwords
Date Thu, 12 Jun 2014 00:24:40 GMT
Spark is obviously well-suited to crunching massive amounts of data. How
about to crunch massive amounts of numbers?

A few years ago I put together a little demo for some co-workers to
demonstrate the dangers of using SHA1
<> to hash and store
passwords. Part of the demo included a live brute-forcing of hashes to show
how SHA1's speed made it unsuitable for hashing passwords.

I think it would be cool to redo the demo, but utilize the power of a
cluster managed by Spark to crunch through hashes even faster.

But how would you do that with Spark (if at all)?

I'm guessing you would create an RDD that somehow defined the search space
you're going to go through, and then partition it to divide the work up
equally amongst the cluster's cores. Does that sound right?

I wonder if others have already used Spark for computationally-intensive
workloads like this, as opposed to just data-intensive ones.


View this message in context:
Sent from the Apache Spark User List mailing list archive at
View raw message