spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Spico Florin <spicoflo...@gmail.com>
Subject MLib usage on Spark Streaming
Date Mon, 16 Feb 2015 14:39:39 GMT
Hello!
  I'm newbie to Spark and I have the following case study:
1. Client sending at 100ms the following data:
  {uniqueId, timestamp, measure1, measure2 }
2. Each 30 seconds I would like to correlate the data collected in the
window, with some predefined double vector pattern for each given key. The
predefined pattern has 300 records. The data should be also sorted by
timestamp.
3. When the correlation is greater than a predefined threshold (e.g 0.9) I
would like to emit an new message containing {uniqueId,
doubleCorrelationValue}
4. For the correlation I would like to use MLlib
5. As a programming language I would like to muse Java 7.

Can you please give me some suggestions on how to create the skeleton for
the above scenario?

Thanks.
 Regards,
 Florin

Mime
View raw message