spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <>
Subject Financial fraud detection using streaming RDBMS data into Spark & Hbase
Date Thu, 15 Dec 2016 23:27:33 GMT
I am not talking about Credit Card fraud etc.

In the complex fraud cases like that one in UBS
<> , the rogue
trader over a period of time manipulated the figures. Although there is a
lot of talk about using elaborate set-ups to predict unusual behaviour etc,
in my opinion it all boils down how the figures are manipulated at database

Bottom line money has to leave the account and paid out to be real money so
to speak. In most cases these type of fraud happens because someone is
pretty familiar with the front office and settlement work but ignorant of
how a transactional database works.

In short a transactional DB only keeps the latest updates. However, it is
now possible to get the DML data out of the log by use of replication
technologies (I am not talking about simple CDC). If we start sending
replicated transactional logs (as SQL statements) of out of database for
all updates and store them in Hbase, then one can go through the Hbase data
with Spark.

In the past this was prohibitive using database audit as it had heavy price
on the RDBMS performance. However, with Big Data this can be done through
the time series/immutable inserts (Updates are simply flagged as updates
and time stamped) as a new row in Hbase.

I thought about it a bit and I think it can provide results far quicker
and better than searching a needle in a haystack these days with the stuff
promoted by various companies?

Your thoughts?

Dr Mich Talebzadeh

LinkedIn *

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

View raw message