james-server-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Means" <cme...@intfar.com>
Subject First pass Token Counter mailet for ANTI SPAM mailet
Date Mon, 19 Aug 2002 04:11:19 GMT
Hi,

This is my first attempt at developing a mailet...so if I've made a mistake
about how best to implement this...or if I've just done something dumb in my
code (cos I'm no Java guru either) please let me know.

I'm following through with a posting I saw on /. regarding using word
occurance statistics to be able to filter out SPAM from legit messages.
Here's the original article: http://www.paulgraham.com/spam.html

I saw this as a two part development.

Part 1:
  Routines for building good/bad word-token statistics.

Part 2:
  Using the statistics to route or flag new messages as SPAM or not.

Attached is my first pass at the code for Part 1.

As I decided to get familiar with JDBC with James at the same time, I've
coded this to use JDBC as the repository...that may not be the best approach
as it introduces a time lag at start up (as it loads the existing
words/occurances) and at shutdown, as it persists the new statistics back
into the database.

Let me know what you guys think of my approach...etc.

P.S.  Hopefully, there's something I don't understand about how to develope
under James.  I'm using JBuilder 4 to compile my code, then I've got to
update the James.bar (which JBuilder doesn't recognize as a jar repository)
with the new class file, then restart James.  I realize there's probably no
easy way around restarting James, but it would be nice to skip updating the
.bar all the time...is there a way to do this?

Thanks.

-Chris

Mime
View raw message