ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Measure <ameas...@gmail.com>
Subject Sharing trained models while protecting confidentiality
Date Sat, 18 May 2013 19:43:20 GMT
In my day job I train text classifiers that are useful for a wide variety
of health surveillance tasks. The data used to train these classifiers
however cannot be shared because of confidentiality protections.  I would
like to make these trained models available to others just as cTAKES does,
but I'm not sure how. Can you tell me how cTAKES does it, or point me to
resources that might be useful?

My models tend to be regularized logistic regression models trained on
bag-of-words type features. I suspect that I can get some protection by
hashing everything to a fixed space first, but if there's a different
well-established approach out there I'd rather use that.

Alex Measure

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message