spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From evanzamir <>
Subject How to add custom steps to Pipeline models?
Date Fri, 12 Aug 2016 16:19:42 GMT
I'm building an LDA Pipeline, currently with 4 steps, Tokenizer,
StopWordsRemover, CountVectorizer, and LDA. I would like to add more steps,
for example, stemming and lemmatization, and also 1-gram and 2-grams (which
I believe is not supported by the default NGram class). Is there a way to
add these steps? In sklearn, you can create classes with fit() and
transform() methods, and that should be enough. Is that true in Spark ML as
well (or something similar)? 

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe e-mail:

View raw message