I understand persistence for PySpark ML pipelines is already present in 2.0, and further improvements are being made for 2.1 (e.g. SPARK-13786).

I’m having trouble, though, persisting a pipeline that includes a custom Transformer (see SPARK-17025). It appears that there is a magic _to_java() method that I need to implement.

Is the intention that developers implementing custom Transformers would also specify how it should be persisted, or are there ideas about how to make this automatic? I searched on JIRA but I’m not sure if I missed an issue that already addresses this problem.

Nick