spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jinhong lu <lujinho...@gmail.com>
Subject how to construct parameter for model.transform() from datafile
Date Mon, 13 Mar 2017 08:31:17 GMT
Hi, all:

I got these training data:

	0 31607:17
	0 111905:36
	0 109:3 506:41 1509:1 2106:4 5309:1 7209:5 8406:1 27108:1 27709:1 30209:8 36109:20 41408:1
42309:1 46509:1 47709:5 57809:1 58009:1 58709:2 112109:4 123305:48 142509:1
	0 407:14 2905:2 5209:2 6509:2 6909:2 14509:2 18507:10
	0 604:3 3505:9 6401:3 6503:2 6505:3 7809:8 10509:3 12109:3 15207:19 31607:19
	0 19109:7 29705:4 123305:32
	0 15309:1 43005:1 108509:1
	1 604:1 6401:1 6503:1 15207:4 31607:40
	0 1807:19
	0 301:14 501:1 1502:14 2507:12 123305:4
	0 607:14 19109:460 123305:448
	0 5406:14 7209:4 10509:3 19109:6 24706:10 26106:4 31409:1 123305:48 128209:1
	1 1606:1 2306:3 3905:19 4408:3 4506:8 8707:3 19109:50 24809:1 26509:2 27709:2 56509:8 122705:62
123305:31 124005:2

And then I train the model by spark:

	import org.apache.spark.ml.classification.NaiveBayes
	import org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
	import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
	import org.apache.spark.sql.SparkSession

	val spark = SparkSession.builder.appName("NaiveBayesExample").getOrCreate()
	val data = spark.read.format("libsvm").load("/tmp/ljhn1829/aplus/training_data3")
	val Array(trainingData, testData) = data.randomSplit(Array(0.7, 0.3), seed = 1234L)
	//val model = new NaiveBayes().fit(trainingData)
	val model = new NaiveBayes().setThresholds(Array(10.0,1.0)).fit(trainingData)
	val predictions = model.transform(testData)
	predictions.show()


OK, I have got my model by the cole above, but how can I use this model to predict the classfication
of other data like these:

	ID1	509:2 5102:4 25909:1 31709:4 121905:19
	ID2	800201:1
	ID3	116005:4
	ID4	800201:1
	ID5	19109:1  21708:1 23208:1 49809:1 88609:1
	ID6	800201:1
	ID7	43505:7 106405:7

I know I can use the transform() method, but how to contrust the parameter for transform()
method?





Thanks,
lujinhong


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message