Hello,
I'm reading the documents on Multinomial Logistic Regression (
https://apache.github.io/incubator-systemml/algorithms-classification.html#usage)
with Scala API. It says
val model = lr.fit(X_train_df)
val prediction = model.transform(X_test_df)
The "Arguments" section below it says:
X: Location (on HDFS) to read the input matrix of feature vectors; each row
constitutes one feature vector.
Y: Location to read the input one-column matrix of category labels that
correspond to feature vectors in X. Note the following:...
The explanation of the arguments seem to correspond to the Hadoop and Spark
API.
Could someone please advise what are the specifications of `X_train_df` and
`X_test_df`? Are they the same as specified in the Python API? i.e.:
# X_train, y_train and X_test can be NumPy matrices or Pandas
DataFrame or SciPy Sparse Matrixy_test = logistic.fit(X_train,
y_train).predict(X_test)# df_train is DataFrame that contains two
columns: "features" (of type Vector) and "label". df_test is a
DataFrame that contains the column "features"
The explanation of arguments for Python/Scala seem to be missing for other
algorithms, too.
Thanks a lot,
Ethan
|