spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LoicH (Jira)" <j...@apache.org>
Subject [jira] [Created] (SPARK-32146) ValueError when loading a PipelineModel on a personal computer
Date Wed, 01 Jul 2020 08:26:00 GMT
LoicH created SPARK-32146:
-----------------------------

             Summary: ValueError when loading a PipelineModel on a personal computer
                 Key: SPARK-32146
                 URL: https://issues.apache.org/jira/browse/SPARK-32146
             Project: Spark
          Issue Type: Bug
          Components: ML
    Affects Versions: 2.4.5
         Environment: * OS: Windows
 * SparkSession: spark = SparkSession.builder.appName({color:#6a8759}"annonces_organiques"{color}).getOrCreate()
            Reporter: LoicH


I have a PipelineModel saved on my computer that I can't load using {{PipelineModel.load(path)}}.

When I launch my code in a Databricks cluster, it works. {{path}} is the path to my model
saved on DBFS, accessible via a mount point: {{path = "/dbfs/path/to/my/model}}.

However on my machine, calling {{PipelineModel.load("C:\\Users\\path\\to\\my\\model")}} throws
a {{ValueError("RDD is empty")}}.

Here is how the model is saved on my computer:

{{\---model
    +---metadata
    |       part-00000
    |       _SUCCESS
    |
    \---stages
        +---0_CountVectorizer_b92625354bf7
        |   +---data
        |   |       part-00000-tid-9156766819779394023-5cf6aecb-8959-48b3-be24-65bfa0543465-62-1-c000.snappy.parquet
        |   |       _committed_9156766819779394023
        |   |       _started_9156766819779394023
        |   |       _SUCCESS
        |   |
        |   \---metadata
        |           part-00000
        |           _SUCCESS
        |
        \---1_LinearSVC_108fa01daf43
            +---data
            |       part-00000-tid-4403060754466700849-27841dd9-de88-4015-9dfa-7854c2a15f15-65-1-c000.snappy.parquet
            |       _committed_4403060754466700849
            |       _started_4403060754466700849
            |       _SUCCESS
            |
            \---metadata
                    part-00000
                    _SUCCESS}}

(I just downloaded the model from my DataLake to my computer)

How can I load this model when running my code in local?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message