beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akash Patel (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BEAM-3403) Ingesting json file ValidationError: Expected type <type 'unicode'>
Date Wed, 03 Jan 2018 09:05:00 GMT
Akash Patel created BEAM-3403:
---------------------------------

             Summary: Ingesting json file ValidationError: Expected type <type 'unicode'>
                 Key: BEAM-3403
                 URL: https://issues.apache.org/jira/browse/BEAM-3403
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
    Affects Versions: 2.2.0
            Reporter: Akash Patel
            Assignee: Ahmet Altay


Reading a json file from GCS file pattern using Beam Python SDK 2.2.0 in Dataflow yields the
following warning:

{code:bash}
Retry with exponential backoff: waiting for 4.21317187833 seconds before retrying report_completion_status
because we caught exception: ValidationError: Expected type <type 'unicode'> for field
name, found s05-s34-reify20-process-msecs (type <class 'apache_beam.utils.counters.CounterName'>)
Traceback for above exception (most recent call last): File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py",
line 175, in wrapper return fun(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
line 491, in report_completion_status exception_details=exception_details) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
line 299, in report_status work_executor=self._work_executor) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
line 316, in report_status append_counter(work_item_status, counter, tentative=not completed)
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py", line 43,
in append_counter status_object, counter.name, kind, counter.accumulator, setter) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py",
line 95, in append_counter_update add_unstructured_name_and_kind(metric_update, metric_name,
kind) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py", line
63, in add_unstructured_name_and_kind metric_update.nameAndKind.name = metric_name File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 973, in __setattr__ object.__setattr__(self, name, value) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1299, in __set__ value = self.validate(value) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1406, in validate return self.__validate(value, self.validate_element) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1364, in __validate return validate_element(value) File "/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py",
line 1549, in validate_element return super(StringField, self).validate_element(value) File
"/usr/local/lib/python2.7/dist-packages/apitools/base/protorpclite/messages.py", line 1346,
in validate_element (self.type, name, value, type(value)))
{code}

The job does not fail but rather gets stuck on trying to read the file. The above warning
is thrown for every retry read.

However running the job with Beam Python SDK 2.1.1 works perfectly fine.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message