airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ash Berlin-Taylor <>
Subject Re: Operators that poll vs Sensors
Date Tue, 05 Sep 2017 15:35:10 GMT
The primary difference between those cases and the other Sensors is that the sensors I've seen
(EMR Job Flow, S3 Key) don't do anything _other_ than the sensing task, where as the tasks
you linked to also perform some other action; it's just that they wait until that operation
is complete before returning.

Additionally my understanding is that there Sensor's are just a API/python class-level convention
that don't make any difference to the scheduler, i.e. this is what the BaseSensor class does:

def execute(self, context):
  started_at =
  while not self.poke(context):
    if ( - started_at).total_seconds() > self.timeout:
      if self.soft_fail:
        raise AirflowSkipException('Snap. Time is OUT.')
        raise AirflowSensorTimeout('Snap. Time is OUT.')
    sleep(self.poke_interval)"Success criteria met. Exiting.")

i.e. there's not much difference in effect from an operator that loops and sleeps itself to
one which is a Sensor.


> On 5 Sep 2017, at 16:14, Richard Baron Penman <> wrote:
> Hello,
> I noticed some operators in contrib (ECS, databricks, dataproc) submit
> their task and then poll until complete:
> Would they be better designed as Sensors?
> I ask because I wrote a Sensor for an API and wondering whether there was
> an advantage to the Operator polling approach.
> Richard

View raw message