When Airflow executes a task, it needs a log file to which the tasks' output is written. If Airflow is unable to find or connect to the log file, it fails the task. While the batch-ingest Airflow application always provides Airflow's scheduler with a log file, the Airflow scheduler may still fail to find the log file.

Diagnosis

Occasionally, an Airflow task will fail and Airflow will claim that the log file for that task does not exist:

Log file missing
*** Log file does not exist: opt/airflow/logs/ingest/sis_academic_minor.sis__academic_minor__create/2021-04-19T00:00:00:00/2.log
...
Invalid URL 'http://:8793/log/ingest/sis_academic_minor.sis__academic_minor__create/2021-04-19T00:00:00:00/2.log': No host supplied

Root cause

The problem is a known bug in Airflow. We are waiting for the Airflow team to fix this issue.

Solution

At present, the only way to resolve this problem is to clear the task and its downstream dependencies. This tells the Airflow scheduler to execute the tasks again.



  • No labels