The context data ETL database used by the batch-ingest application may run out of memory if its hardware is improperly configured.
Diagnosis
Occasionally, an Airflow task in the batch-ingest application may fail with the following error:
psycopg2.OperationalError: SSL SYSCALL error: EOF detected
You may also see:
psycopg2.OperationalError: FATAL: the database system is in recovery mode FATAL: the database system is in recovery mode
Root cause
This error occurs when the database used by the batch-ingest application to transform UDP data runs out of memory and goes into recovery mode. At that point, the database becomes unreachable to the batch-ingest application and the task fails.
Solution
There is a two-part solution to the problem. First, you must grow the memory resources available to the DB instance. Second, you must run the failed tasks again by clearing them.
1. Increase RAM available to the database
You can change the CloudSQL instance settings in the Google Cloud console. In particular, you will grow the RAM available to the Cloud SQL instance.
2. Clear any failed tasks
Once the Cloud SQL instance is upgraded with a proper amount of RAM, you will re-run the failed tasks by clearing them.