Summary
The main indication of the resolution was following log in MongoDB Telemetry primary replica
Impact
Trigger
PagerDuty - Grafana Alerts of MongoDB Telemetry 0, Replication lag
Detection
20:00 - immediately after the incident started
Root Causes
Remediation
Change the settings of MongoDB to HA. Reduce requirements on consistency on the MongoDB Telemetry.