DNS Update Race: Delays Allow Old Plan to Undo Recent Changes

DNS Update Race: Delays Allow Old Plan to Undo Recent Changes

A recent incident involving a race condition among DNS Enactors has highlighted potential vulnerabilities in the system’s update process. Typically, DNS Enactors efficiently update DNS states by applying the most current plans to service endpoints. This process usually runs smoothly, but the recent event demonstrated how delays in one Enactor can interact adversely with others.

In this instance, one DNS Enactor was facing significant delays while attempting to apply updates, causing it to repeatedly retry its transactions on several endpoints. Concurrently, the DNS Planner continued to produce newer plans, and another DNS Enactor began applying one of these plans swiftly across all endpoints. The overlap in timing between these operations exposed the race condition.

As the second Enactor finished updating its endpoints, it triggered a clean-up process designed to remove outdated plans. Unfortunately, while this was occurring, the delayed Enactor applied an older plan to the regional DDB endpoint, overriding the newer updates. The initial check, which ensures that a plan is newer than any previously applied plan, failed due to the delays, allowing the outdated plan to overwrite the latest updates.

Consequently, the older plan was deleted during the clean-up process, leading to the immediate removal of all IP addresses associated with the regional endpoint. The deletion of the active plan left the system in disarray, ultimately necessitating manual intervention to restore normal operations.

This incident serves as a critical reminder of the importance of robust error handling and synchronization in multi-threaded systems. As technology continues to evolve, ensuring the reliability and integrity of such processes will be essential in preventing similar occurrences in the future.

Popular Categories


Search the website