On February 20, 2026, Cloudflare, a leading web infrastructure and security company, experienced a significant six-hour global service outage. This disruption severely impacted customers utilizing its Bring Your Own IP (BYOIP) services, rendering numerous websites and applications inaccessible.
Incident Overview
The outage commenced at 17:48 UTC and persisted for over six hours, during which Cloudflare unintentionally withdrew customer Border Gateway Protocol (BGP) routes from the internet. This action affected approximately 25% of all BYOIP prefixes globally, leading to widespread service disruptions. Users encountered HTTP 403 errors when attempting to access services, including the 1.1.1.1 public recursive DNS resolver.
Root Cause Analysis
The incident was traced back to an internal configuration error within Cloudflare’s Addressing API. An automated cleanup task, designed to replace manual removal processes for BYOIP prefixes, contained a bug. This task executed an API query with an unassigned `pending_delete` flag, causing the system to queue all returned BYOIP prefixes for deletion instead of only those slated for removal. Consequently, approximately 1,100 BYOIP prefixes and their associated service bindings were systematically deleted before the process was manually halted.
Impact on Services
The outage had a cascading effect on multiple core products and services:
– Core CDN and Security Services: Traffic failed to route through Cloudflare, resulting in connection timeouts for numerous websites.
– Spectrum: Applications operating on BYOIP were unable to proxy traffic, leading to complete service failures.
– Dedicated Egress: Users leveraging BYOIP or Dedicated IPs could not send outbound traffic to their destinations.
– Magic Transit: End users connecting to protected applications experienced complete connection failures and timeouts.
Recovery Efforts
Restoration efforts were complex due to the varied impact on customer prefixes. While some users could self-remediate by re-advertising their prefixes via the Cloudflare dashboard, approximately 300 prefixes required manual restoration. Engineers had to push global configuration updates to reapply settings across the edge network, prolonging the recovery process.
Preventive Measures
In response to the incident, Cloudflare is implementing several critical architectural changes under its Code Orange resilience initiative:
– Standardizing API Schema: To prevent flag interpretation errors that led to the unintended deletion of prefixes.
– Implementing Circuit Breakers: To detect and halt abnormally fast BGP prefix deletions, mitigating potential widespread disruptions.
– Establishing Health-Mediated Operational State Snapshots: To separate customer configurations from production rollouts, ensuring stability during updates.
Conclusion
This incident underscores the critical role of robust internal processes and safeguards in maintaining internet infrastructure stability. Cloudflare’s proactive steps toward enhancing system resilience aim to prevent similar occurrences in the future, ensuring reliable service for its global customer base.
Twitter Post:
Cloudflare’s 6-hour global outage on Feb 20, 2026, disrupted numerous services. Learn about the cause and the steps taken to prevent future incidents. #CloudflareOutage #InternetSecurity #BYOIP
Focus Key Phrase:
Cloudflare global service outage
Article X Post:
Hashtags:
Article Key Phrase:
Category: Security News