This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Characterization of Failures in an IP Backbone Athina Markopoulou † , Gianluca Iannaccone ‡ , Supratik Bhattacharyya § , Chen-Nee Chuah ¶ , Christophe Diot ‡ † EE Department Stanford Univ.,CA, USA § Sprint ATL Burlingame, CA, USA ¶ ECE Department UC Davis, CA, USA ‡ Intel Research Cambridge, UK Abstract — We analyze IS-IS routing updates from Sprint’s IP network to characterize failures that affect IP connectivity. Failures are first classified based on probable causes such as maintenance activities, router-related and optical layer problems. Key temporal and spatial characteristics of each class are ana- lyzed and, when appropriate, parameterized using well-known distributions. Our results indicate that 20% of all failures is due to planned maintenance activities. Of the unplanned failures, almost 30% are shared by multiple links and can be attributed to router-related and optical equipment-related problems, while 70% affect a single link at a time. Our classification of failures according to different causes reveals the nature and extent of failures in today’s IP backbones. Furthermore, our characteriza- tion of the different classes can be used to develop a probabilistic failure model, which is important for various traffic engineering problems. I. INTRODUCTION The core of the Internet consists of several large networks (often referred to as backbones) that provide transit services to the rest of the Internet. These backbone networks are usually well-engineered and adequately provisioned, leading to very low packet losses and negligible queuing delays , . This robust network design is one of the reasons why the occurrence and impact of failures in these networks have received little attention. The lack of failure data from operational networks has further limited the investigation of failures in IP back- bones. However, such failures occur almost everyday  and an in-depth understanding of their properties and impact is extremely valuable to Internet Service Providers (ISPs). In this paper, we address this deficiency by analyzing failure data collected from Sprint’s operational IP backbone. The Sprint network uses an IP-level restoration approach for safeguarding against failures with no protection mechanisms in the underlying optical fiber infrastructure . Therefore, problems with any component at or below the IP layer (e.g., router hardware/software failures, fiber cuts, malfunctioning of optical equipment, protocol misconfigurations) manifest themselves as the loss of connectivity between two directly connected routers, which we refer to as an IP link failure. IS-IS  is the protocol used for routing traffic inside the Sprint network. When an IP link fails, IS-IS automatically recomputes alternate routes around the failed link, if such routes exist. The Sprint network has a highly meshed topology This work was conducted when the authors were affiliated (or in collaboration) with Sprint ATL. Email addresses: A. Markopoulou -collaboration) with Sprint ATL....
View Full Document
This note was uploaded on 12/08/2011 for the course CS 525 taught by Professor Gupta during the Spring '08 term at University of Illinois, Urbana Champaign.
- Spring '08