Network Recovery

Protection and Restoration of Optical, SONET-SDH, IP, and MPLS


  • Jean-Philippe Vasseur, M.S. in Computer Science, Distinguished Engineer at Cisco Systems
  • Mario Pickavet, Ghent University, Gent, Belgium
  • Piet Demeester, Ghent University, Gent, Belgium

Network Recovery is the first book to provide detailed information on protecting and restoring communication networks, and it sets a sky-high standard for any that may follow. Inside, you’ll learn specific techniques that work at each layer of the networking hierarchy—including optical, SONET-SDH, IP, and MPLS—as well as multi-layer escalation strategies that offer the highest level of protection. The authors begin with an incisive introduction to the issues that define the field of network protection and restoration, and as the book progresses they explain everything you need to know about the relevant protocols, providing theoretical analyses wherever appropriate. If you work for a network-dependent organization, large or small, you’ll want to keep Network Recovery within reach at all times.
View full description


Networking professionals in medium to large corporations (including gov't. and military) as well as telecom companies, i.e., network engineers and consultants, network managers, systems engineers, protocol designers, network architects and designers, service providers and equipment vendors.


Book information

  • Published: July 2004
  • ISBN: 978-0-12-715051-2


"If one desires to learn as extensive and still evolving a field as network recovery, he or she will be interested in a book prepared by authors related to two somewhat separate worlds: industry and academia. ... This combination of knowledge gives an excellent overview of hot topics related to communications network resilience.... The advantage of this book is related to the fact that each chapter can be read separately, since the authors briefly repeat the most important ideas as necessary. The other benefit is that the current state of the development level of some techniques is signaled. To sum up, the book gives the reader a deep insight into "how it works." ... Thanks to this fact, the book can be recommended to everybody interested in network recovery, from layperson to experienced designer who would like to learn about the latest solutions." -- IEEE Communications Magazine, Book Reviews, July 2005. Reviewer: Piotr Cholda "This is the right book at the right time for anyone in the telecommunications business, or anyone who is dependent on the services provided by the telecommunications business that would like to understand the new Internet that is rapidly becoming the common reality." —From the Foreword by Scott Bradner, Senior Technical Consultant and University Technology, Security Officer at Harvard University "This book provides a welcome overview of the many techniques applied to protect and recover data paths in IP and MPLS networks as well as in SONET/SDH and optical transport networks. The analysis and case studies of MPLS Traffic Engineering recovery mechanisms will be particularly useful to operators intending to deploy MPLS protection within their networks." —Adrian Farrel, Old Dog Consulting & Co-Chair of the IETF CCAMP Working Group "I recommend this book to anyone responsible for the design, implementation and management of any-sized communications network." —David Cooper, Global Crossing "Writing a didactic and up-to-date book on protection and restoration techniques in new generation WDM networks is a real challenge. This requires two distinct fields of expertise in new generation IP/MPLS networks and in advanced optical transmission systems respectively. For the first time, such specialists have joined their efforts to make a book that gives readers a unique opportunity for understanding the details behind the main characteristics and challenges of multi-layer survivability." -Maurice Gagnaire, ENST, École Nationale Supérieure des Télécommunications

Table of Contents

Chapter 1: Introduction1.1 Communications networks today1.1.1 Fundamental networking concepts1.1.2 Layered network representation1.1.3 Network planes1.2 Network reliability1.2.1 Definitions1.2.2 Which failures can occur?1.2.3 Reliability requirements for various users and services1.2.4 Measures to increase reliability1.3 Different phases in a recovery process1.3.1 Recovery cycle1.3.2 Reversion cycle1.4 Performance of recovery mechanisms: criteria1.4.1 Scope of failure coverage1.4.2 Recovery time1.4.3 Backup capacity requirements1.4.4 Guaranteed bandwidth1.4.5 Reordering and duplication1.4.6 Additive latency and jitter1.4.7 State overhead1.4.8 Scalability1.4.9 Signaling requirements1.4.10 Stability1.4.11 Notion of recovery class1.5 Classification of single-layer recovery mechanisms1.5.1 Backup capacity: dedicated versus shared1.5.2 Recovery paths: pre-planned versus dynamic1.5.3 Protection versus restoration1.5.4 Global versus local recovery1.5.5 Control of recovery mechanisms1.5.6 Ring networks versus mesh networks1.5.7 Connection-oriented versus connectionless1.5.8 Revertive versus non-revertive mode1.6 Multi-layer recovery1.7 ConclusionChapter 2: SONET-SDH2.1 Introduction: transmission networks 2.1.1 Transmission Networks 2.1.2 Management of (Transmission) Networks 2.1.3 Structuring/Modeling Transmission Networks 2.1.4 Summarizing conclusions 2.2 SDH and SONET Networks 2.2.1 Introduction 2.2.2 Structure of SDH networks 2.2.3 SDH frame structure: overhead bytes relevant for network recovery2.2.4 SDH Network Elements2.2.5 Summarizing conclusion2.2.6 Differences between SONET/SDH 2.3 Operational aspects 2.3.1 Fault management processes 2.3.2 Fault detection and propagation inside a network element 2.3.3 Fault propagation and notification on a network level 2.3.4 Automatic Protection Switching (APS) protocol 2.4 Ring protection 2.4.1 Multiplex Section Shared Protection Ring (MS-SP Ring) 2.4.2 Multiplex Section Dedicated Protection Ring (MS-DP Ring) 2.4.3 Sub-Network Connection Protection Ring (SNCP Ring) 2.4.4 Ring Interconnection 2.4.5 Summarizing conclusions 2.4.6 Difference between Sonet and SDH 2.5 Linear Protection2.5.1 Multiplex Section Protection (MSP)2.5.2 Path protection2.5.3 Summarizing conclusions 2.6 Restoration 2.6.1 Protection versus restoration2.6.2 Summarizing conclusions2.7 Case study 2.7.1 Assumptions: network scenario, node configurations, and protection strategies2.7.3 Proposed network design and evaluation process2.7.4 Cost comparison for different protection strategies2.7.5 Summarizing conclusions2.8 Summary2.9 Recommended reference work and research-related topicsChapter 3: Optical Networks3.1 Evolution of the optical network layer3.1.1 Wavelength Division Multiplexing in the point-to-point optical network layer3.1.2 An optical networking layer with optical nodes3.1.3 An optical network layer organized in rings3.1.4 Meshed optical networks3.1.5 Adding flexibility to the optical network layer 3.2. The Optical Transport Network3.2.1 Architectural aspects and structure of the optical transport network3.2.2 Structure of the Optical Transport Module3.2.3 Overview of the standardization work on the Optical Transport Network3.3 Fault detection and propagation3.3.1 The optical network overhead3.3.2 Defects in the optical transport network3.3.3 OTN maintenance signals and alarm suppression3.4 Recovery in optical networks3.4.1 Recovery at the optical layer?3.4.2 Standardization work on recovery in the optical transport network3.4.3 Shared Risk Group3.5 Recovery mechanisms in ring-based optical networks3.5.1 Multiplex Section Protection in ring-based optical networks3.5.2 Optical channel protection in ring-based optical networks3.5.3 OMS versus OCh based approach3.5.4 Shared versus dedicated approach3.5.5 Interconnection of rings3.6 Recovery mechanisms in mesh-based optical networks3.6.1 Protection versus restoration3.7 Ring-based versus mesh-based recovery schemes3.8 Availability3.8.1 Availability calculations3.8.2 Availability: some observations3.9 Som recent trends in research3.9.1 p-cycles3.9.2 Meta-mesh recovery technique3.9.3 Flexible optical networks3.10 SummaryChapter 4: IP Routing4.1 IP routing protocols4.1.1 Introduction4.1.2 Distance vector routing protocol overview4.1.3 Link State routing protocol overview4.1.4 IP routing: a local versus global restoration mechanism?4.2 Analysis of the IP recovery cycle4.2.1 Fault detection and characterization4.2.2 Hold-off timer4.2.3 Fault notification time4.2.4 Computation of the routing table4.2.5 An example of IP rerouting upon link failure4.3 Failure profile and fault detection4.3.1 Failure profiles4.3.2 Failure detection4.3.3 Failure characterization4.3.4 Analysis of the various failure types and their impact on traffic4.4 Dampening algorithms4.5 FIS propagation (LSA origination and flooding)4.5.1 LSA origination process4.5.2 LSA flooding process4.5.3 Time estimate for the LSA origination and flooding process4.6 Route computation4.6.1 Shortest path computation4.6.2 The Dijkstra algorithm4.6.3 Shortest path computation triggers4.6.4 Routing Information Base (RIB) update4.7 Temporary loops during network states changes4.7.1 Temporary loops in the case of a node or a link failure4.7.2 Temporary loops caused by a restored network element4.8 Load balancing4.9 QOS guarantees during failure4.10 Non Stop Forwarding: an example with OSPF4.11 A case study with IS-IS4.12 Summary4.13 Algorithm complexity4.14 Incremental SPF4.15 Interaction between fast IGP convergence and NSF4.16 Research related topicsChapter 5: MPLS Traffic Engineering5.1 MPLS Traffic Engineering refresher5.1.1 Traffic Engineering in data networks5.1.2 Terminology5.1.3 MPLS Traffic Engineering components5.1.4 Notion of preemption in MPLS Traffic Engineering5.1.5 Motivations for deploying MPLS Traffic Engineering5.2. Analysis of the recovery cycle5.2.1 Fault detection time5.2.2 Hold-off timer5.2.3 Fault notification time5.2.4 Recovery operation time5.2.5 Traffic recovery time5.3. MPLS Traffic Engineering global default restoration5.3.1 Fault Signal Indication5.3.2 Mode of Operation5.3.3 Recovery Time5.4 MPLS Traffic engineering global path protection5.4.1 Mode of operation5.4.2 Recovery time5.5 MPLS Traffic Engineering local protection 5.5.1 Terminology5.5.2 Principles of local protection recovery techniques5.5.3 Local Protection-"One to one backup"5.5.4 Local Protection-"Facility backup"5.5.5 Properties of a Traffic Engineering LSP5.5.6 Notification of "Tunnel locally repaired"5.5.7 Signaling extensions for MPLS Traffic Engineering local protection5.5.8 Two strategies for deploying MPLS Traffic Engineering for fast recovery5.6. Another MPLS Traffic Engineering recovery alternative5.7. Load balancing5.8 Comparison of global protection and local protection5.8.1 Recovery time5.8.2 Scalability5.8.3 Bandwidth sharing capability5.8.4 Summary5.9 Revertive versus non revertive modes5.9.1 MPLS Traffic Engineering default global restoration5.9.2 MPLS Traffic Engineering global path protection5.9.3 MPLS Traffic Engineering Local protection5.10 Failure profiles and fault detection5.10.1 MPLS-specific failure detection hello based protocol5.10.2 Requirements for an accurate failure type characterization5.10.3 Analysis of the various failure types and their impact on traffic forwarding5.11 Case Studies5.11.1 Case Study 15.11.2 Case Study 25.11.3 Case Study 35.12 Standardization5.13 Summary5.14 RSVP signaling extensions for MPLS TE local protection5.14.1 SESSION-ATTRIBUTE object5.14.2 FAST REROUTE object5.14.3 DETOUR object5.14.4 Route Record Object (RRO)5.14.5 Signaling a protected Traffic Engineering LSP with a set of constraints5.14.6 Identification of a signaled TE LSP5.14.7 Signaling with Facility backup5.14.8 Signaling with one-to-one backup5.14.9 Detour merging5.15 Backup path computation5.15.1 Introduction5.15.2 Requirements for strict QoS guarantees during failure5.15.3 Network design considerations5.15.4 Notion of bandwidth sharing between backup paths5.15.5 Backup path computation – MPLS TE global path protection5.15.6 Backup path computation – MPLS TE Fast Reroute Facility Backup5.15.7 Backup tunnel path computation with MPLS TE Fast Reroute One-to-One Backup5.15.8 Summary5.16 Research related topicsChapter 6 Multi-Layer Networks6.1 ASON / GMPLS networks6.1.1 The ASON/ASTN framework6.1.2 Protocols for implementing a distributed control plane6.1.3 Overview of control plane architectures (overlay, peer, augmented)6.2 Generic multi-layer recovery approaches6.2.1 Why multi-layer recovery?6.2.2 Single-layer recovery schemes in multi-layer networks6.2.3 Static multi-layer recovery schemes6.2.4 Dynamic multi-layer recovery6.2.5 Summary6.3 Case studies6.3.1 Case study 1: Optical restoration and MPLS Traffic Engineering Fast Reroute6.3.2 Case study 2: SONET-SDH protection and IP routing6.3.3 Case study 3: MPLS Traffic Engineering Fast Reroute (Link Protection) and IP Rerouting Fast convergence6.4 Conclusion6.5 References