Carrier Ethernet deployment faces challenges, opportunities

by Ralph Santitoro

Ethernet has been adopted by service providers of all types to enable new services and simplify network infrastructure. There are, however, a number of implementation issues Carrier Ethernet must overcome before it is as ubiquitous as today’s T1, SONET/SDH, and Frame Relay services.

Standards, technologies, and products are being created and continue to evolve to meet these challenges. However, for now service providers must take an evolutionary philosophy in deploying Ethernet services. They can deploy an infrastructure immediately to enjoy the benefits of Carrier Ethernet technologies but must realise that the technology is still maturing.

Today, three Carrier Ethernet fundamentals are well defined and articulated by the Metro Ethernet Forum, namely, the Ethernet user network interface (UNI), Ethernet virtual connections (EVCs), and Ethernet service definitions following a standard service definitions framework. However, there are several key areas that require further specification and standardisation, as well as service provider cooperation.

Service operations, administration, and maintenance (OAM) and the Ethernet network-to-network interface (NNI) are two key areas now undergoing further definition. These developments are absolutely essential to provide service/network fault isolation and troubleshooting, service-level agreements (SLAs), and internetworking peering agreements to offer end-to-end services with the ubiquity that TDM and Frame Relay services offer today.

Network operators use service OAM to monitor the status and performance of Ethernet services. This includes monitoring the operational status of network elements or network connections to determine the overall health of the network as well as measuring service performance-level objectives to determine if SLAs are being met. Ethernet service OAM must be provided end-to-end, beginning and ending with a UNI at the customer premises and across an EVC that will often traverse two or more service-provider NNI boundaries.

Service OAM for SONET, PDH (T1/T3), and Frame Relay services is very well defined and has been widely deployed for years. The standardised service OAM provided by these networks has enabled service providers to minimise their operational expenses (opex) in troubleshooting network or service outages using a common set of tools and processes.

By contrast, Ethernet service OAM is just coming to fruition for Carrier Ethernet networks and services. Fortunately, there is a rich history on this topic, so standards development organisations can focus their efforts on defining the protocols and architectures needed to implement service OAM specifically using Ethernet technologies. Two critical areas of service OAM are fault management and performance monitoring.

Troubleshooting network issues can often be quite challenging. The layering of protocols and services on top of a transport network only increases the complexity of fault isolation.

With transport networks, faults across the network as well as between network segments are isolated through indications of a signall loss and by using tools to perform loopbacks. For example, if a T1 circuit downstream has a fault, the network will indicate the fault via an alarm. The network operator can troubleshoot a T1 at the customer premises by initiating a remote loopback to the customer premises router to determine if the fault is in the service provider’s network. T1 troubleshooting is well defined and understood with widely available tools. The requirements for fault management in Ethernet networks and services are essentially the same. However, the tools, technologies, and protocols are different.

With Carrier Ethernet networks, fault management can be broken down into two areas: link fault management (LFM) and connectivity fault management (CFM).

The IEEE 802.3ah standard defines LFM for Ethernet links or segments between NNIs or for access network connections between the UNI at the customer premises and the service provider’s central office, collocation facility, or point of presence (see figure). The standard defines fault detection for the link as well as critical events. The standard also defines remote loopbacks that are critical to troubleshoot and investigate faulty connections before sending out a network technician at a cost of several hundred dollars in opex. When performing loopbacks, metrics such as frame delay, throughput, and frame loss can be calculated to determine the overall quality of the connection. The objective of LFM is to provide the service provider’s network operations center (NOC) with visibility and troubleshooting capability all the way to a particular Ethernet port on the device providing the UNI or NNI functionality.

IEEE 802.3ah can provide LFM for Ethernet services that are delivered over any underlying transport technology, including over Ethernet over copper (EoCu), active fibre (EoF), passive fibre (EPON), copper PDH circuits (EoPDH), and SONET (EoS). However, not all networking equipment supports the relatively new 802.3ah standard. Therefore, comparable methods available from the transport network technology are often used as an alternative method to achieve similar results, e.g., DS1/DS3 loopback capability for EoPDH services or terminal and facilities loopbacks for SONET circuits providing EoS services.

Meanwhile, IEEE 802.1ag and ITU-T Y.1731 collectively define CFM, which allows service providers to manage each subscriber’s end-to-end Ethernet service interconnecting two or more locations (UNIs). The service connectivity is logically represented by an EVC. At each subscriber location, the UNI may have one or more services (EVCs) instances. Therefore, CFM is performed for each EVC and if the EVC supports more than one class of service (CoS), CFM is performed per CoS. CFM enables the service provider to determine the operational status of an EVC (or CoS per EVC) and defines the troubleshooting tools to isolate the failure.

CFM provides four fundamental capabilities:

Ethernet continuity checking: Just like an electrician uses an ohmmeter to test an electrical circuit’s continuity to find faults in the wire, continuity checking capability for Ethernet networks enables the NOC to determine if an entire EVC or segments of an EVC at intermediate points in the network are operational. Continuity check messages are also used as “heartbeats” that can be used to indicate an EVC failure and inform the network to switch the EVC to traverse a backup path.
Ethernet link trace: Like the familiar IP traceroute function, this capability enables the NOC to determine where a failure has occurred at each hop across all or part of an EVC.
Ethernet loopback: Like the familiar ICMP ping function, this capability enables the NOC to determine if a destination is reachable or not by pinging the UNI (management endpoint of the EVC) or some intermediate management point in the network such as an internal or external NNI.
Ethernet alarm indication signall (AIS): Like the familiar AIS function for DS1 circuits, the discovery of a connectivity fault by part of the network generates AIS messages to notify other devices of the existence of the fault.

These four capabilities provide the basic tools for fault isolation and troubleshooting Ethernet networks.

Service performance monitoring enables service providers to offer their subscribers an SLA that can be verified by measurements of several critical service performance parameters. While the measurement of, and definitions for, these metrics vary by service provider, they generally encompass the following:

Frame delay (FD).
Frame delay variation (FDV-often referred to as jitter).
Frame loss ratio (FLR).
Service availability.

These critical service performance parameters are defined for each EVC (or each CoS per EVC) and can be calculated or statistically approximated within a given timeframe, e.g., 5minute, 15minute, or 24hour intervals, during which they must be compliant with the service level objectives defined in the SLA.

One of the challenges with delivering an SLA is that service providers often must peer with one or more service providers to offer an end-to-end service. This makes the measurement of the service performance parameters such as FD or FLR more complex and challenging for a number of reasons:

Measurement of the parameters requires service OAM measurement frames to be transmitted across service provider domains through an NNI that may not allow such management traffic to pass.
If the subscriber’s physical connection to the network is through a peering network operator’s facilities (copper, fibre, or wireless), the NOC can’t access the subscriber’s UNI without placing a device at the customer premises to participate in the measurement and monitoring of the service performance parameters as well as other UNI functionality. Such a device, referred to as an Ethernet network interface device (NID), is somewhat analogous to the T1 channel service unit that provided the T1 service provider demarcation point at the customer premises. However, the NID provides additional functionality required for an Ethernet, packet-based network service.
The peering network operator may not be able to ensure minimum consistent service performance to meet the service provider’s SLA requirements. For example, if the service provider requires an FLR of 0.1% and can assure a 0.08% FLR within its network but the peering network operator can only assure a 0.04% FLR, then the 0.1% FLR cannot be achieved.

Service providers, at some point, need to establish a peering agreement with one or more service providers in order to reach a subscriber site that cannot be reached directly from their network facilities. For PDH circuits (T1 and T3) and SONET, well defined standards, technologies, and processes enable service providers to quickly establish peering agreements with good, well known troubleshooting tools for service OAM.

Carrier Ethernet, on the other hand, is challenged because of the lack of standardisation of an Ethernet NNI. Furthermore, the NNI capabilities of different service providers’ equipment and service offerings can vary significantly-creating interoperability challenges that delay service delivery (and service revenue). Because of these issues, service providers must create a custom peering agreement for each subscriber, which can take months and adds to opex. This often limits a given service provider’s Ethernet service coverage to markets in which they own the network facilities.

Given the demand and importance of Ethernet services to their product offerings, several national and global service providers have begun collaborating in the Metro Ethernet Forum to solve the Ethernet NNI challenge for both retail and wholesale Ethernet services. Work has begun to develop a standardised NNI technical specification and address other nontechnical business requirements for NNI peering agreements. An example of some technical functions and capabilities that need to be considered at the NNI can be found in the table.

Carrier Ethernet is clearly the technology of choice for next-generation networks and is evolving to meet the needs of service providers to achieve widespread service deployment. Service OAM and the Ethernet NNI are two critical areas that require comprehensive approaches to achieve this objective. To address these challenges, significant developments are underway to provide optimal service and network fault management and internetworking peering agreements that will lead to Ethernet services ubiquity with the ability to provide high-value SLAs.

Ralph Santitoro is the director of Carrier Ethernet solutions at Turin Networks (www.turinnetworks.com) and chairman of the Metro Ethernet Forum’s Web Marketing Committee. He can be reached at [email protected].