******************************************************************* ATM Forum Document Number: ATM_Forum/96-0180 ******************************************************************* Title: Scope For ATM Forum's Performance Benchmarking Work Item ******************************************************************* Abstract: This contribution discusses the scope of the ATM Forum's test working group. It also presents update on he performance metrics proposed in our December 1995 contribution. ******************************************************************* Source: Raj Jain, Bhavana Nagendra, and Gojko Babic The Ohio State University Department of CIS Columbus, OH 43210-1277 Phone: 614-292-3989, Fax: 614-292-2911, Email: Jain@ACM.Org The presentation of this contribution at the ATM Forum is sponsored by NASA. ******************************************************************* Date: February, 1996, Los Angeles ******************************************************************* Distribution: ATM Forum Technical Working Group Members (AF-TEST) ******************************************************************* Notice: This contribution has been prepared to assist the ATM Forum. It is offered to the Forum as a basis for discussion and is not a binding proposal on the part of any of the contributing organizations. The statements are subject to change in form and content after further study. Specifically, the contributors reserve the right to add to, amend or modify the statements contained herein. ******************************************************************* SUMMARY OF DECEMBER 1995 DISCUSSION: ----------------------------------- In October 1995 meeting of AF-TEST, it was agreed that performance benchmarking is essential and that instead of forming a separate "birds of a feather (BOF)", AF-TEST will schedule presentations on performance issues. As a result, three presentations were made in December 1995 meeting. During the presentation, some issues were raised about what exactly should be the scope of ATM Forum's work in this area. Five different views were expressed by five different people. They were all asked write up and present their view of the scope for February meeting. This contribution fulfills that commitment from our side. Many of the ideas presented here are enhancements of ideas presented in the two earlier meetings. SCOPE OF ATM FORUM's WORK ON PERFORMANCE BENCHMARKING: ----------------------------------------------------- Performance benchmarking is related to user perceived performance of ATM technology. For the success of ATM technology, it is important that the performance of existing and new applications be better than that on other competing networking technologies. In other words, goodness of ATM will not be measured by cell level performance but by frame-level performance and performance perceived at higher layers. Most of the Quality of Service (QoS) metrics, such as cell transfer delay (CTD), cell delay variation (CDV), cell loss ratio (CLR), and so on, may or may not be reflected directly in the performance perceived by the user. For example, while comparing two switches if one gives a CLR of 0.1% and a frame loss ratio of 0.1% while the other gives a CLR 1% but a frame loss ratio of only 0.05%, the second switch will be considered superior by many users. ATM Forum and ITU have standardized the definitions of QoS metrics. We need to do the same for higher level performance metrics. Without a standard definition, each vendor will use their own definition of common metrics such as throughput and latency resulting in a confusion in the market place. Avoiding such a confusion will help buyers eventually leading to better sales resulting in the success of the ATM technology. GOALS OF THE ATM FORUM WORK: --------------------------- a. ATM Forum should define higher level performance metrics that will help a user to compare various ATM equipment (and possibly non-ATM equipment) in terms of performance. b. The metrics should be such that they are independent of switch or NIC architecture. The same metrics should apply to all architectures. c. The metrics should help the user predict the performance of their application or design their network configurations to meet their required performance. d. ATM Forum should develop precise methodology for measuring these metrics. The methodology includes a set of configurations and traffic patterns. This will allow vendors as well as users to conduct their own measurements and come up with comparable results. e. The key goal of this effort is to enhance the marketability of ATM technology and equipment. Any other extension of the above that helps in achieving that goal can be added later to this list. f. The benchmarking should eventually cover all classes of service. Many past performance measurements concentrated on CBR service. We need to extend those to real time VBR, non-realtime VBR, ABR, and UBR. This may be phased such that most important service classes are covered first and less important ones are added later. g. The metrics and methodology for different service classes can be different. h. The benchmarking should cover as many protocol stacks as possible. For example, data traffic may use UBR or ABR service class. Some ATM networks (switches) may offer one or both classes. The user may care more for the application throughput rather than the underlying mechanism used. The performance is, therefore, measured on several alternative protocol stacks. i. The benchmarking work should include performance of network management, connection setup, along with normal data transfer. NON-GOALS OF THE ATM FORUM WORK: ------------------------------- a. ATM Forum is not responsible for conducting any measurements. This is similar to other tests such as conformance testing. Tests are defined but not conducted by the standards bodies. b. ATM Forum is not responsible for certifying any measurements. Again this is not different from that for conformance testing. Certification has legal issues. Only defining metrics and methodologies has no legal consequences over and above what ATM forum is already doing. c. ATM Forum is not responsible for setting particular performance thresholds such that equipment below those thresholds are called "unsatisfactory." For example, whether a switch which loses 50% of packets is good or bad may depend upon the applications and cost. The users and designers should be free to make their own cost-performance tradeoffs. Setting such thresholds inhibits such trade-offs. For example, if we set the packet loss threshold at, say 99%. This will prevent manufacturers from making low-cost switches that may be good enough for many applications. Generally, users have flexibility to design their applications that they get satisfactory performance inspite of lower grade equipment (for example, by forward error correction or retransmissions in case of packet errors and loss). In other words, ATM Forum should not set any requirements that prevents reducing cost while reducing performance as well. As another example of the above argument consider the problem of setting a delay value for ATM switches. Lets say a delay of 30 ms is set as the standard. Switch manufacturers are compelled to manufacture switches with that delay value. But what prevents them to manufacture switches with various delays, depending on applications and requirements? For some applications a switch with a delay of 40 ms might suffice. Such switches are cheaper and needn't be precluded from the market. At the same time, manufacturers should continue to invest on better switches (with lower delays). So applications should not be bound by numbers as the switch manufacturers will be compelled to manufacture switches with the prescribed parametric values. This will hurt competition and will bring in legalities. It would also hurt progress into better technology. This holds good for all the other metrics of the switches like throughput, latency and the like. AN EXAMPLE PROPOSAL: ------------------- The metrics, methodologies, and traffic patterns discussed below are presented as a starting point for discussion. These are enhancements of those in our December 95 contribution. During December 95 meeting, a number of good suggestions were made. We have tried to update the presentations with those suggestions. Since this is a new endeavor, it is limited in several respects. This particular proposal concentrates on the data traffic (ABR and UBR service classes) since that is expected to be the bulk of traffic on ATM networks initially. Other service classes will be added later. TRAFFIC PATTERNS: ---------------- We define two types of traffic based on application's response to network congestion. a. Open loop traffic b. Closed loop traffic Case a) With open loop traffic, the application does not reduce its load when the network performance degrades in terms of throughput or delay. Periodically occurring events generally lead to such traffic patterns. Case b) With closed loop traffic, the application does slow down when the network response is slower. In many client-server applications, clients will not generate new requests if the previous requests have not been served. TCP/IP, which is expected to be a big part of the ATM market at least initially, is an example of a closed loop application. If the network performance degrades and TCP packets are delayed excessively or lost, TCP will reduce its window and resulting load on the network. UDP is an example of an open loop traffic. The following figure shows some of the application layer protocols that run on TCP and UDP, respectively. +-----+ | NSF | +-----+ +-----+ +------+ +-----+ +------+ +----+ +------+ +-----+ | RPC | | NDS | | SNMP | |BOOTP| |Telnet| |SMTP| | XWin | | FTP | +--+--+ +--+--+ +--+---+ +--+--+ +--+---+ +--+-+ +--+---+ +-----+ | / | | / | | | / | | / | | | / | | / | | +------+---------+ +-------+--------+-------+ | UDP | | TCP | +---+--+ +---+---+ / / / / +-------------------+ | IP | +-------------------+ . . . Figure 1 - The protocol stack above UDP and TCP ------------------------------------------------ One reason for differentiating between open-loop and closed-loop traffic patterns is that ATM layer has to provide proper resource control for open-loop traffic. The closed-loop traffic can live with looser controls. For example, TCP can work over UBR or ABR. It can work even under high loss conditions. WHICH LAYER TO MEASURE THE PERFORMANCE? ----------------------------------------- The performance can be measured at several layers (above ATM layer), for example, network (e.g., IP), transport (e.g., TCP), application (e.g., FTP). At each layer, several alternative stacks are possible. For example, IP can use "Classical IP over ATM" (RFC 1577) or "LAN Emulation (LANE)." As shown in Figure 2, performance could be measured at any of the three layers: AAL5, RFC 1577/LANE and IP. +-------------------+ | USER LEVEL | | APPLICATION (FTP) | |--------+----------| | TCP | UDP | |--------+----------|<---+ | IP | | |-----------+-------|<---+-- User perceived performance | RFC 1577 | LANE | | |-----------+-------|<---+ | AAL5 | |-------------------| | ABR | UBR | | ATM | |-------------------| | PHY | +-------------------+ Figure 2 - Examples of measurement alternatives ------------------------------------------------ At the AAL5 layer, one can measure ATM performance, but cannot compare technologies. At the LANE/RFC 1577 layer or at the IP layer, different technologies can be compared. TEST CONFIGURATIONS: ------------------- We propose considering the following two configurations. These will be used in defining the metrics. The hosts are connected by a ATM cloud and can be a single switch or a collection of switches. Configuration A: N inputs and 1 output +------+ |HOST1L| | |------------+ +------+ | | | +--------+ +------+ +------+ +---------( ) | | |HOST2L| ( ATM ) | HOST | | |------------+---------( )--------| | +------+ ( CLOUD ) +------+ +---------( ) . | +--------+ . | . | . | . | | +------+ | |HOSTNL| | | |------------+ +------+ Figure 3 - A configuration with N inputs and a single output ------------------------------------------------------------ Configuration B: N inputs and N outputs +------+ +------+ |HOST1L| |HOST1R| | |------------+ +-------| | +------+ | | +------+ | | | +--------+ | +------+ +---( )------+ +------+ |HOST2L| ( ATM ) |HOST2R| | |------------+---( )------+-------| | +------+ ( CLOUD ) +------+ +---( )------+ . | +--------+ | . . | | . . | | . . | | . . | | . | | +------+ | | +------+ |HOSTNL| | | |HOSTNR| | |------------+ +-------| | +------+ +------+ Figure 4 - A configuration with N inputs and N outputs ------------------------------------------------------ Here are illustrations of tests that can be performed using the above configurations. For Configuration A, increase load symmetrically on N ports and measure the output. This configuration represents an overloaded condition as N inputs are flowing into one switch and there is a single output. Such a condition would result in lower throughput, increased frame loss, lower back to back burst size, higher latency etc. Fairness can also be measured, i.e. if the switch discriminates the sources. For Configuration B, the traffic can be sent in the following 3 ways. i) HostiL sends traffic to HostiR all of its traffic, i = 1,2,....,N. ii) HostiL sends traffic to HostjR, j = 1,2,....,N, 1/N of traffic, i = 1,2,....,N. iii) Same as i), but with bidirectional traffic. iv) Same as ii), but with bidirectional traffic. N needs to be determined, overloading depends on the number of sources. Increase load symmetrically on all ports and measure on the corresponding outputs and this configuration can be used to measure the fairness of the switch. PERFORMANCE METRICS: ------------------- We propose that the metrics be grouped as follows: - General metrics - Protocol-Stack specific metrics - Traffic Management metrics - Network Management metrics General Performance Metrics : These metrics apply to most ATM networks and are not protocol specific. The tests for these metrics effectively characterize the basic features of the switch. Protocol-Stack Specific Metrics : These metrics apply to particular protocol stacks and need only be measured and tested if particular protocols are being used. Examples, of such protocols are RFC1577 and LANE, as discussed earlier. Traffic Management Metrics: These measure ability of the switches to avoid overload and to efficiently and fairly resolve contention among various VCs when there is overload. Network Management Metrics : These metrics are defined to aid characterization of the switch in responding to network management requests. Some of the discussion below is from RFC 1242 and its current version (an internet draft) [Bradner] and is a modification of [Jain, Nagendra]. We are of course, open to comments, suggestions, and discussion, for tailoring these metrics and configurations. GENERAL PERFORMANCE METRICS --------------------------- 1. Throughput The throughput can be measured for UBR case (open loop) and ABR case (closed loop). For UBR, throughput is defined as the maximum rate at which none of the frames are dropped by the ATM switch. Essentially we are looking at the behavior of a perfect switch which works with an efficiency of 100%. Data traffic is passed through the switch and then the frames that are transmitted by the switch are counted. If the input and the output count are the same then the load is increased and the test is conducted again. The throughput is the highest load at which the count of the output frames equals the count of the input frames. A graph of input count vs output count can be shown. Instead, the load can be kept constant and the frame size can be varied and its effect on the throughput can be studied. A model graph of input count vs output count would be: Point X defines the throughput without loss. ^ | # # | # # OUTPUT COUNT X |--------- # | # | # | # | | # | # | # |<---- 0% loss | # | | # | | # | | # | | # | +-----------------------------------> X LOAD (INPUT COUNT) Figure 5 - Graph of output count vs load (input count) for UBR -------------------------------------------------------------- Throughput can be expressed in bits/sec, frame/sec or cells/sec. Cells/ sec is not a good unit as cells in the ATM layer have significant overhead and relatively low overhead at AAL5. It is preferred to express the same in bits/sec, because expressing it in frame/sec would involve the frame size which is a variable. However bits/sec and frame/sec are related by the following equation. Throughput (bits/sec) = Throughput (frame/sec) * Average frame size (bits) For ABR, we propose two throughputs definitions: o without loss o after congestion mechanism is triggered. The definitions and tests will now be explained. Data traffic is passed through the switch from the sources and then the frames that are transmitted by the switch are counted. All frames are of the same size (Frame size is open to discussion). If the input and the output count are the same then the load is increased and the test is conducted again. The throughput without loss is the highest load at which the count of the output frames equals the count of the input frames. When the load is increased beyond a certain point, the congestion mechanism is activated and sends a warning to the sources to decrease load rate. The system will stabilize at some point (meaning that the input count and output count are identical) and that load defines the throughput after congestion mechanism is triggered. Instead, the load can be kept constant and the frame size can be varied and its effect on the throughput can be studied. Throughput in configuration A equals or is close to the capacity of the sink. It is noted that a well behaved switch would allow equal load from all sources without giving preference to any source. 2. Latency We use the following table to define the beginning (at input) and ending (at output), of the time interval, for which latency has to be defined. The definition of latency can be defined in 4 ways depending on the time interval considered. We wish to deviate from the usual definition of latency which is stated in [Jain and Nagendra] +---+-----------------------------++-----------------------------+ | ON INPUT || ON OUTPUT | +---+--------------+--------------++-------------+---------------+ |SL#| FIRST BIT | LAST BIT || FIRST BIT | LAST BIT | +---+--------------+--------------++-------------+---------------+ | 1 | X | || X | | | | | || | | | 2 | | X || X | | | | | || | | | 3 | X | || | X | | | | || | | | 4 | | X || | X | +---+--------------+--------------++-------------+---------------+ Table 1 - Four ways of defining the time interval for latency ------------------------------------------------------------- Definition 1 and 2 in the above table is not appropriate since the whole frame is the user's concern and the complete frame has not been received as yet. Definitions 3 and 4 are appropriate for the user but in case 3, the latency is dependent on the message length. So it appears that definition 4 is a good measure for switch latency measurement. Hence we choose to define latency as follows : The time interval starting from when the last bit of the input frame is transmitted and ending when the last bit of the output frame is received by the host. This is valid for all types of devices both cut-through devices and store and forward devices and the measure cannot be negative for cut-through devices as stated earlier. This helps in treating the devices uniformly and not be bothered by the internal architecture. Latency depends on the load. Hence it has to be measured in two extreme loads. o zero load o throughput load Other loads may also be considered. The time at which the frame is fully transmitted is recorded (timestamp A). The receiver logic in the test equipment should be able to the tag information in the frame stream and record the time at which the entire tagged frame was received (timestamp B). Latency = Timestamp B - Timestamp A The reporting format would be load and resultant latency for each frame size. 3. Frame loss rate Percentage of frames that should have been forwarded by the switch under steady state traffic that were not forwarded due to lack of resources. Frame loss rate is an interesting metric only under open loop (UBR case), conditions as under closed loop conditions, the network will warn the source of potential losses (congestion mechanism). When congestion mechanism is activated, frame loss is possible, but it is not constant. This measurement reports the performance of the switch at an overloaded state. The device might lose frames that contain routing information and this may further reduce the performance as more frames need to be retransmitted. The frame errors could be CRC errors and/ or cell termination errors. Frame loss rate = (input_count - output_count) / input_count Configuration A) The first trial should be run at the load that corresponds to 100% of the maximum rate for the frame size from N sources. The load is progressively decreased until there are two successive trials with no frame loss. Configuration B) The switch receives traffic from N sources simultaneously at maximum rate for the frame size. The output is measured at the N outputs. The load is progressively decreased until there are two successive trials with no frame loss. The results of the frame loss test should be reported as a graph of % loss vs load. 4. Back-to-Back Burst Size Fixed length frames presented at a rate such that there is the minimum legal separation between frames over a short to medium period of time, starting from an idle state. This determines buffering capabilities of the ATM switch in hand. NFS, remote disk backup systems like rdump, and remote tape access systems, can be configured such that a single request can result in a block of data being returned, as much as 64k octets. The length of the frame is to be decided. Case a) Bursts of frames with minimum inter-frame gaps are sent to the switch from the sources and the number of frames that have been forwarded by the switch to the single host is counted. If there are no losses or congestion mechanism is not triggered (ABR) then the length of the burst is increased and the test is rerun. The back-to-back burst size is the longest burst that the device will handle without the loss of any frames. It measures the extent of data buffering in the switch. 5. Call establishment time This is the time taken to setup a connection with the destination by the calling party. For short duration VCs, call establishment time is an important part of the user perceived performance. The time between the submission of a "call request" and the reception of the corresponding "ready indication" is defined as the call establishment time. The call establishment time is measured at zero load and load corresponding to the throughput. Other loads may also be considered. TRAFFIC MANAGEMENT METRICS: -------------------------- 1. Load Control Latency: A set of VCs are established. After the system reaches the steady state, the load on one VC is suddenly increased, the time for the system to reach the steady state again is measured. Similarly, when the load is decreased, the time to reach steady state is measured. 2. Burst Throughput: Frames are sent at differing burst (frame burst) sizes and the steady state throughput is measured. Depending upon the underlying service class (UBR, ABR), the bursty performance may be different than steady state performance. This is particularly important for request-response (client-server) applications. 3. Throughput in the Presence of Higher Priority Traffic: The throughput of ABR traffic is measured when a VBR VC shares the path with data traffic. The characteristics of the VBR traffic need to be clearly specified. 4. Fairness: The fairness can be discussed for both the configurations in figures 3 and 4. In the configuration A, N sources are connected to a single host through a switch. Increase load symmetrically on N ports and measure the output. The switch might cut out a host and only allows traffic from the remaining hosts and so the fairness of the situation can be studied. In the configuration B, each of the N hosts is connected to either 1 host or all the N hosts on the output through a switch. Increase load symmetrically on N ports and measure the output on the corresponding hosts. If the traffic in all the lines is not equal, then the switch is partial and the fairness criteria has been violated. NETWORK MANAGEMENT METRICS: [To be discussed] APPLICATION SPECIFIC PERFORMANCE METRICS: [To be discussed] BIBLIOGRAPHY: [Bradner] Scott Bradner, "Benchmarking Methodology for Network Interconnect Devices", Internet Draft. [Mandeville] Robert Mandeville, European Network Laboratories, Data Comm Magazine, March 1995, p 69. [Wakid] Wakid et al, "Architectures for BISDN Networks : A Performance Study", Advanced Systems Division, National Institute of Standards and Technology, (301)-975-4855, http://www.hpcc.gov/blue94/section.4.7.html [LANQuest] ATM Cell Congestion Loss Across Switch (CCLAS) Throughput Analysis, LANQuest Labs, (408) 894-1000. [RFC1577] M. Laubach, Classical IP and ARP over ATM, RFC 1577, Jan 1994 [SNCI] Scott Bradner, "The 1995 Ethernet to ATM Evaluation", SNCI [Mier] Mier and Smithers, "ATM to the Desktop", Product Testing, Communications Week, September 25, 1995 [RFC1242] Scott Bradner, "Benchmarking Terminology for Network Interconnection Devices", RFC 1242, July 1991. [Rowe] Martin Rowe, "Wealth of ATM Testers - Answers Most Needs", Test and Measurement World, Sept 1995, p 55 [Krivda] Cheryl D. Krivda, "Analyzing ATM Adapter Performance: The Real-World Meaning of Benchmarks," http://www.efficient.com/dox/EM.html [Jain] Raj Jain, "Performance Benchmarking BOF," AF-ALL/95-1347, October 1995. [Jain, Nagendra] Raj Jain, Bhavana Nagendra, "Performance Benchmarking of ATM Switches", AF-TEST/95-1662, December 1995 Note: All our past ATM forum contributions and presentations are available on-line: http://www.cse.wustl.edu/~jain/