************************************************************************* ATM Forum Document Number: ATM_Forum/97-0426. ************************************************************************ Title: Proposed modifications to Performance Testing Baseline: Throughput and Latency Metrics ************************************************************************ Abstract: This revised text of the baseline includes better descriptions of test configurations and measurement procedures for throughput and latency sections of the baseline documents. New text for Appendix A on MIMO latency is also included. ************************************************************************ Source: Gojko Babic, Arjan Durresi, Raj Jain, Justin Dolske, Shabbir Shahpurwala The Ohio State University Department of CIS Columbus, OH 43210-1277 Phone: 614-292-3989, Fax: 614-292-2911, Email: Jain@ACM.Org The presentation of this contribution at the ATM Forum is sponsored by NASA. ************************************************************************ Date: April 1997 ************************************************************************ Distribution: ATM Forum Technical Working Group Members (AF-TEST, AF-TM) ************************************************************************ Notice: This contribution has been prepared to assist the ATM Forum. It is offered to the Forum as a basis for discussion and is not a binding proposal on the part of any of the contributing organizations. The statements are subject to change in form and content after further study. Specifically, the contributors reserve the right to add to, amend or modify the statements contained herein. ************************************************************************ A postscript version of this contribution with several essential figures has been uploaded to the ATM Forum server incoming directory. Shortly it will be moved to appropriate contributions directory. It is also available on our web page: ftp://netlab.wustl.edu/pub/jain/atmf/atm97-0426.ps (Postscript) and ftp://netlab.wustl.edu/pub/jain/atmf/atm97-0426.zip (PKzipped Postscript) An earlier version of this contribution was presented in February 1997 meeting of the ATM Forum. This re 3.1. Throughput 3.1.1. Definitions There are three frame-level throughput metrics that are of interest to a user: * Lossless throughput - It is the maximum rate at which none of the offered frames is dropped by the SUT. * Peak throughput - It is the maximum rate at which the SUT operates regardless of frames dropped. The maximum rate can actually occur when the loss is not zero. * Full-load throughput - It is the rate at which the SUT operates when the input links are loaded at 100% of their capacity. A model graph of throughput vs. input rate is shown in Figure 3.1. Level X defines the lossless throughput, level Y defines the peak throughput and level Z defines the full-load throughput. [Figure 3.1: Peak, lossless and full-load throughput] The lossless throughput is the highest load at which the count of the output frames equals the count of the input frames. The peak throughput is the maximum throughput that can be achieved in spite of the losses. The full-load throughput is the throughput of the system at 100% load on input links. Note that the peak throughput may equal the lossless throughput in some cases. Only frames that are received completely without errors are included in frame-level throughput computation. Partial frames and frames with CRC errors are not included. 3.1.2. Units Throughput should be expressed in the effective bits/sec, counting only bits from frames excluding the overhead introduced by the ATM technology and transmission systems. This is preferred over specifying it in frames/sec or cells/sec. Frames/sec requires specifying the frame size. The throughput values in frames/sec at various frame sizes cannot be compared without first being converted into bits/sec. Cells/sec is not a good unit for frame-level performance since the cells aren't seen by the user. 3.1.3. Statistical Variations There is no need for obtaining more than one sample for any of the three frame-level throughput metrics. Consequently, there is no need for calculation of the means and/or standard deviations of throughputs. 3.1.4. Measurement Procedures Before starting measurements, a number of VCCs (or VPCs), henceforth referred to as "foreground VCCs", are established through the SUT. Foreground VCCs are used to transfer only the traffic whose performance is measured. That traffic is referred as the foreground traffic. Characteristics of a foreground traffic are specified in 3.1.5. The tests can be conducted under two conditions: * without background traffic; * with background traffic; Procedure without background traffic The procedure to measure throughput in this case includes a number of test runs. A test run starts with the traffic being sent at a given input rate over the foreground VCCs with early packet discard disabled (if this feature is available in the SUT and can be turned off). The average cell transfer delay is constantly monitored. A test run ends and the foreground traffic is stopped when the average cell transfer delay has not significantly changed (not more than 5%) during a period of at least 5 minutes. During the test run period, the total number of frames sent to the SUT and the total number of frames received from the SUT are recorded. The throughput (output rate) is computed based on the duration of a test run and the number of received frames. If the input frame count and the output frame count are the same then the input rate is increased and the test is conducted again. The lossless throughput is the highest throughput at which the count of the output frames equals the count of the input frames. The input rate is then increased even further (with early packet discard enabled, if available). Although some frames will be lost, the throughput may increase till it reaches the peak throughput value. After this point, any further increase in the input rate will result in a decrease in the throughput. The input rate is finally increased to 100% of the link input rates and the full-load throughput is recorded. Procedure with background traffic Measurements of throughput with background traffic are under study. 3.1.5. Foreground Traffic Foreground traffic is specified by the type of foreground VCCs, connection configuration, service class, arrival patterns, frame length and input rate. Foreground VCCs can be permanent or switched, virtual path or virtual channel connections, established between ports on the same network module on the switch, or between ports on different network modules, or between ports on different switching fabrics. A system with n ports can be tested for the following connection configurations: * n-to-n straight, * n-to-(n-1) full cross, * n-to-m partial cross, 1 <= m <= n-1, * k-to-1, 1 Output rate and compression of gaps (Case 3c). Case 1a: Input rate = Output rate, No Change in Gaps [Figure A.3] In both scenarios, the pattern of gaps on input is made purposely different from the pattern of gaps on output. This is just to illustrate the point that it is the total gap that matters, and not their locations within the test frame. In the given scenarios, the total number of gaps is 2 cells on both input and output. In this case, the switch delay D is given by: D = First bit latency = Last bit latency Here, we have: * FIFO latency = D ( FIFO latency is correct. * LILO latency = D ( LILO latency is correct. * Input rate = Output rate & FILO latency - Frame input time = D MIMO latency = min {LILO latency, FILO latency - Frame input time} = min {D, D} = D * MIMO latency is correct. Case 1b: Input Rate = Output Rate, Expansion of Gaps A zero-delay switch with expansion of gaps is an not possible. Therefore, only a non-zero delay switch is shown in Figure A.4. In this case, the switch delay D is given by: D = Last bit latency = First bit latency + Time of additional gaps on output [Figure A.4] Here, we have: * FIFO latency < D then FIFO latency is incorrect; FIFO latency does not reflect expansion of gaps. It remains the same even when there is a large expansion. * LILO latency = D ( LILO latency is correct. * Input rate = Output rate & FILO latency - Frame input time = D MIMO latency = min {LILO latency, FILO latency - Frame input time} = min {D, D} = D MIMO latency is correct. Case 1c: Input Rate = Output Rate, Compression of Gaps In this case, shown in Figure A.5, the switch delay D is given by: D = Last bit latency = First bit latency - Time of additional gaps on input Here, we have: * FIFO latency > D then FIFO latency is incorrect; FIFO latency is incorrect because it does not reflect compression of gaps. * LILO latency = D ( LILO latency is correct. * Input rate = Output rate & FILO latency - Frame input time = D MIMO latency = min {LILO latency, FILO latency - Frame input time} = min {D, D} = D * MIMO latency is correct [Figure A.5] Case 2a: Input Rate < Output Rate, No change in Gaps In this case, shown in Figure A.6, the switch delay D is given by: D = Last bit latency Here, we have: * FIFO latency > D ( FIFO latency is incorrect; FIFO latency varies by changing the output rate and not changing the switch (and its delay) otherwise. So, FIFO latency does not correctly represent the switch latency. * LILO latency = D then LILO latency is correct. * Input rate < Output rate FILO latency - Frame input time x Input rate / Output rate = M > D MIMO latency = min {LILO latency, M} = D * MIMO latency is correct. If idle cells are considered part of the test frame, then this as well as all other cases of "no change in gaps" becomes the same as if the frame is contiguous. It is obvious that FIFO latency is equally incorrect for continuous frames. [Figure A.6] Case 2b: Input Rate < Output Rate, Expansion of Gaps In this case, shown in Figure A.7, the switch delay D is given by: D= Last bit latency [Figure A.7] Here, we have: * FIFO latency is incorrect because it varies as the output rate (or delay) in the switch is changes, without any other changes. * It should be noted that in this case, with a given input rate and a given number of gaps on input, it is possible to produce scenarios with an appropriate output rate and an appropriate number of gaps on output such that FIFO latency > D, FIFO latency < D or even FIFO latency = D, all without changing switch characteristics. * LILO latency = D ( LILO latency is correct; * Input rate < Output rate FILO latency - Frame input time x Input rate / Output rate = M > D * MIMO latency = min {LILO latency, M} = D MIMO latency is correct; Case 2c: Input Rate < Output Rate, Compression of Gaps In this case, shown in Figure A.8, the switch delay D is given by: D = Last bit latency [Figure A.8] Here we have: * FIFO latency > D ( FIFO latency is incorrect; Note that, FIFO latency is affected by changing the output rate or/and the number of gaps on the output while the switch (and its delay) is unchanged. * LILO latency = D ( LILO latency is correct. * Input rate < Output rate FILO latency - Frame input time x Input rate / Output rate = M = > D MIMO = min {LILO latency, M} = D MIMO latency is correct. Case 3a: Input Rate > Output Rate, No Change in Gaps In this case, shown in Figure A.9, the switch delay D is given by: D = First bit latency [Figure A.9] Here, we have: * FIFO latency = D ( FIFO latency is correct. * LILO latency > D ( LILO latency is incorrect; Note that LILO latency may change by changing the output rate and without changing the switch otherwise * FILO latency - Frame input time x Input rate / Output rate = D * MIMO latency = min {LILO latency, D} = D * MIMO latency is correct. As it has been indicated, this case as well other cases with no change in gaps can be viewed as cases with continuous frames. It is obvious that LILO latency is equally incorrect for continuous frames. Case 3b: Input Rate > Output Rate, Expansion of Gaps Note that a zero-delay switch with expansion of gaps is not possible. Therefore, only the non-zero delay scenario is shown in Figure A.10. [Figure A.10] In this case, the switch delay D is given by: D = First bit latency + Time of additional gaps on output Here we have: * FIFO latency < D ( FIFO latency is incorrect; FIFO latency is incorrect because it does not reflect expansion of gaps. Note that FIFO latency may be even zero (the case of a zero delay for the first bit) for a nonzero-latency frame. * LILO latency > D ( LILO latency is incorrect. It should be noted that while LILO latency correctly accounts for a time of additional gaps it is incorrectly influenced by changes of output rate. * FILO latency - Frame input time x Input rate / Output rate = D MIMO latency = min{LILO latency, D} = min{LILO, D} = D MIMO latency is correct. Case 3c: Input Rate > Output Rate, Compression of Gaps Only in this case beside scenarios with a zero-delay switch and a non-zero (positive) delay switch, it is possible in addition to have a scenario with a speed-up (negative delay) switch. In this case, it is possible to have a switch that reduces the delay of a frame be removing several gaps. Such switches are called "speedup-delay" switches. One such case is shown in Figure A.11.c. A speedup-delay switch effectively has a negative delay. In this case, the switch delay D is given by: D = First bit latency - Time of missing gaps on output Three situations corresponding to three scenarios above can be distinguished: * a zero-delay switch, where: First bit latency = Time of missing gaps on output * a positive-delay switch, where: First bit latency > Time of missing gaps on output * a speedup-delay switch or a negative-delay switch, where: First bit latency < Time of missing gaps on output Here, we have: * FIFO latency > D ( FIFO latency is incorrect; it does not reflect compression of gaps. * LILO latency > D ( LILO latency is incorrect; while LILO latency correctly accounts for a time of additional gaps, it is incorrectly influenced by changes of output rate. * FILO latency - Frame input time x Input rate / Output rate = D MIMO latency = min {LILO, D } = D * MIMO latency is correct. [Figure A.11] A.3 MIMO latency calculation based on cell level data Contemporary ATM monitors provide measurement data at the cell level. Considering that the definition of MIMO latency uses bit level data, in this section we explain how to calculate MIMO latency using data at the cell level. Standard definitions of two cell level performance metrics, which are of importance for MIMO latency are: * cell transfer delay (CTD), defined as the amount of time it takes for a cell to begin leaving the ATM monitor and to finish arriving at the ATM monitor, i.e. the time between the first bit out and the last bit in. * cell inter-arrival time, defined as the time between arrival of the last bit of the first cell and the last bit of the second cell. It appears that CTD values obtained by ATM monitors always include some system overhead. For example, the measured cell transfer delay for the case of closed loop on an ATM monitor is usually larger than the theoretical value for the cell transmit time (a time needed to transmit one cell over a link of given rate) plus any propagation delay. The discrepancy can be attributed to delays internal to the monitor and its time resolution. That discrepancy is called the monitor overhead, and it can be calculated as the difference between the measured cell transfer delay over a closed loop on the ATM monitor and the theoretical value for the cell transmit time. On the other hand, it appears that inter-arrival times measured by ATM monitors are very accurate, so corrections for cell inter-arrival time values are not necessary. The procedure for MIMO latency calculation depends upon the relative values of input and output link rates. There are two cases to consider: * Input link rate <= Output link rate * Input link rate =>Output link rate MIMO latency calculation: Input link rate <= Output link rate In cases when the input link rate is less than or equal to the output link rate: MIMO latency = LILO latency From Figure A.12, it can be observed that: LILO latency = Last cell's transfer delay - Last cell's input transmit time where: * the cell input transmit time = the time to transmit one cell into the input link. = 53B * 8b / Input link rate in bps To account for the overhead in the ATM monitor, the following adjustment in LILO latency expression has to be made: LILO latency = Last cell's transfer delay - (Last cell's input transmit time + Monitor overhead) Thus, to calculate MIMO latency when the input link rate is less than or equal to the output link rate, it is sufficient to measure the last cell's transfer delay of a frame. [Figure A.12] MIMO Latency Calculation: Input link rate => Output link rate In cases where the input link rate is greater than or equal to the output link rate: MIMO latency = FILO latency - NFOT NFOT can be calculated as discussed in the section 3.2.1, while FILO latency has to be obtained. From Figure A.13, it can be observed that: FILO latency = FIFO latency + Frame output time Also, it can be observed that: FIFO latency = First cell's transfer delay - (First cell's output transmit time + Monitor overhead) Frame output time = First cell to last cell inter-arrival time + Last cell's output transmit time where: * the cell output transmit time = the time to transmit one cell into the output link. = 53B * 8b / Output link rate in bps If measurements of cell inter-arrival times are accurate, there is no need for any corrections in the FOLO expression due to the monitor overhead. Thus, to calculate MIMO latency when the input link rate is greater than or equal to the output link rate, it is necessary to measure the first cell's transfer delay and the inter-arrival time between the first cell and the last cell of a frame. [Figure A.13]