*************************************************************************
        
        ATM Forum Document Number: ATM_Forum/97-0426.
        
        ************************************************************************
        
        Title: Proposed modifications to Performance Testing Baseline: 
        Throughput and Latency Metrics 
        
        ************************************************************************ 
        
        Abstract: This revised text of the baseline includes better 
        descriptions of test configurations and measurement procedures for 
        throughput and latency sections of the baseline documents.  New 
        text for Appendix A on MIMO latency is also included.  
        
        ************************************************************************
        
        Source: 
        
        Gojko Babic, Arjan Durresi, Raj Jain, Justin Dolske, Shabbir Shahpurwala
        The Ohio State University 
        Department of CIS Columbus, OH 43210-1277 
        Phone: 614-292-3989, Fax: 614-292-2911, Email: Jain@ACM.Org
        
        The presentation of this contribution at the ATM Forum is sponsored by NASA.
        
        ************************************************************************
        
        Date: April 1997
        ************************************************************************
        
        Distribution: ATM Forum Technical Working Group Members (AF-TEST, AF-TM)

        ************************************************************************        
        
        Notice:
        This contribution has been prepared to assist the ATM Forum. It is 
        offered to the Forum as a basis for discussion and is not a binding 
        proposal on the part of any of the contributing organizations.  The 
        statements are subject to change in form and content after further 
        study.  Specifically, the contributors reserve the right to add to, 
        amend or modify the statements contained herein.  
        
        ************************************************************************ 
        
        A postscript version of this contribution with several essential 
        figures has been uploaded to the ATM Forum server incoming 
        directory.  Shortly it will be moved to appropriate contributions 
        directory.  It is also available on our web page: 
        
        ftp://netlab.wustl.edu/pub/jain/atmf/atm97-0426.ps 
        (Postscript) 
        
        and 
        
        ftp://netlab.wustl.edu/pub/jain/atmf/atm97-0426.zip (PKzipped 
        Postscript) 
        
        
        An earlier version of this contribution was presented in February 
        1997 meeting of the ATM Forum. This re 
        
        
        
        3.1. Throughput 
        
        
        
        
        
        3.1.1. Definitions  
        
        
        
        There are three frame-level throughput metrics that are of interest 
        to a user: 
        
        * Lossless throughput - It is the maximum rate at which none of the 
        offered frames is dropped by the SUT.  
        
        * Peak throughput - It is the maximum rate at which the SUT 
        operates regardless of frames dropped.  The maximum rate can 
        actually occur when the loss is not zero.  
        
        * Full-load throughput - It is the rate at which the SUT operates 
        when the input links are loaded at 100% of their capacity.  
        
        
        A model graph of throughput vs.  input rate is shown in Figure 3.1. 
        Level X defines the lossless throughput, level Y defines the peak 
        throughput and level Z defines the full-load throughput.  
        
        
        
        
        
        
        [Figure 3.1: Peak, lossless and full-load throughput] 
        
        
        
        The lossless throughput is the highest load at which the count of 
        the output frames equals the count of the input frames.  The peak 
        throughput is the maximum throughput that can be achieved in spite 
        of the losses.  The full-load throughput is the throughput of the 
        system at 100% load on input links.  Note that the peak throughput 
        may equal the lossless throughput in some cases.  
        
        
        Only frames that are received completely without errors are 
        included in frame-level throughput computation.  Partial frames and 
        frames with CRC errors are not included.  
        
        
        
        
        3.1.2. Units 
        
        
        
        Throughput should be expressed in the effective bits/sec, counting 
        only bits from frames excluding the overhead introduced by the ATM 
        technology and transmission systems.  
        
        
        This is preferred over specifying it in frames/sec or cells/sec.  
        Frames/sec requires specifying the frame size.  The throughput 
        values in frames/sec at various frame sizes cannot be compared 
        without first being converted into bits/sec.  Cells/sec is not a 
        good unit for frame-level performance since the cells aren't seen 
        by the user.  
        
        
        
        
        3.1.3. Statistical Variations 
        
        
        
        There is no need for obtaining more than one sample for any of the 
        three frame-level throughput metrics.  Consequently, there is no 
        need for calculation of the means and/or standard deviations of 
        throughputs.  
        
        
        
        
        3.1.4. Measurement Procedures 
        
        
        
        Before starting measurements, a number of VCCs (or VPCs), 
        henceforth referred to as "foreground VCCs", are established 
        through the SUT. Foreground VCCs are used to transfer only the 
        traffic whose performance is measured.  That traffic is referred as 
        the foreground traffic.  Characteristics of a foreground traffic 
        are specified in 3.1.5.  
        
        
        The tests can be conducted under two conditions: 
        
        * without background traffic; 
        
        * with background traffic; 
        
        
        
        Procedure without background traffic 
        
        
        
        The procedure to measure throughput in this case includes a number 
        of test runs.  A test run starts with the traffic being sent at a 
        given input rate over the foreground VCCs with early packet discard 
        disabled (if this feature is available in the SUT and can be turned 
        off).  The average cell transfer delay is constantly monitored.  A 
        test run ends and the foreground traffic is stopped when the 
        average cell transfer delay has not significantly changed (not more 
        than 5%) during a period of at least 5 minutes.  
        
        
        During the test run period, the total number of frames sent to the 
        SUT and the total number of frames received from the SUT are 
        recorded.  The throughput (output rate) is computed based on the 
        duration of a test run and the number of received frames.  
        
        
        If the input frame count and the output frame count are the same 
        then the input rate is increased and the test is conducted again.  
        
        
        The lossless throughput is the highest throughput at which the 
        count of the output frames equals the count of the input frames.  
        
        
        The input rate is then increased even further (with early packet 
        discard enabled, if available).  Although some frames will be lost, 
        the throughput may increase till it reaches the peak throughput 
        value.  After this point, any further increase in the input rate 
        will result in a decrease in the throughput.  
        
        
        The input rate is finally increased to 100% of the link input rates 
        and the full-load throughput is recorded.  
        
        
        Procedure with background traffic 
        
        
        
        Measurements of throughput with background traffic are under 
        study.  
        
        
        
        
        
        3.1.5. Foreground Traffic 
        
        
        
        Foreground traffic is specified by the type of foreground VCCs, 
        connection configuration, service class, arrival patterns, frame 
        length and input rate.  
        
        
        Foreground VCCs can be permanent or switched, virtual path or 
        virtual channel connections, established between ports on the same 
        network module on the switch, or between ports on different network 
        modules, or between ports on different switching fabrics.  
        
        
        A system with n ports can be tested for the following connection 
        configurations: 
        

        * n-to-n straight, 
        
        * n-to-(n-1) full cross,  
        
        * n-to-m partial cross, 1 <= m <= n-1,  
        
        * k-to-1, 1<k<n, 
        
        * 1-to-(n-1).  
        
        
        
        Different connection configurations are illustrated in Figure 3.2, 
        where each configuration includes one ATM switch with four ports, 
        with their input components shown on the left and their output 
        components shown the right.  
        
        
        In the case of n-to-n straight, input from one port exits to 
        another port.  This represents almost no path interference among 
        the foreground VCCs. There are n foreground VCCs. See Figure 3.2a.  
        
        
        In the case of n-to-(n-1) full cross, input from each port is 
        divided equally to exit on each of the other (n(1) ports.  This 
        represents intense competition for the switching fabric by the 
        foreground VCCs. There are nx(n(1) foreground VCCs. See Figure 
        3.2b.  
        
        
        
        In the case of n-to-m partial cross, input from each port is 
        divided equally to exit on the other m ports (1 ( m ( n(1).  This 
        represents partial competition for the switching fabrics by the 
        foreground VCCs. There are nxm foreground VCCs as shown in Figure 
        3.2c. Note that n-to-n straight and n-to-(n(1) full cross are 
        special cases of n-to-m partial cross with m=1 and m=n(1, 
        respectively.  
        
        
        In the case of k-to-1, input from k (1 < k < n) ports is destined 
        to one output port.  This stresses the output port logic.  There 
        are k foreground VCCs as shown in Figure 3.2d.  
        
        
        In the case of 1-to-(n-1), all foreground frames input on the one 
        designated port are multicast to all other (n(1) ports.  This tests 
        the multicast performance of the switch.  There is only one 
        (multicast) foreground VCC as shown in Figure 3.2e.  
        
        
        Use of the 1-to-(n-1) connection configuration for the foreground 
        traffic is under study.  
        
        
        The following service classes, arrival patterns and frame lengths 
        for foreground traffic are used for testing: 
        
        * UBR service class: Traffic consists of equally spaced frames of 
        fixed length.  Measurements are performed at AAL payload size of 64 
        B, 1518 B, 9188 B and 64 kB.  Variable length frames and other 
        arrival patterns (e.g. self-similar) are under study.  
        
        * ABR service class is under study.  
        
        The required input rate of foreground traffic is obtained by 
        loading each link by the same fraction of its input rate.  In this 
        way, the input rate of foreground traffic can also be referred to 
        as a fraction (percentage) of input link rates.  The maximum 
        foreground load (MFL) is defined as the sum of rates of all links 
        in the maximum possible switch configuration.  Input rate of the 
        foreground traffic is expressed in the effective bits/sec, counting 
        only bits from frames, excluding the overhead introduced by  the 
        ATM technology and transmission systems.  
        
        
        3.1.6. Background Traffic 
        
        
        
        Higher priority traffic (like VBR or CBR) can act as background 
        traffic for experiments.  Further details of measurements with 
        background traffic using multiple service classes simultaneously 
        are under study.  Until then, all testing will be done without any 
        background traffic.  
        
        
        
        
        3.1.7. Guidelines For Scaleable Test Configurations 
        
        
        
        It is obvious that testing larger systems, e.g., switches with 
        larger number of ports, could require very extensive (and 
        expensive) measurement equipment.  Hence, we introduce scaleable 
        test configurations for throughput measurements that require only 
        one ATM monitor with one generator/analyzer pair.  Figure 3.3 
        presents a simple test configuration for an ATM switch with eight 
        ports in a 8-to-8 straight connection configuration.  Figure 3.4 
        presents a test configuration with the same switch in an 8-to-2 
        parti al cross connection configuration.  The former configuration 
        emulates 8 foreground VCCs, while the later emulates 16 foreground 
        VCCs.  
        
        
        
        In both test configurations, there is one link between the ATM 
        monitor and the switch.  The other seven ports have external 
        loopbacks.  A loopback on a given port causes the frames 
        transmitted over the output of the port to be received by the input 
        of the same port.  
        
        
        The test configurations in Figure 3.3 and Figure 3.4 assume two 
        network modules in the switch, with switch ports P0-P3 in one 
        network module and switch ports P4-P7 in the another network 
        module.  Foreground VCCs are always established from a port in one 
        network module to a port in the another network module.  These 
        connection configurations could be more demanding on the SUT than 
        the cases where each VCC uses ports in the same network module.  An 
        even more demanding case could be when foreground VCCs use diff 
        erent fabrics of a multi-fabric switch.  
        
        
        Approaches similar to those in Figure 3.3 and Figure 3.4 can be 
        used for n-to-(n(1) full cross and other types of n-to-m partial 
        cross connection configurations, as well as for larger switches.  
        Guidelines to set up scaleable test configurations for the k-to-1 
        connection configuration are under study.  
        
        
        It should be noted that in the proposed test configurations, 
        because of loopbacks, only permanent VCCs or VPCs can be 
        established 
        
        
        It should also be realized that in the test configurations with 
        loopbacks, if all link rates are not identical, it is not possible 
        to generate foreground traffic equal to the MFL. The maximum 
        foreground traffic load for a n-port switch in those cases equals n 
        x lowest link rate.  Only in the case when all link rates are 
        identical is it possible to obtain MFL level.  If all link rates 
        are not identical, and the MFL level needs to be reached, it is 
        necessary to have more than one analyzer/generator pair.  
        
        
        
        
        3.1.8. Reporting results 
        
        
        
        Results should include a detailed description of the SUT, such as 
        the number of ports, rate of each port, number of ports per network 
        module, number of network modules, number of network modules per 
        fabric, number of fabrics, maximum foreground load (MFL), software 
        version, and any other relevant information.  
        
        
        Values for the lossless throughput, the peak throughput with 
        corresponding input load, and the full-load throughput with 
        corresponding input load (if different from MFL) are reported along 
        with foreground (and background, if any) traffic characteristics.  
        
        
        The list of foreground traffic characteristics and their possible 
        values are now provided: 
        
        * type of foreground VCCs: permanent virtual path connections, 
        switched virtual path connections, permanent virtual channel 
        connections, switch virtual channel connections; 
        
        * foreground VCCs established: between ports inside a network 
        module, between ports on different network modules, between ports 
        on different fabrics, some combination of previous cases; 
        
        * connection configuration: n-to-n straight, n-to-(n(1) full cross, 
        n-to-m partial cross with m = 2, 3, 4, ..., n(1, k-to-1 with k=2, 
        3, 4, 5, 6, ...; 
        
        * service class: UBR, ABR; 
        
        * arrival patterns: equally spaced frames, self-similar, random; 
        
        * frame length: 64 B, 1518 B, 9188 B or 64 kB, variable; 
        
        
        
        Values in bold indicate traffic characteristics for which 
        measurement tests must be performed and for which throughput values 
        must be reported.  
        
        
        
        
        3.2. Frame Latency 
        
        
        
        
        
        3.2.1. Definition 
        
        
        
        MIMO latency (Message-In Message-Out) is a general definition of 
        the latency that applies to an ATM switch or a group of ATM 
        switches.  It is defined as follows: 
        
        
        MIMO latency = min {LILO latency, FILO latency - NFOT} 
        
        
        
        where: 
        
        * LILO latency = Time between the last-bit entry and the last-bit 
        exit 
        
        * FILO latency = Time between the first-bit entry and the last-bit 
        exit  
        
        * NFOT = Nominal frame output time 
        
        
        
        The nominal frame output time is defined as: 
        
        
        
        NFOT = Frame input time x Input link rate / Output link rate  
        
        where: 
        
        * Frame input time = Time between the first-bit entry and the 
        last-bit entry  
        
        
        
        The following is an equivalent definition for MIMO latency: 
        
        
        
        
        
        It should be noted that when the input link rate is equal to the 
        output link rate:  

        
        
        MIMO latency = LILO latency = FILO latency - NFOT 
        
        
        
        The MIMO latency is a general definition that applies even when the 
        frames are discontinuous at the input and/or output or when the 
        input and output rates are different.  
        
        
        To measure MIMO latency for a given frame, the time of occurrence 
        for the following three events need to be recorded: 
        
        * First-bit of the frame enters into the SUT, 
        
        * Last-bit of the frame enters into the SUT, 
        
        * Last-bit of the frame exits from the SUT.  
        
        
        
        The time between the first and the second events is FILO latency 
        and the time between the second and third events is LILO latency.  
        
        
        NFOT can be calculated given the cell pattern of the test frame on 
        input (which includes a number of cells of the test frame and 
        duration of idle intervals, if any, and/or number of cells from 
        other frames, if any, between the first cell and the last cell 
        during input transmission of the test frame), and rates of input 
        and output links.  Note that for contiguous frames on input: 
        
        
        Frame input time = Frame Size / Input link rate 
        
        
        
        and then it follows: 
        
        
        
        NFOT = Frame Size / Output link rate 
        
        
        
        Substituting LILO latency, FILO latency and NFOT in the MIMO 
        latency formula would give the frame latency of the SUT.  
        
        
        Appendix A (Section A.2.) presents an explanation of MIMO latency 
        and its justification.  
        
        
        
        
        3.2.2. Frame Delay and Cell Level Data 
        
        
        
        Contemporary ATM monitors provide measurement data only at the cell 
        level, e.g., cell transfer delay (CTD) and cell inter-arrival 
        time.  This data is sufficient to calculate MIMO frame latency as 
        follows.  
        
        
        If the input link rate is less than or equal to the output link 
        rate, then: 
        
        
        
        MIMO latency = 
        Last cell's transfer delay - (Last cell's input transmit time + Monitor overhead)
        
        
        where: 
        
        * the cell transfer delay is the amount of time it takes for a cell 
        to begin leaving the ATM test system and to finish arriving at the 
        ATM test system, i.e.  the time between the first bit out and the 
        last bit in; 
        
        * the cell input transmit time is the time to transmit one cell 
        into the input link.  It can be easily calculated; 
        
        * the monitor overhead is the overhead introduced by the ATM 
        monitor when measuring CTD and it is usually non zero.  It can be 
        calculated as difference between the measured cell transfer delay 
        for the case of closed loop on the ATM monitor and the theoretical 
        value for the cell transmit time plus any propagation delay.  
        
        
        Thus, to calculate MIMO latency when the input link rate is less 
        than or equal to the output link rate, it is sufficient to measure 
        the transfer delay of the last cell of a frame.  
        
        
        If the input link rate is greater than or equal to the output link 
        rate, then:  
        
        
        
        MIMO latency = FIFO latency + Frame output time - NFOT 
        
        
        
        where: 
        
        * FIFO latency 
        
        = Time between the first-bit entry and the first-bit exit 
        
        = First cell's transfer delay - (First cell's output transmit time + 
        Monitor overhead) 
        
        * Frame output time 
                    = Time between the first-bit exit and the last-bit exit
        = First cell to last cell inter-arrival time + Last cell's output transmit time
        
        * the cell output transmit time is the time to transmit one cell 
        into the output link.  It can be easily calculated.  
        
        * the cell inter-arrival time is the time between arrival of the 
        first bit of the first cell and the first bit of the last cell.  
        
        
        Thus, to calculate MIMO latency when the input link rate is greater 
        than or equal to the output link rate, it is necessary to measure 
        the first cell transfer delay and the inter-arrival time between 
        the first cell and the last cell of a frame.  
        
        
        Appendix A (Section A.3.) presents derivations of expressions for 
        MIMO latency calculation based on cell level data.  
        
        
        
        
        3.2.3. Units 
        
        
        
        The latency should be specified in (sec.  
        
        
        
        
        
        3.2.4. Statistical Variations  
        
        
        
        For the given foreground traffic and background traffic, the 
        required times and/or delays, needed for MIMO latency calculation, 
        are recorded for p frames, according to the procedures described in 
        3.2.5. Here p is a parameter and its default (and the minimal 
        value) is 100.  
        
        
        Let Mi be the MIMO latency of the ith frame.  Note that MIMO 
        latency is considered to be infinite for lost or corrupted frames.  
        The mean and standard errors of the measurement are computed as 
        follows: 
        
        
        
        
        Given the mean and the standard error, the users can compute a 
        100(1-()-percent confidence interval as follows: 
        
        
        100(1-a)-percent confidence interval 
                 (mean - z x standard error, mean + z x Standard error)
        
        
        Here, z is the (1-(a/2)-quantile of the unit normal variate.  For 
        commonly used confidence levels, the quantile values are as 
        follows: 
        
        
        
        The value of p can be chosen differently from its default value to 
        obtain the desired confidence level.  
        
        
        
        
        3.2.5. Measurement Procedures 
        
        
        
        For MIMO latency measurements, it is first necessary to establish 
        one VCC (or VPC) used only by foreground traffic, and a number of 
        VCCs or VPCs used only by background traffic.  Then, the background 
        traffic is generated.  Characteristics of a background traffic are 
        described in section 3.2.7. When flow of the background traffic has 
        been established, the foreground traffic is generated.  
        Characteristics of a foreground traffic are specified in section 
        3.2.6. After the steady state flow of foreground traffic h as been 
        reached the required times and/or delays needed for MIMO latency 
        calculation are recorded for p consecutive frames from the 
        foreground traffic, while the flow of background traffic continue 
        uninterrupted.  The entire procedure is referred to as one 
        measurement run.  
        
        
        
        
        
        
        
        
        3.2.6. Foreground traffic 
        
        
        
        MIMO latency depends upon several characteristics of foreground 
        traffic.  These include the type of foreground VCC, service class, 
        arrival patterns, frame length, and input rate.  
        
        
        The foreground VCC can be a permanent or switched, virtual path or 
        virtual channel connection, established between ports on the same 
        network module of the switch, or between ports on different network 
        modules, or between ports on different switching fabrics.  
        
        
        For the UBR service class, the foreground traffic consists of 
        equally spaced frames of fixed length.  Measurements are performed 
        on AAL payload sizes of 64 B, 1518 B, 9188 B and 64 kB.  Variable 
        length frames and other arrival patterns (e.g. self-similar) are 
        under study.  ABR service class is also under study.  
        
        
        Input rate of foreground traffic is expressed in the effective 
        bits/sec, counting only bits from AAL payload excluding the 
        overhead introduced by the ATM technology and transmission 
        systems.  
        
        
        The first measurement run is performed at the lowest possible 
        foreground input rate (for the given test equipment).  For later 
        measurement runs, the foreground load is increased up to the point 
        when losses in the traffic occur or up to the full foreground load 
        (FFL). FFL is equal to the lesser of the input and the output link 
        rates used by the foreground VCC. Suggested input rates for the 
        foreground traffic are: 0.5, 0.75, 0.875, 0.9375, 0.9687, ..., 
        i.e.  1 - 2-k, k = 1, 2, 3, 4, 5, ..., of FFL. 
        
        3.2.7. Background Traffic 
        
        
        
        Background traffic characteristics that affect frame latency are 
        the type of background VCCs, connection configuration, service 
        class, arrival patterns (if applicable), frame length (if 
        applicable) and input rate.  
        
        
        Like the foreground VCC, background VCCs can be permanent or 
        switched, virtual path or channel connections, established between 
        ports on the same network module on the switch, or between ports on 
        different network modules, or between ports on different switching 
        fabrics.  To avoid interference on the traffic generator/analyzer 
        equipment, background VCCs are established in such way that they do 
        not use the input link or the output link of the foreground VCC in 
        the same direction.  
        
        
        For a SUT with w ports, the background traffic can use (w-2) ports, 
        not used by the foreground traffic, for both input and output.  The 
        port with the input link of the foreground traffic can be used as 
        an output port for the background traffic.  Similarly, the output 
        port of the foreground traffic can be used as an input port for the 
        background traffic.  Overall, background traffic can use an 
        equivalent of n=w-1 ports.  The maximum background load (MBL) is 
        defined as the sum of rates of all links, except the o ne used as 
        the input link for the foreground traffic, in the maximum possible 
        switch configuration.  
        
        
        A SUT with w (=n+1) ports is measured for the following background 
        traffic connection configurations: 
        
        * n-to-n straight, with n background VCCs, (Figure 3.2.a); 
        
        * n-to-(n-1) full cross, with nx(n-1) background VCCs. (Figure 
        3.2.b); 
        
        * n-to-m partial cross, 1 <= m <= n-1, with nxm background VCCs. 
        (Figure 3.2.c); 
        
        * 1-to-(n-1), with one (multicast) background VCC. (Figure 3.2.e); 
        
        
        
        Use of the 1-to-(n-1) connection configuration for the background 
        traffic is under study.  
        
        
        The following service classes, arrival patterns (if applicable) and 
        frame lengths (if applicable) are used for the background traffic: 
        
        * UBR service class: Traffic consists of equally spaced frames of 
        fixed length.  Measurements are performed at AAL payload size of 64 
        B, 1518 B, 9188 B and 64 kB.  This is a case of bursty background 
        traffic with priority equal to or lower than that of the foreground 
        traffic.  Variable length frames and other arrival patterns (e.g. 
        self-similar) are for further study.  
        
        * CBR service class: Traffic consists of a contiguous stream of 
        cells at a given rate.  This is a case of non-bursty background 
        traffic with priority higher than that of the foreground traffic.  
        
        * VBR and ABR service classes are under study.  
        
        
        
        Input rate of the background traffic is expressed in the effective 
        bits/sec, counting only bits from frames excluding the overhead 
        introduced by the ATM technology and transmission systems.  
        
        
        In the cases of n-to-n straight, n-to-(n-1) full cross and n-to-m 
        partial cross connection configurations, measurement are performed 
        at input rates of 0, 0.5, 0.75, 0.875, 0.9375, 0.9687, ... (1 - 
        2-k, k = 0, 1, 2, 3, 4, 5,...) of MBL. The required traffic load is 
        obtained by loading each input link by the same fraction of its 
        input rate.  In this way, the input rate of background traffic can 
        also be expressed as a fraction (percentage) of input link rates.  
        
        
        
        
        3.2.8. Guidelines For Scaleable Test Configurations 
        
        
        
        Scaleable test configurations for MIMO latency measurements require 
        only one ATM test system with two generator/analyzer pairs.  Figure 
        3.5 presents the test configuration with an ATM switch with eight 
        ports (w=8). There are two links between the ATM monitor and the 
        switch, and they are used in one direction by the background 
        traffic and in the another direction by the foreground traffic, as 
        indicated.  The other six (w-2) ports of the switch are used only 
        by the background traffic and they have external loo pbacks.  A 
        loopback on a given port causes the frames transmitted over the 
        output of the port to be received by the input of the same port.  
        
        
        Figure 3.5: A scaleable test configuration for measurements of MIMO 
        latency using only two generator analyzer pairs with 8-port switch 
        and 7-to7 straight configuration for background traffic 
        
        
        Figure 3.5 shows a 7-to-7 straight connection configuration for the 
        background traffic.  The n-to-(n-1) full cross configuration and 
        the n-to-m partial cross configurations can also be similarly 
        implemented.  
        
        
        The test configuration shown assumes two network modules in the 
        switch with ports P0-P3 in one network module and ports P4-P7 in 
        the another network module.  Here, the foreground VCC and 
        background VCCs are established between ports in different network 
        modules.  
        
        
        It should be noted that in the proposed test configurations, 
        because of loopbacks, only permanent VCCs or VPCs can be 
        established.  
        
        
        It should also be realized that in test configurations, if all link 
        rates are not identical, it is not possible to generate background 
        traffic (without losses) equal to MBL. The maximum background 
        traffic input rate in those cases equals (n-1) x lowest link rate.  
        Only in the case where all link rates are identical is it possible 
        to obtain MBL level without losses in background traffic.  
        
        
        If the link rates are different, it is possible to obtain MBL in 
        the n-to-n straight case, but background traffic will have losses.  
        In this case, the foreground traffic should use the lowest rate 
        port in the switch as the input, while the highest rate port in the 
        switch should be used as the output.  The background traffic enters 
        the SUT through the highest rate port and passes successively 
        through ports of decreasing speeds.  At the end, the background 
        traffic exits the switch through the lowest rate port.  d9� 
        
        
        
        
        
        3.2.9. Reporting results 
        
        
        
        Reported results should include detailed description of the SUT, 
        such as the number of ports, rate of each port, number of ports per 
        network module, number of network modules, number of network 
        modules per fabric, number of fabrics, the software version and any 
        other relevant information.  
        
        
        Values of the mean and the standard error of MIMO latency are 
        reported along with values of foreground and background traffic 
        characteristics for each measurement run.  
        
        
        The list of foreground and background traffic characteristics and 
        their possible values are now provided: 
        
        
        Foreground traffic: 
        
        * type of foreground VCC: permanent virtual path connection, 
        switched virtual path connection, permanent virtual channel 
        connection, switch virtual channel connection; 
        
        * foreground VCC established: between ports inside a network 
        module, between ports on different network modules, between ports 
        on different switching fabrics; 
        
        * service class: UBR, ABR; 
        
        * arrival patterns: equally spaced frames, self-similar, random; 
        
        * frame length: 64 B, 1518 B, 9188 B or 64 kB, variable; 
        
        * full foreground load (FFL); 
        
        * input rate: the lowest rate possible for the given test 
        equipment, and 0.5, 0.75, 0.875, 0.9375, 0.9687, ..., (i.e., 1 - 
        2-k, k = 1, 2, 3, 4, 5, ...,) of FFL.  
        
        
        Background traffic: 
        
        * type of background VCC's: permanent virtual path connections, 
        switched virtual path connections, permanent virtual channel 
        connections, switch virtual channel connections; 
        
        * foreground VCCs established: between ports inside a network 
        module, between ports on different network modules, between ports 
        on different switching fabrics, some combination of previous cases; 
        
        * connection configuration: n-to-n straight, n-to-(n-1) full cross, 
        n-to-m partial cross with m = 2, 3, 4, ..., n-1; 
        
        * service class: UBR, CBR, ABR, VBR; 
        
        * arrival patterns (when applicable): equally spaced frames, 
        self-similar, random; 
        
        * frame length (when applicable): 64 B, 1518 B, 9188 B, 64 kB, 
        variable; 
        
        * maximum background load (MBL); 
        
        * input rate: 0, 0.5, 0.75, 0.875, 0.9375, 0.9687, ... (i.e., 1 - 
        2-k, k = 0, 1, 2, 3, 4, 5,...) of MBL.  
        
        
        Values in bold indicate traffic characteristics for which 
        measurement tests must be performed and for which MIMO latency 
        values must be reported.  
        
        
        Appendix A: MIMO Latency 
        
        A.1. Introduction 
        
        In the case of a single bit, the latency is generally defined as 
        the time between the instant the bit enters the system to the 
        instant the bit exits from the system.  For an illustration of a 
        single bit case see Figure A.1.  
        
        
        [Figure A.1: Latency for a single bit]  
        
        
        
        For multi-bit frames, the usual way to define a frame latency is to 
        use one of the following four definitions: 
        
        * FIFO latency: Time between the first-bit entry and the first-bit 
        exit  
        
        * LILO latency: Time between the last-bit entry and the last-bit 
        exit  
        
        * FILO latency: Time between the first-bit entry and the last-bit 
        exit  
        
        * LIFO latency: Time between the last-bit entry and the first-bit 
        exit 
        
        
        Unfortunately, none of the above four metrics apply to an ATM 
        network (or switch) since: 
        
        * an ATM switch does cell-switching, i.e.  it transmits a received 
        cell of any frame without waiting for any other cells of that frame 
        to arrive and  
        
        * the frames are not always sent or received contiguously, i.e., 
        there may be idle periods, idle cells or cells of other frames 
        between cells of a test frame either on input and/or on output.  
        
        
        In the rest of this appendix, it is assumed that the duration of 
        any idle period (between cells of a test frame) is always an 
        integral number of cell times.  Thus, such periods can be viewed as 
        sequences of one or more idle cells.  This assumption makes further 
        presentation easier without any loss of generality.  
        
        
        Both idle cells and cells of other frames between cells of a test 
        frame are shown as gaps.  If input and output rates are different 
        then the duration of each gap on input is different from a duration 
        of each gap on output.  
        
        
        Figure A.2 illustrates different latencies of an ATM switch 
        (network) with the input link rate higher that the output link rate 
        for a test frame consisting of 3 cells.  Note the different gaps on 
        input and output.  On input, there are two gaps after the first 
        cell of the frame, followed by two remaining cells of the frame.  
        On output, there is only one gap after the first cell and then two 
        gaps between the second and the third cell of the frame.  
        
        
        
        [Figure A.2: Latency metrics] 
        
        
        
        Figure A.2 does not show LIFO latency, because in the illustrated 
        case, the first bit (cell) exits before the last bit (cell) 
        enters.  Consequently, LIFO latency is negative.  Because the frame 
        clearly experiences some positive delay, LIFO latency is not a good 
        indicator of the switch latency.  For this reason, LIFO latency 
        will not be considered further.  
        
        
        Note that FILO latency can be computed from LILO latency given the 
        frame input time: 
        
        
        FILO latency = LILO latency + Frame input time  
        
        
        
        It is clear that LILO is a preferred metric in this case, since it 
        is independent of the frame input time, while FILO would be 
        different for each frame input time.  For this reason FILO is not a 
        good measure of switch latency.  
        
        
        In the next section, we justify the MIMO latency definition as 
        defined in Section 3.2.1. We systematically consider all possible 
        cases comparing FIFO latency, LILO latency and MIMO latency; and we 
        show that MIMO latency is the correct metric in all cases, whereas 
        other metrics apply to some cases but give incorrect results in 
        others.  The last section of this appendix shows how to calculate 
        MIMO latency based on cell level data.  A.2. MIMO latency 
        justification 
        
        
        
        In this section, we consider only cases where a test frame is 
        discontinuous on both input and output, i.e.  cases with gaps 
        between the cells of a test frame.  It should be noted that cases 
        with contiguous frames on input and/or output are special cases of 
        discontinuous frames with no gaps.  
        
        
        Depending upon the number of gaps on input and output, we have 
        three possibilities: 
        
        * No change in gaps: The number of gaps on output is same as that 
        on input.  
        
        * Expansion of gaps: The number of gaps on output is larger than 
        that on input.  
        
        * Compression of gaps: The number of gaps on output is less than 
        that on input.  
        
        
        
        The nine cases and the applicability of the three metrics (FIFO 
        latency, LILO latency and MIMO latency) to those cases are shown in 
        Table A.1.  
        
        
        
        
        [Table A.1: Applicability of Various Latency Definitions]  
        
        
        For each case we present a scenario similar to one in Figure A.2, 
        but with simplified labeling.  Each case includes one scenario with 
        a test frame exercising a nonzero (positive) latency and (if 
        possible) another scenario with a test frame exercising a 
        zero-latency.  We refer to a switch with positive frame latency as 
        a non-zero (positive) delay switch and to a switch with a zero 
        frame latency as a zero delay switch.  The cases with a zero-delay 
        switch are especially useful to verify the validity of a latency 
        definition, because the switch delay is known in advance (equal to zero).  
        
        
        It should be noted that it is actually possible to have a negative 
        frame latency and we refer to such switch as a speed-up (negative 
        delay) switch.  That scenario is only possible in the case of Input 
        rate > Output rate and compression of gaps (Case 3c).  
        
        
        Case 1a: Input rate = Output rate, No Change in Gaps 
        
        [Figure A.3] 
        
        
        In both scenarios, the pattern of gaps on input is made purposely 
        different from the pattern of gaps on output.  This is just to 
        illustrate the point that it is the total gap that matters, and not 
        their locations within the test frame.  In the given scenarios, the 
        total number of gaps is 2 cells on both input and output.  
        
        
        In this case, the switch delay D is given by: 
        
        
        
        D = First bit latency = Last bit latency 
        
        
        
        Here, we have: 
        
        * FIFO latency = D ( FIFO latency is correct.  
        
        * LILO latency = D ( LILO latency is correct.  
        
        * Input rate = Output rate & FILO latency - Frame input time = D 
        
         MIMO latency = min {LILO latency, FILO latency - Frame input 
        time} 
        
         = min {D, D} = D 
        
        * MIMO latency is correct.  
        
        Case 1b: Input Rate = Output Rate, Expansion of Gaps 
        
        
        
        A zero-delay switch with expansion of gaps is an not possible.  
        Therefore, only a non-zero delay switch is shown in Figure A.4.  
        
        
        In this case, the switch delay D is given by: 
        
        
        
        D = Last bit latency = First bit latency + Time of additional gaps 
        on output 
        
        
        [Figure A.4] 
        
        
        
        Here, we have: 
        
        * FIFO latency < D then FIFO latency is incorrect; FIFO latency does 
        not reflect expansion of gaps.  It remains the same even when there 
        is a large expansion.  
        
        * LILO latency = D ( LILO latency is correct.  
        
        * Input rate = Output rate & FILO latency - Frame input time = D 
        
        MIMO latency = min {LILO latency, FILO latency - Frame input 
        time} 
                              = min {D, D} = D
        
         MIMO latency is correct.  
        
        
        
        
        
        Case 1c: Input Rate = Output Rate, Compression of Gaps 
        
        
        
        In this case, shown in Figure A.5, the switch delay D is given by: 
        
        
        
        D = Last bit latency = First bit latency - Time of additional gaps 
        on input 
        
        
        
        Here, we have: 
        
        * FIFO latency > D then FIFO latency is incorrect; FIFO latency is 
        incorrect because it does not reflect compression of gaps.  
        
        * LILO latency = D ( LILO latency is correct.  
        
        * Input rate = Output rate & FILO latency - Frame input time = D 
        
         MIMO latency = min {LILO latency, FILO latency - Frame input time}  
        
        = min {D, D} = D  
        
        * MIMO latency is correct 
        
        
        [Figure A.5] 
        
        Case 2a: Input Rate < Output Rate, No change in Gaps 
        
        
        
        In this case, shown in Figure A.6, the switch delay D is given by: 
        
        
        
        D = Last bit latency 
        
        
        
        Here, we have: 
        
        * FIFO latency > D ( FIFO latency is incorrect; FIFO latency varies 

        by changing the output rate and not changing the switch (and its 
        delay) otherwise.  So, FIFO latency does not correctly represent 
        the switch latency.  
        
        * LILO latency = D then LILO latency is correct.  
        
        * Input rate < Output rate  
        
        FILO latency - Frame input time x Input rate / Output rate = M > D  
        
        MIMO latency = min {LILO latency, M} = D 
        
        * MIMO latency is correct.  
        
        
        
        If idle cells are considered part of the test frame, then this as 
        well as all other cases of "no change in gaps" becomes the same as 
        if the frame is contiguous.  It is obvious that FIFO latency is 
        equally incorrect for continuous frames.  
        
        [Figure A.6] 
        
        
        
        
        
        Case 2b: Input Rate < Output Rate, Expansion of Gaps 
        
        
        
        In this case, shown in Figure A.7, the switch delay D is given by: 
        
        
        
        D= Last bit latency 
        
        
        [Figure A.7] 
        
        
        
        Here, we have: 
        
        * FIFO latency is incorrect because it varies as the output rate 
        (or delay) in the switch is changes, without any other changes.  
        
        * It should be noted that in this case, with a given input rate and 
        a given number of gaps on input, it is possible to produce 
        scenarios with an appropriate output rate and an appropriate number 
        of gaps on output such that FIFO latency > D, FIFO latency < D or 
        even FIFO latency = D, all without changing switch 
        characteristics.  
        
        * LILO latency = D ( LILO latency is correct; 
        
        * Input rate < Output rate  
        
        FILO latency - Frame input time x Input rate / Output rate = M > D  
        
        * MIMO latency = min {LILO latency, M} = D 
        
        MIMO latency is correct; 
        
        
        
        
        
        Case 2c: Input Rate < Output Rate, Compression of Gaps 
        
        
        
        In this case, shown in Figure A.8, the switch delay D is given by: 
        
        
        
        D = Last bit latency 
        
        
        [Figure A.8] 
        
        
        
        Here we have: 
        
        * FIFO latency > D ( FIFO latency is incorrect; Note that, FIFO 
        latency is affected by changing the output rate or/and the number 
        of gaps on the output while the switch (and its delay) is 
        unchanged.  * LILO latency = D ( LILO latency is correct.  
        
        * Input rate < Output rate  
        
        FILO latency - Frame input time x Input rate / Output rate = M = > D  
        
        MIMO = min {LILO latency, M} = D  
        
        MIMO latency is correct.  
        
        
        
        
        
        Case 3a: Input Rate > Output Rate, No Change in Gaps 
        
        
        
        In this case, shown in Figure A.9, the switch delay D is given by: 
        
        
        
        D = First bit latency 
        
        
        [Figure A.9] 
        
        
        
        Here, we have: 
        
        * FIFO latency = D ( FIFO latency is correct.  
        
        * LILO latency > D ( LILO latency is incorrect; Note that LILO 
        latency may change by changing the output rate and without changing 
        the switch otherwise  
        
        * FILO latency - Frame input time x Input rate / Output rate = D 
        
        * MIMO latency = min {LILO latency, D} = D 
        
        * MIMO latency is correct.  
        
        
        
        As it has been indicated, this case as well other cases with no 
        change in gaps can be viewed as cases with continuous frames.  It 
        is obvious that LILO latency is equally incorrect for continuous 
        frames.  
        
        
        
        
        Case 3b: Input Rate > Output Rate, Expansion of Gaps 
        
        
        
        Note that a zero-delay switch with expansion of gaps is not 
        possible.  Therefore, only the non-zero delay scenario is shown in 
        Figure A.10.  
        
        
        [Figure A.10] 
        
        
        
        In this case, the switch delay D is given by: 
        
        
        
        D = First bit latency + Time of additional gaps on output 
        
        
        
        Here we have: 
        
        * FIFO latency < D ( FIFO latency is incorrect; FIFO latency is 
        incorrect because it does not reflect expansion of gaps.  Note that 
        FIFO latency may be even zero (the case of a zero delay for the 
        first bit) for a nonzero-latency frame.  
        
        * LILO latency > D ( LILO latency is incorrect.  It should be noted 
        that while LILO latency correctly accounts for a time of additional 
        gaps it is incorrectly influenced by changes of output rate.  
        
        * FILO latency - Frame input time x Input rate / Output rate = D 
        
        MIMO latency = min{LILO latency, D} = min{LILO, D} = D  
        
        MIMO latency is correct.  
        
        
        
        
        
        Case 3c: Input Rate > Output Rate, Compression of Gaps 
        
        
        
        Only in this case beside scenarios with a zero-delay switch and a 
        non-zero (positive) delay switch, it is possible in addition to 
        have a scenario with a speed-up (negative delay) switch.  
        
        
        In this case, it is possible to have a switch that reduces the 
        delay of a frame be removing several gaps.  Such switches are 
        called "speedup-delay" switches.  One such case is shown in Figure 
        A.11.c. A speedup-delay switch effectively has a negative delay.  
        
        
        In this case, the switch delay D is given by: 
        
        
        
        D = First bit latency - Time of missing gaps on output 
        
        
        
        Three situations corresponding to three scenarios above can be 
        distinguished: 
        
        * a zero-delay switch, where:  
        
        First bit latency = Time of missing gaps on output 
        
        * a positive-delay switch, where:  
        
        First bit latency > Time of missing gaps on output 
        
        * a speedup-delay switch or a negative-delay switch, where:  
        
        First bit latency < Time of missing gaps on output 
        
        
        
        Here, we have: 
        
        * FIFO latency > D ( FIFO latency is incorrect; it does not reflect 
        compression of gaps.  * LILO latency > D ( LILO latency is 
        incorrect; while LILO latency correctly accounts for a time of 
        additional gaps, it is incorrectly influenced by changes of output 
        rate.  * FILO latency - Frame input time x Input rate / Output rate 
        = D 
        
        MIMO latency = min {LILO, D } = D  
        
        * MIMO latency is correct.  
        
        
        [Figure A.11] 
        
        
        
        
        
        A.3 MIMO latency calculation based on cell level data 
        
        
        
        Contemporary ATM monitors provide measurement data at the cell 
        level.  Considering that the definition of MIMO latency uses bit 
        level data, in this section we explain how to calculate MIMO 
        latency using data at the cell level.  
        
        
        Standard definitions of two cell level performance metrics, which 
        are of importance for MIMO latency are: 
        
        
        * cell transfer delay (CTD), defined as the amount of time it takes 
        for a cell to begin leaving the ATM monitor and to finish arriving 
        at the ATM monitor, i.e.  the time between the first bit out and 
        the last bit in.  
        
        * cell inter-arrival time, defined as the time between arrival of 
        the last bit of the first cell and the last bit of the second 
        cell.  
        
        
        It appears that CTD values obtained by ATM monitors always include 
        some system overhead.  For example, the measured cell transfer 
        delay for the case of closed loop on an ATM monitor is usually 
        larger than the theoretical value for the cell transmit time (a 
        time needed to transmit one cell over a link of given rate) plus 
        any propagation delay.  The discrepancy can be attributed to delays 
        internal to the monitor and its time resolution.  That discrepancy 
        is called the monitor overhead, and it can be calculated  as the 
        difference between the measured cell transfer delay over a closed 
        loop on the ATM monitor and the theoretical value for the cell 
        transmit time.  
        
        
        On the other hand, it appears that inter-arrival times measured by 
        ATM monitors are very accurate, so corrections for cell 
        inter-arrival time values are not necessary.  
        
        
        The procedure for MIMO latency calculation depends upon the 
        relative values of input and output link rates.  There are two 
        cases to consider: 
        
        * Input link rate <= Output link rate 
        
        * Input link rate =>Output link rate 
        
        
        
        
        
        MIMO latency calculation: Input link rate <= Output link rate 
        
        
        
        In cases when the input link rate is less than or equal to the 
        output link rate: 
        
        
        MIMO latency = LILO latency 
        
        
        
        From Figure A.12, it can be observed that: 
        
        
        
        LILO latency = Last cell's transfer delay - Last cell's input 
        transmit time 
        
        
        
        where: 
        
        * the cell input transmit time = the time to transmit one cell into 
        the input link.  = 53B * 8b / Input link rate in bps 
        
        
        
        To account for the overhead in the ATM monitor, the following 
        adjustment in LILO latency expression has to be made: 
        
        
        LILO latency = 
        Last cell's transfer delay - (Last cell's input transmit time + Monitor overhead)
        
        
        Thus, to calculate MIMO latency when the input link rate is less 
        than or equal to the output link rate, it is sufficient to measure 
        the last cell's transfer delay of a frame.  
        
        
        [Figure A.12] 
        
        
        MIMO Latency Calculation: Input link rate => Output link rate 
        
        
        
        In cases where the input link rate is greater than or equal to the 
        output link rate: 
        
                     MIMO latency = FILO latency - NFOT
        
        
        
        NFOT can be calculated as discussed in the section 3.2.1, while 
        FILO latency has to be obtained.  
        
        
        From Figure A.13, it can be observed that: 
        
        
        
        FILO latency = FIFO latency + Frame output time 
        
        
        
        Also, it can be observed that: 
        
        
        
        FIFO latency = First cell's transfer delay -  
                          (First cell's output transmit time + Monitor overhead)
        Frame output time = First cell to last cell inter-arrival time +  
                                                   Last cell's output transmit time
        where: 
        
        * the cell output transmit time = the time to transmit one cell 
        into the output link.  
                       = 53B * 8b / Output link rate in bps
        
        
        
        If measurements of cell inter-arrival times are accurate, there is 
        no need for any corrections in the FOLO expression due to the 
        monitor overhead.  
        
        
        Thus, to calculate MIMO latency when the input link rate is greater 
        than or equal to the output link rate, it is necessary to measure 
        the first cell's transfer delay and the inter-arrival time between 
        the first cell and the last cell of a frame.  
        
        
        
        [Figure A.13]