************************************************************************

ATM Forum Document Number: ATM Forum/94-0883

************************************************************************

Title:  The OSU Scheme for Congestion Avoidance
        using Explicit Rate Indication

************************************************************************
Abstract:

An explicit rate indication scheme for congestion avoidance in
ATM networks is proposed. The sources monitor their load and
provide the information periodically to the switches. The
switches, in turn, compute the load level and ask the sources to
adjust their rates up or down.  The scheme achieves high link
utilization, low delay, fair allocation of rates among contending
sources, provides quick convergence and works for bursty traffic.

************************************************************************

Source:

Raj Jain, Shiv Kalyanaraman and Ram Viswanathan
The Ohio State University
Department of CIS
Columbus, OH 43210-1277
Phone: 614-292-3989, Fax: 614-292-2911, Email: Jain@ACM.Org

************************************************************************

Date:   September 26-29, 1994

************************************************************************

Distribution: ATM Forum Technical Working Group Members (Traffic Management)

************************************************************************

Notice: This contribution has been prepared to assist the ATM
Forum.  It is offered to the Forum as a basis for discussion and
is not a binding proposal on the part of any of the contributing
organizations.  The statements are subject to change in form and
content after further study.  Specifically, the contributors
reserve the right to add to, amend or modify the statements
contained herein.

************************************************************************

INTRODUCTION

This scheme developed at The Ohio State University (OSU) is also
an explicit rate indication scheme similar to the MIT scheme
[1,2]. However, it does not necessarily require the switches to
remember the rates of all VCs.  Thus, the minimal storage
requirements as well as the computational complexity becomes
order one, O(1), that is, the computation or storage does not
change as the number of VCs is changed. Also, it uses the exact
overload as measured at the switch to determine the allowed rate.

FEATURES OF THE OSU SCHEME

The OSU scheme has the following desirable features:

1. It provides high throughput. The bottleneck utilization can be
made close to 90-99%. The target utilization band (TUB) is
actually a parameter set by the switch owner.

2. The oscillations are bounded. Once the system enters the TUB,
it stays in the TUB trying to achieve fairness

3. The delays are minimum. The steady state queue lengths are
close to 1 resulting in minimum possible delay.

4. The OSU scheme is a congestion avoidance scheme in the sense
that the scheme provides high throughput and low delay. Also, the
network operating point does not become suboptimal as the more
memory is added to the switches.

5. The actual overload measured at the switch is used. Thus, any
unused capacity, which is not used by sources to whom it has been
allocated becomes available for other sources.

6. The scheme WORKS for bursty traffic.

7. The number of parameters has been kept small. The only
parameters are the target utilization band and the load averaging
interval.

8. The scheme is not very sensitive to parameter values. Slight
mistuning of these parameters does not cause instability in the
network.

9. The parameters are easy to set. Both target utilization band
and averaging intervals have intutitive meaning and can be easily
set by unskilled network operators.

10. The scheme requires only order one O(1) computation. In the
most basic form of the scheme, only the rate of the current VCs
is used and so the computation does not increase as the number of
VCs is increased.

11. Bipolar feedback is used. The switches can increase or
decrease the rate.  Additional round-trip for increase is
avoided.

12. Fairness is achieved without any per-VC scheduling, such as,
round-robin or fair-queueing.

13. A backward congestion notification (BECN) option is provided.
It is not required for proper operation but helps in some cases.

14. A precision fairness computation option, in which, the rates
of all sources are used in computing the feedback, in a manner
similar to that in the MIT scheme has also been designed. Again,
this is not required for proper operation but minimized
oscillations.

THE OSU SCHEME

In the OSU scheme, the sources monitor their average load and
periodically send control cells that contain the load
information. The switches monitor their own load and use it in
combination with the information provided in the control cells,
compute a factor by which the sources should go up or down. Like
the MIT Scheme, the control cell is returned by the destination
to the source, which then adjusts its rate as instructed by the
network. The key difference betwen the OSU scheme and the other
explicit rate schemes is the way the load is measured and rate
adjustment factor is computed.

The Control Cell (RM Cell) contains the following fields:

1. Transmitted cell rate (TCR). This is the inverse of the inter-cell
   transmission time.
2. The Offered Average Cell Rate (OCR) as measured at the source
3. Rate Adjustment Factor (initially 0)
4. Averaging interval (initially 0)
5. The direction of feedback (backward/forward)
6. Timestamp containing the time at which the control cell was generated
   at the source

The last two fields are used in the backward congestion
notification option and need not be present if that option is not
used. Other fields are explained later in this sections.

THE SOURCE ALGORITHM:

The source algorithm consists of three components:
1. How often to send control cells
2. How to measure the offered average cell rate
3. How to respond to the feedback received from the network

These three questions are answered in the next three subsections.

CONTROL CELL SENDING ALGORITHM:

The control cells are sent periodically every T interval.
Although it could be done by the cell count, using interval
allows the scheme to work on networks with widely varying link
speeds.  The network manager sets the averaging interval
parameter for each switch. The maximum of the averaging interval
along a path is returned in the control cell. This is the
interval that the source uses to send the control cells.

During an idle interval, no control cells are sent. If the source
measures the OCR to be zero, then one control cell is sent,
subsequent control cells are sent only after the rate becomes
non-zero.

MEASURING THE OFFERED AVERAGE LOAD

Unlike any other scheme proposed so far, each source also
measures its own load. The measurment is done over the same
averaging interval that is used for sending the control cells.
Notice that there are two separate parameters: transmitted cell
rate and offered average cell rate. The first is the
instantaneous cell rate during burst transmissions. The cells are
sent equally spaced in time. The inter-cell time is computed
based on the transmitted cell rate. However, the source may be
idle in between the bursts and so the average cell rate is
different from the transmitted cell rate. This average is called
the offered average cell rate and is also included in the cell.
Notice that TCR is a control variable (like the knob on a faucet)
while the OCR is a measured quantity (like a meter on a pipe).

Normally the OCR should be less than the TCR, except when the TCR
has just been reduced. In such cases, the the maximum of current
TCR and OCR is put in the TCR field.
 In other words,

                  TCR in Cell <- max{TCR, OCR}

RESPONDING TO NETWORK FEEDBACK

The control cells returned from the network contain a "load
adjustment factor" along with the TCR. The current TCR may be
different from that in the cell.  The source computes a new TCR
by dividing the TCR in the cell by the load adjustment factor in
the cell:

                TCR in the Cell
New TCR = ----------------------------------
          Load Adjustment Factor in the Cell

If the load adjustment factor is more than one, the network is
asking the source to decrease. If the new TCR is less than the
current TCR, the source sets its TCR to the new TCR value.
However, if the new TCR is more than current TCR, the source is
already operating below the network's requested rate and there is
no need make any adjustments.

Similarly, if the load adjustment factor is less than one, the
network is permitting the source to increase. If the current TCR
is below the new TCR, the source increases its rate to the new
value. However, if the current TCR is above the new TCR, the new
value is ignored and no adjustment is done.

THE SWITCH ALGORITHM

The switch algorithm consists of the following components:

1. How to measure the available capacity
2. How to achieve efficiency
3. How to achieve fairness

These issues and others arising from these are discussed next.

MEASURING THE CURRENT LOAD:

This consists of simply counting the number of cells received
during a fixed averaging interval. The interval is set by the
network manager. Based on the known capacity of the link, the
switch can compute the load level and determine whether it is
overloaded or underloaded.

Since running a link at full load generally results in large
queues, it is best to target the link utilization at close to but
not quite at 100%.  To achieve this the network manager selects a
target utilization, say 90%. Whenever the input rate is more than
90% of the nominal capacity, the link is said to be overloaded
and whenever the utilization is less than 90%, the link is said
to be underloaded.  The link cell rate when the network is
operating at the target utilization is computed:

                   Target Utilization X Link bandwidth in Mbps
Target Cell Rate = --------------------------------------------
                             Cell size in bits

The current load level is then given by:

                     Number of cells received during the averaging interval
Current Load level = ------------------------------------------------------
                            Target Cell Rate X Averaging Interval

ACHIEVING EFFICIENCY

To achieve efficiency, all we need is to replace the load
adjustment factor in each control cell by the maximum of the the
current load level and the load adjustment value already in the
cell.

Load Adjustment Factor = max(Load Adjustment Factor in the cell,
                             Current Load Level in this Switch)

This simple algorithm is sufficient to bring the network to
efficient operation within the next round trip. However, the
allocation of the available bandwidth among contending VCs may
not be fair. To achieve fairness we need to make use of the other
information in the control cells as discussed later.

COUNTING THE NUMBER OF ACTIVE SOURCES:

Like the MIT scheme, the switches in our scheme may also remember
the rates declared by various sources and use them in computing
the fair share. However, there are two differences. First, the
rates declared by the sources are "Offered Average Cell Rates
(OCRs)" and not the desired cell rates, which may or may not be
related to the actual rates. Secondly, in the simplest version of
our scheme rates of all sources are not required. All we need is
the number of active sources, which can be counted either by
counting the number of sources with non-zero OCRs or by marking a
bit in the VC table whenever a cell from a VC is seen. The bits
are counted at the end of each averaging interval and are cleared
at the beginning of each interval.

ACHIEVING FAIRNESS: The TUB Algorithm

In resource allocation, the top priority is to bring the network
to efficient operation. Once the network is operating close to
the target utilization, we need to take steps to achieve
fairness. The network manager declares a target utilization band
(TUB), say, 90+-9% or 81% to 99%. Whenever the link utilization
is in TUB, the link is said to be operating efficiently. As will
be seen later, it is better to express TUB in the U(1+- Delta)
format, where U is the target utilization level. For example,
90+-9% is expressed as 90(1+- 0.1)%.

Given the number of active sources, the fair share is computed as
follows:

                 Target Cell Rate
Fair Share = ------------------------
             Number of Active Sources

To achieve fairness, we treat the underloading and overloading
sources differently. Underloading sources for our scheme are
those sources that are using less than the fair share. While
overloading sources are those that are using more than the fair
share.

If the current load level is z, the underloading sources are
treated as if the load level is z/(1+Delta) and the overloading
sources are treated as if the load level is z/(1-Delta).  Here
Delta is the half-width of the TUB. We call this "the TUB
algorithm."

If the OCR in the control cell is less than the fair share, the
load adjustment factor in the cell is changed as follows:

Load Adjustment Factor = max(Load Adjustment Factor in the cell, z/(1+Delta))

On the other hand, if the OCR in the control cell is more than
the fair share, the load adjustment factor in the cell is
adjusted as follows:

Load Adjustment Factor = max(Load Adjustment Factor in the cell, z/(1-Delta))

We have proven that this algorithm guarantees that the system
consistently moves towards more fair operation. Also, once inside
the TUB, the network remains in the TUB unless the number of
sources or their load pattern changes. In other words, TUB is a
``closed'' operating region.  These statements are true for any
value of Delta less than 0.5.

If Delta is small, as is usually the case, division by 1+Delta is
approximately equivalent to a multiplication by 1-Delta and vice
versa.

THE DESTINATION ALGORITHM:

The destination simply returns all control cells back to the
source.


SIMULATION RESULTS:

We have done extensive simulation testing of the scheme [3]. The
results will be presented partly in this forum meeting and then
in the November meeting.

OTHER OPTIONS:

The basic scheme as described above is sufficient to bring the
network to optimal and fair operation under all circumstances.
However, the performance can be improved by a number of
extensions.  These extensions will be the subject of a future ATM
Forum contribution.

REFERENCES

[1] A. Charny, D. Clark, and R. Jain, "Congestion Control with
Explicit Rate Indication," ATM Forum Contribution 94-0692, July
1994.

[2] Anna Charny, "An Algorithm for Rate  Allocation  in  a Cell-
Switching Network with Feedback", MIT TR-601, May 1994.

[3] R. Jain, S. Kalyanraman, and R. Viswanathan, "The OSU Scheme
for Congestion Avoidance in ATM Networks Using Explicit Rate
Indication," The Ohio State University, Department of CIS,
Technical report, under preparation.