IP Performance Metrics WG (IPPM)

Tuesday, August 25 at 0900-1000

Chairs: Guy Almes <almes@internet2.edu>
Will Leland <wel@bellcore.com>

AGENDA:

1. Status report on I-Ds and RFC progress (G. Almes, W. Leland) (5 minutes)
2. Presentation on error bars (M. Zekauskas) (15 minutes) This short presentation
summarizes the new material added to I-Ds <draft-ietf-ippm-delay-04.txt> and
<draft-ietf-ippm-loss-04.txt>
3. Discussion of error bars and confidence intervals (30 minutes)
4. Future directions for IPPM (G. Almes, W. Leland) (10 minutes)
5. A brief overview and discussion, if time permits.

Will Leland opened the WG meeting with a review of the agenda followed by a brief
report on the WG's document status:
- RFC 2330: Framework for Metrics
- Connectivity: WG final call
- One-way Packet Loss, One-way Packet Delay
Discussed at April IETF, revised
Calibration error material added
- Delay Variation: revised
- Bulk Transfer: stalled
- Loss Patterns: no consensus

Matt Zekauskas then took the floor. He reviewed the changes that had been added to
the Drafts: added "Calibration" to the "Errors" section, removed the path parameter, and
created a new "Reporting the Metric" section.

The goal of this effort is "... to be able to compare metrics from different implementations."
To that end, you need to be able to find and remove systematic errors and then be able
to classify/characterize the random error. We want to be able to report for a singleton
value +/- "calibration error bar", with 95% confidence. The idea is to ensure the metrics
aren't dominated by the error. And, as defined with One-way Delay, the error is hard to
analyze. We need to look for a simpler way.

We can characterize errors in measurement as measured = true + systematic + random.
With suitable experience, we can remove the systematic component and characterize
the random error in a given measurement; that is reported = measured - systematic
yielding true = reported - random.

As to the question of why 95% versus some other interval: we must chose some specific
value to allow comparison between measurements. And, as experience suggests,
"...[f]or user-level implementation, 95% [is] tight enough to exclude outliers." This
value is therefore specific proposal for IPPM confidence bounds. It is important to note
that the calibration error is a property of the measurement instrument - in effect, "How
accurate is the yardstick?" - and is not a property of a series of measurements.

The problem of how to treat losses resolves into three possible causes, two of which are
false losses and which are looked at:
- False loss due to threshold value
- False loss due to resource limits on measurement instrument (e.g., buffers)
- Packets truly lost, but reported as finite (not considered).

The first is addressed by trying to have a large enough threshold value so that false loss
is not a problem. We report probability of false loss, which depends on local network
and measurement instrument loads.

Some potentially open issues; really some things to think about:
- How does an error bar relate to percentile statistics on the metrics?
- False loss could make large (e.g., 90th) percentile infinite
- Due to dispersion of values, low-to-middle percentiles don't change much.

Following Matt's presentation, a Q&A session further discussed the issues raised. Each
topic is summarized below:

- How are calibration error bars assigned? By calibrating in a lab under similar
conditions - that is under simulated load. The error bars are a property of the
singleton, not for the stream.

- The chairs welcome more comment on this - it seems to be the only open issue and,
while all would like to get the draft out, no important contributions should be missed.

- The question of clipping/discarding data was raised. This is not the case - no data
is discarded rather what is being attempted is to clip the amount of uncertainty -
how much error we are introducing into the measurement. Again, what this is is a
bunch of singletons, not a stream.

- Concern was expressed about multiple standards efforts in the area of metrics: for
example, the IPPM work overlaps with the new Draft ITU-T Recommendation
I.35IP: "Internet Protocol Data Communication Service - IP Packet Transfer and
Availability Performance Parameters". Manufacturers value having a single
standard; they are concerned that multiple standards may be inconsistent. Vern
Paxson, the IPPM Liaison with the T1A1 and ITU efforts in this area, explained
that presentations had been made by the IPPM to them prior to this Draft
Recommendation, but that there are differences in scope and intention between
the ITU and IPPM documents; the Chairs agreed that the critical issue is for the
standards to be consistent where they overlap but that a wider scope in the ITU
work is not a problem per se. Vern will look into the current situation and report to
the IPPM mailing list.

- The idea was raised of a document to be prepared by the WG which quantified the
metrics being developed - establishing what are good or bad values. Guy Almes
replyed that the WG's Area Directors had cautioned against just this - against any
sort of interpretation of the metrics. However, a customer might very well use
these metrics as tools in dealing with their providers for performance & reliability.
The metrics will permit the customer and provider to have a clearly understood set
of metrics. The customer and provider "only" have to agree on the values to be
associated.

- Will Leland pointed out that the WG should document experience as it is valuable
to capture and pass on, perhaps as a "Best Practices" document.

Following the Q&A, Will Leland led a brief discussion of Future Directions for the WG.
He pointed out that the Charter is much out of date and needs to be revised. He also
raised the question of what is the best process for stringent metric scrutiny. Finally,
several metrics have been proposed either in a session or on the mailing list, but have
elicited little response. He said "Orlando or Else" - that is, if the WG is to consider it, it
must have gotten some debate behind it before the next IETF, 7-11 December in Orlando.

Will also revisited the question of how to coordinate our work with that of other
standards bodies. He will try to get more resources applied before Orlando, and invited
volunteers.

A speaker pointed out how the publication by various labs of router "benchmark"
figures had caused the manufactures to take notice and pay attention. The speaker
hoped the WG's work would come to the same end.