IP Performance Metrics WG (IPPM)

Tuesday, August 25 at 0900-1000

Chairs:		Guy Almes <almes@internet2.edu>
        		Will Leland <wel@bellcore.com>

AGENDA:

1. Status report on I-Ds and RFC progress (G. Almes, W. Leland) (5 minutes)
2. Presentation on error bars (M. Zekauskas) (15 minutes)  This short presentation 
summarizes the new material added to I-Ds <draft-ietf-ippm-delay-04.txt> and 
<draft-ietf-ippm-loss-04.txt>
3. Discussion of error bars and confidence intervals (30 minutes)
4. Future directions for IPPM (G. Almes, W. Leland) (10 minutes)
5. A brief overview and discussion, if time permits.

Will Leland opened the WG meeting with a review of the agenda followed by a brief 
report on the WG's document status:
- RFC 2330: Framework for Metrics
- Connectivity: WG final call
- One-way Packet Loss, One-way Packet Delay
Discussed at April IETF, revised
Calibration error material added
- Delay Variation: revised
- Bulk Transfer: stalled
- Loss Patterns: no consensus

Matt Zekauskas then took the floor.  He reviewed the changes that had been added to 
the Drafts: added "Calibration" to the "Errors" section, removed the path parameter, and 
created a new "Reporting the Metric" section. 

The goal of this effort is "... to be able to compare metrics from different implementations."  
To that end, you need to be able to find and remove systematic errors and then be able 
to classify/characterize the random error.  We want to be able to report for a singleton 
value +/- "calibration error bar", with 95% confidence.  The idea is to ensure the metrics 
aren't dominated by the error.  And, as defined with One-way Delay, the error is hard to 
analyze.  We need to look for a simpler way.

We can characterize errors in measurement as measured = true + systematic + random.  
With suitable experience, we can remove the systematic component and characterize 
the random error in a given measurement; that is reported = measured - systematic 
yielding true = reported - random.

As to the question of why 95% versus some other interval: we must chose some specific 
value to allow comparison between measurements.  And, as experience suggests, 
"...[f]or user-level implementation, 95% [is] tight enough to exclude outliers."  This 
value is therefore specific proposal for IPPM confidence bounds. It is important to note 
that the calibration error is a property of the measurement instrument - in effect, "How 
accurate is the yardstick?" - and is not a property of a series of measurements.

The problem of how to treat losses resolves into three possible causes, two of which are 
false losses and which are looked at:
- False loss due to threshold value
- False loss due to resource limits on measurement instrument (e.g., buffers)
- Packets truly lost, but reported as finite (not considered).

The first is addressed by trying to have a large enough threshold value so that false loss 
is not a problem.  We report probability of false loss, which depends on local network 
and measurement instrument loads.

Some potentially open issues; really some things to think about:
- How does an error bar relate to percentile statistics on the metrics?
- False loss could make large (e.g., 90th) percentile infinite
- Due to dispersion of values, low-to-middle percentiles don't change much.

Following Matt's presentation, a Q&A session further discussed the issues raised.  Each 
topic is summarized below:

- How are calibration error bars assigned?  By calibrating in a lab under similar 
conditions - that is under simulated load. The error bars are a property of the 
singleton, not for the stream.

- The chairs welcome more comment on this - it seems to be the only open issue and, 
while all would like to get the draft out, no important contributions should be missed.

- The question of clipping/discarding data was raised.  This is not the case - no data 
is discarded rather what is being attempted is to clip the amount of uncertainty - 
how much error we are introducing into the measurement.  Again, what this is is a 
bunch of singletons, not a stream.

- Concern was expressed about multiple standards efforts in the area of metrics: for 
example, the IPPM work overlaps with the new Draft ITU-T Recommendation 
I.35IP: "Internet Protocol Data Communication Service - IP Packet Transfer and 
Availability Performance Parameters".  Manufacturers value having a single 
standard; they are concerned that multiple standards may be inconsistent. Vern 
Paxson, the IPPM Liaison with the T1A1 and ITU efforts in this area, explained 
that presentations had been made by the IPPM to them prior to this Draft 
Recommendation, but that there are differences in scope and intention between 
the ITU and IPPM documents; the Chairs agreed that the critical issue is for the 
standards to be consistent where they overlap but that a wider scope in the ITU 
work is not a problem per se. Vern will look into the current situation and report to 
the IPPM mailing list.

- The idea was raised of a document to be prepared by the WG which quantified the 
metrics being developed - establishing what are good or bad values.  Guy Almes 
replyed that the WG's Area Directors had cautioned against just this - against any 
sort of interpretation of the metrics.  However, a customer might very well use 
these metrics as tools in dealing with their providers for performance & reliability.  
The metrics will permit the customer and provider to have a clearly understood set 
of metrics.  The customer and provider "only" have to agree on the values to be 
associated.

- Will Leland pointed out that the WG should document experience as it is valuable 
to capture and pass on, perhaps as a "Best Practices" document. 

Following the Q&A, Will Leland led a brief discussion of Future Directions for the WG.  
He pointed out that the Charter is much out of date and needs to be revised.  He also 
raised the question of what is the best process for stringent metric scrutiny.  Finally, 
several metrics have been proposed either in a session or on the mailing list, but have 
elicited little response.  He said "Orlando or Else" - that is, if the WG is to consider it, it 
must have gotten some debate behind it before the next IETF, 7-11 December in Orlando. 

Will also revisited the question of how to coordinate our work with that of other 
standards bodies. He will try to get more resources applied before Orlando, and invited 
volunteers.

A speaker pointed out how the publication by various labs of router "benchmark" 
figures had caused the manufactures to take notice and pay attention.  The speaker 
hoped the WG's work would come to the same end.