CURRENT_MEETING_REPORT_


Reported by Steve Casner/USC-ISI

Minutes of the Audio/Video Transport Working Group (AVT)

The AVT Working Group met during three separate sessions.  The first
session began with presentations of candidate protocols for real-time
audio/video transport, followed by a lively discussion of the
differences among the candidates and the underlying questions implied by
those differences.  The discussion resumed in the second session and
part of the third, followed by live demonstrations of experimental
packet audio and video programs.

As part of the second IETF ``audiocast'', live audio and video from all
three sessions was transmitted via UDP and IP multicast to participants
at a number of locations around the world.  At least two remote
participants made multiple contributions to the Working Group
discussion.

1.  Presentations of Candidate Protocols

Steve Casner began with a quick review of the descriptions of the
Network Voice Protocol (NVP-II) data packet format and the first-cut
strawman protocol from the San Diego meeting, then presented a
second-cut strawman based on the discussions in San Diego.  The data
packet header contains the following fields:


   o Timestamp (16 bits of seconds + 16-bit fraction)
   o Packet Sequence Number (16 bits)
   o Flow Identifier (8 bits)
   o Options Length (8 bits)
   o Options


Since Van Jacobsen could not attend, Steve also described the protocol
used by the vat audio program, based on a protocol description sent by
Van to the rem-conf list.  The data packet header format is:


   o Protocol Version (2 bits)
   o Number of Site Identifiers to follow header (6 bits)
   o Start-of-Talkspurt Flag (1 bit)
   o Audio Format/Encoding (5 bits)
   o Conference Identifier (16 bits)
   o Timestamp (32-bit audio sample counter)
   o Site Identifiers (0 to 63; 32 bits each)


Both of these data packet formats depend on a session/control protocol
to carry information that is not required in every data packet.  Henning

                                   1





Schulzrinne described the extensions to the vat session protocol used in
his NEVOT audio program, in particular the periodic transmission of the
sender's state (the current time and how many samples have been
transmitted) to enable measurement of loss at the receiver.

Simon Hackett gave an impromptu overview of his Multimedia Data Switch
(MMDS) application and protocol.  For purposes of experimentation, Simon
chose to use large headers including a variety of fields to make the
data self-describing.  He also continues to send packet headers during
silence as a keep-alive, but just omits the data to reduce the
bandwidth.

See section 5 below for references on these protocols.

2.  Discussion of Protocol Differences

The goal of the discussion was to identify the issues that must be
resolved in order to produce a draft protocol.  The primary ones were:


   o Timestamp format, media sample clock or real time
   o Sequence number versus start-of-talkspurt flag
   o What multiplexing is required beyond address+port
   o Whether or not to indicate encoding format in data packets


The first two issues underlie a key question for the Working Group,
namely whether we should define one real-time transport protocol or
multiple application-specific protocols.  The rough concensus was for
the former, but this may conflict with ease of implementation.

The Working Group discussed timestamp formats at the last meeting and
this one, but the issue is still not finally decided.  For purposes of
synchronization among multiple media sources, the only practical means
is to relate all streams to real time (synchronized time of day).  This
would be simplified if the timestamps are in real time, but the
implementation of audio buffering is much easier with an audio sample
clock timestamp.  The timestamp format could be converted either at the
sender or receiver; what's needed is a detailed analysis of the
tradeoffs.

The strawman protocols propose a packet sequence number in addition to
the timestamp in order to differentiate lost packets from packets not
sent during silence.  The vat protocol uses a flag on the first packet
of a talkspurt because packet mis-ordering makes the sequence number
hard to use.  On the other hand, a sequence number may be required for
video applications that don't have talkspurts but require multiple
packets per frame all with the same timestamp.

The Flow ID in the strawman protocol serves two purposes:  it provides
multiplexing of multiple streams (e.g., audio and video) from the same
source on one IP multicast address and port, and it allows for different
encodings to be used, with each Flow ID bound to an encoding descriptor

                                   2





using the session/control protocol.  As defined, the vat protocol
includes an explicit encoding format field in the data packet, but the
Working Group deemed 5 bits to be too small a number.  The vat encoding
values could also be bound a dynamic set of encoding descriptors using a
control protocol.

The vat Conference ID discriminates among conferences in case of a
collision in random IP multicast address allocation and because many BSD
derived systems don't allow discriminating on the multicast destination
address.  The strawman assumes a repair of the BSD deficiency (which
seems feasible at this time for multicast capable systems) and assumes
some other method to avoid address collisions.

3.  Completeness and Compatibility with Connection Management

In addition to resolving differences among the protocol proposals, we
must consider whether the protocols are sufficiently complete.  Unlike
the audio and video conferencing applications, distributed simulation
and PBX trunking may require aggregation of multiple frames of data into
a single packet.  If the frames can all share the same header
information, then aggregation can be consigned to the next layer up; if
not then some additional encapsulating mechanism would be required.  We
did not consider this further.

Another extension would be flow control.  In previous Working Group
discussions, it has been assumed that network resource management
mechanisms and protocols would be available to allow real-time
applications to avoid congestion.  Christian Huitema pointed out that at
least over some paths we will probably need a feedback mechanism to
allow adjustable codecs to accommodate congestion.  The Group was unsure
whether an application-independent feedback mechanism could be defined.
Christian is to write a specification as a starting point.

This Working Group's low-level protocol must also be compatible with
higher-level connection management protocols such as those under
discussion in the Remote Conferencing Architecture BOF. Provision of
encoding format selections from a conference directory server seems
straightforward.  However, the server must also have a means to acquire
an IP multicast address.  Lixia Zhang suggested (remotely!)  that we
really should consider a distributed system of servers to hand out
globally unique IP multicast addresses; this capability will be needed
by several groups considering multicast, not just ours.

4.  Software Encoding and Enumeration

The real-time transport protocol should be independent of the media
encoding algorithms and formats that belong to the next higher layer
except that the format must be identified by the lower layer.  However,
in keeping with the Working Group goal to foster interoperation and
experimentation with packet audio and video, it may be valuable to agree
on some (perhaps low performance) software compression techniques for
use until hardware is generally available.  This suggests that some of


                                   3





the encoding formats we need to identify will be non-standard and hence
not included in any standard enumeration.

The Working Group feels a strong need to pick up a task that has been
deferred by others, to define an IANA-managed enumeration or naming
convention for audio and video encoding algorithms to enable
interoperation.  The enumeration should not be part of the protocol
itself, but the protocol must provide the space to carry the encoding
identification.  There was substantial discussion of numeric vs
text/parametric identification of formats.  This issue was not resolved.

The third Working Group session was concluded with descriptions and
demonstrations of the software encoding algorithms developed by Working
Group participants.  Paul Milazzo gave an update on the protocol for the
BBN Desktop Video Conference program which was used to multicast packet
video from IETF. Christian Huitema showed the INRIA H.261 video
compression software.  Hans Eriksson described the packet audio and
video experiments at SICS.

5.  Further Discussion

While several issues were not resolved, we laid out the considerations
for each choice well enough to guide the design of a complete set of
consistent choices as the first draft protocol from this Group.  Our
(revised) goal is to have an Internet Draft protocol submitted by
November.  Further discussion by email will be required to make this
happen.

During the IETF meeting, some notes from the first session, including a
description of the strawman and vat protocols, was sent to the rem-conf
list.  It should be in the archive, or may be requested from
casner@isi.edu.  A message from last March on MMDS is also available.

An extensive summary of the issues and a protocol recommendation has
been prepared by Henning Schulzrinne and is available from:


    gaia.cs.umass.edu:~ftp/pub/rtp/rtp.ps


This working paper will be made an Internet Draft for wider
distribution.

Thanks to Eve Schooler, Henning Schulzrinne and Christian Huitema for
taking the notes from which these Minutes were prepared.

Attendees

George Abe               abe@infonet.com
J. Allard                jallard@microsoft.com
John Batzer
Lou Berger               lberger@penril.com

                                   4





James Berry              beri@sandia.llnl.gov
Luc Boulianne            lucb@cs.mcgill.ca
Scott Brim               swb@cornell.edu
Alan Bryenton            bryenton@bnr.ca
Randy Butler             rbutler@ncsa.uiuc.edu
Stephen Casner           casner@isi.edu
Yee-Hsiang Chang         yhc@concert.net
Andrew Cherenson         arc@sgi.com
Robert Clements          clements@bbn.com
Michael Collins          collins@ccc.nersc.gov
Steve Deering            deering@parc.xerox.com
Tony DeSimone            tds@hoserve.att.com
Jack Drescher            drescher@concert.net
Hans Eriksson            hans@sics.se
Julio Escobar            jescobar@bbn.com
Roger Fajman             raf@cu.nih.gov
Margaret Forsythe        mrf@ftp.com
Osten Franberg           euaokf@eua.ericsson.se
Ron Frederick            frederick@parc.xerox.com
Jerry Friesen            jafries@sandia.llnl.gov
Robert Gilligan          Bob.Gilligan@eng.sun.com
Simon Hackett            simon@internode.com.au
Robert Hagens            hagens@ans.net
Christian Huitema        christian.huitema@sophia.inria.fr
Peter Kirstein           P.Kirstein@cs.ucl.ac.uk
Jim Knowles              jknowles@trident.arc.nasa.gov
Padma Krishnaswamy       kri@sabre.bellcore.com
Matt Mathis              mathis@a.psc.edu
Cindy Mazza
Donald Merritt           don@brl.mil
Paul Milazzo             milazzo@bbn.com
Robert Mines             rfm@sandia.llnl.gov
Donald Morris            morris@ucar.edu
Ari Ollikainen           ari@es.net
Roger Osmond             bytex!rfo@uunet.uu.net
Larry Palmer             lp@decvax.dec.com
Michael Powell           mdpowel@pacbell.com
Russell Pretty           pretty@bnr.ca
K. K. Ramakrishnan       rama@erlang.enet.dec.com
Bradley Rhoades          bdrhoades@mmc.mmmg.com
Allan Rubens             acr@merit.edu
Henry Sanders            henrysa@microsoft.com
Eve Schooler             schooler@isi.edu
Koichiro Seto            seto@hitachi-cable.co.jp
Vincent Sgro             sgro@cs.rutgers.edu
Louis Steinberg          louiss@vnet.ibm.com
Terrance Sullivan        terrys@newbridge.com
Sally Tarquinio          sallyt@gateway.mitre.org
Claudio Topolcic         topolcic@nri.reston.va.us
Mark Uhrmacher           maui@tabasco.lcs.mit.edu
Andrew Veitch            aveitch@bbn.com
John Vollbrecht          jrv@merit.edu
David Waitzman           djw@bbn.com
Sandro Wallach           sandro@elf.com

                                   5





Abel Weinrib             abel@bellcore.com
Rick Wilder              wilder@ans.net
Walter Wimer             walter.wimer@andrew.cmu.edu
Linda Winkler            lwinkler@anl.gov
Jeff Wong                jaw@io.att.com
Richard Woundy           rwoundy@rhqvm21.vnet.ibm.com
John Wroclawski          jtw@lcs.mit.edu
Paul Zawada              Zawada@ncsa.uiuc.edu



                                   6