Minutes of the Audio/Video Transport Working Group

Reported by Colin Perkins and Steve Casner.

The audio/video transport working group met twice at the 47th IETF in
Adelaide. In the first session the group discussed the update of the RTP
specification, work on a new profile for unicast RTP with retransmission,
and RTP header compression and multiplexing. The second session included
discussion of a number of payload formats, transport of MPEG-4, and
authentication of RTP streams.

The meeting started with a review of work in progress. A number of RFCs 
have been published since the last meeting:
- RTP payload format for generic FEC (RFC 2733)
- Guidelines for authors of RTP payload formats (RFC 2736)
- Sampling group membership (RFC 2762)
In addition, a number of drafts are with the IESG, awaiting publication:
- RTP MIB
- RTP payload format for DMTF tones
- RTP payload format for real-time pointers

The working group last call on the RTP specification and audio/video profile 
concluded at the Washington DC meeting, and the set of edits agreed at that 
meeting were completed in January, resulting in draft-ietf-avt-rtp-new-06.txt 
and draft-ietf-avt-profile-new-08.txt. 

A number of additional minor edits have since been made to the RTP
specification, based on comments made on the mailing list, leading to
draft-ietf-avt-rtp-new-07.txt. Updates for the profile text on G.729 and
G.723.1 have been received from the ITU-T, but have yet to be incorporated.

Considering the companion documents to the RTP specification, the MIME
registration document (draft-ietf-avt-rtp-mime-02.txt) has had L16
pre-emphasis parameters added, whilst the RTCP bandwidth modifiers document
(draft-ietf-avt-rcp-bw-01.txt) has not changed (just an increase in version
number, to prevent the draft expiring). Both of these are intended for
proposed standard. 

The RTP specification and audio/video profile are considered ready for IETF
last call, however they cannot be issued as draft standard RFCs until an
interoperability statement has been produced. 

The current status of the testing was presented by Colin Perkins. There are a 
number of drafts listing interoperability requirements and testing strategies:
- draft-ietf-avt-rtp-interop-02.txt 
- draft-ietf-avt-profile-interop-00.txt
- draft-ietf-avt-rtptest-02.txt
which have not changed since the last meeting. There is also a web page 
  http://www-mice.cs.ucl.ac.uk/multimedia/mist/avt/RTPinterop/
showing the current status. It was noted that progress has been slow and
that we urgently require interoperability statements from implementors of
rtp and the audio/video profile.

In particular, please contact the working group chairs if you have an
implementation which implements any of the following features of RTP:
padding, header extension, SDES PHONE/LOC/PRIV, BYE with multiple
SSRC/reason text, RTCP APP packets, encryption, RTCP reconsideration
algorithms and step join back-off, SSRC collision/loop detection,
modification of the RTCP bandwidth fraction, or transport using TCP.

Furthermore, if you have implemented any of the following codecs and have
tested against another implementation, please contact the chairs: 1016,
G.726-32, G.723.1, G.722, QCELP, G.728, G.729, GSM HR/EFR, PCMA, CellB, JPEG,
MPT, MP1S, MP2P, BMPEG, H.263, H.263+, BT.656.

The other area of discussion relating to the base RTP specification was
congestion control. Steve Casner noted that the IESG are concerned that RTP
does not implement congestion control, and that this has the potential to
harm the network. Other protocols are required to have congestion control
to advance on the standards track, why should RTP be exempt?

It was suggested that the requirements for congestion control are, to some
extent, profile specific: it makes sense to include a warning of the dangers 
of not implementing congestion control in the main RTP specification, but to
defer details to the profiles. A number of people agreed with this, noting
also that there is not much more we can design in right now, but a discussion 
of how to do congestion control would be a useful addition - exploring the 
design space for adaptation, and why it's needed. 

Mark Handley noted that we have work ongoing with the retransmission profile
for unicast, which needs to have strong congestion control. The existing
A/V profile also needs a discussion of the TCP-equivalent rate and when to
stop or adapt if your loss rate exceeds that. Steve Casner asked if the
existing feedback from RTCP was sufficient to perform rate adaptation for
congestion control. Mark Handley answered that it probably was not, but
should be sufficient to know when to leave the session.

The consensus of the group was that the main RTP spec will mention that
congestion control is important, but will defer to the profiles for the
details. Mark Handley volunteered to write the congestion control section
for the A/V profile, which he did before end of this IETF.  The
suggested text will be discussed on the mailing list.

Steve Casner presented work on an enhanced comfort noise RTP payload format
for Robert Zopf. This is a replacement for the existing payload format, as
discussed at the previous meeting, and includes noise spectrum data in
addition to energy level (extending the one-octet existing format to a
multi-octet format). It is planned that this will replace the existing
format. Steve Casner asked if anyone knew of any compatibility problem that
would arise with having the new definition define a longer format with the
existing static type number. Feedback from the group on these compatibility
issues is solicited.

The next items for discussion were retransmission schemes for RTP packets,
and RTCP reporting extensions. 

Colin Perkins presented work on RTCP Reporting Extensions for Timur Friedman 
(draft-friedman-avt-rtcp-report-extns-00.txt). This proposes a framework to
extend SR/RR packets, using profile specific reporting extensions. This
comprises a minimal header (two fields only: type and length) with several
proposed uses:
- Run length encoding (RLE) of the packet loss pattern
- RLE of duplicates
- List of timestamps
- Detailed loss/duplicate/jitter statistics

The motivation for the work is that RTCP SR/RR packets include limited 
information, and a number of applications (e.g. MINC and MRM) can use
extended statistics if provided in a uniform format. There are a number 
of open issues: How to handle large extensions? Include all extensions?  
Some?  Which? As extensions to the SR/RR reception report array, or in
a separate APP packet?  Feedback is solicited.

The next presentation was on the subject of RTCP-based retransmission
(draft-podolsky-avt-rtptx-01.txt) by Koichi Yano. The changes from the
previous version include:
- Now posed as a new profile, RTP/RX, which inherits the A/V profile,
  except for the RTCP interval (it allows immediate NACKs, but still keeps
  the average bandwidth limit) and the provision of a new RTCP packet type
  for NACKs (including first sequence number, 16 bit loss bitmap, and SSRC)
- The SSRC is now included in NACK packets
- It has been simplified: only for NACK (former draft defined a
  multi-purpose ACK, but  NACK is enough for most purposes)
- Removed RX-protocol sybtype and flag fields (for simplicity, ease of
  implementation)

The need for congestion control to be well specified and non-optional in
this profile was noted. Mark Handley stated that we have to assume that
loss is due to congestion, this means that just re-transmitting lost
packets is something we have to be very careful of, since it can lead to a
positive feedback loop, worsening congestion. The TCP equivalent throughput
should be considered to be the maximum acceptable rate for a scheme such as
this. It was also noted that retransmissions bias your loss fraction sample, 
since by definition, they are only sent when the network is congested and 
likely to drop packets.

The effects of retransmission on RTCP receiver reports were discussed (e.g.
should packets repaired by retransmission be included in the loss fraction?), 
as was the possibility to extend the RR to include the number of NACKs sent, 
requested packets, duplicates, etc. Another open issue is how to denote the 
loss bitmap. As the first sequence number plus bitmap or as the last sequence 
number plus bitmap?

It was noted that there are a number of security implications to this
profile: for example, the potential for denial of service due to bogus
NACKs, especially if multicast is used.  Also NACKs may need to include 
an unpredictable nonce/cookie, to make this difficult to spoof. Finally, 
the NACK format is deployable for multicast, but does it make sense?
Note that retransmission is being considered only for unicast at this
point pending further work in the Reliable Multicast Research Group
and RMT working group.

The next steps for this draft are as follows:
- Re-issue as an AVT work item (draft-ietf-avt-...) 
- Add more description relating to SDP, RTSP
- Add more description of sender and receiver's recommended behavior
- Implement and test

The final presentation in this area was an RTP payload for selective
retransmission (draft-miyazaki-avt-rtp-selret-00) presented by Carsten
Burmeister. The target applications for this payload format are streaming
applications over wireless links which have a high bit error rate, which
implies non-congestive packet loss, hence a retransmission scheme is useful
(just the important data, if possible). The draft defines a new payload
format, with header additions to mark important packets, together with a
retransmission request scheme.

A number of comments were made following this presentation:
- It was noted that there are IPR consideration with this proposal.
- Steve Casner noted that it is not clear that a payload format is the right 
  thing to do here. Either this scheme is specific to MPEG, in which case it 
  should be merged with the MPEG payload format, or it should be a new RTP 
  profile.
- Some meeting participants did not like the use of two payload type
  identifiers, when a single one would be sufficient. The introduction 
  of excess de-multiplexing points is to be avoided.
- Concern was raised that the necessary changes to RTCP timing rules have
  not been addressed.
- A comment was made that the idea of marking important packets within an
  application level header is `hackish'. It may be better to multiplex under 
  different connections (e.g. all the I frames in one stream, P frames in a 
  different one) using the UDP/IP layer as a multiplex (e.g. use different 
  UDP ports, or a different diffserv TOS).

This section of the meeting concluded with the suggestion that this
proposal could be combined with the previous.

The next section of the meeting discussed RTP compression and multiplexing.
The first presentation was by Tmima Koren on enhancements to CRTP. This
work makes CRTP more robust to packet loss, and allows it to work better on
high delay links. Discussion related to whether the extra complexity of
this form of compressed packet is justified by the efficiency gain. It was
noted that we need to introduce this complexity to correct for the
non-constant increment of the IP-ID. Carsten Bormann asked how much we care
about preserving the IP-ID, since it's redundant if DF is set. It was noted
that the IESG have previously expressed pushback on schemes which don't
protect the IP-ID. If we consider patterns of the IP-ID, can we reduce the
number of bits (e.g. we may be able to send the least significant bits of
the IP-ID in many common cases). We don't always need 16 bits for this. How
much do we care about backwards compatibility with RFC2508, since we have
to negotiate use of this extension anyway? Are these changes the correct
ones to make when moving CRTP to draft standard? These questions require
further discussion on the mailing list.

Carsten Bormann gave a pointer to the robust header compression working
group, which is doing related work.

Bruce Thompson presented tunnel compressed RTP (draft-ietf-avt-tcrtp-00.txt).  
The scheme proposed in the original draft submitted in Oslo (draft-wing-avt-
tcrtp-00.txt) has since been broken into distinct parts: IP Tunneling, PPP
Multiplexing, and CRTP with enhancements. The current draft reflects this
split, and describes how those parts fit together. It was noted that the
bandwidth efficiency of multiplexing in this manner is equivalent to CRTP,
once 3-4 calls are multiplexed.

Steve Casner noted that this is intended to be the standard multiplexing
solution for RTP streams in place of other "RTP multiplexing" schemes
discussed previously in AVT. Comments from the authors of the other schemes 
on the suitability of this solution would be greatly appreciated.

In the last presentation of the first day, Lou Berger discussed the
extension of CRTP to work with MPLS (draft-berger-mpls-hdr-comp-00.txt,
draft-berger-mpls-hdr-comp-over-ppp-00.txt). This work will be done in 
the MPLS working group, in conjunction with AVT. Open issues include:
- A bit naming collision with draft-koren-avt-crtp-enhance-01.txt (both
  define a bit named "N" in different locations)
- The current draft doesn't support the CRTP enhancements introduced by
  Koren
- Should MPLS/IP header compression over PPP reuse the same packet type 
  values as IP header compression or new values to allow for easier debugging?

The second day of the meeting comprised discussion of RTP payload formats
for a number of codecs. The first of these was the G.722.1 payload format
(draft-ietf-avt-rtp-g7221-00.txt) presented by Steve Casner for Tony
Crossman. This is a straight forward payload format, with one or more codec
frames being packed into an RTP packet with no additional payload header.
It was noted that G.722.1 is specified to use 16kHz RTP timestamp clock
(unlike G.722 which was mistakenly specified to use an 8kHz clock which is
retained for backward compatibility).  G.722.1 also supports several data
rates, which must be signaled out of band.

A number of drafts were submitted relating to the RTP payload format for
the AMR speech codec. The current status of this work was presented by
Johan Sjoberg and Ari Lakaniemi. It has been agreed that the authors of
these drafts will merge them into a single draft over the next few weeks.
In addition, the MIME type registration will be merged into the payload
format document, and an additional document will be referenced to specify
the storage format.

The RTP payload formats for DV audio/video were presented by Katsushi
Kobayashi. Changes to the DV video format are minor. The major discussion
point was if audio-only streams should be sent as DV format, or converted
to, for example, L16. It was decided to leave the current specification
unchanged in recommending that audio-only streams be converted to a native
format. We also discussed how to specify audio channels, since DV allows
for more channels than the AIFF-C convention used in the A/V profile, hence
an SDP attribute may be needed to define the channel ordering. This is only
needed when sending the data unbundled from DV, as native audio. This can
clearly be defined for the new audio formats specified here, but does it
make sense to allow this channel specification to be applied to L16 format?
Should it be allowed for channel specifications which fit into the AIFF-C
convention?  Carsten Bormann stated the question as: can we just add, after
the fact, a=fmtp parameters to existing payload formats? I.e. can this be
written to apply to L16 also? Do we need to revise the L16 MIME
registration? After some discussion Steve Casner noted that the group
consensus was to go ahead and define this SDP method.

The RTP payload formats for HDTV and AC3 audio were discussed by Allison
Mankin. She noted that a lot of implementation work has been conducted
since the last meeting (by  the University of Washington, 3com, and
Tektronics) and this has lead to a number of changes in the drafts, and
some issues remain. Bill Nowicki has asked if this can be written as a
payload format for any kind of uncompressed video, to make it more generic?
It has also been noted that a breakdown on which SMPTE standards should be
included here needs to be added: there are compressed (but still high
bandwidth) payloads which may need a format. It is not entirely clear which
way to cut the document, and advice is solicited. It is also unclear if
1920 scanlines is enough. It is now, but what about in future? Can we use
SDP signaling to save bits in the header but still allow a wider scanline
range? What about the Vertical Blanking Interval? This has been left out of
the current draft, to be thought about (since it increases the data rate
significantly). Several people have wanted to include this, so it will most
likely go back in. Is a 90kHz timestamp sufficient? Finally, a plea was
made for anyone who has implemented RTP in hardware to contact Allison;
insights are sought...

Ross Finlayson presented a revision of the more-loss-tolerant payload
format for MP3 audio (draft-ietf-avt-rtp-mp3-01.txt). Since the previous
version, the draft has been updated to include optional support for
interleaving (as discussed in Washington DC), and the process by which this
is achieved was described.  Feedback is solicited, and it was proposed to
advance this document to proposed standard after next meeting. A question
was asked regarding performance analysis, relative to RFC 2250.  The
payload format just moves bytes around compared to RFC 2250, and is not
expected to be significantly processor intensive. Subjective tests show it
sounds better, but these were not done in a very scientific manner.

MP2 transport stream extensions were presented by Steve Casner for Humphrey
Liu. This draft extends RFC 2250 to allow an alternate RTP timestamp clock
rate of 27MHz so that the every MPEG packet in a 40Mb/s multi-program MPEG
transport stream can be positioned accurately in time, and defines a
"piecewise CBR" method to reconstruct timing at the receiver. At the last
AVT meeting, some participants questioned whether this would work.  Since
the last meeting, work has been underway on experimental validation of the
draft. This work was outlined, and it was noted that it seems to work so
far, but more experiments are needed.

The discussion of MPEG-4 was introduced by Colin Perkins, with a review of
the consensus from the last meeting. It was noted that:
- We should allow MPEG-4 elementary stream codecs to be packetized in a manner 
  similar to other codecs, with standards track payload formats. The group has 
  adopted draft-ietf-avt-rtp-mpeg4-es-00.txt as a work item, for eventual 
  submission to the standards track.
- Multiplexed MPEG-4 media is to be treated in a similar manner to earlier
  bundled MPEG transport. Hence, we will consider a FlexMux payload format
  if one is submitted.
- We do not believe we fully understand the issues involved in the
  transport of the complete MPEG-4 system over RTP. Hence we will submit
  such payload formats for publication as experimental RFCs, whilst we gain
  implementation experience. At present we have two such payload formats:
  draft-ietf-avt-rtp-mpeg4-02.txt and draft-ietf-avt-mpeg4streams-00.txt
The primary focus of the discussion at this meeting was the payload format
for elementary streams, and in particular the transport of back-channel
information. We also had an update on one of the formats for the complete
system, the other payload format (draft-ietf-avt-rtp-mpeg4-02.txt) has not
changed since the last meeting.

The payload format for elementary streams (draft-ietf-avt-rtp-mpeg4-es-00.txt) 
was presented by Yoshihiro Kikuchi. Changes since the last meeting include:
- The marker bit in the visual format has changed from marking random
  access points to marking the last packet in a VOP (for consistency with
  other video formats).
- The timestamp resolution has a default of 90KHz, for consistency with
  other MPEG formats.
- The specification has been updated to match changes in MPEG-4 Version 2
  in the move from FPDAM to FDAM. 
There were also several minor editorial changes. A number of
interoperability tests have also been conducted, successfully. 

The payload format for elementary streams also includes an RTCP format for
MPEG-4 backward channel messages, such as NEWPRED error resilience. This
was presented by Shigeru Fukunaga. A number of issues were noted:
- Timing of Sending RTCP: RTCP packets should be sent as soon as possible,
  the issues are much as in the re-transmission profiles. Is a new profile
  required here?
- Multicast or Unicast: NEWPRED is workable with small scale multicast, but
  it is probably desirable to restrict this to unicast.
- Congestion Control: There is no description in the current draft
  (congestion control may not be a major issue, since the data rate of a
  media stream is not increased when NEWPRED is in use; although there is
  the additional backtraffic).
- Should be backchannel information be transported in RTCP or in a
  separate RTP stream? The consensus of the group was that RTCP is appropriate.
- What should be the format of compound RTCP packets which include these
  backchannel messages?

Steve Casner noted that he thinks this doesn't need a new profile, but we
may need to relax some of the rules in the RTP spec to allow this. Carsten
Bormann noted that this is not the first codec which needs a backchannel,
and won't be the last one (for example, H.263+ needs a similar backchannel). 
Having a common means for sending these backchannel messages would be nice.
Maybe a more generic RTCP extension (profile addition)?  Jonathan Rosenberg
noted that a common backchannel may just be a namespace.  Or do we have
more information in common? Carsten Bormann noted that the value here is
more in getting the other issues out of the way - bandwidth and timing
issues, etc.

Steve Casner concluded with the chairs' view that we should proceed 
with the definition of this backchannel scheme as is, and consider a 
more general solution in future.

Paul Christ presented draft-ietf-avt-mpeg4streams-00.txt. Changes since 
the previous draft include:
- Adoption as an AVT work item for eventual submission as an experimental
  RFC, hence the name change (from draft-guillemot-avt-genrtp-03.txt).
- Flexmux section added, see draft-rgcc-flexmuxmpeg4- 00.txt
- Payload Type: Different payload types should be assigned for ES, SL-PDU
  and FlexMux streams. A payload type in the dynamic range should be chosen. 
- TSOFFSET removed, E bits included

Steve Casner asked if this is ready for working group last call for
experimental? Paul Christ noted that there are RTCP retransmission issues
to be resolved (NEWPRED, etc).  Further discussion is needed, but if
retransmission is not relevant it is ready for last call.

The final presentation of the day was by Michael Thomas, summarizing the
PacketCable security extensions to RTP. There are two main goals of this
work: providing privacy and integrity for media, and selecting algorithms
friendly to large PSTN gateways. RTP header and payload are to be covered
by an MMH MAC, placed as a trailer in the packet; the payload is to be
protected by RC-4 encryption. It is hoped that a draft will be submitted
describing this work in time for the next meeting. It should also be noted
that there are several proposals for insertion of the MAC: hide it in RTP
padding, use a header extension, or use a profile extension to specify a
fixed-length trailer. Further discussion is clearly needed.

Due to lack of time, the meeting closed at this point. The final agenda
item - charter bashing - was omitted. It should be noted that the last
milestone on our charter was in November 1999, and we actually completed
most milestones! It's now time to re-charter or close-down the group, and
since we still have ongoing work, we propose to re-charter.  The chairs'
proposal, which will be elaborated in more detail on the mailing list, is 
to include the following items in the revised charter:
- Move RTP to draft standard (needs interoperability statements, and a
  discussion of congestion control)
- RTP Multiplexing (move the TCRTP framework to BCP)
- Transport of MPEG-4
        - ES format to proposed
        - Formats for complete system to experimental
        - FlexMux?
        - Applicability statement (informational)
- New RTP profile including unicast retransmission and congestion control
- Authentication of RTP streams
        - New profile? Padding? Header extension?
- Ongoing development of payload formats 
This is a preliminary suggestion only; comments from the working group
are solicited.