Minutes of the Audio/Video Transport Working Group

Reported by Colin Perkins.

The audio/video transport working group held two full meetings in
Minneapolis and, in addition, a sub-group met to discuss the transport of
MPEG-4 in a telephone conference with the MPEG committee meeting in Korea.

Since the last meeting the group has published RFC2508 (IP/UDP/RTP header
compression). The group does not currently have any drafts awaiting IESG
approval for publication although some are now ready for working group last
call.

The revised RTP specification (draft-ietf-avt-rtp-new-03.txt) was presented
by Steve Casner. Recent changes include a clarification that the payload
type may change during a session, a review of sections 6.2 and 6.3 which
resulted in several minor corrections and a removal of the requirement that
inactive participant state is retained for 30 minutes (obsoleted by the
reconsideration rules), and several corrections to the code in appendix A.7.  
The specification is now believed to be complete: all members of the working 
group are encouraged to read, check and comment on this document.

The SSRC sampling draft (draft-ietf-avt-rtpsample-02.txt) is now complete
except that the wording of the IPR statement has to be updated to match the
guidelines in RFC2026. Once this is done it is intended to hold last call
on this document for experimental status.

The new RTCP conformance testing draft (draft-ietf-avt-rtcptest-00.txt) was
presented by Jonathan Rosenberg. This describes several tests which may be
performed on an RTP implementation to determine if it correctly implements
the RTCP send/receive rules. Reaction to this draft was favourable, and a
number of additional tests were noted for possible inclusion in a future
version: check that SSRC identifiers are randomly allocated and check
response to an SSRC collision. It was also noted that the draft assumes the
default RTCP sender/receiver bandwidth fractions, and should be made more
general. This document was adopted as an AVT work item, for eventual
progression as an informational RFC.  This draft also specifies the behavior 
of a software "test instrument" to be used in performing the tests; it
would be a tremendous service to the working group if one or more members
decided to implement this test instrument.

Recent changes to the RTP A/V profile (draft-ietf-avt-profile-new-05.txt)
were presented by Steve Casner. The use of MUST, SHOULD, MAY, etc, is now
complete throughout the document, the possible use of a non-default RTCP
bandwidth fraction is noted, a "changes from RFC1890" section has been
added and the GSM-HR, GSM-EFR, QCELP, BT656, H263-1998 and BMPEG codecs
have been added. 

The use of non-default RTCP bandwidth fractions requires the definition
of appropriate SDP bandwidth modifiers. For example:
        b=RS:<kb/s>             RTCP sender bandwidth
        b=RR:<kb/s>             RTCP receiver bandwidth
however it is noted that SDP allows for bandwidth modifiers to be expressed
in kb/s as an integer only. The SDP specification must be updated either to
allow bandwidth modifiers to be specified as being in b/s or to accept
fractional values.  Mark Handley expressed a preference for the former
choice and no counter-arguments were given.  It was agreed that the
specification of these modifiers could simply be added to the RTP A/V
profile, although since they may be of use in other profiles it may be
better to start a new draft to record additions and changes to SDP.

The new GSM payload formats are defined by reference to an ETSI document
only, yet it is unclear whether this is acceptable. Concern was raised that
a definition by reference made it difficult to find the information, yet
copying the relevant tables into the profile raises the potential for
inconsistency. A compromise was suggested whereby a copy of the format is
included in the profile as an appendix for convenience, but it is noted
that the ETSI document is authoritative.

A first draft document registering the RTP codec names in the MIME namespace
has been produced by Philipp Hoschka (draft-hoschka-rtp-mime-00.txt).  This
draft defines procedures for registration of codecs in the MIME namespace
with "encoding considerations" specifying how they are transported in RTP; it 
also registers the existing codecs in the A/V profile. At present it's a rough 
draft needing completion.  This draft is separate from the profile so that it 
may define procedures for carrying the codec data via more traditional MIME 
transports in addition to RTP; for example, draft-alvestrand-audio-l16-01.txt 
on the L16 codec should be merged into this draft.  This draft is referenced 
by the revised profile draft, but if advancement of the profile to Draft
Standard status would be blocked by a reference to this separate draft at
Proposed Standard status, we may consider merging the new draft into the
profile.  Open issue: what to do with vnd.wave and vnd.avi types defined in
RFC 2361?

The A/V profile is now believed to be complete, once again careful review
by the working group is requested.

A revision to the PureVoice (QCELP) payload format has been produced in
response to last call comments (draft-mckay-qcelp-02.txt). A new working
group last call is now in progress on this draft - comments are solicited.

The guidelines for writers of RTP payload format specifications draft
(draft-ietf-avt-rtp-format-guidelines-01.txt) is now complete. Working
group last call for BCP will be issued shortly.

The RTP MIB (draft-ietf-rtp-mib-04.txt) was presented by Bill Strahm.
Changes include: clarification of the difference between monitor and host
implementations, explicit allowance of non-consecutive indexes into the
rtpSessionEntry table, use of 32 bit rather than 16 bit indexes into this
table, the removal of rtpSessionIfAddr and type change of rtpSessionIfIndex
into InterfaceIndex rather than InterfaceIndexOrZero.  There are two known
issues with the current specification: the references need updating to
match the most recent SNMP documents, and compatibility with IPv6 needs to
be checked. With these two exceptions, the document is believed to be ready
for last call for proposed standard. Comments from the working group are
solicited.

The generic FEC draft (draft-ietf-fec-05.txt) was presented by Jonathan
Rosenberg. This revision clarifies usage with the RFC2198 payload format 
and defines SDP attributes for FEC protected media. Since the FEC data is
sent as a separate stream from the media, it is represented in SDP by an
additional "m=" line, with "a=fmtp" lines linking it to the media stream
via "a=tag" directives, for example:

        m=audio 49170 RTP/AVP 0         -+
        c=IN IP4 224.2.17.12/127         | Stream protected by FEC
        a=tag:1                         -+
        m=video 50274 RTP/AVP 31
        m=audio 47182 RTP/AVP 121       -+
        c=IN IP4 224.2.17.13/127         | FEC stream
        a=rtpmap:121 parityfec/8000      | 
        a=fmtp:121 1                    -+

It was suggested that this usage is unclear, since the FEC is really a
content transfer encoding, rather than a new media type; a better solution
may be to specify multiple "c=" lines for the media stream.

Furthermore, it is unclear how to register parity FEC as a MIME type, since
it can apply to both audio and video. One possibility is to register it as
both "audio/parityfec" and "video/parityfec", another may be to define a new
top level MIME type and register, say, "encoding/parityfec". If the solution
of using multiple "c=" directives in the session description is chosen, the
problem may be avoided since the MIME type will be that of the media stream,
and the parity FEC becomes a MIME content-transfer-encoding instead.

Finally, it was noted that the parity FEC work may be subject to a patent
owned by 3com. The meeting received an assurance that the 3com would "license 
the patent in accordance with [rfc]2026", and it is expected that a formal
IPR statement will be forthcoming. The parity FEC draft will be modified to
note these issues.

A more loss-tolerant payload format for MPEG (1 or 2) layer III audio
(draft-finlayson-rtp-mp3-00.txt) was presented by Ross Finlayson. The
existing payload format, RFC2250, is fine for layer I or II audio but is
not optimal for layer III (.mp3) since frames are not ADUs in MP3 and are
not independently decodable, and hence such a stream is not very loss
tolerant.  This new payload format is a data-preserving rearrangement of
the original stream, such that each packet contains complete ADUs, not
codec frames.  This makes the stream more error resilient, although the
implementation needs more knowledge of MPEG audio to perform the encoding.
It was decided to make this new payload format a work item of AVT.

A new payload format for DV format video (draft-kobayashi-dv-video-00.txt)
was presented by Akimichi Ogawa. This format is straight-forward, with
multiple blocks of the codec output (DIF blocks) being packed into an RTP
packet with no format specific header. Audio and video are typically
bundled (for a data rate of around 30Mbps), but may be transmitted
separately if desired. Again, this will become an AVT work item.

Open issues with this draft include the handling for the 12 bit sampling
option for DV audio (a new payload format for 12 bit audio may be defined
and referenced from the DV draft) and the definition of SDP attributes to
describe DV sessions.

A proposal to include location information in RTCP has presented by Jon
Crowcroft (draft-crowcroft-rtcp-latlong-00.txt). This suggests defining an
RTCP APP packet to include the real (or virtual) position of an RTP session
participant in a media independent manner. Reaction to this was favourable,
and it was suggested that the DNS LOC field has a definition for the format
of a location description which could be reused. Also, it was suggested
that this could be a new SDES packet type, rather than an APP packet. There
was also concern expressed that RTCP would not be sufficiently timely to
convey location information for fast moving sources - a solution more along
the lines of the MPEG BIFS concept may be more appropriate there. A revised
draft with more details is expected in time for the Oslo meeting.

The RTP payload format for MPEG-4 streams (draft-ietf-avt-rtp-mpeg4-01.txt)
was presented by Reha Civanlar and is a result of collaboration between AVT
and the MPEG committee. The payload format maps MPEG-4 SL packets onto RTP
in an efficient manner: those bits of the SL header which have a direct
analogue in the RTP header are mapped onto the RTP header, whilst a payload
header carries the additional features of the SL header. Open issues are
the mapping between RTP streams and MPEG elementary stream identifiers
(ESs), which will probably require definition of additional SDP attributes,
and the transport of the initial object descriptor (IOD). 

It was suggested that a possible alternative for conveying the mapping
between ESs and RTP streams would be via an RTCP SDES item, since conveying
it in the session description will only work if the selection of SSRCs can
be controlled. It was also noted that some applications do not send RTCP
reports, so an initial out-of-band mapping may be needed. It is possible
that a combination of the two approaches may make most sense.

Concerns were also expressed that the transport of the IOD as part of the
initial session description may be inappropriate. The IOD may be large, in
which case it is wasteful to include it in a SAP announcement, and a URL
may be more appropriate; in a SIP invitation it may sensible to include the
complete IOD as a MIME multipart response or in the SDP response; a
reference to a BIFS stream may also be possible. 

It may be that we cannot provide a single solution for communicating the
IOD, and that the draft should give a list of examples of how this can be
done, rather than defining a single solution, since it is clearly scenario
dependent.

Finally, transport of MPEG-4 streams requires a multiplexing solution 
(for reasons outlined in the draft). It was noted that the GeRM proposal
(draft-ietf-avt-germ-00.txt) provides a good fit to these needs.

Following the discussion of MPEG-4 transport, the group revisited the need
for an RTP multiplexing scheme. The questions asked to focus this discussion
were: should AVT standardize a multiplexing scheme? If yes, more than one
scheme? Which one(s)? In addition, the chairs made a strawman proposal to
standardize one scheme (GeRM) and recommend the use of Tmux (RFC1692) for
other situations, noting that applications for which neither of these are
satisfactory may specify their own multiplexing schemes, but that these
would not need to be standardized by AVT.

There was considerable discussion on the merits of this proposal, initially
much of it regarding Tmux. Many people were concerned that Tmux is a new IP
protocol which is not well supported and cannot be (easily) implemented at
the application level (these are similar arguments to those for why RTP was
not given its own protocol number). It was also noted that multiplexing
gateways are likely to be dedicated systems, so the issue of IP protocol
numbers is less of an issue. Concern was also expressed that Tmux does not
compress headers although it does save on IP headers and reduce the packet
count, whereas the other multiplexing proposals also include some form of
header compression.

Overall, the feeling of the meeting was that Tmux is not generally applicable 
for RTP multiplexing, and should not be one of our recommended solutions.

There was considerably less dissent regarding the idea of using GeRM as our
multiplexing protocol. 

It was noted that GeRM does not include the UDP port numbers, and that we
may need to extend it to include these since they are needed in some cases
to distinguish multiple streams. 

It was noted that we may wish to allow reduced complexity decoders which
can only decode a subset of the GeRM functionality (for example, which can
multiplex but not compress). This may also require SDP attributes to signal
which subset of GeRM was being used by the sender. This may result in an
additional section being added to the GeRM specification which describes
the range of solutions, references Tmux, includes the non-compressed GeRM
and SDP extensions. 

A number of comments were made that GeRM is too complex. These may be
offset if it becomes clear that "profiles" of GeRM whereby applications
may implement only a subset are acceptable.

It was noted that compressed RTP (RFC2508) may also be applicable: if all
that is required is to save the header overhead, then this will achieve the
desired effect without the problems caused by multiplexing.

The consensus of the group was that we should work with the current GeRM
proposal as our starting point. GeRM solves the transport problem, we would
prefer one solution, and the holes in the other proposals were worse than in
GeRM. Try implementing GeRM for the proposed scenarios and see whether there
are any serious limitations that can't be fixed and would justify creating
another solution.

The next agenda item was the generic payload format. Michael Speer noted
that there is work underway to revise this format as discussed in the Los
Angeles meeting, and it is expected that this will be complete before the
Oslo meeting.

Finally, the group discussed scaling RTCP for large groups. Concern has
been expressed about the amount of router state used by RTCP in very large
groups and it may be time to consider methods, other than SSRC sampling,
and write some additional profiles to specify these other modes of
operation. For example, the group has previously discussed sampling which
receivers respond to probes by sending RTCP packets (sliding key scheme, as
proposed by Bolot/Turletti/Wakeman), summarisation and aggregation of RTCP
reports, and unicast reporting to the source which can then forward as
multicast. Proposals for such extensions were solicited from the group.
Interested parties should write new RTP profile drafts which may reference
the existing A/V profile and just specify the differences.

Mark Handley noted that there is work in the area of reliable multicast
congestion control which may be applicable to RTP. The group is urged to
follow this work, and to consider adding congestion control to their
implementations.

Anders Klemets noted an additional problem with RTCP reception reports
where multiple unicast clients are reporting to a server. It is desired
to have a way by which the server can cause the clients to report less
often than their minimum interval interval to avoid being overloaded by
the multiple reports (from unicast clients which aren't aware of each
other). It was noted that this can be done by using an RTCP bandwidth
modifier in the session description, but this is done at startup only
and cannot vary dynamically. 

It was briefly noted, as the meeting closed, that the security area
advisory has noted that DES encryption is insufficient, yet the RTP
specification recommends the use of DES as a default. The group should
consider changing this recommendation.  The revised RTP spec already does
refer to use of IPSEC, but defines the DES-based scheme for backward
compatibility with existing implementations based on RFC 1889.