TCPIMPL minutes
42nd IETF, Chicago
August 28, 1998

Reported by Evi Nemeth, with editing by the chairs & Sally Floyd.

Agenda:
1. Agenda bashing
2. Scott Bradner, intellectual property rights
3. Status
4. Known problems I-D
5. Security problems
6. RFC 2001 revision (congestion control)
7. WG closing after Orlando

1. Agenda bashing:

No changes to the agenda.

2. IPR (intellectual property rights) issues:

Scott Bradner reminded the group that if you know of intellectual
property issues in your company on a topic and don't say so, then
you cannot participate in the discussion and decisions regarding
that topic. This is outlined in RFC 2028.

3. Status

The testing tools document is done, RFC 2398. The larger initial window documents
have been approved by IESG, and the 3 drafts will soon become RFCs. Regarding the
restart of idle connections, see draft-ietf-tcpimpl-restart, which Joe Touch reports will
soon be revised for publication.

4. Known problems

7 new ones have been documented since the LA meeting. Bill Fenner described a new
bug: if during a bi-directional transfer you are sending and so is the other end, but
you're not reading the incoming data very fast, you can end up deadlocked with a full
buffer. For example, a multithreaded client-server where one thread is sending a lot,
another is receiving a lot, but using one tcp connection. The fix is to change an unsigned
to an int and recognize -1 as a valid value. Bill will explain it better and submit it.

There are 3 others which are less serious and also not yet addressed. The document will
be forwarded to the IESG without outlining these bugs.

5. Security problems

There is a list of known problems:
Predictable initial sequence number
SYN flooding
Land attack

Phil Karn noted that the latter two are really denial of service attacks, and questioned the
title of the section.

6. RFC 2001 revision

High-level sketch of the revisions:
removed ambiguities
fixes for fast retransmit and fast recovery
added discussion of SACK
added larger initial window pointer

The discussion of the 2001 revision was a little chaotic at this point and went back and
forth between several topics. The comments about each topic have been grouped
together for the minutes, and therefore the comments are somewhat out-of-order.

Sally Floyd described two separate modifications to the Fast Retransmit and Fast
Recovery algorithms in RFC 2001. The first modification is the NewReno algorithm,
introduced in Janey Hoe's SIGCOMM 96 paper, which improves TCP's response to a
"partial ack" received during Fast Recovery, acknowledging some but not all of the
packets sent before Fast Recovery was initiated. The preferred TCP algorithms would
be those of Sack TCP. However, when the SACK option is not available, the NewReno
algorithm was described as a small but important change to make to Reno TCP, avoiding
Reno TCP's well-documented problems with retransmit timeouts when multiple packets
are dropped from a window of data.

The second modification described was the bugfix algorithm for avoiding unnecessary
multiple fast retransmits. This problem occurs in Reno when, after a retransmit timeout,
packets are retransmitted that have already been received at the receiver. When the
TCP sender receives three duplicate ACKs acknowledging three retransmitted packets,
the sender can incorrectly interpret this as a new instance of congestion. Simulations
showed that the NewReno algorighm and the bugfix for avoiding-multiple-fast-
retransmits are orthogonal - it is possible to implement one and not the other. However,
while it is possible to create scenarios with Reno or NewReno TCP where the bugfix for
avoiding-multiple-fast-retransmits would be helpful, it is not possible to create the
pathological scenarios that can occur with Tahoe TCP (e.g., TCP with Fast Retransmit
but without Fast Recovery).

Sally's slides can be found at: "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.ps" (postscript),
and "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.pdf" (PDF).

The chairs think that NewReno is a good thing; folks should implement it (solaris 2.6
might already) and put out an experimental RFC or include it (all or part) in 2001.

The decision was made to take this discussion to the mailing list and do an experimental
RFC with NewReno, rather than include it in the bugs list in RFC 2001.

There was then discussion of whether to include Sally's Reno modification for avoiding-
multiple-fast-retransmits in the RFC 2001 revision - how much experience with it do
we need to include it?

Vern suggested that an experimental document with Sally's modification could come
out at the same time, and be referenced by 2001.

Kacheong Poon (Sun) confirmed that some implementations of NewReno can behave
like stop and go during retransmission (like in Janey Hoe's paper). This occurs when
multiple packets are dropped from a window of data, and NewReno TCP recovers by
retransmitting at most one dropped packet per roundtrip time.

Sally said it is possible to implement NewReno with "stop and go" behavior, but that in
an alternate implementation, included as an option in the NS simulator, the retransmit
timer is reset on only the first retransmission. In this case, instead of slowly recovering
by retransmitting at most one dropped packet per roundtrip time, eventually the
retransmit timer times out and the sender slow-starts. The first-order fix for problems
with multiple packets dropped from a window of data is to use Sack, but when Sack is
not available, NewReno with this implementation should not perform worse than Reno.

Sally and Kacheong Poon agreed to confer on different possible implementations of NewReno.

Phil Karn asked if we want to make TCP more aggressive in the face of multiple packets
dropped. Sally answered: multiple packets dropped in a window of data is one instance
of congestion. So cut the window in half, do one retransmit; if retransmitted packets get
lost, then it's more serious and do slow start.

The RFC 2001 discussion continued with a discussion of ACKing every second full
sized segment being a MUST and not a SHOULD. A wording tweak is needed: that
ACKing is *at least* every second full-sized packet, since some systems ACK every
segment, and that's allowed.

Another issue arose concerning ACK every 2nd full sized segment -- there's no way for
the receiver to really know if the segments arriving are full-sized. Resolution: loosen
the language but word it so that today's TCPs are ok.

A question regarding definition of 3 duplicate ACKs - must they be consecutive?
Answer: yes, they need to be consecutive, but it's rare that they're not, so should not
cause an implementation to be labeled non-conformant.

7. WG closing after Orlando

2001 is almost done if NewReno is not included. The plan is to close the working group
after the next meeting (in Orlando). However, in discussion after adjournment, the issue
was raised of documenting PMTU discovery issues, which may merit keeping the WG
active for one more meeting.