INTERIM_MEETING_REPORT_

Reported by Robin Iddon/AXON Networks and Jeanne Haney/Bay Networks

Minutes of the Remote LAN Monitoring Working Group (RMONMIB)

An interim meeting of the RMONMIB Working Group was held in Santa Clara,
CA on 15-17 May.  The meeting was sponsored by cisco Systems.


Agenda

   o Protocol directory
   o Protocol distribution
   o Address mapping
   o Network layer host/matrix
   o Seven-layer host/matrix
   o Relative offset filtering
   o Time filter
   o Probe capabilities
   o Generic control table issues
      -  dropped packet counter
      -  lastActivationTime
      -  lastDeleteTime elimination
      -  tableSizeRequested/Granted
   o Seven-layer topN/history
   o RMON1 additions
   o User history
   o Probe config MIB
   o Dynamic protocol discovery
   o Channel as dataSource


The following notes are intended to provide a overview of the issues
discussed at the meeting.  Refer to the upcoming draft for detailed
changes.


Protocol Directory

Issues involving the protocol identifier format were discussed.
Concerns over OID tree data explosion led to a new ID format using an
OID to represent the protocol layering and an octet string to represent
the attributes or parameters associated with each protocol layer.

OID tree data explosion issues:

     (1) Reference document explodes.
     (2) protocolDirectoryTable grows by same factor.
     (3) Agent code grows potentially.
     (4a) The number of stats/host/matrix rows may grow also.
     (4b) The number of filter entries grows also.

Agreed to add some kind of protocolDirType to indicate whether or not
this node could be (user) extended.

Adopted proposal:


     protocolDirEntry
         protocolDirID -- OID { ip.udp.tftp }
         protocolDirOptions -- OCTET STRING, 8-bit per sub ID

     INDEX { protocolDirID, protocolDirOptions }


There must be exactly one 8-bit option per sub ID in protocolDirID. The
intent is that all protocol defs have exactly one option -- those that
need none use zero; those that need one or more must define how to
combine their values into a single 8-bit.

For WANs in particular, there is real concern that there is no way to
handle the multitude of link layer encapsulations.  Previously we hoped
to allow vendors to insert their own subtrees; we can still do the same
thing provided we identify the places where it will occur in advance and
provide for vendor-extension bit.

Agreed to remove protocolDirParentID pending further discussion.

Agreed to add an ``unknown network layer protocol enumeration'' which
handles all cases where absolutely nothing could be determined about a
packet (except, its mac addresses and length.)



Protocol Distribution


A proposal came up to add size distribution to this table.  Discussion
over the granularity of the buckets led to a proposal to use three
buckets:  < media-min, >= media-min, and > media-max.  Agreement could
not be reached and size distribution was dropped from consideration.

Proposal to use protocolDirIndex (aka local integer) in the
protocolDistTable INDEX { protocolDistControlIndex, protocolDirIndex }.
Agreed that if there was no other use for the protocolDirIndex then this
table will revert to its original use of protocolDirID.

Discussion of fragmentation and whether we are interested in monitoring
higher layer fragmentation (i.e., whether we want to try and provide
counters which instrument fragmentation at all layers) -- generally the
group appears not to be interested in directly counting fragmentation at
any layer.



Address Mapping

Much discussion was made of whether to include the addressMapIfIndex in
the INDEX (and hence to differentiate rows on different interfaces that
are otherwise the same).

There was discussion on how much effort it is for the NMS to utilize
this table.  Possible problems include:


   o How the NMS maps RMON1 host addresses through this table without
     totally uploading the table?

   o Whether the NMS uses the random access capability.

   o Should ifIndex be replaced by an OID to allow it to point to a
     repeater port?  Data source still tells you which network the data
     came from.


Include controlIndex instead of ifIndex as (a) ifIndex is being replaced
and (b) do not want to keep port histories (which would happen if a
device moved from one port to another and the OID was part of the
INDEX).

Agreed to INDEX { protocol, address, controlIndex }

Agreed to incorporate portOID into addressMapEntry -- intent to point at
point of origin of this device (best guess of agent).

After much discussion about NMS control of agent resource utilization it
was agreed that the protocolDirectory should contain a set of flags to
control usage of this protocol.  At a minimum this should control
whether or not a protocol is used in maintaining the address mapping
(hence it appears in this section of the agenda).  Ideally we would also
have a few more flags to enable usage in the protocolDistribution and
the host/matrix tables.



Network Layer Host/Matrix

Discussion of using an enumerated value vs.  protocol dir index led to
further discussion of protocol directory `counting' issues and the need
to control which protocols are counted in which tables:


   o One idea is to turn on/off a protocol via the protocol dir table.
     This means you collect the same protocols for all interfaces and
     all application tables.  This seems very restrictive.

   o The second idea is to define the protocol channel which defines a
     set of protocols that a control entry points to, to determine which
     protocols it is collecting.  The control tables would still have a
     separated data source value (i.e.  not tie with protocol channel,
     so protocol channel can be shared across several control tables).
     This serves two purposes.  It allows the NMS to give the agent help
     in conserving its resources.  It also makes the tables smaller to
     retrieve so it helps the NMS.

   o The final choice was to turn the protocol on/off on a per
     application (Net Map, Matrix, Host, etc.).  You cannot control it
     on a per interface basis.  You cannot control it on a per control
     table basis.  This is the one that most people voted for.


The counters within the nlHostTable were discussed:


   o nlHostOutErrors discussion -- agreed object removed.

   o nlHostOutMACNUCastPkts agreed to replace nlHostOutBroadcastPkts,
     nlHostOutMulticastPkts.

   o nlHostOutFragmentPkts agreed not to implement this class of
     counter.


nlHostEntry creation was discussed.  Certainly do not insert on MAC
error packets; do insert on new source address.  There was some
discussion on whether or not to insert on destination address.  It was
finally agreed to insert on good source and destination addresses but
that the agent may need to use an improved aging technique to eliminate
the host destination addresses generated by programs which ping
sequential addresses in an attempt to discover which hosts exist.

Agreed to drop hlMatrix[SDjDS]Errors.

Agreed to keep both DS and SD tables (despite their being good reasons
not to).  It was deemed (a) too complex to dismiss the NMS's inability
to easily know of some classes of uni-directional conversations and (b)
the overheads on the agent are not severe enough to make the pain of
pushing this through worth doing).

Agreed to not do subnet aggregation because there was no standardizable
proposal and no one volunteered to do one.



Seven-Layer Host/Matrix


Three models were discussed based on nl/sl host tables:


  1. Merge them

  2. Keep them separate but closely related so that the agent can be
     efficient

  3. Keep them totally independent


Long discussion over the product class<->mib group mapping followed.

Eventually the group came to a vote on:


  1. Single control table causing a nlHostTable and slHostTable to be
     constructed (related solution 1' recognizes that within the single
     control table entry will be parameters specific to the nl and sl
     tables, e.g., rm2HostControlNlMaxDesired and
     rm2HostControlSlMaxDesired).

  2. Merge both tables (voted out 1 for merge, 16 against).

  3. Split control tables but slHostControlTable depends on an instance
     of nlHostControlTable.  Notice that this is also the same functions
     as 1'.

  4. No sharing of data, hence duplicate memory requirements!
     (Deleted.)


Proposal 1' was accepted over 3.

Steve will add a straw proposal for the combined sl/nlHostControlTable
in the next draft.

slHostEntry will contain only inPkts/outPkts and inOctets/outOctets.
slHostEntry will not contain slHostAddress, instead INDEX will reference
nlHostAddress, and words will be added to ensure that for each
slHostEntry there must be an nlHostEntry with the same address and hence
deleting an nlHostEntry will cause deletion of the associated
slHostEntries.

Misconfiguring the protocolDirectory such that slHost function is
enabled for a protocol but nlHost function is not enabled for its
network layer protocol causes no data to be collected in either table
for this protocol (because there are no nlHostEntries to relate
slHostEntries to).

Proposal adopted:  
INDEX { controlIndex, protDirIndex(addrType), nlHostAddress,
protDirIndex(protocolType) } and that the slHostTable contain neither an
address nor a MACNUCastPkts counter.

A proposal was adopted to include a bit/enum in the protocolDirectory to
indicate whether or not a network layer address is available for this
protocolDirectoryEntry (it would not make sense to set this bit for
ip.udp, for instance, but it could be set for both the ip entry and the
ip.udp.appleTalk entry; an agent would set the bit if it supports the
protocol as a network layer protocol and not if it supports it only as
an application protocol).  Ideally we would incorporate this into the
nodeType object.  This is not something to be placed in the parameters
object because it can only relate to the final protocol of the OID, not
all of them).

Proposal for slMatrix is:
INDEX { controlIndex, protDirIndex(addrType), sa, da, protDir(protType) }

Agreed to let Steve apply results of the nl/sl host table discussions to
the matrix and so avoid long discussions over basically the same
subject.

Agreed to move forward to the topN/history on host/matrix tables out of
order because we want to discuss it in the context of the host/matrix
tables.

Discussion of data table columns:


   o Issue of error counters.  What does it include?  Why count L2
     errors by protocol.  Errors can propagate up to this table.  It is
     too hard to make it meaningful to count network layer errors.
     Therefore we will leave it out.

   o Bcast and mcast?  Could there be permutation of bcast/mcast at the
     L2 level and bcast/mcast at the L3 level.  Is a broadcast to MAC
     addresses with a multicast IP address counted as bcast or mcast.
     Robin believes that the impact on the net is the fact that it is
     bcast, i.e., everyone received and processed it.  We decided that
     we are merging the bcast and mcast counts into one counter.  We are
     still counting L2 counters with an NLHostOutNUcastPkts (not
     unicast).  Get rid of Broadcast, Multicast, and Errors.

   o Robin proposes an OutFragment counter that only bumps up when
     fragments are detected from a particular SA. Most people abstained,
     so it is a closed issue.  Fragments are not counted.

   o We discussed not adding entries to the Host Table based on DA, so
     that the table does not get filled up with erroneous addresses from
     MIB sweeps, etc.  On the other hand there are L3 broadcast
     addresses in video multicast addresses that will never appear in
     the source.  Maybe we can use a different aging algorithm so
     entries without out pkts, get deleted sooner.  But then would you
     be deleting these interesting mcast and bcast pkt as frequently as
     these bogus sweep addresses.

   o Good packets for this table is defined as good MAC packets.

   o Drop the Matrix error counters, do not add the bcast counter, they
     can get them from the host table.  Remove nlMatrixSDAddressType.


Discussion of encapsulated network layers (e.g., IP in IP):


   o The problem of NL layer protocols being wrapped on other NL
     protocols, causes some problem in the how to count the pkt and what
     the NL address is.  How do you record both NL address.  Do you
     consider the encapsulated protocol to be application?  There is no
     place to save the encapsulated NL address.

   o Steve proposes an address structure that encode what the protocol
     is so that we can model both NL protocols and NL protocols
     encapsulated in other NL protocols.  Should we try to solve this
     problem?  (Vote:  8-2-5.)  Now the NL tables could have entries
     that count a pkt twice, since the NL table accounts for all NL
     protocols, not just the NL usage at this particular probe in the
     network.  Not all probes need to implement this, but all NMSs need
     to be aware of this anomaly.  I.e., if you take all the entries for
     a particular NL Host, they could total up to more than 100% of the
     Net utilization for that Host.  How does this affect the protocol
     distribution table.  There would be a protocol directory entry for
     AppleTalk with IP and it would be counted in the prot distribution.

   o The upshot of the vote to handle protocols that may be encapsulated
     within other protocols, how you might represent the addressing.
     Can we change the network address mapping table to record this
     information that we have learned from encapsulated NL protocols.
     Add pDir Index as last index to the slHostTable (and slMatrix) NL
     -- address object nonUnicasts, SL -- pDirIndex on end Add a
     bit/boolean to pDirTable that defines whether addresses are
     recognized for that protocol.



Relative Offset Filtering

There was a lot of discussion of various filtering related topics.  In
the end it was agreed to treat the channels as data source issue
elsewhere.

Agreed to pursue filterLogicTable and mod to filterChannelIndex
0..65535.  Robin to write up proposal (15 for, 0 against, 2 abstain).



Time Filter

After an example and some discussion it was agreed to implement time
filter as proposed (15 for, 0 against, 1 abstain).

It was also agreed that the timeMark goes in between the control index
and the rest of the index.


Probe Capabilities

We discussed probe classes and the nl/sl split.  We finally closed with
nl/sl remain different tables (7 for, 3 against, 3 abstain).

Next we voted on whether or not any kind of capabilities object was
needed; in favour (11 for, 1 against, 2 abstain).

Next we discussed per-interface vs.  per-device capabilities.  First
vote on scalar only (per-device) (7 for, 2 against, 4 abstain).  Scalar
adopted.


Generic Control Table Issues

  A) Dropped Packet Counter

     There was a lot of discussion about how the counters work and what
     they are (and are not) intended to do.  In the end it was agreed
     that these counters are not intended to enable the agent to do
     statistical sampling/scaling.  Indeed the notion of scaled data in
     the RMON2 tables is explicitly precluded (the group cannot define a
     scaling algorithm that is universally appropriate).  Finally there
     was debate over whether statistical sampling and scaling were
     really the only solution to the 10x media speed increases, and
     while there was no agreement the discussion polarized between those
     that felt that the current agent technology would enable 100MBit
     and those that did not.

     It was agreed that there would be one droppedFrame counter per
     control entry by default but that for some groups/functions we may
     decide to use a scalar should that prove more appropriate.

     It was agreed that the [etherjtokenRingP]StatsDropEvents would
     continue to exist in RMON2 agents and that its semantics would be
     unchanged.  The following rules define how the fooDropFrame counter
     (from the fooControlEntry) relates to the
     [etherjtokenRingP]StatsDropEvents counter and
     [etherjtokenRingP]StatsPkts counter for the same interface:

      1. For each time the agent recognizes that one or more packets
         have been missed without it knowing exactly how many were
         missed it must increment the dropEvents counter for that
         interface.  This is the only time that the dropEvents counter
         is incremented.

      2. Whenever the agent chooses not to update a table/data
         collection function based on the contents of a packet which it
         knows was present on the network it must increment the
         droppedFrames counter for that table/function.

      3. For all packets which are not lost in (1) above or dropped in
         (2) above the agent must update tables/data collection
         functions accurately.

     Two results of applying these rules are:

      1. The sum of all packet counters in a table or data collection
         function (e.g., the hostOutPkt counter) plus the associated
         droppedFrame counter should be exactly equal to the sum of the
         [etherjtokenRingP]StatsPkts and [etherjtokenRingP]DroppedFrames
         counters for the same data source.  Of course this assumes that
         the there are enough resources in the agent such that the table
         is not being LRU'd.

      2. For all agents where the dropEvent counter is zero the sum of
         the droppedFrame and Pkt counters in a given table or function
         on the same interface should be exactly equal to the number of
         packets that there were on the network.

     It was agreed that there should be strong recommendations for RMON2
     agents to utilize the droppedFrame counters as a means of
     accurately reporting the number of frames missed and that if at all
     possible the dropEvents counters should never be incremented -- in
     this way an NMS can use the data with much higher confidence.

  B) lastActivationTime

     Proposal to have this object set to sysUpTime at the point in time
     this control row's status transitioned from not active to active.
     This lets the NMS notice that another NMS restarted data collection
     (without picking a new control index) and so deltas will be
     invalid.  It also gives an indication of the age of the table (but
     may not be used to rate the first ever poll -- the data counters
     still do not have to start from zero and so you do not know the
     delta over the interval).

     Agreed to adopt proposal (13 for, 3 abstain, 0 against).  Notice
     that we will decide later which tables and functions to apply this
     to.

  C) lastDeleteTime Elimination

     Discussion -- it was agreed that lastDeleteTime was easy to
     implement, but it is also agreed that it was designed specifically
     for creationOrder which no longer exists.

     Proposal is to replace tableSize and lastDeleteTime with
     insertCount and deleteCount (where insertCount - deleteCount ==
     tableSize).

     Agreed unanimously to adopt.

  D) tableSizeRequested/Granted

     Proposal to implement a maxDesired (i.e., a ceiling) per
     controlEntry.  0 implies consume as much memory as is
     required/available.  > 0 instructs the agent to create at most this
     many data table entries associated with this control entry -- once
     this ceiling is reached the agent should delete old resources
     (associated with this control entry) in order to create new rows.

     Agreed to adopt proposal (16 for, 0 against, 1 abstain).

     Notice that we later had a discussion which suggested a valid use
     of zero would be for the new hostTable where the control entry
     creates both nlHostTable and slHostTable; a user who did not want
     an slHostTable on an interface might use 0 to indicate that.
     Perhaps we should use -1 to imply unlimited rather than zero.


Seven-Layer topN/history

Agreed to do any kind of topN in addition to the RMON1 stuff (8 for, 0
against, 7 abstain).

Agreed to do slMatrixTopN (7 for, 0 against, 0 abstain) Marginally
agreed to do nlMatrixTopN (5 for, 1 against, 5 abstain) Agreed to not do
slHostTopN and nlHostTopN (1 for, 5 against, 7 abstain and 0 for, 4
against, 7 abstain respectively).

Agreed not to support TopN by protocol (1 for, 10 against, 4 abstain).

A real proposal bringing together all the best ideas of how to do TopN
on the nl/sl matrix tables is needed -- Steve, Matt and Shay to get
together on producing this proposal.


RMON1 Additions

  1. netUtilization

     Etherstats gives you the number of octets seen.  Robin proposes
     that we provide a count of the number of bits and include interpkt
     gap and the preamble.  This gives you a better approximation of
     utilization.  Bytes seems like a better unit to use, then the
     counter will not wrap as readily.  It still is the same way another
     analyzer would calculate utilization.  We still run the risk that
     RMON gets compared with these analyzers and is not identical.  So
     the question is, is the esterStatsOctets value a good enough
     approximation to get utilization or do we want to provide a new
     object that counts more of the overhead.  People seem to favor just
     sticking with the original counter and obtaining an approximation
     to utilization for thresholding via Alarms.
     The group voted to use the octets approximation and not add any new
     bandwidth utilization indicators.

  2. filterDescr

     Proposal withdrawn without opposition.

  3. [filter changes]

     Robin to make proposal on the list based on what was discussed at
     the meeting (i.e.  the filterLogicTable with m:1 relation
     reversed).

  4. Control table additions

     The group considered four additions and how they applied to each
     control table:

     (a) insert, delete counters
     (b) maxDesired
     (c) activationTime
     (d) droppedFrames

     EtherStatsTable+TokenRingPStats+TokenRingMLStats
         activationTime, droppedFrames

     HistoryControlTable
         Nothing

     EtherHistoryTable+TokenRingPHistoryTable+TokenRingMLHistoryTable
         droppedFrames

     AlarmTable
         Nothing

     HostControlTable
         maxDesired, activationTime and droppedFrames
         (maxDesired needs note in implementors guide, apparently)

     HostTable/HostTimeTable
         Nothing

     HostTopNControlTable
         Nothing

     HostTopNTable
         Nothing

     MatrixControlTable
         Same as hostControlTable

     MatrixSD/DSTable
         Nothing

     FilterTable/ChannelTable/BufferTable
         Nothing

     EventControlEntry + LogTable
         Nothing

     RingStationControlTable
         activationTime, droppedFrames

     SourceRoutingStatsControlTable
         activationTime, droppedFrames

  5. Storage type

     Steve to propose an object which is per-control row and indicates
     what NVRAM processing an agent has performed on that row (ROM,
     will-write, wont-write, written).  (7 for, 0 against, 4 abstain).

  6. Alarms enhancements

     Make it robust when monitored OID disappears.

     It was agreed that Steve would produce a draft based on an
     alarmValueStatus object which defines whether the agent managed to
     get the value last interval, an alarmValueUnavailable event/trap,
     an alarmUnavailableEventPollThreshold (i.e.  the number of
     unavailable intervals before generating the event).

  7. WAN status bits

     It was agreed that bit6 will be supported in the pktStatus bitmask
     as the packet direction bit.  Further study of bit7 (other physical
     errors) will be done, but this bit needs to be more clearly defined
     before it can be adopted.


User History

Get rid of objectsGranted.  BucketsRequested cannot be changed after row
goes valid.  Otherwise it stands as is.


Probe Config MIB

In this section OK means that we accepted to do it -- there were no
votes as such, just a call for objections.


   o probeID, probeFirmwareRev, probeHardwareRev OK. Discussion on
     converting probeDateAndTime from ASCII into v2 DateAndTime TC
     (except we will do our own TC which is length 0 or 8 or 11 to allow
     optionality) OK.

   o probeResetControl OK.

   o probeDownloadFile, etc.  (5 for, 0 against, 7 abstain) OK.

   o serialConfigTable:  Agreed to keep serialIP and serialSubnet.

   o The agent might need to implement the two tables that contain line
     speed and flow control objects, but we will try to get away without
     doing it; should it be rejected elsewhere we will have to adopt
     usage of the appropriate serial MIBs instead (charPortTable and
     portTable).

   o Modify serialConfigProtocol slip(1), ppp(2), other(3).

   o Take all modem string DEFVALs and make them comments instead.

   o Rename serialTrapTimeout to serialDialoutTimeout OK.

   o netConfigIpAddress/netConfigSubnetMask OK. Remove netConfigIfSpeed
     and
     netConfigIfRingNumber.

   o trapDestIndex, trapDestCommunity, trapDestIpAddress, trapDestOwner,
     trapDestStatus OK (8 for, 0 against, 2 abstain).

   o serialConnect:  index, dest ip, connection type (direct, modem,
     switch, switch-and-modem), dial/connect strings, owner, status.
     OK.



Dynamic Protocol Discovery

Populate the prot directory that is sensible at startup.  Then the agent
could add some protocols that it discovers existing on the net.  The
assumption is that the agent was capable to decode those protocols all
along, but there is such large set of them and they may never appear on
the net.  It is OK that these added protocols grow more than a single
level, as originally thought.  It is up to the probe whether to turn it
on or not for collection.  How do we document these in the protocol
document and provide options for these fields?

Further discussion on the mailing list is needed.


Channel as dataSource

There was a vote on how many people violently object to using channel as
data source (2).

Those that wanted the standard to be changed to mandate that an agent
must allow channel as data source (3).

Those that want to leave the standard as is (and accept that there will
continue to be proprietary extensions) and that the behaviour of any
other kind of data source value is undefined (11).

We voted to modify the text that states ifIndex is the only recognized
dataSource that all should support, but that other values are not
illegal -- just considered out of scope.