Open Fabrics Enterprise Distribution (OFED)
                CHELSIO T3 RNIC RELEASE NOTES
			December 2008


The iw_cxgb3 and cxgb3 modules provide RDMA and NIC support for the
Chelsio S310/320 and R310/320 series adapters.  Make sure you choose the
'cxgb3' and 'libcxgb3' options when generating your ofed-1.4 rpms.

============================================
New for ofed-1.4
============================================

- 7.0 Firmware support.  See below for more information on updating
your RNIC to the latest firmware.

- Memory Managment Extensions including:
	- Fast register memory regions
	- Invalidate local memory region work request
	- Zero stag support via the local DMA lkey field
	- Read with invalidate local stag work request

- RDS bcopy mode enabled for iWARP devices

============================================
Recent Enhancements
============================================

- Various MPI libraries are enabled via a new iw_cxgb3 module option
called peer2peer.  When loading iw_cxgb3, set peer2peer=1 to enable Intel
MPI version 3.1.038, HP MPI version 2.02.05.01, OpenMPI (will be released
with OpenMPI-1.3), and Scali MPI (will be available in version 3.13.7).
This option must be set on all systems in your cluster.  See more info
below on running these MPIs.  NOTE: None of these MPIs are included in
the ofed-1.4 release.  Contact the specific vendors for obtaining the
MPI code.  Open MPI can be pulled from www.open-mpi.org.

- Large memory registration.  User applications can now register > 30MB 
memory regions.

============================================
Enabling Various MPIs
============================================

For OpenMPI, Intel MPI, HP MPI, and Scali MPI: you must set the iw_cxgb3
module option peer2peer=1 on all systems.  This can be done by writing
to the /sys/module file system during boot.  EG:

# echo 1 > /sys/module/iw_cxgb3/parameters/peer2peer

Or you can add the following line to /etc/modprobe.conf to set the option
at module load time:

options iw_cxgb3 peer2peer=1

For Intel MPI, HP MPI, and Scali MPI: Enable the chelsio device by adding
an entry to /etc/dat.conf for the chelsio interface.  For instance,
if your chelsio interface name is eth2, then the following line adds a
DAT device named "chelsio" for that interface:

chelsio u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "eth2 0" ""

=============
Intel MPI:
=============

The following env vars enable Intel MPI version 3.1.038.  Place these
in your user env after installing and setting up Intel MPI:

export RSH=ssh
export DAPL_MAX_INLINE=64
export I_MPI_DEVICE=rdssm:chelsio
export MPIEXEC_TIMEOUT=180
export MPI_BIT_MODE=64

Note: I_MPI_DEVICE=rdssm:chelsio assumes you have an entry in
/etc/dat.conf named "chelsio".

Contact Intel for obtaining their MPI with DAPL support.

=============
HP MPI:
=============

To run HP MPI applications, use these mpirun options:

-prot -e DAPL_MAX_INLINE=64 -UDAPL

EG:

$ mpirun -prot -e DAPL_MAX_INLINE=64 -UDAPL -hostlist r1-iw,r2-iw ~/tests/presta-1.4.0/glob

Where r1-iw and r2-iw are hostnames mapping to the chelsio interfaces.

Also this assumes your first entry in /etc/dat.conf is for the chelsio
device.

Contact HP for obtaining their MPI with DAPL support.

=============
Scali MPI:
=============

The following env vars enable Scali MPI.  Place these in your user env
after installing and setting up Scali MPI for running over Infiniband:

export DAPL_MAX_INLINE=64
export SCAMPI_NETWORKS=chelsio
export SCAMPI_CHANNEL_ENTRY_COUNT="chelsio:128"

Note: SCAMPI_NETWORKS=chelsio assumes you have an entry in /etc/dat.conf
named "chelsio".

Contact Scali for obtaining their MPI with DAPL support.

=============
OpenMPI:
=============

OpenMPI iWARP support is only available in OpenMPI version 1.3 or greater.

Open MPI will work without any specific configuration via the openib btl.
Users wishing to performance tune the configurable options may wish to
inspect the receive queue values.  Those can be found in the "Chelsio T3"
section of mca-btl-openib-hca-params.ini.

============================================
Loadable Module options:
============================================

The following options can be used when loading the iw_cxgb3 module to
tune the iWARP driver:

cong_flavor     - set the congestion control algorithm.  Default is 1.
                  0 == Reno
                  1 == Tahoe
                  2 == NewReno
                  3 == HighSpeed

snd_win         - set the TCP send window in bytes. Default is 32kB.

rcv_win         - set the TCP receive window in bytes. Default is 256kB.

crc_enabled     - set whether MPA CRC should be negotiated.  Default is 1.

markers_enabled - set whether to request receiving MPA markers.  Default is
                  0; do not request to receive markers.

                  NOTE: The Chelsio RNIC fully supports markers, but
                  the current OFA RDMA-CM doesn't provide an API for
                  requesting either markers or crc to be negotiated.  Thus
                  this functionality is provided via module parameters.

mpa_rev         - set the MPA revision to be used.  Default is 1, which is 
                  spec compliant.  Set to 0 to connect with the Ammasso 1100 
                  rnic.

ep_timeout_secs - set the number of seconds for timing out MPA start up
                  negotiation and normal close.  Default is 60.

peer2peer	- Enables connection setup changes to allow peer2peer
		  applications to work over chelsio rnics.  This enables
		  the following applications:
			Intel MPI
			HP MPI
			Open MPI
			Scali MPI
		  Set peer2peer=1 on all systems to enable these
		  applications.

The following options can be used when loading the cxgb3 module to
tune the NIC driver:

msi             - whether to use MSI or MSI-X.  Default is 2.
                  0 = only pin
                  1 = only MSI or pin
                  2 = use MSI/X, MSI, or pin, based on system

============================================
Updating Firmware:
============================================

This release requires firmware version 7.x, and Protocol SRAM version
1.1.x.  This firmware can be downloaded from http://service.chelsio.com.

If your distro/kernel supports firmware loading, you can place the
chelsio firmware and psram images in /lib/firmware, then unload and reload
the cxgb3 module to get the new images loaded.  If this does not work,
then you can load the firmware images manually:

Obtain the cxgbtool tool and the update_eeprom.sh script from Chelsio.

To build cxgbtool:

# cd <path-to-cxgbtool>
# make && make install

Then load the cxgb3 driver:

# modprobe cxgb3

Now note the ethernet interface name for the T3 device.  This can be
done by typing 'ifconfig -a' and noting the interface name for the
interface with a HW address that begins with "00:07:43".  Then load the
new firmware and eeprom file:

# cxgbtool ethxx loadfw <firmware_file>
# update_eeprom.sh ethxx <eeprom_file>
# reboot

============================================
Testing connectivity with ping and rping:
============================================

Configure the ethernet interfaces for your cxgb3 device.   After you
modprobe iw_cxgb3 you will see one or two ethernet interfaces for the
T3 device.  Configure them with an appropriate ip address, netmask, etc.
You can use the Linux ping command to test basic connectivity via the
T3 interface.

To test RDMA, use the rping command that is included in the librdmacm-utils 
rpm:

On the server machine:

# rping -s -a 0.0.0.0 -p 9999

On the client machine:

# rping -c -VvC10 -a server_ip_addr -p 9999

You should see ping data like this on the client:

ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
client DISCONNECT EVENT...
#

============================================
Addition Notes and Issues
============================================

1) To run uDAPL over the chelsio device, you must export this environment
variable:

        export DAPL_MAX_INLINE=64

2) If you have a multi-homed host and the physical ethernet networks
are bridged, or if you have multiple chelsio rnics in the system, then
you need to configure arp to only send replies on the interface with
the target ip address:

        sysctl -w net.ipv4.conf.all.arp_ignore=2

3) If you are building OFED against a kernel.org kernel later than
2.6.20, then make sure your kernel is configured with the cxgb3 and
iw_cxgb3 modules enabled.  This forces the kernel to pull in the genalloc
allocator, which is required for the OFED iw_cxgb3 module.  Make sure
these config options are included in your .config file:

	CONFIG_CHELSIO_T3=m
	CONFIG_INFINIBAND_CXGB=m

4) If you run the RDMA latency test using the ib_rdma_lat program, make
sure you use the following command lines to limit the amount of inline
data to 64:

	server:	ib_rdma_lat -c -I 64
	client:	ib_rdma_lat -c -I 64 server_ip_addr