========================================================================

             Open Fabrics Enterprise Distribution (OFED)
              MVAPICH2-1.2p1 in OFED 1.4 Release Notes

                          December 2008


Overview
--------

These are the release notes for MVAPICH2-1.2p1.  This is OFED's edition of
the MVAPICH2-1.2p1 release. MVAPICH2 is an MPI-2 implementation over
InfiniBand and iWARP from the Ohio State University
(http://mvapich.cse.ohio-state.edu/).


User Guide
----------

For more information on using MVAPICH2-1.2p1, please visit the user guide at
http://mvapich.cse.ohio-state.edu/support/.


Software Dependencies
---------------------

MVAPICH2 depends on the installation of the OFED Distribution stack with
OpenSM running. The MPI module also requires an established network
interface (either InfiniBand, IPoIB, iWARP, uDAPL, or Ethernet). BLCR support
is needed if built with fault tolerance support.


New Features
------------

MVAPICH2 (MPI-2 over InfiniBand and iWARP) is an MPI-2 implementation based on
MPICH2.  MVAPICH2 1.2p1 is available as a single integrated package (with
MPICH2 1.0.7).  This version of MVAPICH2-1.2p1 for OFED has the following
changes from MVAPICH2-1.0.3:

MVAPICH2-1.2p1 (11/11/2008)
- Fix shared-memory communication issue for AMD Barcelona systems.

MVAPICH2-1.2 (11/06/2008)

* Bugs fixed since MVAPICH2-1.2-rc2
  - Ignore the last bit of the pkey and remove the pkey_ix option since the
    index can be different on different machines. Thanks for Pasha@Mellanox 
    for the patch.
  - Fix data types for memory allocations. Thanks for Dr. Bill Barth 
    from TACC for the patches.
  - Fix a bug when MV2_NUM_HCAS is larger than the number of active HCAs.
  - Allow builds on architectures for which tuning parameters do not exist.

* Efficient support for intra-node shared memory communication on 
  diskless clusters

* Changes related to the mpirun_rsh framework
  - Always build and install mpirun_rsh in addition to the process 
    manager(s) selected through the --with-pm mechanism.
  - Cleaner job abort handling
  - Ability to detect the path to mpispawn if the Linux proc filesystem is
    available.
  - Added Totalview debugger support
  - Stdin is only available to rank 0.  Other ranks get /dev/null.

* Other miscellaneous changes
  - Add sequence numbers for RPUT and RGET finish packets.
  - Increase the number of allowed nodes for shared memory broadcast to 4K.
  - Use /dev/shm on Linux as the default temporary file path for shared 
    memory communication. Thanks for Doug Johnson@OSC for the patch.
  - MV2_DEFAULT_MAX_WQE has been replaced with MV2_DEFAULT_MAX_SEND_WQE and
    MV2_DEFAULT_MAX_RECV_WQE for send and recv wqes, respectively.
  - Fix compilation warnings.

MVAPICH2-1.2-RC2 (08/20/2008)

* Following bugs are fixed in RC2
    - Properly handle the scenario in shared memory broadcast code when the
      datatypes of different processes taking part in broadcast are different.
    - Fix a bug in Checkpoint-Restart code to determine whether a connection
      is a shared memory connection or a network connection.
    - Support non-standard path for BLCR header files.
    - Increase the maximum heap size to avoid race condition in realloc().
    - Use int32_t for rank for larger jobs with 32k processes or more.
    - Improve mvapich2-1.2 bandwidth to the same level of mvapich2-1.0.3.
    - An error handling patch for uDAPL interface. Thanks for Nilesh Awate
      for the patch.
    - Explicitly set some of the EP attributes when on demand connection
      is used in uDAPL interface.


MVAPICH2-1.2RC1 (07/02/08)

* Based on MPICH2 1.0.7

* Scalable and robust daemon-less job startup

   - Enhanced and robust mpirun_rsh framework (non-MPD-based) to
     provide scalable job launching on multi-thousand core clusters

   - Available for OpenFabrics (IB and iWARP) and uDAPL interfaces
     (including Solaris)

* Checkpoint-restart with intra-node shared memory support

   - Allows best performance and scalability with fault-tolerance
     support

* Enhancement to software installation
   - Full autoconf-based configuration
   - An application (mpiname) for querying the MVAPICH2
     library version and configuration information

* Enhanced processor affinity using PLPA for multi-core architectures
   - Allows user-defined flexible processor affinity

* Enhanced scalability for RDMA-based direct one-sided communication
  with less communication resource

* Shared memory optimized MPI_Bcast operations

* Optimized and tuned MPI_Alltoall


Main Verification Flows
-----------------------

In order to verify the correctness of MVAPICH2-1.2p1, the following tests
and parameters were run.

Test                            Description
====================================================================
Intel                           Intel's MPI functionality test suite
OSU Benchmarks                  OSU's performance tests
IMB                             Intel's MPI Benchmark test
mpich2                          Test suite distributed with MPICH2
mpitest                         b_eff test
Linpack                         Linpack benchmark
NAS                             NAS Parallel Benchmarks (NPB3.2)
NAMD                            NAMD application


Mailing List
------------

There is a public mailing list mvapich-discuss@cse.ohio-state.edu for
mvapich users and developers to
- Ask for help and support from each other and get prompt response
- Contribute patches and enhancements

========================================================================