I've a small cluster (16 W2k8 nodes + w2k8r2 headnode + w2k8 storage node) with HPC 2008R2SP1 ans Connex2 card.
since I upgrade the head node to R2 the infiniband network crash very often.
I've upgrade the ConnecX card to the last firmware and the winOF + driver to the past Mellanox package ==> KO.
The IB network crash when I launch a CCM job on several nodes.
I have to restart the openSM manager to re-activate the network.
Any idea ?