질문 Infiniband crash

  • 2011년 1월 25일 화요일 오후 10:40
     
     

    hi,

    I've a small cluster (16 W2k8 nodes + w2k8r2 headnode + w2k8 storage node) with HPC 2008R2SP1 ans Connex2 card.

    since I upgrade the head node to R2 the infiniband network crash very often.

    I've upgrade the ConnecX card to the last firmware and the winOF  + driver to the past Mellanox package ==> KO.

    The IB network crash when I launch a CCM job on several nodes.

    I have to restart the openSM manager to re-activate the network.

    Any idea ?