none
HPC Pack not reaching full network speed RRS feed

  • Question

  • Wondering if anyone can help me with this issue:

    Running HPC Pack 2016 (Update 3) on Windows Server 2016, 8 nodes, Infiniband FDR (56gb/sec) on application network. Running MPI PingPong throughput test and only getting 5480 MB/sec max between nodes. Last I checked I should be getting around 6800 MB/sec for FDR.

    Tried multiple FDR network cards, problem persists. Switch and Cables are fully FDR capable. Scheduler says all nodes have 56gb/sec connection and Network Direct is showing True. Drivers and firmware are all correct.

    Oddly when i change the HPC network to run on Gigabit only, i am seeing a similar speed bottleneck- MPI PingPong only returns 70 MB/sec max, which is very short of the 125 MB/sec maximum.

    Perhaps i am interpreting something incorrectly? Has anyone else discovered a similar issue?


    Saturday, August 24, 2019 3:45 PM

All replies

  • Hi RymerR,

    Suppose you are running the cluster on premise, not on Azure (H-series)? For 6800 MB/sec for FDR, did you ever get a number close to that in a real run? If you are running MPIPingPong throughput test in HPC Pack built-in diagnostics, the parameter used for mpipingpong.exe is -pt with 4MB packet size, the real throughput value could be lower than the bandwidth in theory.

    Regards,

    Yutong Sun 

    Friday, September 6, 2019 7:22 AM