locked
Symphony to HPC - Terrible performance with parallel application RRS feed

  • Question

  • Long story short, many variables are changing - same application, changing the middleware (Symphony>HPC) changing the infrastructure (Hosted>Cloud Vendor) and changing the OS/Hardware (2003>2008/2012)

    Based on the hardware changes alone though, we should be looking at some performance increases. However when disecting the jobs and doing a proof of concept, we are looking at HPC being 8 times slower.

    Stripped everything to bare metal, did some network capturing, and the network isn't the issue, CPU is being utilized fine, but still just taking the jobs longer to do.

    Anyone had to do anything like this before?

    Tuesday, March 24, 2015 3:45 AM

All replies

  • This doesn't happen for other users.

    Could you describe your architecture? We can help you narrow down the issue if you have a simple repro.

    Basic things to check:

    1. When you manually run a single task on the compute node, can it finish as fast as in the old hardware and OS?

    2. Does the tasks in the job dispatched to compute node efficiently? You can check the start time and end time of each task.

    3. Do you have many jobs? what balancing and preemption configuration do you use at cluster level?

    Friday, March 27, 2015 5:24 AM
  • Hi,

      Actually we have a few migrations from Symphony to HPC and the performance are comparable. Did you use SOA? What Cloud Vendor you are using? Is it from Physical to Virtualization? Did your application "compute intensive" or "communication intensive" or "I/O intensive"?

      If possible, we can review your configuration and way you use HPC for your workloads. Please contact hpcpack@Microsoft.com.

    Qiufang


    Qiufang Shi


    Friday, March 27, 2015 6:39 AM
  • 1) When I put it into a loop it takes twice as long 

    2) I will check this 

    3) Yes, over 1000 "tasks" 

    Monday, March 30, 2015 2:46 PM
  • We did not use SOA, it is IBM Softlayer, we have done both complete physical, and complete virtual and mixed and matched components in between, all with similar results. 
    Monday, March 30, 2015 2:47 PM
  • Anyone who is interested in seeing wiresharks or perflogs, I'd be more than happy to show you. The wiresharks are about 50mb in size, as well as the perf logs. 
    Monday, March 30, 2015 4:44 PM
  • We will be interested in workload pattern, can you share how it is designed on the HPC Pack and whether it is appropriately configured.


    Qiufang Shi

    Thursday, April 9, 2015 2:03 AM