Using .NET 4.0 Parallel Framework with Win HPC R2 Beta Cluster - Unusually slow performance RRS feed

  • Question

  • Hi,

    We have been working on a problem statement where we would execute nearly a million database transactions (simple update of a couple of columns in 2 tables in database). Here are the different options we've tried:

    1. Executing the transactions sequentially on a single node. (No involvement of any HPC jobs)

    2. Executing the transactions using the .NET 4.0 Parallel Framework on a single node. (No involvement of any HPC jobs)

    The speedup observed in this case is nearly 600% which is fantastic.

    To run the same program on a HPC Cluster here is what we do:

    1. Create a job and distribute the load on to 2 compute nodes (scheduling per node). Each of this node is a dual-core with 4 GB of RAM.

    2. We then run the program which internally uses .NET 4 Parallel Framework to exploit the cores on the machine.

    Surprisingly, the performance of the application is not even a 1/2 of what one would ideally expect. Given that the job now runs on 2 machines, the expectation would be that the application performs atleast 2 times faster than on a single node targeting 2 cores.

    A few more details about the program:

    The application uses .NET 4 Entity Framework for Data Access. It has been thoroughly reviewed and verified to make sure that there aren't any race conditions and each thread will work on its own EF context. It has also been take care that objects are disposed as and when the operation is complete.

    Any ideas about what the problem could be and its solution?

    Raghu Kishore Vempati
    Tuesday, July 20, 2010 4:56 AM

All replies

  • > Is the database on a separate networked server (i.e., on none of the nodes involved in the computation)? If the database is resident on the node in your first non HPC test then network latency may affect performance of the HPC tests if the nodes are accessing the database in a different fashion than on the single node test.

    > Have you tried a situation where you're not using the parallel extensions in the code that's running within HPC? It would be nice to know how that code performs; HPC is able to exploit multiple cores as well so the code using parallel extensions for this purpose isn't strictly necessary. 

    > How are you timing the performance of the HPC code? Ensure that you're timing the actual performance of the database access code.

    One note: I believe that there is a limit to how much you can improve the performance of i/o bound tasks using distributed computing alone; you also need to remove i/o bottlenecks.



    Sunday, August 15, 2010 7:36 AM
  • The database is on the same Enterprise network but yes "on none of the nodes involved in computation". The database access is the same in both a non-HPC and a HPC test.

    We did try using the other situation where there is no use of .NET 4 parallel extensions. There is an improvement, but that is hardly any (may be around 50% more) than a non-HPC execution.

    The timing is uniform across the HPC code.

    The only thing that beats me is how the .NET 4 Parallel Extensions on a dual-core machine out perform a cluster (2 compute nodes - 4 cores) even when we don't combine these two.

    Are there any limitations on whether the .NET 4 Parallel Extensions can co-exist with the HPC Server cluster computing API?

    Comments welcome.

    Raghu Kishore Vempati
    Monday, September 27, 2010 1:13 PM
  • I have similar experience running against cluster with about 30 nodes, each with 4 or more nodes.  I'm running Excel 2010 VBA app in cluster.  It always hangs at some point in the job...usually near the end (I'm getting callbacks to client, so I know its at least running on some nodes).  Most often I have to cancel the job.  My HPC database is on SQL Server running on the head node.
    Tuesday, October 5, 2010 12:59 PM
  • This is some thing that I have been facing aswell. Have you tried selecting the compute nodes only(no head node) to run the job. I think you might get some error. In case of mine which is same as yours except the database part, the threads created by .NET Parallel extention was running only on the head node, though I had the whole cluster assigned. The time for completing the  task was same for the cluster and the node alone. So I ran the job on the cluster except the head node. I got an error with some exit code -216... .

    I wonder if the threads created by the parallel extention fun on other nodes.



    Saturday, October 30, 2010 4:30 PM