Reading same large files from each node causing performance degredation RRS feed

  • Question

  • I've been using HPC 2008 R2 for several simple tasks using parametric sweeps, so I'm only mildly competent with the technology.  I'm currently working on a bit of a more complicated problem and struggling.  I'm using C# and using the api's directly to schedule jobs which is working fine.

    The problem involves processing semi large chunks of data files (say 40 mb) written in a few files.  Each node needs read-only access to this information and reads ALL of the data once prior to doing processing.

    I don't know a lot about our cluster architecture, but based on the performance numbers I'm seeing, it appears putting these files on the head node doesn't distribute the files to each node.  I'm assuming this because I'm seeing large increases in reading the file time as I scale up the number of nodes used.

    Is there a recommended approach to dealing with this type of problem?  (ie..read once and pass the large data into each node with some message passing technique (if so could you point me to an example?)...or some way of copying the files to each node...or some other approach I hadn't considered?).

    Friday, October 19, 2012 1:59 AM

All replies

  • You can do something like:

    clusrun /nodegroup:ComputeNodes xcopy \\%CCP_SCHEDULER%\apps\%packageName%\*.* E:\apps\%packageName%\ /EY

    Substituting whatever files you need to copy to/from paths.

    If a lot of files Robocopy may be a better option.

    Realize though if files are locked you'll need a strategy to confirm the files are in sync if you use local copies.

    As to head node performance - I'd separate the files to a separate file server, depending on load and network performance perhaps put it on SSD or even ramdisk if you have fast enough pipe between nodes. There's more complex ways of course to pass the data, but "it depends" would be an answer for that. 

    Steve Radich

    Thursday, November 1, 2012 4:21 PM