Odd malloc behavior when job scheduled via job manager RRS feed

  • Question

  • We have an app that uses a lot of memory, possibly alllocating 50GBytes to a large array (c++ x64 app).  This program works fine is run from a command prompt on the compute node but fails the allocation of  the large array when run from the job scheduler.  I created a simple test program that starts at 32BGBytes and keeps dividing by two until it can sucessfully get memory from a malloc call.  Manually run, it succeeds immediately on the 32GBytes, but run through the job scheduler, it doesn't successfully get an allocation until the request size in the malloc call is <500MBytes. Any idea what's going on here?  Is there some job parameter I need to be setting?
    Tuesday, April 26, 2011 9:45 PM

All replies

  • On further testing, looks like this is isolated to a single compute node, so probably a hardware or software infrastructure error.
    Wednesday, April 27, 2011 1:43 PM