locked
x64 program using the IScheduler API in HPC Pack 2008 SDK SP1 got errors RRS feed

  • Question

  • Hi:

    Below is my program. My platform is Windows Server 2008 Enterprise edition + HPC Pack 2008 SP1 (x64) 
    For a x86 build of this program, everything works fine!
    But for a x64 build, I got the error at pJob->put_Name()
    The error is: job->put_Name() failed with 0x80040232.
    Can anyone help? Thanks very much!
    /////////////////////////////////////////////////////////////////////////////////

    #define _WIN32_DCOM

    #include <windows.h>
    #include <stdio.h>
    #include <comutil.h>
    #pragma comment(lib, "comsupp.lib")

    // The Microsoft.Hpc.Scheduler.tlb and Microsoft.Hpc.Scheduler.Properties.tlb type
    // libraries are included in the Microsoft HPC Pack 2008 SDK. The type libraries are
    // located in the "Microsoft HPC Pack 2008 SDK\Lib\i386" or \amd64 folder. Include the rename
    // attributes to avoid name collisions.
    #import <Microsoft.Hpc.Scheduler.tlb> named_guids no_namespace raw_interfaces_only \
        rename("SetEnvironmentVariable","SetHpcEnvironmentVariable") \
        rename("AddJob", "AddHpcJob")
    #import <Microsoft.Hpc.Scheduler.Properties.tlb> named_guids no_namespace raw_interfaces_only

    int main(int argc, char **argv)
    {
        CoInitializeEx(NULL, COINIT_MULTITHREADED);
        HRESULT hr = S_OK;
        IScheduler *pScheduler = NULL;
        // Get an instance of the Scheduler object.
        hr = CoCreateInstance(__uuidof(Scheduler), // CLSID_Scheduler,
                              NULL,
                              CLSCTX_INPROC_SERVER,
                              __uuidof(IScheduler), // IID_IScheduler,
                              reinterpret_cast<void **>(&pScheduler));
        if(FAILED(hr))
        {
            wprintf(L"CoCreateInstance() failed with 0x%x.\n", hr);
            if(pScheduler)
            {
                pScheduler->Release();
            }
            exit(-1);
        }
       
        hr = pScheduler->Connect(_bstr_t("localhost"));
        if(FAILED(hr))
        {
            wprintf(L"Connect() failed with 0x%x.\n", hr);
            pScheduler->Release();
            exit(-1);
        }
       
        ISchedulerJob *pJob = NULL;
        hr = pScheduler->CreateJob(&pJob);
        if(FAILED(hr))
        {
            wprintf(L"CreateJob() failed with 0x%x.\n", hr); fflush(stdout);
            pScheduler->Release();
            exit(-1);
        }

        hr = pJob->put_Name(_bstr_t("MyHPCJob"));
        if(FAILED(hr))
        {
            wprintf(L"job->put_Name() failed with 0x%x.\n", hr); fflush(stdout);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }

        ISchedulerTask *pTask = NULL;
        hr = pJob->CreateTask(&pTask);
        if(FAILED(hr))
        {
            wprintf(L"CreateTask() failed with 0x%x.\n", hr); fflush(stdout);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }

        hr = pTask->put_Name(_bstr_t("MyHPCTask"));
        if(FAILED(hr))
        {
            wprintf(L"task->put_Name() failed with 0x%x.\n", hr); fflush(stdout);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }

        hr = pTask->put_CommandLine(_bstr_t("hostname"));
        if(FAILED(hr))
        {
            wprintf(L"put_CommandLine() failed with 0x%x.\n", hr); fflush(stdout);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }

        hr = pJob->AddTask(pTask);
        if(FAILED(hr))
        {
            wprintf(L"AddTask() failed with 0x%x.\n", hr); fflush(stdout);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }

        hr = pScheduler->SubmitJob(pJob, _bstr_t(argv[1]), _bstr_t(argv[2]));
        if(FAILED(hr))
        {
            wprintf(L"SubmitJob() failed with 0x%x.\n", hr);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);       
        }

        pTask->Release();
        pJob->Release();
        pScheduler->Release();

        return 0;
    }

    • Moved by Alex Sutton Tuesday, October 27, 2009 11:16 AM (From:Windows HPC Server Developers - General)
    Tuesday, October 27, 2009 10:30 AM

Answers

All replies

  • You say you are running SP1 of the HPC Pack; are you also running the most recent version of the SDK?  There were some issues with COM that we fixed at SP1.  You can get the lastest build here:
    http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=3fe15731-b1b6-42de-b278-5ccd46c0863b

    Thanks,
    Josh


    -Josh
    Tuesday, October 27, 2009 5:40 PM
    Moderator
  • Hi:

    I have already used the HPC Pack 2008 SDK SP1 and got this error...

    My program is built on Windows XP Professional 32bit with Visual Studio 2005

    Wednesday, October 28, 2009 1:38 AM
  • I've asked the team to take a look and see if they can provide an answer.

    Thanks,
    Josh


    -Josh
    Thursday, November 19, 2009 1:12 AM
    Moderator
  • VS 2005 C++ has a buggy #import mechanism.
     

    You should be able to work-around the problem by (i) keeping the code 32-bit; or (ii) moving up to VS2008; or (iii) trying this:

     

    [1] First, compile your code for 32-bit using the existing #import’s. The #import mechanism causes the C++ pre-pre-processor to generate the following two temporary .tlh headers into the target directory:

    Microsoft.Hpc.Scheduler.tlh

    Microsoft.Hpc.Scheduler.Properties.tlh

    [2] Now copy these two .tlh files into your source file directory and hang on to them!

    [3] Finally modify your source code to #include the .tlh files (or rename them to .h if you like) instead of #import’ing the typelibs.

    [4] You should now be able to compile in both 32-bit and 64-bit successfully every time, with no need for the .tlbs.

    • Marked as answer by Don PatteeModerator Wednesday, December 9, 2009 6:11 AM
    • Unmarked as answer by Seifer Lin Thursday, January 21, 2010 8:12 AM
    Thursday, November 19, 2009 6:56 PM
  • Hi,
    Using the headers directly indeed solves the problem of submitting jobs.

    But there are some other errors for getting job information.

    The sample code is as the following:

    #define _WIN32_DCOM

    #include <windows.h>
    #include <stdio.h>
    #include <comutil.h>
    #pragma comment(lib, "comsupp.lib")
    #include "microsoft.hpc.scheduler.h"
    #include "microsoft.hpc.scheduler.properties.h"

    int main(int argc, char **argv)
    {
        long JobId = 0;
        printf("Enter Job ID\n");
        scanf("%d", &JobId);

        CoInitializeEx(NULL, COINIT_MULTITHREADED);
       
        HRESULT hr = S_OK;
        IScheduler *pScheduler = NULL;
        hr = CoCreateInstance(__uuidof(Scheduler), // CLSID_Scheduler,
                              NULL,
                              CLSCTX_INPROC_SERVER,
                              __uuidof(IScheduler), // IID_IScheduler,
                              reinterpret_cast<void **>(&pScheduler));
        if(FAILED(hr))
        {
            printf("CreateInstance() failed with 0x%x.\n", hr);
            if(pScheduler)
            {
                pScheduler->Release();
            }
            exit(-1);
        }
        else
        {
            printf("CreateInstance() ok.\n", hr);
        }
       
        hr = pScheduler->Connect(_bstr_t("localhost"));
        if(FAILED(hr))
        {
            printf("Connect() failed with 0x%x.\n", hr);
            pScheduler->Release();
            exit(-1);
        }
        else
        {
            printf("Connect() ok.\n", hr);
        }
     
        ISchedulerJob *pJob = NULL;
        hr = pScheduler->OpenJob(JobId, &pJob);
        if(FAILED(hr))
        {
            printf("OpenJob() failed with 0x%x.\n", hr);
            pJob->Release();
            pScheduler->Release();
            exit(-1);       
        }
        else
        {
            printf("OpenJob() ok.\n", hr);
        }
     
        hr = pJob->Refresh();
        if(FAILED(hr))
        {
            printf("RefreshJob() failed with 0x%x.\n", hr);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
        else
        {
            printf("RefreshJob() ok.\n", hr);
        }

        IStringCollection *pAllocatedNodesCollection;
        hr = pScheduler->CreateStringCollection(&pAllocatedNodesCollection);
        if(FAILED(hr))
        {
            printf("CreateJobAllocatedNodesCollection() failed with 0x%x.\n", hr);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
        else
        {
            printf("CreateJobAllocatedNodesCollection() ok.\n", hr);
        }
       
        hr = pJob->get_AllocatedNodes(&pAllocatedNodesCollection);
        if(FAILED(hr))
        {
            printf("GetJobAllocatedNodes() failed with 0x%x.\n", hr);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
        else
        {
            printf("GetJobAllocatedNodes() ok.\n", hr);
        }

        long NAllocatedNodes = 0;
        hr = pAllocatedNodesCollection->get_Count(&NAllocatedNodes);
        if(FAILED(hr))
        {
            printf("GetJobAllocatedNodesCount failed with 0x%x.\n", hr);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
        else
        {
            printf("GetJobAllocatedNodesCount ok.\n", hr);
            printf("NAllocatedNodes=%d\n", NAllocatedNodes);
        }

        char **pAllocatedNodes = new char*[NAllocatedNodes];
       
        for(long i = 0; i < NAllocatedNodes; i++)
        {
            BSTR bstr = NULL;
            hr = pAllocatedNodesCollection->get_Item(i, &bstr);
            pAllocatedNodes[i] = ::_com_util::ConvertBSTRToString(bstr);
            if(FAILED(hr))
            {
                printf("GetJobAllocatedNodesItem failed with 0x%x.\n", hr);
                pJob->Release();
                pScheduler->Release();
                exit(-1);
            }
            else
            {
                printf("GetJobAllocatedNodesItem ok.\n", hr);
                printf("pAllocateNodes[%d]=[%s]\n", i, pAllocatedNodes[i]); fflush(stdout);
            }
        }


        return 0;
    }


    For a 32 bit program, everthing fine, and the output is
    ---------------------------------------------------------------------
    C:\Users\Administrator\Desktop>HPCCreateJob.exe
    Enter Job ID
    274
    CreateInstance() ok.
    Connect() ok.
    OpenJob() ok.
    RefreshJob() ok.
    CreateJobAllocatedNodesCollection() ok.
    GetJobAllocatedNodes() ok.
    GetJobAllocatedNodesCount ok.
    NAllocatedNodes=1
    GetJobAllocatedNodesItem ok.
    pAllocateNodes[0]=[WIN-SSGZA1V4Z8C]

    C:\Users\Administrator\Desktop>
    ---------------------------------------------------------------------

    But for 64bit program, some errors occurred.
    ---------------------------------------------------------------------
    C:\Users\Administrator\Desktop>HPCCreateJob.exe
    Enter Job ID
    274
    CreateInstance() ok.
    Connect() ok.
    OpenJob() ok.
    RefreshJob() ok.
    CreateJobAllocatedNodesCollection() ok.
    GetJobAllocatedNodes() ok.
    GetJobAllocatedNodesCount ok.
    NAllocatedNodes=1
    GetJobAllocatedNodesItem failed with 0x8007000b.

    C:\Users\Administrator\Desktop>
    ---------------------------------------------------------------------

    We're developing the product with MSMPI & Job Scheduler now, but I am stuck in these errors.

    thank you!

    • Edited by Seifer Lin Thursday, January 21, 2010 2:00 AM edit
    Thursday, January 21, 2010 1:48 AM
  • Hi Siefer,

    Sorry for the delay getting back to you.

    Yes, I get the same result on x64 when using your code. I'm working on discovering the root cause, but in the meantime, I have a workaround for you which works OK for me on x64. The workaround is to use an IEnumVARIANT interface to access the elements of your IStringCollection instead of using the get_Item method. Please try the code below and let me know if it solves your problem:

        char **pAllocatedNodes = new char*[NAllocatedNodes];
        
        IEnumVARIANT *pIEnumVARIANT = NULL;
    
        hr = pAllocatedNodesCollection->GetEnumerator (
                &pIEnumVARIANT // /*[out,retval]*/ struct IEnumVARIANT * * pRetVal
                );
    
        if (FAILED(hr))
        {
            printf("ERROR 0x%0x calling pAllocatedNodesCollection->GetEnumerator()\n", hr);
    
        } else {
    
            for(long i = 0; i < NAllocatedNodes; i++)
            {
                VARIANT pvt[1];
                ULONG celtReceived = 0;
    
                hr = pIEnumVARIANT->Next(
                        1,            // unsigned long celt,
                        pvt,          // VARIANT FAR* rgvar,
                        &celtReceived // unsigned long FAR* pceltFetched
                        );
    
                if (FAILED(hr))
                {
                    printf("ERROR 0x%0x calling pIEnumVARIANT->Next()\n", hr);
    
                } else {
    
                    pAllocatedNodes[i] = ::_com_util::ConvertBSTRToString(pvt[0].bstrVal);
                    printf("pAllocateNodes[%d]=[%s]\n", i, pAllocatedNodes[i]); fflush(stdout);
                }
            }
        }
    


    Regards,

    Patrick
    Wednesday, February 3, 2010 7:31 PM
  • Hi Patrick:

    Your code works fine for getting the allocated nodes' name.

    However, the following code still fails (sorry for that I don't test every IScheduler functions I used individually)


    For assigning the compute nodes. The 64bit API fails again, but the 32bit API works fine.

    The output (64bit) of the following code is

    pNodeCollection->Add failed with 0x8007000b.

    #define _WIN32_DCOM
    
    #include <windows.h>
    #include <stdio.h>
    #include <comutil.h>
    #pragma comment(lib, "comsupp.lib")
    #include "microsoft.hpc.scheduler.h"
    #include "microsoft.hpc.scheduler.properties.h"
    
    int main(int argc, char **argv)
    {
        CoInitializeEx(NULL, COINIT_MULTITHREADED);
        HRESULT hr = S_OK;
        IScheduler *pScheduler = NULL;
        // Get an instance of the Scheduler object. 
        hr = CoCreateInstance(__uuidof(Scheduler), // CLSID_Scheduler, 
                              NULL,
                              CLSCTX_INPROC_SERVER,
                              __uuidof(IScheduler), // IID_IScheduler, 
                              reinterpret_cast<void **>(&pScheduler));
        if(FAILED(hr))
        {
            wprintf(L"CoCreateInstance() failed with 0x%x.\n", hr);
            if(pScheduler)
            {
                pScheduler->Release();
            }
            exit(-1);
        }
        
        hr = pScheduler->Connect(_bstr_t("localhost"));
        if(FAILED(hr))
        {
            wprintf(L"Connect() failed with 0x%x.\n", hr);
            pScheduler->Release();
            exit(-1);
        }
        
        ISchedulerJob *pJob = NULL;
        hr = pScheduler->CreateJob(&pJob);
        if(FAILED(hr))
        {
            wprintf(L"CreateJob() failed with 0x%x.\n", hr); fflush(stdout);
            pScheduler->Release();
            exit(-1);
        }
    
        hr = pJob->put_Name(_bstr_t("MyHPCJob"));
        if(FAILED(hr))
        {
            wprintf(L"job->put_Name() failed with 0x%x.\n", hr); fflush(stdout);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
    
        ////////////////////////////////////////////////////////////////////////////
        IStringCollection *pNodeCollection = NULL;
        hr = pScheduler->CreateStringCollection(&pNodeCollection);
        if(FAILED(hr))
        {
            wprintf(L"Scheduler->CreateStringCollection() failed with 0x%x.\n", hr);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }    
    
        const char* ComputeNodeHostname = "WIN-SSGZA1V4Z8C"; //The hostname of the compute node
    
        hr = pNodeCollection->Add(_bstr_t(ComputeNodeHostname));
        if(FAILED(hr))
        {
            wprintf(L"pNodeCollection->Add failed with 0x%x.\n", hr);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
        ////////////////////////////////////////////////////////////////////////////
    
        hr = pJob->put_UnitType(JobUnitType_Node);
        if(FAILED(hr))
        {
            wprintf(L"job->put_UnitType() failed with 0x%x.\n", hr); fflush(stdout);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
     
        hr = pJob->putref_RequestedNodes(pNodeCollection);
        if(FAILED(hr))
        {
            wprintf(L"job->putref_RequestedNodes() failed with 0x%x.\n", hr); fflush(stdout);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
        
        ISchedulerTask *pTask = NULL;
        hr = pJob->CreateTask(&pTask);
        if(FAILED(hr))
        {
            wprintf(L"CreateTask() failed with 0x%x.\n", hr); fflush(stdout);
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
    
        hr = pTask->put_Name(_bstr_t("MyHPCTask"));
        if(FAILED(hr))
        {
            wprintf(L"task->put_Name() failed with 0x%x.\n", hr); fflush(stdout);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
    
        hr = pTask->put_CommandLine(_bstr_t("hostname"));
        if(FAILED(hr))
        {
            wprintf(L"put_CommandLine() failed with 0x%x.\n", hr); fflush(stdout);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
    
        hr = pJob->AddTask(pTask);
        if(FAILED(hr))
        {
            wprintf(L"AddTask() failed with 0x%x.\n", hr); fflush(stdout);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);
        }
    
        hr = pScheduler->SubmitJob(pJob, _bstr_t(argv[1]), _bstr_t(argv[2]));
        if(FAILED(hr))
        {
            wprintf(L"SubmitJob() failed with 0x%x.\n", hr);
            pTask->Release();
            pJob->Release();
            pScheduler->Release();
            exit(-1);        
        }
    
        pTask->Release();
        pJob->Release();
        pScheduler->Release();
    
        return 0;
    }
    
    Thursday, February 4, 2010 7:48 AM
  • Hi Seifer,

    Yes, you're definitely running into some odd behavior on x64. For this most recent error (with IStringCollection::Add()), I'm sorry, but I can't see any way to code around this in 64-bit unmanaged code. We will try to fix the problems with IStringCollection in our next service pack (SP2).

    In the meantime, there are a few ways to get your application running, but it depends on what your requirements are. For example, if you are able to run 32-bit code, then we already know that the job scheduler COM interfaces will work correctly. Alternatively, if you need your code to be 64-bits, then using managed code (e.g. C#) will work correctly. On the other hand, if you need to run unmanaged code, then it's possible to create a managed code wrapper with it's own COM interface. Also, you may be able to run an unmanaged 32-bit process exposing a COM interface that you can access from your 64-bit application.

    Would any of these alternatives work for you?

    Regards,

    Patrick
    Thursday, February 4, 2010 10:13 PM
  • Hi Patrick,

    Thanks for the reply.

    You have mentioned that

    1. On the other hand, if you need to run unmanaged code, then it's possible to create a managed code wrapper with it's own COM interface.
    2. Also, you may be able to run an unmanaged 32-bit process exposing a COM interface that you can access from your 64-bit application.

    Is there any sample code or tutorial ?


    In our program, we package the HPC COM APIs into a DLL, and the DLL is linked by our program (C++). The program uses
    the DLL to submit HPC jobs.


    regards,
    Seifer
    Friday, February 5, 2010 12:31 AM
  • Hi Seifer,

    Sorry for the delay again. Here is a link to a generic article that may help, although it's not targeted for HPC:

    http://dnjonline.com/article.aspx?ID=jun07_access3264

    Let me know whether it helps.

    Regards,

    Patrick
    Wednesday, February 10, 2010 6:41 AM