none
Non-domain-joined Windows compute nodes RRS feed

  • Question

  • Is there more documentation on setting up non-domain joined compute nodes in HPC 2016 Update 1? I have a node added to the cluster but is not able run jobs due to credential mapping errors:

    Error from node: COMPUTE4:System.Security.Principal.IdentityNotMappedException: Some or all identity references could not be translated.
       at System.Security.Principal.NTAccount.Translate(IdentityReferenceCollection sourceAccounts, Type targetType, Boolean forceSuccess)
       at System.Security.Principal.NTAccount.Translate(Type targetType)
       at Microsoft.Hpc.NodeManager.RemotingExecutor.JobEntry.Name2Sid(String username)
       at Microsoft.Hpc.NodeManager.RemotingExecutor.JobEntry.<Init>d__24.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Microsoft.Hpc.NodeManager.RemotingExecutor.JobEntryFactory.<GetJobEntryAsync>d__4.MoveNext()
    --- End of stack trace from previous location where exception was thrown ---
       at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
       at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
       at Microsoft.Hpc.NodeManager.RemotingExecutor.RemotingNMExecImpl.<StartJob>d__39.MoveNext()

    Wednesday, January 3, 2018 5:38 AM

Answers

  • Hi Jim,

      If you submit the job with a different domain username, you don't need to add that name into the compute node first as our logic on the compute node will be:

    - check if it is domain joined, always treat the username as a domain user

    - if it is non domain joined, check whether the same username exists on the local machine, if yes, use that user; if not, create a new local user with a same name.


    Qiufang Shi

    • Marked as answer by Jim Miranto Tuesday, January 16, 2018 10:23 PM
    Monday, January 8, 2018 5:46 AM

All replies

  • Hi Jim,

      First, please tell us whether you headnode is domain joined? Then tell us how you submitted the job? 

      I checked our logic, Name2Sid will only be called if the credential you provided is in the same domain as your computer node, please check below (I've simplified a little bit).

                    bool isDomainUser = IsUserInSameDomain(userAccount);
                    //check if the username is domain user
                    if (isDomainUser)
                    {
                        if (!string.IsNullOrEmpty(userAccount))
                        {
                            SecurityIdentifier sid = new SecurityIdentifier(Name2Sid(userAccount));
                        }
                    }
                    else
                    {
                        this.userAccount = Credentials.ToLocalAccount(userAccount);
                    }

    and we use Environment.UserDomainName and the userAccount to see whether they are match.

    So let us know your domain name of your account, and the environment variable "USERDOMAIN" on your compute node. I'm guessing that your non domain joined compute node share the same USERDOMAIN name as your domain?


    Qiufang Shi

    Wednesday, January 3, 2018 6:43 AM
  • Hi Qiufang,

    Yes, my head node is domain joined. The domain name is HPCLUSTER. I have been testing job submissions using single task and multi task node jobs.

    The non domain joined compute node USERDOMAIN variable is set to COMPUTE4, the machine name. USERDOMAIN on the head node is set to HPCLUSTER, when logged in as a domain user.

    So the USERDOMAIN on the compute node does NOT match the domain name of the head node. However, using your suggestion, I found something that finally makes the jobs run without the authentication errors.

    When I setup the non domain compute node, I set the workgroup name to the same as the domain name, HPCLUSTER. The USERDOMAIN variable did not get the domain name but did get the machine name.

    If I change the non domain joined compute node workgroup name to something other than the domain name of the head node, the jobs process without any issues. If the workgroup name matches the domain name, that is when the jobs fail.

    One other question: If I submit the job with a different domain username, will that name need to be added to the non domain joined computer node? Or will the process create the username?

    Thanks,

    Jim

    Thursday, January 4, 2018 2:55 AM
  • Hi Jim,

      If you submit the job with a different domain username, you don't need to add that name into the compute node first as our logic on the compute node will be:

    - check if it is domain joined, always treat the username as a domain user

    - if it is non domain joined, check whether the same username exists on the local machine, if yes, use that user; if not, create a new local user with a same name.


    Qiufang Shi

    • Marked as answer by Jim Miranto Tuesday, January 16, 2018 10:23 PM
    Monday, January 8, 2018 5:46 AM