none
Unable to list HPC users

    Question

  • Hello,

    Since today some users can connect to the HPC cluster but can't create new jobs. When they try, they get the error message "Access denied".
    Some users can work as usual and some jobs are running now.
    When we (the administrators) want to display the list of users (Configuration / Users) we see the error message "There was a network problem or the server was disconnected. Please try connecting again". We've tried with different accounts, remotely or locally on the head node. We've restarted the HPC servers. The problem isn't solved.

    Diagnostics are ok (All functionnal tests)
    Nothing found in the logs.

    Any idea?

    Thanks for your help.

    Marc

    PS: SP1 isn't installed.


    • Moved by Alex SuttonOwner Wednesday, August 26, 2009 4:41 PM (From:Windows HPC Server Deployment, Management, and Administration)
    • Moved by Josh Barnard Friday, September 18, 2009 11:42 PM Sorry to move this again; looks like this problem is related to user-management features. (From:Windows HPC Server Job Submission and Scheduling)
    Wednesday, August 26, 2009 12:00 PM

Answers

  • Finally we could solve the issue :-)

    Due to an ongoing domain migration (ARGROUP to AR) two accounts with the same SID were added in the local admins group on the head node : the old one from ARGROUP domain, and the new one from AR domain (migrated from ARGROUP).
    I removed the old from local administrators group, and now I can manage again users threw the console.
    It seems the support found the issue in the last dump capture on HPC processes, which crashed during the action.

    Thanks for your help and comments anyway ; I hope this will help other HPC admins, who have the same issue...
    Regards.
    Monday, November 02, 2009 5:53 PM

All replies


  • Now all users can schedule jobs. We don't know why...
    But we aren't able to list the HPC users in the HPC administration console.

    PS: I'm not sure that's the right forum...
    Thursday, August 27, 2009 7:59 AM
  • Hi Marc
    What output do you get from the Get-HpcMembers Powershell cmdlet? Try when running the PowerShell console as administrator.
    Cheers
    Dan

    Thursday, August 27, 2009 10:50 AM
  • Hi Dan,

    I work with Marc on this subject, and I tried this command on head node.
    Unfortunately the command isn't recognized, when run either from HPC Powershell or standard Windows Powershell.
    Could you please tell us, from which path this should be launched ?

    Thanks.

    Regards.
    Friday, August 28, 2009 2:35 PM
  • Can you double-check that your cluster has a good connection to Active Directory?

    A quick test for this is to run "net time /domain" on your head node.  You could also use the domain connectivity diagnostic test.

    Thanks,
    Josh
    -Josh
    Friday, August 28, 2009 11:46 PM
  • Thanks for your answer Josh.

    I've run all diagnostic tests on all nodes (including domain connectivity) and they're successful.
    The result from command "net time /domain" on head node is also good, resolving the right DC date+time.

    Regards.

    Monday, August 31, 2009 8:41 AM
  • After some tries I could run the Get-HpcMember (without "s") command successfully.
    Here's the content :

    Name                                    Role
    ----                                    ----
    NT AUTHORITY\INTERACTIVE                User
    NT AUTHORITY\AUTHENTICATED USERS        User
    AR\ARSVCHPC                             User
    ENTENHAUSEN\G-DE-APP-ABAQUS             User
    GRENOBLE\G-FR-APP-ABAQUS                User
    ARGROUP\ARRAYNETTECHNICALTEAM           Administrator
    GRENOBLE\ARRAYNETTECHNICALTEAM          Administrator
    AR\OMNI_SRV                             Administrator
    AR\OMNI_SRV                             Administrator

    That's compliant with what was defined during the initial installation ; Marc and I are members of GRENOBLE\ARRAYNETTECHNICALTEAM and ARGROUP\ARRAYNETTECHNICALTEAM groups (we have several ADs in our structure).

    I've also tried to add a test user with Add-HpcMember command ; it works fine also.
    It seems there's a trouble within the Users configuration interface, but where... ?

    Regards.

    Monday, August 31, 2009 12:02 PM
  • Just so i understand, the issue here is that you cannot list the users in the admin console, but you can see them in powershell?
    ( also the users are able to schdule their jobs alright)?
    thanks
    -parmita


    pm
    Monday, August 31, 2009 9:52 PM
    Moderator
  • That's it Parmita ; users can work normally with HPC, launching their Abaqus jobs.
    But we can only manage users & groups accessing HPC threw Powershell, as console doesn't work properly.

    Regards.
    Tuesday, September 01, 2009 7:08 AM
  • Is the admin console exhibiting the problem running on the head node itself or on a client system?  Can you reproduce the problem from another client?  What types of errors do you see in the logs at %ccp_data%logfiles on the system(s) which have the problem?

    Thanks,
    --Brian
    Tuesday, September 01, 2009 5:45 PM
  • We have the issue opening the console either from the head node itself, or from an HPC management client (I tested it on my PC, and from another one with the same standard installation).
    If I have a look at the logs on head node I see nothing inside, each time I reproduce the problem.
    Wednesday, September 02, 2009 8:49 AM
  • Hi Rudolphe
    Just to confirm, which domain hosts your cluster machines, and which domain hosts the account that you are using to manage the cluster? If these domains are different are you able to try the same operation using an admin account hosted in the same domain as the cluster machines?
    Regards
    Dan
    PS apologies for the Get-HpcMember typo, hope it didn't cause you too much grief!
    Thursday, September 03, 2009 8:33 AM
  • Hi,

    Our cluster environment is in AR domain, subtree of a bigger AD :

    RAY -- AR
        |- DE (ENTENHAUSEN)
        |- FR (GRENOBLE)
           
    

    Usually clients are in DE or FR domains ; I've tried with accounts in AR, DE or FR, with the same results (all sub-domains are trusted of course).

    Regards.
    Thursday, September 03, 2009 2:33 PM
  • one last question, you are not logged in as 'local admin' right? you say you've tried it with accounts in AR,DE or FR, I assume, you were logged into the head node/TS client with these accounts, right?

    -parmita


    pm
    Wednesday, September 30, 2009 5:21 PM
    Moderator
  • Hi,

    I've tried it in many ways : directly on head node, remotely from a client... with local or domain accounts.
    Each time the symptoms are the same.

    We had also applied HPC 2008 Service Pack 1 last week, without any improvement on that issue.
    I've an opened cased with HP/Microsoft support on it ; the network error issued while trying to add users threw the interface leads them to a network/socket trouble... but until now there's neither solution nor workaround for it.

    Regards.
    Thursday, October 01, 2009 9:39 AM
  • Hi Rudolphe
    I was just thinking about this thread and see that you raised a support call with HP/Microsoft. Did you ever get to the bottom of things? I'd be interested in hearing about the solution (if one was found) for future reference.
    Thanks very much.
    Dan

    Wednesday, October 28, 2009 8:48 AM
  • Hi Dan,

    The case is still in progress (they requested logs 2 days ago)... but if we find the solution I'll post it here, for sure.

    Regards.
    Wednesday, October 28, 2009 9:05 AM
  • Finally we could solve the issue :-)

    Due to an ongoing domain migration (ARGROUP to AR) two accounts with the same SID were added in the local admins group on the head node : the old one from ARGROUP domain, and the new one from AR domain (migrated from ARGROUP).
    I removed the old from local administrators group, and now I can manage again users threw the console.
    It seems the support found the issue in the last dump capture on HPC processes, which crashed during the action.

    Thanks for your help and comments anyway ; I hope this will help other HPC admins, who have the same issue...
    Regards.
    Monday, November 02, 2009 5:53 PM
  • Many thanks for the update Rudolphe, very good to hear you got a solution sorted out.
    Dan
    Tuesday, November 03, 2009 1:55 PM