HPC Pack Server 2008 Compute Node "Unable to access working directory on <local node>" RRS feed

  • Question

  • Im trying to set up a cluster using MS HPC pack. All nodes are in the proper domain, and appear online and healthy on the head node's cluster manager page. However, when I tried to test the cluster using lizard from the head node, it failed (but the job appeared in the compute nodes' job manager). All 3 compute nodes gave the same error message -- "Unable to access working directory on <the name of each node>." I'm sure its not a firewall issue, and I went through all the deployment wizards very meticulously. Did I forget a step? My ADDC is properly setup and part of the domain. I figure there is a sharing setting I am missing or something. Any help with getting this ironed out will be much appreciated.

    Edit: I am on a remote client, and manually set each primary DNS address to the address of my ADDC.

    • Edited by Nolan.330 Monday, June 18, 2012 8:10 PM
    Monday, June 18, 2012 8:07 PM

All replies

  • when you log onto the CNs manually, can you access the share?
    Tuesday, June 19, 2012 8:21 PM
  • I probably should have put this in my original post, but I am extremely inexperienced with setting up networks. This is my first try. When you say "share" I assume its a file shared between the nodes. I don't think if I even have one. At this point all I know is that I have a properly configured ADDC, a headnode, and 3 compute nodes that are all in the same domain. I tried turning the firewall completely off, and still got the same error--so it cant be that. The HN recognizes that the CNs are in the domain and shows them as healthy and online until I try to run a simulation (using lizard). Then the compute nodes show either an error or a warning (with the above error message). When I tried to run "clusrun systeminfo" on the CNs (from the HN cluster manager), I got two different errors. The first said that command output proxy has failed and the other one said that the node was unable to create the standard output file. 

    In any event, thank you for responding! I posted similar questions a few different places and this is the first response I gotten back. 

    Tuesday, June 19, 2012 9:22 PM
  • you have to setup a windows file share and run from there
    Tuesday, June 19, 2012 11:40 PM
  • Alright, great! This is what I was looking for. Ill try looking into that tomorrow and let you know how it goes. 
    Wednesday, June 20, 2012 3:16 AM
  • Alright, I spent all morning looking into setting up a file share on the domain. I gave my domain user account full access to the everything in Start->Administrative Tools->Share and Storage Management. I still think I'm really missing one large step, while the rest is in place. On my head node, in the cluster manager, it says each node is online and healthy, but in the "Network" tab, it says "Bound to Network: none" and pinging them fails. I'm genuinely confused by this because in the network/sharing center, the HN and all 3 CNs say they are connected to "<domainname>.local." classified as a "domain network." I somehow joined each one to its own network all named exactly the same thing, and don't know how to join them. In Start->Network, I can't see any of the other computers, which seems problematic...I just don't understand how the HN can see them in cluster manager and display them as OK, when they are clearly suffering from my inability to network.

    Thanks again for any help you can offer my poor lost soul.

    • Edited by Nolan.330 Wednesday, June 20, 2012 5:52 PM
    Wednesday, June 20, 2012 5:51 PM