What happened to my DHCP exclusions?
-
2009年9月21日 22:04When I configured the HPC Pack 2008, I allowed the GUI to set up a DHCP scope on the headnode. Then I added a number of exclusions using the DHCP admin tool, so as not to stomp over the Linux cluster with which I share a network.
Unfortunately, the exclusions appear to be periodically removed from from the DHCP scope's configuration. Not all of them, mind you -- an exclusion that excludes a single IP address remains, so only the most important exclusions have been removed.
This cannot continue to happen -- when the Windows DHCP server goes rogue this way, it's quite possible to take down the other 1080 nodes of our 1200-node production cluster. We're using about 120 nodes to experiment with Windows, and the rest is a production Linux cluster.
So, my questions are as follows:
- Can we remove the windows DHCP server and use the standard Unix dhcpd that we know how to control? In addition to gauranteeing that Windows doesn't go rogue and de-stabilize our Linux cluster, this will make it much easier to make the cluster automatically reboot nodes in to Linux or Windows based on what the users ask for.
- How can I look under the hood to find out and control what changes the HPC Pack tools are making to the DHCP configuration? Normally, I can just read the Perl script or whatever that modifies the files, or look at logfiles -- but I haven't found either of those on Windows. Altarnatively, documentation that describes what the HPC Pack does to the Windows DHCP server would allow me to troubleshoot the problem and script around any issues that arise.
- The word on the street is that WDS runs a secret DHCP server that sends additional packets on top of the regular DHCP server. Can anyone fill me on how this is supposed to work? Knowing how it's supposed to work may save mme any hours of looking at Wireshark output.
My office-mate reported a variation on this issue on Wed, 09 Apr 2008 18:51:32 -0500, so it's been around a while. Reconfiguring services on a cron-schedule in the background without telling the sysadmin and with no visible way for the admin to control the process is a major problem -- it really is a systems time-bomb. I'd love to go to a totally manual (or scripted)l DHCP configuration (preferably using our existing Unix dhcpd configuration), but I haven't found documentation describing which changes the HPC Pack tools make to the DHCP server, and what methods/interfaces it uses to make those changes.
-Luke
所有回覆
-
2009年9月22日 20:10
Here's a typical packet from the secret DHCP server in WDS. The packet was parsed and formatted into text by Wireshark:No. Time Source Destination Protocol Info 8 4.098599 10.1.69.56 10.1.66.13 DHCP DHCP ACK - Transaction ID 0xc6fde161 Frame 8 (303 bytes on wire, 303 bytes captured) Arrival Time: Sep 22, 2009 13:25:06.209898000 [Time delta from previous captured frame: 0.000534000 seconds] [Time delta from previous displayed frame: 0.000534000 seconds] [Time since reference or first frame: 4.098599000 seconds] Frame Number: 8 Frame Length: 303 bytes Capture Length: 303 bytes [Frame is marked: False] [Protocols in frame: eth:ip:udp:bootp] [Coloring Rule Name: Checksum Errors] [Coloring Rule String: cdp.checksum_bad==1 || edp.checksum_bad==1 || ip.checksum_bad==1 || tcp.checksum_bad==1 || udp.checksum_bad==1 || mstp.checksum_bad==1] Ethernet II, Src: Dell_61:c1:bd (00:24:e8:61:c1:bd), Dst: Dell_fd:e1:61 (00:15:c5:fd:e1:61) Destination: Dell_fd:e1:61 (00:15:c5:fd:e1:61) Address: Dell_fd:e1:61 (00:15:c5:fd:e1:61) .... ...0 .... .... .... .... = IG bit: Individual address (unicast) .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default) Source: Dell_61:c1:bd (00:24:e8:61:c1:bd) Address: Dell_61:c1:bd (00:24:e8:61:c1:bd) .... ...0 .... .... .... .... = IG bit: Individual address (unicast) .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default) Type: IP (0x0800) Internet Protocol, Src: 10.1.69.56 (10.1.69.56), Dst: 10.1.66.13 (10.1.66.13) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00) 0000 00.. = Differentiated Services Codepoint: Default (0x00) .... ..0. = ECN-Capable Transport (ECT): 0 .... ...0 = ECN-CE: 0 Total Length: 289 Identification: 0x3128 (12584) Flags: 0x00 0... = Reserved bit: Not set .0.. = Don't fragment: Not set ..0. = More fragments: Not set Fragment offset: 0 Time to live: 128 Protocol: UDP (0x11) Header checksum: 0x0000 [incorrect, should be 0x6d5d] [Good: False] [Bad : True] [Expert Info (Error/Checksum): Bad checksum] [Message: Bad checksum] [Severity level: Error] [Group: Checksum] Source: 10.1.69.56 (10.1.69.56) Destination: 10.1.66.13 (10.1.66.13) User Datagram Protocol, Src Port: altserviceboot (4011), Dst Port: bootpc (68) Source port: altserviceboot (4011) Destination port: bootpc (68) Length: 269 Checksum: 0x9c65 [validation disabled] [Good Checksum: False] [Bad Checksum: False] Bootstrap Protocol Message type: Boot Reply (2) Hardware type: Ethernet Hardware address length: 6 Hops: 0 Transaction ID: 0xc6fde161 Seconds elapsed: 4 Bootp flags: 0x0000 (Unicast) 0... .... .... .... = Broadcast flag: Unicast .000 0000 0000 0000 = Reserved flags: 0x0000 Client IP address: 10.1.66.13 (10.1.66.13) Your (client) IP address: 0.0.0.0 (0.0.0.0) Next server IP address: 10.1.69.56 (10.1.69.56) Relay agent IP address: 0.0.0.0 (0.0.0.0) Client MAC address: Dell_fd:e1:61 (00:15:c5:fd:e1:61) Client hardware address padding: 00000000000000000000 Server host name: msabesched.abe.local Boot file name: Boot\x64\WdsNbp.com Magic cookie: (OK) Option: (t=53,l=1) DHCP Message Type = DHCP ACK Option: (53) DHCP Message Type Length: 1 Value: 05 Option: (t=60,l=9) Vendor class identifier = "PXEClient" Option: (60) Vendor class identifier Length: 9 Value: 505845436C69656E74 Option: (t=54,l=4) DHCP Server Identifier = 10.1.69.56 Option: (54) DHCP Server Identifier Length: 4 Value: 0A014538 End Option
It doesn't appear to be handing out address (I don't see any DHCP Offers from the Windows server in my packet capture when the Windows DHCP server is disabled), but this does seem to cause the nodes to boot into Windows, even when we're using the Linux dhcpd to hand out the addresses.
I didn't know you could append a DHCP offer by sending a follow up packet -- that's neat!
But I really wish that this kind of thing wasn't hidden from the admin. It's a ticking time-bomb when the admin dosn't know about it and must employ subterfuge to control it -- especially in a network where Windows is a minority system. (Windows is running on less than 10% of the nodes in this particular system.)- 已編輯 Luke Scharf 2009年9月22日 20:11 Fine-tune formatting
-
2009年9月22日 22:25版主
Alright, so here is what I think is happening-
you have a Linux and windows cluster sharing the 'private' network -with two dhcp servers.
When you setup the windows HPC head node, you picked an subnet for the windows nodes ( in network wizard) which overlaps with your Linux DHCP server and you went into the windows DHCP server to setup 'exclusions' which we, windows hpc services stomped on.
windows HPC management service uses the 'private' network to send down management traffic, including response to DHCP and PXE requests. We also use this network for TFTP traffic, multicast OS images during install and send down management commands, which deploy, manage and monitor the Compute nodes. Windows expects this network to be private:).
it is not private in your case, since you share it with a linux cluster there are a few things you can do to ensure windows does not bring down your linux nodes and vice versa.
a) if you pick a scope in the network wizard( to-do list, step 1), we will stomp on any exclusions you put in there, so put your linux and windows nodes on separate subnets, this way we will not mess with your exclusions.
b) if this is a problem, consider DHCP reservations for each of the windows nodes based on mac addresses, and prevent windows DHCP from responing to any
machines that do not have a known MAC address.
c) the Default setting for the PXE server on the head node is to NOT respond to any pxe requests, from 'unknown' or new machines, but it seems that has been turned off, exhibiting the 'secret DHCP' behavior you talk about. you can change this back via UI ( admin console->Menu->options->deployment settings->"respond only to PXE requests that come from existing compute nodes". select and save.
If you want to use powershell,
Set-HpcClusterProperty -WDSMode IgnoreUnknown
will set this for you.
powershell reference at :http://technet.microsoft.com/en-us/library/cc947676(WS.10).aspx
I would also recommend creating a node list xml identifying the compute node you want to deploy as windows node ( assuming these are the same 120 machines you deploy as windows, since they are identified by MAC addresses) so you can use the same names for the same machine each time you deploy
http://technet.microsoft.com/en-us/library/cc707389(WS.10).aspx
there are other topics in the deployment guide at the link above which will help you create DHCP reservations if you need to.
let me know if you run into further problems.
thanks
-Parmita
pm -
2009年9月23日 14:37
Alright, so here is what I think is happening-
you have a Linux and windows cluster sharing the 'private' network -with two dhcp servers.
When you setup the windows HPC head node, you picked an subnet for the windows nodes ( in network wizard) which overlaps with your Linux DHCP server and you went into the windows DHCP server to setup 'exclusions' which we, windows hpc services stomped on.
Of course they're sharing the same subnet -- we have one private network for our cluster (a 10-net) -- are you telling me that you expect Windows to only run on dedicated networks?? With the shared network, we only have to maintain one network, and Windows and Linux can share resources like storage, schedulers, and so forth. I've been doing this kind of thing with DHCP servers for well over a decade. But the HPC Pack service is periodically unconfiguring the IP address exclusions that make it possible for these systems to play well with others, which is a problem.
We've Disabled the Microsoft DHCP server. The packet that I posted above must be coming from WDS, and it only seems to be going out to known clients. I loaded a list of known clients using an XML file a while ago, and I was able to delete a client from the "HPC Cluster Manager" -- though I'm trying to automate this process so that it can eventually be controlled by scripts, and possibly by Moab.
Hidden DHCP servers are something that Are Not Allowed on production networks, though. This is something that every admin needs to know about before installing HPC Pack, though -- especially admins who work primarily on non-Windows platforms.
b) if this is a problem, consider DHCP reservations for each of the windows nodes based on mac addresses, and prevent windows DHCP from responing to any machines that do not have a known MAC address.
Done. We didn't even consider doing it any other way, since our network contains mostly net-booted Linux workstations. I've loaded the list into Microsoft's DHCP server via a shell script.
c) the Default setting for the PXE server on the head node is to NOT respond to any pxe requests, from 'unknown' or new machines, but it seems that has been turned off, exhibiting the 'secret DHCP' behavior you talk about. you can change this back via UI ( admin console->Menu->options->deployment settings->"respond only to PXE requests that come from existing compute nodes". select and save.
This part seems to be behaving -- we wouldn't be having this conversation if 1080 production Linux nodes weren't booting. The Windows servers would have been turned off immediately, and possibly permenantly.
The problem is that the main Microsoft DHCP server was forgetting the address-range exclusions. We've disabled the microsoft DHCP server, and we're going to try to manage this thing with the Linux dhcpd that the rest of the cluster depends on. The problem is that the WDS server is still sending out DHCP packets, which makes it technically a rogue DHCP server -- even though the packets it's sending out is a follow-up PXE-related packet, it's still very bad that we had to figure this out the hard way. We need to know about these things before we put these services on our network. The other problem is that since I couldn't figure out what information flows between the HPC Pack admin tools and the Microsoft DHCP server, I set up the HPC Pack to manage the DHCP server -- which seems to work OK, except that it removes the address-range exlusions that make it possible for these two systems to play nicely on the same network. But it doesn't matter now -- rogue DHCP servers must be terminated for the good of 90% of the hosts on our network, and I we'll figure it out what information must be sent to the Linux DHCP server by trial, error, and Wireshark. Wireshark is good.
I would also recommend creating a node list xml identifying the compute node you want to deploy as windows node ( assuming these are the same 120 machines you deploy as windows, since they are identified by MAC addresses) so you can use the same names for the same machine each time you deploy
http://technet.microsoft.com/en-us/library/cc707389(WS.10).aspx
there are other topics in the deployment guide at the link above which will help you create DHCP reservations if you need to.
That was done months ago -- though the examples that I picked up from another admin here wasn't formatted quite right and I had to ask for help:
http://social.microsoft.com/Forums/en-US/windowshpcitpros/thread/446bc3e0-872f-4cc3-a73f-38552e1216f4
The main problem (which is mostly solved) is that the HPC Pack appears to be "managing" my DHCP server in ways that I didn't expect via some sort of a hidden cronjob, and the next problem is that I'm still getting DHCP packets from a Windows host when I've disabled DHCP. If it works the way I think it does, the seperation between "system" DHCP services and "WDS" DHCP services makes sense -- but it needs to written on big honking warning lables that WDS sends out DHCP packets. Windows is a guest on this network -- and it needs to act that way.
-Luke
- 已編輯 Luke Scharf 2009年9月23日 15:08 Fix HTML formatting
-
2009年9月23日 14:41I set the the DHCP server's "Startup Type" in the "Services" control-panel to "Disabled" yesterday. It was enabled and started this morning, which means it's really-super-ungood-misbehaving. It also re-added all of the nodes that I deleted from the HPC Job Manager's Node Management interface -- nodes that were supposed to just pxeboot into Linux.
I've removed the DHCP role from the Windows headnode. Even if it breaks something on the Windows side, it will allow the remaining 90% of the cluster to run properly.
How much can I cuss before I get banned from this forum?
-Luke
-
2009年9月23日 16:27版主
:)
WDS does not send out 'DHCP' packets, when you disable dhcp outside of the HPC admin console, the management service notices it and it tries to fix it. The management service is doing its job in trying to be a 'self healing ' system. we don't have a hidden DHCP service, it would as you said, ban setting this up on any network, ever, if we were to setup a dhcp server without telling the admin.
What you need to do is go to the network wizard and uncheck DHCP. Disabling the service from the DHCP add-in is not the right way to disable DHCP on the HPC head node, it is going to break other things, and the management service will attempt to resurrect the DHCP service, and fill up you event log with tons of error messages.
Go back to the network wizard( configuration pane, to do list, first step) on the page 'Private network configuration', uncheck the box that says enable DHCP.
Now you can manage DHCP outside of the HPC head node and you will not have bad behavior.
-parmita
pm- 已標示為解答 parmita mehtaModerator 2009年9月24日 17:22
-
2009年9月23日 17:42
we don't have a hidden DHCP service, it would as you said, ban setting this up on any network, ever, if we were to setup a dhcp server without telling the admin.
Then why am I sniffing DHCP packets originating fro the Windows headnode IMMEDIATELY after I disable the DHCP server? This is before the system "heals" itself and restarts the DHCP server -- my Windows headnode stops sending out DHCP Offers (which implies that something changed when I disabled the Windows DHCP server, and it appears to be stopped), but even though there aren't any DHCP offers, it continues to send out DHCP ACK's. This seems to be pretty strong evidence that there's a secret DHCP server there somewhere on the Windows headnode. Then when it "heals" itself, it starts sending out the DHCP offers again. Also, I've removed the DHCP server role from my Windows headnode, and it's still sending out DHCP packets. That's why I posted the decoded packet-capture above. If this packet isn't coming from some sort of hidden DHCP server, then where is it coming from?
Also, how was I supposed to know that the normal Windows administration practices no longer apply? I'm primarily a Unix admin, but I've run Windows Active Directory systems with 120+ workstations in the not-too-distant past...
I've re-run the network wizard, but I'm reluctant to run those things because I never know what it's going to steamroll...! I'll keep Wireshark running and see what I see.
-Luke
P.S. Here's a decoded DHCP ACK that I captured with Wireshark from after I removed the DHCP role from my headnode (but before I disabled it using the HPC Cluster manager -- it includes a packet sent from my headnode (10.1.69.56) to a compute-node (10.1.66.48). The second decoded packet is DHCP ACK packet that's being broadcast to the entire network. The packets are clearly intended primarily for PXEbooting and aren't Offering an address -- but 10.1.66.48 is a compute node that we're trying to boot back into Linux after deleting it from the HPC Cluster Manager and disabling the DHCP server:
No. Time Source Destination Protocol Info 672 1220.327151 10.1.69.56 10.1.66.48 DHCP DHCP ACK - Transaction ID 0xc6fd3658 Frame 672 (377 bytes on wire, 377 bytes captured) Arrival Time: Sep 23, 2009 10:37:10.969158000 [Time delta from previous captured frame: 0.000347000 seconds] [Time delta from previous displayed frame: 0.264298000 seconds] [Time since reference or first frame: 1220.327151000 seconds] Frame Number: 672 Frame Length: 377 bytes Capture Length: 377 bytes [Frame is marked: True] [Protocols in frame: eth:ip:udp:bootp] [Coloring Rule Name: Checksum Errors] [Coloring Rule String: cdp.checksum_bad==1 || edp.checksum_bad==1 || ip.checksum_bad==1 || tcp.checksum_bad==1 || udp.checksum_bad==1 || mstp.checksum_bad==1] Ethernet II, Src: Dell_61:c1:bd (00:24:e8:61:c1:bd), Dst: Dell_fd:36:58 (00:15:c5:fd:36:58) Destination: Dell_fd:36:58 (00:15:c5:fd:36:58) Source: Dell_61:c1:bd (00:24:e8:61:c1:bd) Type: IP (0x0800) Internet Protocol, Src: 10.1.69.56 (10.1.69.56), Dst: 10.1.66.48 (10.1.66.48) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00) Total Length: 363 Identification: 0x0383 (899) Flags: 0x00 Fragment offset: 0 Time to live: 128 Protocol: UDP (0x11) Header checksum: 0x0000 [incorrect, should be 0x9a95] Source: 10.1.69.56 (10.1.69.56) Destination: 10.1.66.48 (10.1.66.48) User Datagram Protocol, Src Port: altserviceboot (4011), Dst Port: bootpc (68) Source port: altserviceboot (4011) Destination port: bootpc (68) Length: 343 Checksum: 0x9cd2 [validation disabled] Bootstrap Protocol Message type: Boot Reply (2) Hardware type: Ethernet Hardware address length: 6 Hops: 0 Transaction ID: 0xc6fd3658 Seconds elapsed: 0 Bootp flags: 0x0000 (Unicast) Client IP address: 10.1.66.48 (10.1.66.48) Your (client) IP address: 0.0.0.0 (0.0.0.0) Next server IP address: 10.1.69.56 (10.1.69.56) Relay agent IP address: 0.0.0.0 (0.0.0.0) Client MAC address: Dell_fd:36:58 (00:15:c5:fd:36:58) Client hardware address padding: 00000000000000000000 Server host name: msabesched.abe.local Boot file name: Boot\x64\WdsNbp.com Magic cookie: (OK) Option: (t=53,l=1) DHCP Message Type = DHCP ACK Option: (t=60,l=9) Vendor class identifier = "PXEClient" Option: (t=54,l=4) DHCP Server Identifier = 10.1.69.56 Option: (t=250,l=72) Private End Option No. Time Source Destination Protocol Info 6741 3694.852647 10.1.69.56 255.255.255.255 DHCP DHCP ACK - Transaction ID 0xaf284c6 Frame 6741 (303 bytes on wire, 303 bytes captured) Arrival Time: Sep 23, 2009 11:18:25.494654000 [Time delta from previous captured frame: 0.000531000 seconds] [Time delta from previous displayed frame: 2474.525496000 seconds] [Time since reference or first frame: 3694.852647000 seconds] Frame Number: 6741 Frame Length: 303 bytes Capture Length: 303 bytes [Frame is marked: True] [Protocols in frame: eth:ip:udp:bootp] [Coloring Rule Name: UDP] [Coloring Rule String: udp] Ethernet II, Src: Dell_61:c1:bd (00:24:e8:61:c1:bd), Dst: Broadcast (ff:ff:ff:ff:ff:ff) Destination: Broadcast (ff:ff:ff:ff:ff:ff) Source: Dell_61:c1:bd (00:24:e8:61:c1:bd) Type: IP (0x0800) Internet Protocol, Src: 10.1.69.56 (10.1.69.56), Dst: 255.255.255.255 (255.255.255.255) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00) Total Length: 289 Identification: 0x1057 (4183) Flags: 0x00 Fragment offset: 0 Time to live: 128 Protocol: UDP (0x11) Header checksum: 0xda3c [correct] Source: 10.1.69.56 (10.1.69.56) Destination: 255.255.255.255 (255.255.255.255) User Datagram Protocol, Src Port: bootps (67), Dst Port: bootpc (68) Source port: bootps (67) Destination port: bootpc (68) Length: 269 Checksum: 0x5057 [validation disabled] Bootstrap Protocol Message type: Boot Reply (2) Hardware type: Ethernet Hardware address length: 6 Hops: 0 Transaction ID: 0x0af284c6 Seconds elapsed: 4 Bootp flags: 0x8000 (Broadcast) Client IP address: 0.0.0.0 (0.0.0.0) Your (client) IP address: 0.0.0.0 (0.0.0.0) Next server IP address: 10.1.69.56 (10.1.69.56) Relay agent IP address: 0.0.0.0 (0.0.0.0) Client MAC address: Dell_f2:84:c6 (00:1d:09:f2:84:c6) Client hardware address padding: 00000000000000000000 Server host name: msabesched.abe.local Boot file name: Boot\x64\WdsNbp.com Magic cookie: (OK) Option: (t=53,l=1) DHCP Message Type = DHCP ACK Option: (t=60,l=9) Vendor class identifier = "PXEClient" Option: (t=54,l=4) DHCP Server Identifier = 10.1.69.56 End Option
-
2009年9月23日 19:25After running through the wizard and rebooting the Windows headnode, it seems like everything is behaving (the returned-to-Linux nodes boot) except for one thing. I'm still seeing DHCP ACK's being broadcast from my Windows headnode to 255.255.255.255 with PXE information. The packets look like the one I posted above.
We haven't established that these broadcast-packets are harmless, but so far it looks like removing the Windows DHCP server role and going through the network wizard did get rid of the egregious issues. I still want to know where in Windows these possibly-harmless DHCP ACK packets are coming from. I'm guessing it's the WDS service, but I haven't proven that.
The Windows compute nodes aren't booting at the moment, but I think I need to add a few properties that I saw with the sniffer into dhcpd. That's my problem, though.
This "Self Healing" stuff really needs to be done differently. The admin needs be informed of every detail of the changes made to these services one way or another. It doesn't matter to me if it's documentation, if it's something like UAC, or if it's just a Perl script in :C:\Program Files\Microsoft HPC Pack" that I can read and modify -- but if the admin isn't informed as to what's going on, it's a ticking time-bomb that likely to do a lot more harm than good. -
2009年9月24日 17:21版主Thanks for your feedback.
There is no secret DHCP service. We will try to reproduce this in our test environement and see what you might be seeing.
thanks
-parmita
pm