Headnode stops sending events under heavy load
-
22. oktober 2011 22:54
I was reading the answer to this post and point three caught my eye...
The headnode is under heavy load, and is unable to keep up with the large number of job/task events it needs to send out to a particular connected client, in which case it will explictly close the connection.
1. Is there any definition of "heavy load"? Is it administrator configurable?
2. Is there any other way to work out if we have been dropped by the server other than "periodic polling" as mentioned in the last answer?
Alle besvarelser
-
25. oktober 2011 00:27
Hi,
Answering your first question, "heavy load" depends on multiple factors like:
- size of the cluster,
- number of concurrently running jobs,
- number of tasks in currently running jobs,
- tasks running time (shorter tasks will cause more processing on the headnode, due to a frequent job/task state changes),
- headnode hardware,
- database software version (express vs full) and installation location (headnode vs remote server).
Administrators can use job templates (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=5659) and activation/submission filters (http://technet.microsoft.com/en-us/library/cc972783(WS.10).aspx) to have control over some of the above factors.
About your second question, clients can also use scheduler reconnection events: http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.reconnecthandler(v=VS.85).aspx
Thanks,
Łukasz- Markeret som svar af Cube00 25. oktober 2011 08:20
-
8. november 2011 11:06
Hi Łukasz,
Thanks for the answer
Is this still applicable when using the COM API or is COM more robust?
Thanks.