locked
Load balanced front end EE servers do not failover nicely. RRS feed

  • Question

  • The OCS 2007 Planning Guide speaks about a robust failover handling in a load balanced senario. Here is a short extract:

     

    "The multiple Front End Servers that make up an Enterprise Edition pool provide a high availability solution wherein if a single Front End Server fails, clients will detect the failure and automatically reconnect to one of the other available Front End Servers. Meeting state is preserved because a meeting is hosted by the pool, not by any single server ... etc. etc."

     

    This does not match our experience which is described below:

     

    We have 2 EE Front End servers (composites). One is hosting a 3 party AV conference. We stop this server and hope the session(s) will seemlesly failover onto the other server. This is what we notice ...

    a. all the clients are disconnected from the conference: "Conference error"

    b. after a few seconds all clients automatically sign in to the OCS server (Reconnect)

    c. but the conference could not be joined again -> the conference is lost!

     

    Question 1: Have we misunderstood the failover functionality or have we a problem ?

     

    Question 2: Is there a "nice" way of taking down a Front End for maintenance with minimal impact on end users. We could imagine a command which first prevents the build up of new conferences and then may be later starts closing the services ?

    Monday, January 28, 2008 5:53 PM

All replies

  • If I understand you correctly, your servers are operatings properly. Let me see if I understand...

    The A/V conference you refer to is a Communicator conference with voice and/or video, correct?

    These types of conferences will be interrupted and can be continued by joining the parties together again in another conference.

     

    I believe the conference discussed in the guide is a Live Meeting conference. Live meeting conferences will be interrupted and can be rejoined, but Communicator conferences can not be rejoined dynamically.

     

    If you are referring to a Live Meeting conference, we may need to do some additional troubleshooting.

     

     

    Tuesday, January 29, 2008 12:05 AM
  • OK. Thanks. It was a communicator 2007 conference. But the OCS 2007 Planning Guide does go on to say ...

     

    "When the server goes down due to hardware or network failure, there will be an interruption in the experience of the clients that are using that server for IM, presence, and conferencing. Those clients will reconnect to resume the service...."

     

    which is carefully worded at least to give the impression to the casual reader that a sort of "failover" will happen. Or does it mean that the users of those (communicator) clients will have to manually reconnect to resume the service (and then manually rebuild the conference which was lost) ?

     

    Anyway, if there is no failover and users (of Communicator 2007) have to manually rebuild conferences when a server goes down, how are we supposed to handle planned server maintenance in such a way as to minimise impact on the users ?

     

     

     

    Tuesday, January 29, 2008 10:07 AM