locked
OCS 2007 R2/Communicator: contacts show as presence unknown for majority ofusers on one server RRS feed

  • Question

  • Dear all,

    I have just completed a migration to OCS 2007 R2 and Communicator R2 from OCS 2007, the migration has gone well so far and i've moved our complete infrastructure which consists of 3 standard edition servers, one per geographic region we provide services to, all clients are now using Communicator R2 that we deployed via gpo.

    We host 600 users across the three servers, we're running Windows 2008 64bit standard edition with the default install of standard edition inc SQL Express 2005. We are using internally issued certificates, and all appropriate validation checks on all servers come up green. The event logs are clean on all servers, bar the below mentioned errors.

    For users hosted on our americas and europe servers there are no problems:

    From communicator all users can see the presence of users in both their home region and remote regions (UPN's take the format of jbloggs@asia.company.net, jbloggs@europe.company.net; jbloggs@americas.company.net ). For 70 percent (approx) of the users in our Asia region, any users already in their contact lists show as normal, anyone not in the contact list shows as "Presence Unknown", unless that user has contacts that are hosted on another regions server in which case they show as "Presence Unknown".

    Despite the presence unknown tag for all non-contact list contacts, calls and IM's are able to occur as normal, and users from other regions servers (and those users unaffected by the issue in Asia) are able to see presence.

    All affected users were able to see communicator presence as normal after the upgrade and before this problem began to appear (or so they say, i cant verify this) and there seems to be no relaitonship between this issue occuring and the client version of Communicator R2 in use:

    We are using the following versions, all are affected: 3.5.6907.0, 3.5.6907.22 and 3.5.6907.34.

    When i look in the iis logs on our affected server i see many many many 404 and 401 errors from abs retrieval, as shown below, the users who have no problems generally pull down the AB with a 200:

    2009-07-28 00:11:07 10.0.1.245 GET /Abs/Int/Handler/D-0c38-0c3a.lsabs - 443 ASIA\**** 10.0.2.79 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.04506;+InfoPath.2;+MS-RTC+LM+8) 404 0 64 78
    2009-07-28 00:11:07 10.0.1.245 GET /Abs/Int/Handler/D-0c38-0c39.lsabs - 443 - 10.0.2.79 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.04506;+InfoPath.2;+MS-RTC+LM+8) 401 2 5 62
    2009-07-28 00:11:07 10.0.1.245 GET /Abs/Int/Handler/D-0c38-0c39.lsabs - 443 ASIA\**** 10.0.2.79 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.04506;+InfoPath.2;+MS-RTC+LM+8) 200 0 0 62
    2009-07-28 00:14:04 10.0.1.245 GET /Abs/Int/Handler/D-0c38-0c3a.lsabs - 443 - 10.0.2.86 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.04506;+InfoPath.2;+MS-RTC+LM+8) 401 2 5 62
    2009-07-28 00:14:04 10.0.1.245 GET /Abs/Int/Handler/D-0c38-0c3a.lsabs - 443 ASIA\**** 10.0.2.86 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.04506;+InfoPath.2;+MS-RTC+LM+8) 404 0 64 78
    2009-07-28 00:14:04 10.0.1.245 GET /Abs/Int/Handler/D-0c38-0c39.lsabs - 443 - 10.0.2.86 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+SLCC1;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.04506;+InfoPath.2;+MS-RTC+LM+8) 401 2 5 62

    I've ran tracing and see the following events again and again:

    ModuleName IIS Web Core
    Notification 2
    HttpStatus 401
    HttpReason Unauthorized
    HttpSubStatus 2
    ErrorCode 2147942405
    ConfigExceptionInfo  
    Notification AUTHENTICATE_REQUEST
    ErrorCode Access is denied. (0x80070005)

    When i look at an affected clients trace log there is a lot of info, little of it makes sense to me, but this may or may not be related:
    07/29/2009|09:53:56.669 15F4:1774 TRACE :: CUccLogicalSubscription::ProcessCategoryDataCollection - Presentity object not found for [sip:****@asia.apco.net], this 0414B990
    07/29/2009|09:53:56.669 15F4:1774 TRACE :: CUccLogicalSubscription::ProcessCategoryDataCollection - Presentity object not found for [sip:****@asia.apco.net], this 0414B8F0
    07/29/2009|09:53:56.669 15F4:1774 TRACE :: CUccLogicalSubscription::ProcessCategoryDataCollection - Presentity object not found for [sip:****@asia.apco.net], this 0414B850

    I've looked through the below but we dont get any address book errors from the client....

    http://ucnoevil.blogspot.com/2008/03/address-book-download-issues-vista.html

    I've tried deleting the cache on clients and signing in again, without any luck.

    Two things have occured that differentiate the OCS R2 instance where we have the problem and those where we dont are:

    http://blog.tiensivu.com/aaron/archives/1867-RtcQmsAgent-fails-to-start-on-OCS-2007-R2-server-and-causes-KB-967831-April-2009-update-for-Front-End-Server-components-install-to-fail.html (made sure i installed msmq on the other servers to prevent someone doing a careless update and breaking everything like i did). This was corrected as per the blog post.

    The other issue i see are: on boot i get this error from the affected server, but i dont understand why, if it's an issue with certificates that some clients hosted on the asia OCS server have no problem with viewing presence and some do:

    Log Name:      System
    Source:        Schannel
    Date:          7/29/2009 7:04:56 AM
    Event ID:      36870
    Task Category: None
    Level:         Error
    Keywords:      Classic
    User:          N/A
    Computer:      *******.asia.****.net
    Description:
    A fatal error occurred when attempting to access the SSL server credential private key. The error code returned from the cryptographic module is 0x80090011.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Schannel" />
        <EventID Qualifiers="49152">36870</EventID>
        <Level>2</Level>
        <Task>0</Task>
        <Keywords>0x80000000000000</Keywords>
        <TimeCreated SystemTime="2009-07-28T23:04:56.000Z" />
        <EventRecordID>28986</EventRecordID>
        <Channel>System</Channel>
        <Computer>*******.asia.****.net</Computer>
        <Security />
      </System>
      <EventData>
        <Data>server</Data>
        <Data>80090011</Data>
      </EventData>
    </Event>

    Clearly this is an error on my part as we have two perfectly functional instances, but i've been looking at this for two weeks and if anyone can give me any direction i'd be grateful.

    Thanks,

    Jim
    • Edited by Jim Bullock Wednesday, July 29, 2009 5:09 AM
    Wednesday, July 29, 2009 4:09 AM