none
IaaS deployment script stopped working suddenly as of September 1 RRS feed

  • Question

  • I and a colleague are using the IaaS deployment script (different configuration files to prevent conflicts) and have been doing so without issue until recently.  I last ran it successfully on Monday August 31.  My colleague has been having issues since yesterday (Tuesday September 1).  When he brought it to my attention today, I found that I too was running into the same errors with the identical inputs used on Monday; we run into errors when it comes to promoting the domain contrller.  The error we are getting is as follows:

    [Information]09/01/2015 16:41:05 - Promoting VM DomainController as a domain 
    controller, the domain FQDN is hpcdomain.local.
    [Warning]09/01/2015 16:41:08 - Remote PowerShell Call to DomainController failed (ErrorCode=-2144108102): Connecting to remote server 
    localhpcsvc.cloudapp.net failed with the following error message : The 
    SSL connection cannot be established. Verify that the service on the 
    remote host is properly configured to listen for HTTPS requests. Consult 
    the logs and documentation for the WS-Management service running on the 
    destination, most commonly IIS or WinRM. If the destination is the WinRM 
    service, run the following command on the destination to analyze and 
    configure the WinRM service: "winrm quickconfig -transport:https". For 
    more information, see the about_Remote_Troubleshooting Help topic.
    
    Has something changed in Azure which is resulting in these errors?  We're at a loss as to how things suddenly stopped working.

    Wednesday, September 2, 2015 7:44 PM

All replies

  • After a day spent trying to debug this error, it is still an issue.  There appears to be a certificate error on the server itself.  When running the winrm -quickconfig -transport:https command, the same error number results and the following error message appears.

    Running netstat -an on the Vm in question confirms that the server is not listening on the ports specified by the IaaS deployment script

    Thursday, September 3, 2015 8:52 PM
  • Hi KWilliams1,

    This error could be caused by an invalid/corrupted cloud service certificated imported under Cert:\LocalMachine\My used as the SSL cert. I would suggest you remove the DC VM and the Cloud Service entirely, and then rerun the IaaS Deployment script to regenerate the certificate for a retry.

    Meanwhile, please make sure the latest IaaS Deployment scripts (download page) are used for the deployment.

    BR,

    Yutong Sun

    Friday, September 4, 2015 1:06 PM
    Moderator
  • It's odd but as suddenly as it started acting up, it stopped this morning.  No (additional) changes were made to the scripts.  It started by testing it out on a new machine, which automatically worked.  When going back to the original machines on which the scripts were initially run, it started working.

    I note that initially we were using the the certificates rather than Add-AzureAccount to workaround the issue of having to log in several times a day.  Perusing the Azure blog, I found the SDK for Azure Resource Manager was announced on September 1.  Is this in any way related?  I have read that certificates are being phased out and everything is moving to ARM eventually.

    Friday, September 4, 2015 9:37 PM
  • The management certificate used for Azure PowerShell cmdlets by the IaaS deployment script is different from the service certificate which is generated in the Cloud Service and imported on the VM in the Cloud Service. The original transient error was only related to the service certificate which is used for the HTTPS communication for PowerShell remoting.

    BR,

    Yutong Sun

    Sunday, September 6, 2015 6:31 AM
    Moderator
  • The cloud service itself was created by the IaaS script so they were new.  I didn't think to check the certificates of those services (they have long since been removed and I am unable to check now) when the error was occurring.  I have re-run the script now that it's working and the certificates have expiration dates of November 25, 2023.  I checked on the cloud certificates as outlined here.  Is there any additional information on the certificates in question that you can point me to?

    Thursday, September 10, 2015 7:09 PM
  • Please refer here for managing certificates on Azure. For the previous transient certificate issue, I currently don't have clue about the root cause. If such error happens again, I suggest you keep the Cloud Service and the VM and then open an Azure support ticket about the certificate problem which leads to the Powershell remoting failure.

    BR,

    Yutong Sun

    Friday, September 11, 2015 10:24 AM
    Moderator