From my Previous article, after swapping a faulty disk from the Front-End Pool Server, the server refuses to go online and kept hitting onto the BSOD (Blue Screen of Death), causing the system to keep rebooting by itself before it manage to reach the Desktop part. Despite attempts to repair the Operating System, the system just couldn’t come back online.

Hence, I’ve decided to start fresh (since we’re running on an EE Pool with SQL as the back-end), after rebuilding the entire system and to have the Lync Services up and running, to my amazed none of the users within the environment was able to connect to the server and behavior was:

Unable to Sign In to Lync 01

Although the Machine has been joined to the Domain, the Lync client still prompts for another round of authentication

Unable to Sign In to Lync 02

After putting in the correct password, the Lync client still indicates that the User credential is incorrect and doesn’t allow any users to logon.

That led me into suspecting whether the previous activity that I’d did to repair the RTC had led to this incident. To confirm this, I’d created another dummy account and sip enabled it – problem still persists.

The following commands has also been issued towards both existing and the new dummy account that it has been SIP enabled:

Get-CSUser -Identity “sip:AccountName@mydomain.com”

Identity : CN=AccountName ,OU=MyOU, DC=Child, DC=MyDomain, DC=com
VoicePolicy : Some Call Policy
VoiceRoutingPolicy :
ConferencingPolicy :
PresencePolicy :
DialPlan :
LocationPolicy :
ClientPolicy :
ClientVersionPolicy :
ArchivingPolicy :
ExchangeArchivingPolicy : Uninitialized
PinPolicy :
ExternalAccessPolicy :
MobilityPolicy :
PersistentChatPolicy :
UserServicesPolicy :
HostedVoiceMail :
HostedVoicemailPolicy :
HostingProvider : SRV:
RegistrarPool : Pool1.child.mydomain.com
Enabled : True
SipAddress : sip:Accountname@mydomain.com
LineURI : tel:+XXXX
EnterpriseVoiceEnabled  : True
ExUmEnabled : True
HomeServer : CN=LcServices,CN=Microsoft,CN=1:17,CN=Pools,CN=RTCService,CN=Microsoft,CN=System,DC=mydomain,DC=com
DisplayName : Account Display Name
SamAccountName : AccountSAMName

However, when attempting to register both accounts using the following cmdlet:

Test-CSRegistration -UserSipAddress AccountName@mydomain.com -TargetFQDN Pool1.child.mydomain.com

Output:

Target Fqdn : Pool1.Child.MyDomain.com

Result : Failure

Latency : 00:00:00

Error Message : 404, Not Found

Diagnosis : ErrorCode=1003,Source=sip.chassasia.com,Reason=User does not exist, destination=mydomain.com Microsoft.Rtc.Signaling.DiagnosticHeader

While using DBAnalyzer to capture the existing accounts, the returned value was:

>dbanlayzer.exe /sqlserver:Pool1.child.mydomain.com\RTCLOCAL /report:user /user:sip:AccountName@mydomain.com

Output:

User : jamesooi@mydomain.com
————————————————
Resource Id : 99
Database Type : Usc Db
Registrar Pool : NULL
Usc Pool : NULL
GUID : NULL
SID : NULL
Display Name : NULL
Enabled : False
OptionFlags :  0x0
ArchivingFlags :
ForwardingUrl : NULL
MovingAway : False
Contact Version : 1104

I was almost stunned when returned values from the database was almost all NULL – which means none of the information which has been published by Active Directory are written into the RTC database; which means either the Active Directory or the SQL Database(s) had went faulty.

Hence, further investigation needs to be proceed to identify the root cause:

  1. OCS Logger Tool

Enable the following traces allows us to observed the actual process when the user attempts to sign-in:

  • SIPStack
  • S4
  • Web Infrastructure
  • User Services

Output under the traces window:

Unable Sign in to Lync 003

Line 1: —-Odbc State: 42000, Severity: 11, Native: 50010, Sproc: DbRaiseError, Line: 20, Sql State: 1, Message: [Microsoft][SQL Server Native Client 11.0][SQL Server]###50010:CertStoreGetPublishedCert:jamesooi@mydomain.com is not found in this database.—-

Line 2: ^^^^ CertStoreGetPublishedCert sproc execution failed : ExecHr = [hr=S_OK], NativeError = [50010], NativeErrorSeverity = [11], NativeErrorLineNumber = [20], NativeErrorSqlState = [1], OdbcSqlState = [42000], ErrorText = [[# [Microsoft][SQL Server Native Client 11.0][SQL Server]###50010:CertStoreGetPublishedCert:jamesooi@mydomain.com is not found in this database. #]]^^^^.

Line 3: ( 0000000007105720 ) Replying with 403 hr[S_OK] Ms-diag[4005] AuthzId [jamesooi@mydomain.com] Reason[]

Output under the Messages window, there’ll plenty of 401 & 403 Unauthorized errors:

Unable Sign in to Lync 004

Based on the error messages, this leads to the next component within Lync – the SQL Server Database Service

2. SQL Profiler

Using the SQL Profiler is to identify whether information between Lync & SQL are communicating. As we needed to re-produced the problem, another account was created from Active Directory and Enable at Lync Server Control Panel. In this trace:Unable to Sign in to Lync 005

Line 1: Beginning of the trace – Lync Server 2013 is connected to the SQL Server instance and execute commands

SP : StmtStarting – raiserror (@Message, @Severity, @_State, @SprocName, @_Param1, @_Param2, @_Param3)

Line 2: Error message thrown out by the application, SQL was not able to locate the entry in the database

SP: User Error Message – ###50010:ReportUserData:danny@mydomain.com is not found in this database

Line 3: Traces ends

SP : StmtCompleted – raiserror (@Message, @Severity, @_State, @SprocName, @_Param1, @_Param2, @_Param3)

Thus, SQL has indicated that Lync Server 2013 is communicating nicely and properly with SQL Server 2012. But this left us dead-witted: so if Lync Server 2013 is interacting with SQL Server as expected, why isn’t Lync server able to write or pick up the correct information?

Finally, we ran another attempt of forcing the User Database to be updated, hoping that it’ll give us a different results. When we ran the Get-CSUserDatabaseState the following output was generated:

Identity : UserDatabase:MY-DB02.child.mydomain.com

Online   : True

And subsequently, we ran the Get-CSUserReplicatorConfiguration and this was the only results returned:

Identity : Global
ADDomainNamingContextList : {dc=mydomain,dc=com}
ReplicationCycleInterval  : 00:01:00

In which I realized that my child domain wasn’t part of the User Replicator Configuration event any longer, probably the information had went missing/corrupted during the repair of the RTC database (needs to be confirmed).

So, we went ahead to add the child domain entry into the User Replicator Configuration:

Set-CsUserReplicatorConfiguration -Identity global -ADDomainNamingContex tList @{Add=”DC=child,DC=mydomain,DC=com”}

Update-CSUserDatabase

And the next thing, using the Test-CSRegistration cmdlet, all users were able to log on to the system successfully!

Test-CsRegistration -UserSipAddress jamesooi@mydomain.com -TargetFqdn  MY-LYNCP1.child.mydomain.com

Target Fqdn : MY-LYNCP1.child.mydomain.com

Result : Success

Latency : 00:00:02.6059438

Error Message :

Diagnosis :

It took us almost 3 – 4 days to look into this issue, assuming that Lync & SQL server interaction had went wrong, and it appears that it was just a User Replicator Configuration had went missing, causing that the SIP enabled user information was not able to be seen and replicated by Lync Server 2013 itself.