This article is aimed to provide SDK users (mostly Member Associations and the TMS) an easy process for troubleshooting various issues related to sending and receiving the Clearing House (CH) messages, i.e. Domestic Transfer Declaration (DTD), First Pro Registration (FPR) and Proof of Payment (POP). Note, that this is the first version of the instruction and some of the actions specified may require contact with Connect ID Support (support.id@fifa.org).
In most cases, in order to troubleshoot problems, Connect ID SDK logs are required. Due to the nature of the SDK, where the logs are stored, in what form and how exactly they look like, depends on the client (MA) implementation. Some store logs in files, some in DB and others will send them to MS Application Insights. In order to retrieve logs you need to locate the place where they are stored. This is based on your implementation of logger described in the chapter 2.3.4 Logger of the SDK documentation.
In most cases a problem can be easily detected and fix provided thanks to error handling. Connect ID SDK validates the message prior to sending and the TMS validates the message upon receiving it (and sends back an explanation about a potential issue found). The problem occurs, however, when MA tries to send a CH message and seems not to receive a reply from TMS within the expected 30 seconds. Since there may be multiple reasons for this behaviour, troubleshooting such situations may pose a challenge.
Prerequisites:
- Logger interface is implemented as part of the integration with the Connect ID SDK. For details refer to chapter 2.3.4 Logger of the SDK documentation
- SDK user familiarised themselves with the main Clearing House article.
- SDK user read chapter 6. [Optional] Transfer Matching System (TMS) / Clearing House integration of the SDK documentation and implemented integration with the TMS/ClearingHouse according to recommendations given there.
Identify the major area of a problem
The very first step of the troubleshooting process will be to understand where & when the message has been lost. There are 3 possible options here:
- Option 1 - the original message did not leave MA
- Option 2 - message left MA but the return message did not leave TMS
- Option 3 - message left MA, return message left TMS but there is still a time-out on the MA side
There is an API method in the Service Bus API (/api/support/support/getmessagehistory) that checks the message history and provides the answer to the question above.
Before you use it, you need to find the correlation ID of your message. It's an [almost] unique message identifier which has a very useful characteristic: message reply (from TMS) has the same correlation ID as the original message. It allows us to properly track the lost message.
Correlation ID is always logged as "CID" in logs. To find it please search in logs (around the timestamp when message was supposed to be sent) for phrase "Sending..." or "Failed to ". For example in PHP for Domestic Transfer Declaration it may look like this:
[2022-05-09 14:36:39][DEBUG] [CID: 62790ad34f7f8] [ReqType: domestic-transfer-declaration] [Req: DomesticTransferDeclaration (Player's FIFA_ID=1KG6H61)] Sending...
or
[2022-05-09 14:36:39][ERROR] [CID: 62790ad34f7f8] Failed to send Domestic Transfer Declaration {"exception":"Fifa\\ConnectId\\Sdk\\Tms\\DeclarationRejectedException: Domestic Transfer Declaration was rejected. Reason: ... in ... Stack trace: ..."}
In the examples above Correlation ID is equal to 62790ad34f7f8.
Now that we have a correlation ID, we can go back to Swagger and use the tool for message tracking. There are 3 possible outcomes of the tool, corresponding 1 to 1 with the issue areas specified above:
- Option 1 - no message log was returned - it means that the message did not leave MA, i.e. was not properly sent to TMS
- Option 2 - single message log was returned - message successfully left MA but return message was not sent by TMS
- Option 3 - two message logs were returned - message left MA and the return message left TMS but it was not properly processed back on the MA side
The three chapters below will help you troubleshoot the problem depending on the option.
Option 1 - the original message did not leave MA
This is an indication that something might be wrong with your message. SDK has an in-built validation mechanism that works on two levels:
- XSD validation - does a message form an XML that is following a prescribed XSD schema
- additional validation - does a message follow some additional rules defined by the business. For example in the FPP message there is a requirement for the existence of one and only one registration. This is an additional restriction on top of the XSD schema.
To verify this scenario, using Correlation ID try searching for log indicating that the message was sent. For example for Domestic Transfer Declaration and Correlation ID equal to 62790ad34f7f8 it may look like this:
[2022-05-09 14:36:40][DEBUG] [CID: 62790ad34f7f8] [ReqType: domestic-transfer-declaration] [Req: DomesticTransferDeclaration (Player's FIFA_ID=1KG6H61)] Sent.
If you can't find this log, then it is very likely that you will find the error logged for this Correlation ID starting with "Failed to".
[2022-05-09 14:36:39][ERROR] [CID: 62790ad34f7f8] Failed to send Domestic Transfer Declaration {"exception":"Fifa\\ConnectId\\Sdk\\Tms\\DeclarationRejectedException: Domestic Transfer Declaration was rejected. Reason: ... in ... Stack trace: ..."}
Option 2 - message left MA but the return message did not leave TMS
This indicates that the problem is not MA related. Message successfully left the MA but a return message has never been received. We recommend the following troubleshooting steps:
- check if TMS listener is up & processing messages in general - go to DevTools (URL depends on the environment), impersonate yourself as FIFA_TMS (dropdown in the top right hand corner) and choose the main listener test PE02.
- if not, check test PE04, to verify if TMS is at least able to send an empty string
- if not, check the TMS topic and the dead letter queue. TMS may not be processing messages at all.
- consider looking at the Power BI report ([PPU] Service bus messages) to see when the last message has been properly exchanged with TMS
- open the ticket with TMS and let them know of all the results of your pre-checks
Option 3 - message left MA, return message left TMS but there is still a time-out on the MA side
The problem is with the configuration on the MA side. What we know for sure is that the message was successfully sent to TMS and then the related message (with the same correlation ID) was sent back to the MA. There's a couple of potential reasons that may be causing problems on this side, so we recommend the following troubleshooting process:
- check if MA's listener work - PE04 test (DevTools) will be sufficient to prove that the listener works and is processing messages
- if yes, check your listener logs to find the correlation ID - it's going to be the same correlation ID as for the original message sent. If it was received then there should be a log with "Received and locked a message:" for this correlation ID, outlining the message's properties, for example:
[2022-05-09 13:41:32][DEBUG] [LID: 6279003e8e84a] [CID: 62790ad34f7f8] Received and locked a message: Id: XYZ, Action: domestic-transfer-confirmation, ContentLength: 330, Properties: correlationId=54516282-8a1d-4c04-b82d-99a2c88df557, uniqueFifaId=..., version=..., BrokerProperties: LockToken=..., MessageId=..., ReplyTo=..., Sender=..., LockToken: ...
After finding this log, please note down the message Id (in this example it is "XYZ") and search for it in the logs; there are two possible logs to let you know if the message was handled properly;[2022-05-09 13:51:52][DEBUG] [LID: 6279003e8e84a] The message with ID: XYZ was processed properly. It will be deleted now.
or not;
[2022-05-09 13:52:32][ERROR] [LID: 6279003e8e84a] The message with ID: XYZ was NOT processed properly. It will be unlocked now. {"exception":"Exception: Error occurred when trying to put message in cache system. ... in ... Stack trace: ..."}
Please note: the above exception message is only an example.
- if the correlation ID cannot be found, check if there is a possible 2nd listener running. Both listeners could be reading from the same queue but writing to different log files (depending on the MA's setup of logs).
- If your infrastructure for the application using SDK consists of multiple servers or your listener and registration facade are running in separate processes, please refer to 2.4 Configure listener to verify if implementation is correct
- It might be helpful to check logs. If there are different logger IDs in logs after "Starting with the following handlers" is logged, there is a very likely possibility that a second listener is running. Example of this issue:
[2022-05-09 13:51:26][DEBUG] [LID: 6279003e8e84a] Starting with the following handlers: ... ... [2022-05-09 13:51:32][DEBUG] [LID: 6279003e8e84a] [CID: 52790bd34f7f2] Received and locked a message: ... [2022-05-09 13:51:36][DEBUG] [LID: 8459003f8a84b] Starting with the following handlers: ... ... [2022-05-09 13:51:45][DEBUG] [LID: 6279003e8e84a] [CID: 42790ad34f7f1] Received and locked a message: ...
- Please notice how in the above example chronologically the listener with id 6279003e8e84a starts and begins receiving messages, then listener with id 8459003f8a84b also starts while the first listener is still active, because it received the message.
- check if your CH message handlers have been correctly registered. This information will be visible in logs after starting your listener.
- To verify this, please search in logs for phrase "Starting with the following handlers". For example in PHP it may look like this:
[2022-05-09 13:51:26][DEBUG] [LID: 6279003e8e84a] Starting with the following handlers: Fifa\ConnectId\Sdk\MessageHandlers\PersonDetailsRequestMessageHandler (person-details-request), Fifa\ConnectId\Sdk\MessageHandlers\PersonDetailsMessageHandler (person-details), Fifa\ConnectId\Sdk\Tms\MessageHandlers\CorrelationIdContainerBasedMessageHandler (domestic-transfer-confirmation), Fifa\ConnectId\Sdk\Tms\MessageHandlers\CorrelationIdContainerBasedMessageHandler (proof-of-payment-confirmation), Fifa\ConnectId\Sdk\Tms\MessageHandlers\CorrelationIdContainerBasedMessageHandler (first-pro-registration-confirmation), Fifa\ConnectId\Sdk\MessageHandlers\GenericMessageHandler
- To verify this, please search in logs for phrase "Starting with the following handlers". For example in PHP it may look like this:
- check your cache for CH messages. It should be the same as for the personLocal, i.e. if you use in memory cache for personLocal, you can use in memory for CH messages successfully. But in many setups it is required to implement a custom cache so that your listener (responsible for saving a message in the cache upon arrival) and your TMSFacade (responsible for getting the message from cache using the correlation ID) both have a shared storage they can access. This approach is described in chapter 6.1. Using TMS facade of the SDK documentation.