Customers may be missing some responses starting on Sunday, April 9
Incident Report for ISO ClaimSearch
Postmortem

RETROSPECTIVE SUMMARY:

Incident date: 4/10/2023

Resolution date:  4/10/2023

Retrospective date:  4/14/2023 (Due to resource availability)

ATTENDEES:  Kanika, Mangesh, Rebecca, Mohan, Dean, Praveen, Praphul, Sijo, Paresh, Arvind, Rita, Ram, Mohammed, Tim.

CUSTOMER IMPACT:  Customers reported missing responses – Claims were lost in processing for some customers.

ROOT CAUSE:  In October 2022, XML schema changes, and enhancements across the entire system related to XML schema changes were deployed in acceptance environment. The changes made to the response schema file removed two optional fields that were only relevant in a specific case where an individual claims party had more than one coverage on a claim and both coverages had both close date and suit filed indicator. This specific case is rare and does not occur frequently in the test system, but it affects some of our claims in production, particularly those submitted by companies with closed dates. These changes were deployed in production on Sunday morning, April 9, 2023. This is a defect the response schema file which was identified in November in testing, specifically this is a check for the XML responses going from ClaimSearch to the customer.

During October 2022, there was a  XML schema changes deployed in acceptance environment, and enhancements across the entire system related to XML schema. A defect was identified in November 2022 during the testing, specifically this is a check for the XML responses going from ClaimSearch to the customer.

MONITORING: Are there any improvements in monitoring and alerts? These alerts currently go to an MS Teams Channel only. An action item has been captured to pager out on-call teams in pagerduty.

Are there any improvements in the troubleshooting process that could reduce the resolution time? No.

CHANGE: Was this incident caused by a change? Yes

Was the change tested in Production? No

Was the change tested in Acceptance? Yes – the defect was logged but not fixed prior to production deployment

Is there any due diligence that needs to be done to avoid similar incidents caused by a change? Yes. All changes need to flow through the BRP and receive proper prioritization and notification of all development and support teams. The extended timeline of this particular change was also a contributing factor to deploying a known defect

RCA CATEGORY: Defect (this was identified but not fixed.)

CORRECTIVE ACTION: The Schema file  was reverted.

Aviators team reviewed these schema file and determined the attribute that was missing, applied it to the current schema, redeployed and tested in acceptance tested and redeployed in production.

 

PREVENTIVE ACTION ITEMS BY POINT OF FAILURE:

Posted Sep 25, 2023 - 16:41 EDT

Resolved
This incident has been resolved. Customers missing responses, please re-send the impacted claims.
Posted Apr 11, 2023 - 10:00 EDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Apr 10, 2023 - 21:19 EDT
Identified
Root Cause has been identified. We are implementing a fix.
Posted Apr 10, 2023 - 16:54 EDT
This incident affected: System to System Interfaces (XML).