Configuration data discrepancies in automation rules
Incident Report for Jira Service Management
Postmortem

Summary

On February 29, 2024, between 05:11 and 08:46 UTC, customers of Jira Software, Jira Service Management, Jira Work Management, Jira Product Discovery, and Confluence experienced an incorrect data update of Automation Rules. This was due to an incomplete feature flag rollout and bugs in the data upgrade code. 99.99% of affected Automation Rules were remediated by 17:00 UTC on February 29, 2024. The remaining 0.01% of Rule Components were edited by users during the impacted time window and required a confirmation to ensure changes weren’t overridden. These customers were contacted proactively.

IMPACT

Some Rules containing Issue Edit, Issue Create, or Issue Clone actions in Jira products had the Advanced Fields section removed. These Rules continued to run as if the Advanced Fields section were empty. In addition, for affected Issue Edit actions, Jira notifications were not sent.

Some Rules using the Send Email action in Jira and Confluence had the “From name” field removed. Emails continued to be sent but fell back to the Automation default for the “From name” field.

ROOT CAUSE

We were in the process of rolling out a new feature for Automation, which required us to perform data upgrades, which could be risky as the current system would run the upgrades automatically upon Rule activity. In an attempt to minimize such risk, we used feature flags to better control when the upgrade occurred.

However, there was an error in the feature flag configuration which resulted in the upgrades immediately kicking in. Additionally, there were bugs in these upgrades that resulted in overriding customers' saved values with default values. This incorrect configuration was used for Rule runs until recovery.

REMEDIAL ACTION PLAN & NEXT STEPS

We are prioritizing the following improvement actions to avoid repeating this type of incident:

  • We are changing the approach for upgrading Rule configurations to allow better testing and prevent accidental upgrades.
  • Improving feature flag rollout and verification processes to avoid such incorrect configurations.
  • Increasing the frequency of backups of the Automation Rule data.

We apologize to customers who were impacted during this incident; we are taking immediate steps to improve the reliability of Automation.

Thanks,

Atlassian Customer Support

Posted Apr 05, 2024 - 01:04 UTC

Resolved
All the impacted customers have been contacted and all the impacted rules have been fixed. We are now marking this issue as resolved.
Posted Mar 06, 2024 - 04:52 UTC
Update
We have successfully restored the majority of automation rules that were affected, but we are currently in the process of manually fixing the remaining ones. Our support team will be contacting the customers impacted by this issue to proceed with resolving their cases.
Posted Mar 01, 2024 - 14:52 UTC
Update
This issue has been mitigated, and we have restored the majority of automation rules that were affected. Those rules are working as normal, including the rules that had not been modified in the last 48 hours and the rules created after the incident. The remaining rules that haven't yet been restored are cases where customers have manually updated the rules after the incident happened.
We are proceeding with caution to avoid overwriting these changes.
Posted Mar 01, 2024 - 03:17 UTC
Update
We are continuously working on restoring the affected automation rules configuration data. At the moment the majority of the impact has already been mitigated and the team is currently focusing on a few specific outstanding scenarios. We will continue to update you as we progress.
Posted Feb 29, 2024 - 17:30 UTC
Update
We are currently working on restoring the lost automation rules configuration data. The email notifications related to the automation will be addressed upon restoring the lost data. Customers can update the configurations with the Automation rules themselves, and our fix will not override those changes. We will share further updates as we progress.
Posted Feb 29, 2024 - 15:20 UTC
Identified
We have identified the root cause and we are currently working on deploying fix to restore any lost configuration/rules. Next update will be provided in an hour.
Posted Feb 29, 2024 - 10:54 UTC
Update
We continue to investigate the issue with the Automation rules losing configuration data that is impacting certain cloud products. We are actively working to resolve this issue as quickly as possible.
Posted Feb 29, 2024 - 09:23 UTC
Investigating
We are investigating an issue with the Automation rules losing configuration data that is impacting some of the Cloud products. We are actively working to resolve this issue as quickly as possible.
Posted Feb 29, 2024 - 08:22 UTC
This incident affected: Automation for Jira.