{"id":4561,"date":"2025-12-18T15:21:39","date_gmt":"2025-12-18T09:51:39","guid":{"rendered":"https:\/\/blog.spike.sh\/?p=4561"},"modified":"2026-01-09T13:16:38","modified_gmt":"2026-01-09T07:46:38","slug":"postmortem-on-datadog-incidents-not-autoresolving","status":"publish","type":"post","link":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/","title":{"rendered":"Postmortem on Datadog incidents not auto-resolving"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">On December 17th, 2025, we found that Datadog incidents were not auto-resolving as expected. A user reported this incident, and upon investigation, we identified an issue in our Incident Grouping logic that prevented Datadog incidents from auto-resolving. We resolved the issue and going forward, Datadog incidents will auto-resolve as expected. However, we did not update our <a href=\"https:\/\/status.spike.sh\/\">status page<\/a> as the incident escalated quickly. But at Spike, we believe in transparency, so we&#8217;re sharing all the details of this incident with you in this blog\u200b\u200b.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Table of Contents<\/strong><\/p>\n\n\n\n<nav aria-label=\"Table of Contents\" class=\"wp-block-table-of-contents\"><ol><li><a class=\"wp-block-table-of-contents__entry\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#summary\">Summary<\/a><\/li><li><a class=\"wp-block-table-of-contents__entry\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#impact\">Impact<\/a><\/li><li><a class=\"wp-block-table-of-contents__entry\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#timelines\">Timelines<\/a><\/li><li><a class=\"wp-block-table-of-contents__entry\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#response\">Response<\/a><\/li><li><a class=\"wp-block-table-of-contents__entry\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#recovery\">Recovery<\/a><\/li><li><a class=\"wp-block-table-of-contents__entry\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#lessons-learnt\">Lessons learnt<\/a><\/li><li><a class=\"wp-block-table-of-contents__entry\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#conclusion\">Conclusion<\/a><\/li><\/ol><\/nav>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"summary\">Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">On 18 December 2025 at 11:38 AM UTC, we were notified by a user that incidents triggered from Datadog were not auto-resolving in Spike when the Datadog monitor state returned to <strong>OK<\/strong>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The same customer had earlier raised a concern on 14 December 2025 at 9:45 AM UTC via a private Slack channel. At the time, the issue was concluded as a possible configuration issue on the user&#8217;s end. On 18 December, the user confirmed that the behavior was recurring, which prompted a deeper investigation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What happened<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After the issue was confirmed, members of the Spike team joined multiple users from our customer on a live video call to debug the problem together.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">During the session, we observed that:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Incidents were being correctly triggered in Spike when Datadog monitors moved to an alerting state.<\/li>\n\n\n\n<li>Incidents were <strong><em>not<\/em><\/strong> auto-resolving in Spike when the monitor state returned to <strong>OK<\/strong>.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Our initial observation suggested that Datadog might not be sending the <strong>OK<\/strong> state payload to Spike, which would explain the missing auto-resolve behavior.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To gather more data, we ran a controlled test:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Datadog was configured to send both <em>triggered<\/em> and <em>OK<\/em> events.<\/li>\n\n\n\n<li>One path delivered events to Spike.<\/li>\n\n\n\n<li>Another path delivered events to Datadog\u2019s native Slack integration.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">From this test, we confirmed that Datadog was successfully sending <strong>OK<\/strong> state notifications to Slack. This helped narrow down the problem space.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">On 17th December 2025, 11:38 AM UTC, one of our users reported an issue with Datadog integration not auto-resolving the incidents when the state was OK on Datadog. The user had earlier brought this to our attention on a private Slack channel between Spike team and their team at 9:45 AM on 14th December.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We created a ticket with P2 urgency to begin our investigation.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"impact\">Impact<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Our analysis indicates that 2,236 Datadog-triggered incidents this year were likely affected by the auto-resolve issue before the fix was deployed.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"timelines\">Timelines<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Here\u2019s the detailed timeline of events in UTC on 17th December.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>11:38 AM<\/strong><strong><br><\/strong>The customer reports that Datadog-triggered incidents are not auto-resolving in Spike.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>11:42 AM<\/strong><strong><br><\/strong>The Spike team (Damanpreet and Kaushik) discusses the report internally to assess severity and next steps.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>11:45 AM<br><\/strong>Multiple members from the customer\u2019s team and the Spike team join a video call. During the call, we review the Datadog configuration, which appears correct. We begin live debugging and inspect logs to identify potential issues.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>12:25 PM<\/strong><strong><br><\/strong>An investigation ticket is created to formally track the issue.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>1:50 PM<\/strong><strong><br><\/strong>The Spike team begins focused investigation into the Datadog integration and auto-resolve behavior.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>2:35 PM<br><\/strong>We identify a bug in the Datadog integration where Spike was unable to correctly associate resolve (OK) events with the corresponding open incident. As a result, the incident was neither auto-resolved nor re-triggered correctly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>2:43 PM<br><\/strong>A fix is implemented and deployed via a pull request. The issue is resolved.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"response\">Response<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Once the issue was confirmed to be reproducible, we acknowledged it as an incident and assigned it a <strong>P2 priority<\/strong>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We created a ticket in Linear and linked it to the corresponding Spike incident to track investigation and follow-ups. A dedicated Slack thread was opened for real-time coordination, and a war room was set up to focus on reproducing the issue, collecting evidence, and narrowing down the possible causes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">During the investigation, members of the Spike team worked closely with the affected customer, including live debugging sessions, to validate assumptions and gather additional data.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"recovery\">Recovery<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 1: Replication<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To reproduce the issue, we set up a controlled test environment. We created a fresh Datadog datasource, configured a new monitor, and triggered an alert that successfully created an incident on Spike. We then moved the alert to OK state, but the incident did not auto-resolve as expected. With the issue successfully replicated, we could move forward with confidence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 2: Root Cause Analysis<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With replication confirmed, we moved to isolate the problem by examining Spike&#8217;s architecture. Incident events first arrive at the hooks microservice, where we identify the integration type and convert the payload to human-readable format. These events are then sent to the escalations engine, which checks for existing open incidents. One of 5 below actions are taken by the engine:&nbsp;<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>If a triggered event arrives and the same incident is already open, it gets suppressed.\u00a0<\/li>\n\n\n\n<li>If a resolved event arrives and the incident is open, we resolve it.\u00a0<\/li>\n\n\n\n<li>If a resolved event arrives with no matching open incident, it gets discarded.<\/li>\n\n\n\n<li>If a triggered event arrives and the same incident is already resolved, then the incident is re-triggered and alerts are sent<\/li>\n\n\n\n<li>If a triggered event arrives and no matching incident is found, then a new incident is triggered and alerts are sent<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"602\" height=\"824\" data-attachment-id=\"4572\" data-permalink=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/incident-seggregation-1\/\" data-orig-file=\"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/incident-seggregation-1.png\" data-orig-size=\"602,824\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"incident seggregation (1)\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/incident-seggregation-1.png\" src=\"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/incident-seggregation-1.png\" alt=\"\" class=\"wp-image-4572\" srcset=\"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/incident-seggregation-1.png 602w, https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/incident-seggregation-1-219x300.png 219w\" sizes=\"auto, (max-width: 602px) 100vw, 602px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">We started by checking the Hooks microservice logs &#8211; requests were arriving as expected. Next, we examined the escalation and grouping logic. That&#8217;s where we found the issue: the logic to find same open incidents for Datadog integration was incorrect. Along with the keys found in the payloads, Spike was also considering the pre-formatted message from Hooks to identify. Since Datadog incident messages have the status in their title, the identification became nearly impossible. Just removing the messages fixed the issue.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Step 3: Fix &amp; Deployment<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Once we identified the root cause, the fix was straightforward. We corrected the grouping query for Datadog events and deployed the changes to production. We then verified the fix by triggering a complete Datadog alert and resolution cycle, confirming that incidents now auto-resolve as expected.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"lessons-learnt\">Lessons learnt<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">We need stronger, routine data analysis around incident resolution. Tracking how many incidents Spike auto-resolves versus those that do not would have helped surface this issue earlier and with more confidence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When the issue was first raised on Sunday, 14 December, we initially treated it as a configuration problem. That assessment turned out to be incorrect. A deeper investigation at that point could have led to an earlier diagnosis and resolution.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Getting on a live call with the customer significantly accelerated our understanding of the problem. Seeing the product in real use and debugging together reduced guesswork. Maintaining a dedicated Slack channel for ongoing discussions proved valuable and continues to pay off over time.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Spike\u2019s integration architecture also proved resilient. Datadog integrations are isolated from other integrations, which meant no other sources were affected. Even within Datadog, incident triggering continued to work as expected. The separation of incident identification and resolution into distinct services allowed us to isolate the failure quickly and fix it without broader impact. We plan to share a deeper architectural breakdown in the future.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"conclusion\">Conclusion<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">\u200b\u200b\u200b\u200b\u200b\u200bIntegrations keep sending new types of incidents as they evolve and update. We have plans to bring AI into the picture to help identify these new incident types more easily and catch potential problems before they impact you.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Thanks to Adil and Mark for flagging this issue. Going forward, Datadog incidents will auto-resolve as expected.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>On December 17th, 2025, we found that Datadog incidents weren&#8217;t auto-resolving due to an issue in our Incident Grouping logic. We resolved the issue, and now Datadog incidents are auto-resolved as expected. This postmortem details the incident timeline, root cause analysis, and lessons learned.<\/p>\n","protected":false},"author":263547071,"featured_media":4568,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crdt_document":"","_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","_lmt_disableupdate":"","_lmt_disable":"","_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":true,"token":"eyJpbWciOiJodHRwczpcL1wvYmxvZy5zcGlrZS5zaFwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyNVwvMTJcL2JhY2tncm91bmQtNDctMTAyNHg1NTUucG5nIiwidHh0IjoiUG9zdG1vcnRlbSBvbiBEYXRhZG9nIGluY2lkZW50cyBub3QgYXV0by1yZXNvbHZpbmciLCJ0ZW1wbGF0ZSI6ImhpZ2h3YXkiLCJmb250IjoiIiwiYmxvZ19pZCI6MjMzMTM4OTAwfQ.cZmjJ4s4hP9Jgr8Gre7bixt8R07WqNfEYOr1SUtqSBEMQ"},"version":2},"_wpas_customize_per_network":false,"jetpack_post_was_ever_published":false},"categories":[1444],"tags":[],"class_list":["post-4561","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-postmortem"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Postmortem on Datadog incidents not auto-resolving<\/title>\n<meta name=\"description\" content=\"This postmortem details how a flaw in the Incident Grouping prevented Datadog incidents from auto-resolving.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Postmortem on Datadog incidents not auto-resolving\" \/>\n<meta property=\"og:description\" content=\"This postmortem details how a flaw in the Incident Grouping prevented Datadog incidents from auto-resolving.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/\" \/>\n<meta property=\"og:site_name\" content=\"Spike&#039;s blog\" \/>\n<meta property=\"article:published_time\" content=\"2025-12-18T09:51:39+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-01-09T07:46:38+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/background-47.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1040\" \/>\n\t<meta property=\"og:image:height\" content=\"564\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Damanpreet\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Damanpreet\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/\"},\"author\":{\"name\":\"Damanpreet\",\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/#\\\/schema\\\/person\\\/4bc971837058eacf02e9ef12fb625155\"},\"headline\":\"Postmortem on Datadog incidents not auto-resolving\",\"datePublished\":\"2025-12-18T09:51:39+00:00\",\"dateModified\":\"2026-01-09T07:46:38+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/\"},\"wordCount\":1233,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/blog.spike.sh\\\/wp-content\\\/uploads\\\/2025\\\/12\\\/background-47.png\",\"articleSection\":[\"Postmortem\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/\",\"url\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/\",\"name\":\"Postmortem on Datadog incidents not auto-resolving\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/blog.spike.sh\\\/wp-content\\\/uploads\\\/2025\\\/12\\\/background-47.png\",\"datePublished\":\"2025-12-18T09:51:39+00:00\",\"dateModified\":\"2026-01-09T07:46:38+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/#\\\/schema\\\/person\\\/4bc971837058eacf02e9ef12fb625155\"},\"description\":\"This postmortem details how a flaw in the Incident Grouping prevented Datadog incidents from auto-resolving.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#primaryimage\",\"url\":\"https:\\\/\\\/blog.spike.sh\\\/wp-content\\\/uploads\\\/2025\\\/12\\\/background-47.png\",\"contentUrl\":\"https:\\\/\\\/blog.spike.sh\\\/wp-content\\\/uploads\\\/2025\\\/12\\\/background-47.png\",\"width\":1040,\"height\":564,\"caption\":\"Blog cover titled \\\"Postmortem on Datadog incidents not auto-resolving\\\"\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/postmortem-on-datadog-incidents-not-autoresolving\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/blog.spike.sh\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Postmortem on Datadog incidents not auto-resolving\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/#website\",\"url\":\"https:\\\/\\\/blog.spike.sh\\\/\",\"name\":\"Spike&#039;s blog\",\"description\":\"Learnings and opinions in a changing world\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/blog.spike.sh\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/blog.spike.sh\\\/#\\\/schema\\\/person\\\/4bc971837058eacf02e9ef12fb625155\",\"name\":\"Damanpreet\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/9a938d08ab9251a99e241686ddce63028cca486f4937fffad851519903da88d3?s=96&d=robohash&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/9a938d08ab9251a99e241686ddce63028cca486f4937fffad851519903da88d3?s=96&d=robohash&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/9a938d08ab9251a99e241686ddce63028cca486f4937fffad851519903da88d3?s=96&d=robohash&r=g\",\"caption\":\"Damanpreet\"},\"url\":\"https:\\\/\\\/blog.spike.sh\\\/author\\\/daman50ab08d672\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Postmortem on Datadog incidents not auto-resolving","description":"This postmortem details how a flaw in the Incident Grouping prevented Datadog incidents from auto-resolving.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/","og_locale":"en_GB","og_type":"article","og_title":"Postmortem on Datadog incidents not auto-resolving","og_description":"This postmortem details how a flaw in the Incident Grouping prevented Datadog incidents from auto-resolving.","og_url":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/","og_site_name":"Spike&#039;s blog","article_published_time":"2025-12-18T09:51:39+00:00","article_modified_time":"2026-01-09T07:46:38+00:00","og_image":[{"width":1040,"height":564,"url":"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/background-47.png","type":"image\/png"}],"author":"Damanpreet","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Damanpreet","Estimated reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#article","isPartOf":{"@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/"},"author":{"name":"Damanpreet","@id":"https:\/\/blog.spike.sh\/#\/schema\/person\/4bc971837058eacf02e9ef12fb625155"},"headline":"Postmortem on Datadog incidents not auto-resolving","datePublished":"2025-12-18T09:51:39+00:00","dateModified":"2026-01-09T07:46:38+00:00","mainEntityOfPage":{"@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/"},"wordCount":1233,"commentCount":0,"image":{"@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#primaryimage"},"thumbnailUrl":"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/background-47.png","articleSection":["Postmortem"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/","url":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/","name":"Postmortem on Datadog incidents not auto-resolving","isPartOf":{"@id":"https:\/\/blog.spike.sh\/#website"},"primaryImageOfPage":{"@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#primaryimage"},"image":{"@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#primaryimage"},"thumbnailUrl":"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/background-47.png","datePublished":"2025-12-18T09:51:39+00:00","dateModified":"2026-01-09T07:46:38+00:00","author":{"@id":"https:\/\/blog.spike.sh\/#\/schema\/person\/4bc971837058eacf02e9ef12fb625155"},"description":"This postmortem details how a flaw in the Incident Grouping prevented Datadog incidents from auto-resolving.","breadcrumb":{"@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#primaryimage","url":"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/background-47.png","contentUrl":"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/background-47.png","width":1040,"height":564,"caption":"Blog cover titled \"Postmortem on Datadog incidents not auto-resolving\""},{"@type":"BreadcrumbList","@id":"https:\/\/blog.spike.sh\/postmortem-on-datadog-incidents-not-autoresolving\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/blog.spike.sh\/"},{"@type":"ListItem","position":2,"name":"Postmortem on Datadog incidents not auto-resolving"}]},{"@type":"WebSite","@id":"https:\/\/blog.spike.sh\/#website","url":"https:\/\/blog.spike.sh\/","name":"Spike&#039;s blog","description":"Learnings and opinions in a changing world","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/blog.spike.sh\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":"Person","@id":"https:\/\/blog.spike.sh\/#\/schema\/person\/4bc971837058eacf02e9ef12fb625155","name":"Damanpreet","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/secure.gravatar.com\/avatar\/9a938d08ab9251a99e241686ddce63028cca486f4937fffad851519903da88d3?s=96&d=robohash&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/9a938d08ab9251a99e241686ddce63028cca486f4937fffad851519903da88d3?s=96&d=robohash&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/9a938d08ab9251a99e241686ddce63028cca486f4937fffad851519903da88d3?s=96&d=robohash&r=g","caption":"Damanpreet"},"url":"https:\/\/blog.spike.sh\/author\/daman50ab08d672\/"}]}},"modified_by":"Sreekar","jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/blog.spike.sh\/wp-content\/uploads\/2025\/12\/background-47.png","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pfMe4Q-1bz","jetpack-related-posts":[{"id":3229,"url":"https:\/\/blog.spike.sh\/best-automated-incident-response-tools\/","url_meta":{"origin":4561,"position":0},"title":"9 Best Automated Incident Response Tools (2026)","author":"Sreekar","date":"26th September, 2025","format":false,"excerpt":"From triage to post-mortem, this article evaluates the 9 best\u00a0automated incident response tools. See a full comparison of Spike, PagerDuty, and more to find the perfect fit for your team\u2019s workflow and budget.","rel":"","context":"In &quot;Automation&quot;","block_context":{"text":"Automation","link":"https:\/\/blog.spike.sh\/category\/incident-management\/automation\/"},"img":{"alt_text":"Blog cover image titled \"9 Best Automated Incident Response Tools (2025)\"","src":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/09\/blog-cover-1.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/09\/blog-cover-1.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/09\/blog-cover-1.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/09\/blog-cover-1.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/09\/blog-cover-1.png?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/09\/blog-cover-1.png?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":2440,"url":"https:\/\/blog.spike.sh\/9-best-incident-response-tools\/","url_meta":{"origin":4561,"position":1},"title":"9 Best Incident Response Tools (Plus 4 Open-Source Options)","author":"Sreekar","date":"30th July, 2025","format":false,"excerpt":"I\u2019ve curated a list of 9 best incident response tools, plus 4 open-source options for you. But first, a quick note: Many people mix up alerting, monitoring, and incident response. Incident response is what you do after receiving an alert. It includes alert acknowledgment, escalations, incident communication, post-incident analysis, and\u2026","rel":"","context":"In &quot;Comparison&quot;","block_context":{"text":"Comparison","link":"https:\/\/blog.spike.sh\/category\/comparison\/"},"img":{"alt_text":"Blog cover image titled \"9 Best Incident Response Tools\"","src":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-Incident-Response-Tools.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-Incident-Response-Tools.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-Incident-Response-Tools.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-Incident-Response-Tools.png?resize=700%2C400&ssl=1 2x"},"classes":[]},{"id":1652,"url":"https:\/\/blog.spike.sh\/6-better-squadcast-alternatives-2026\/","url_meta":{"origin":4561,"position":2},"title":"6 Better Squadcast Alternatives 2026","author":"Sreekar","date":"12th May, 2025","format":false,"excerpt":"Concerned about Squadcast after the SolarWinds acquisition? Explore these 6 alternatives that provide better reliability, advanced features, and competitive pricing. Don't let acquisition uncertainty impact your incident management effectiveness\u2014find your ideal replacement today.","rel":"","context":"In &quot;Comparison&quot;","block_context":{"text":"Comparison","link":"https:\/\/blog.spike.sh\/category\/comparison\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/05\/Basics-of-Incident-Management.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/05\/Basics-of-Incident-Management.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/05\/Basics-of-Incident-Management.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/05\/Basics-of-Incident-Management.png?resize=700%2C400&ssl=1 2x"},"classes":[]},{"id":372,"url":"https:\/\/blog.spike.sh\/6-better-atlassian-opsgenie-alternatives-2026\/","url_meta":{"origin":4561,"position":3},"title":"6 Better Atlassian OpsGenie Alternatives (2026)","author":"Sreekar","date":"12th February, 2025","format":false,"excerpt":"Switching from OpsGenie? Discover six top alternatives for 2026. From Spike's user-friendly interface to PagerDuty's enterprise features, compare key options, pricing, and find the ideal incident response platform for your team\u2014whether you're a startup or a large enterprise.","rel":"","context":"In &quot;OpsGenie&quot;","block_context":{"text":"OpsGenie","link":"https:\/\/blog.spike.sh\/category\/comparison\/opsgenie\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/02\/Basics-of-Incident-Management.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/02\/Basics-of-Incident-Management.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/02\/Basics-of-Incident-Management.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/02\/Basics-of-Incident-Management.png?resize=700%2C400&ssl=1 2x"},"classes":[]},{"id":2360,"url":"https:\/\/blog.spike.sh\/9-best-it-alerting-software-2026\/","url_meta":{"origin":4561,"position":4},"title":"9 Best IT Alerting Software in 2026 (Plus 3 Open-Source Options)","author":"Sreekar","date":"25th July, 2025","format":false,"excerpt":"I\u2019ve curated a list of 9 best IT alerting software and 3 open-source alternatives for you. Every tool on this list handles the core alerting functions you need: incident detection, fast alert delivery, clear escalation paths, and reliable incident logging. Since all these tools tick those boxes, I focused on\u2026","rel":"","context":"In &quot;Alerts&quot;","block_context":{"text":"Alerts","link":"https:\/\/blog.spike.sh\/category\/incident-management\/alerts\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-IT-Alterting-Software-in-2025-1.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-IT-Alterting-Software-in-2025-1.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-IT-Alterting-Software-in-2025-1.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-IT-Alterting-Software-in-2025-1.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-IT-Alterting-Software-in-2025-1.png?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/07\/9-Best-IT-Alterting-Software-in-2025-1.png?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":371,"url":"https:\/\/blog.spike.sh\/pagerduty-alternatives\/","url_meta":{"origin":4561,"position":5},"title":"6 Better PagerDuty Alternatives (2026)","author":"Sreekar","date":"30th January, 2025","format":false,"excerpt":"Looking to switch from PagerDuty? In this blog post, we explore six powerful PagerDuty alternatives, comparing their key features, pricing, and target audiences. Whether you need a simpler setup, better integrations, or more cost-effective pricing, you'll find the right incident management solution for your team.","rel":"","context":"In &quot;PagerDuty&quot;","block_context":{"text":"PagerDuty","link":"https:\/\/blog.spike.sh\/category\/comparison\/pagerduty\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/01\/pagerduty-vs-opsgenie.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/01\/pagerduty-vs-opsgenie.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/01\/pagerduty-vs-opsgenie.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/blog.spike.sh\/wp-content\/uploads\/2025\/01\/pagerduty-vs-opsgenie.png?resize=700%2C400&ssl=1 2x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/posts\/4561","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/users\/263547071"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/comments?post=4561"}],"version-history":[{"count":10,"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/posts\/4561\/revisions"}],"predecessor-version":[{"id":4575,"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/posts\/4561\/revisions\/4575"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/media\/4568"}],"wp:attachment":[{"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/media?parent=4561"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/categories?post=4561"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.spike.sh\/wp-json\/wp\/v2\/tags?post=4561"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}