As we rolled out our regular monthly release (2025.04), we introduced the possibility for the metric log.appended_bytes
in the DBMS to return a negative value, which became apparent because with this version we released a new datastore version. The process caused the metrics HTTP endpoint of the Neo4j DBMS to fail.
This issue was undetected because it affected only a subset of instances and it got corrected for the instances that had rolled subsequently to another component change we delivered as part of the release.
The issue only occurred on instances that were not part of that restart due to the roll.
This issue also prevented us from fully collecting metrics from those DBMS instances, which impacted monitoring of the instances, troubleshooting by engineers.
We filtered out the metric causing the issue and rolled the affected instances to overcome the issue.
Affected customers (a random subset of instances across tiers) could not retrieve any instance metrics via the endpoint customer-metrics-api.neo4j.io and this also affected the built-in metrics included in the monitoring section of the Aura console (console.neo4j.io )
Following a review of the sequence of events and their impact, we have identified a number of actions to implement so that we improve the Neo4j Aura and prevent, detect, mitigate as well as better handle any similar issue.
Prevention
Detection
Mitigation, handling and troubleshooting
Communication