Auto-Rollback
Auto-rollback automatically reverts a deployment when a CloudWatch alarm fires. This provides a safety net for production deployments -- if your health metrics degrade, DevRamps rolls back to the last known good revision without manual intervention.
How It Works
- You configure a CloudWatch alarm in your AWS account that monitors your application's health (e.g., error rates, latency, CPU utilization).
- You reference the alarm name in your pipeline stage configuration.
- During bootstrap, DevRamps sets up an EventBridge rule in your account to forward alarm state changes.
- When the alarm fires:
- During a deployment: DevRamps cancels the current deployment and rolls back to the last successful revision.
- After a deployment: DevRamps blocks new deployments to the stage until the alarm clears.
- When the alarm clears (returns to OK): The deployment blocker is removed and deployments can resume.
Configuration
Add auto_rollback_alarm_name to any stage:
stages:
- name: production
account_id: "222222222222"
region: us-east-1
auto_rollback_alarm_name: my-service-health-alarm
vars:
env: production
The alarm must exist in the same AWS account and region as the stage.
Alarm Behavior
Alarm fires during deployment (stage is In Progress)
DevRamps:
- Cancels the current deployment.
- Finds the last successfully deployed revision.
- Starts a rollback deployment with that revision.
- Skips infrastructure synthesis during rollback (to avoid unnecessary Terraform changes).
- Blocks the stage and the next stage from receiving new deployments. The next stage is also blocked because the current stage's rolled-back state means the next stage could be running a newer (potentially broken) revision — blocking prevents inconsistency.
Alarm fires after deployment (stage is Succeeded)
DevRamps:
- Adds a deployment blocker to the stage.
- New pushes to this stage are queued but won't execute.
- When the alarm clears, the blocker is removed and queued deployments can proceed.
Alarm fires during rollback (stage is Rolling Back)
No action is taken -- the rollback is already in progress.
Alarm clears (returns to OK)
DevRamps removes the deployment blocker, allowing deployments to resume.
After a Rollback
After an auto-rollback completes, the stage is blocked from automatic promotions. This prevents the problematic code from being immediately redeployed. To resume normal operations:
- Investigate and fix the issue that caused the alarm.
- Manually unblock the stage in the dashboard.
- Push the fix -- the new deployment will proceed through the pipeline normally.
Setting Up CloudWatch Alarms
To use auto-rollback, you need a CloudWatch alarm in your target account. Here's an example Terraform configuration:
resource "aws_cloudwatch_metric_alarm" "service_health" {
alarm_name = "my-service-health-alarm"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 2
metric_name = "5XXError"
namespace = "AWS/ApplicationELB"
period = 60
statistic = "Sum"
threshold = 10
dimensions = {
LoadBalancer = aws_lb.main.arn_suffix
}
alarm_actions = [] # DevRamps handles the alarm via EventBridge
}
The alarm name in your Terraform must match the auto_rollback_alarm_name in your pipeline YAML. You can configure one alarm per stage.
Choosing alarm metrics
Good metrics for auto-rollback alarms include:
- 5xx error rate — Catches application errors introduced by a bad deployment.
- P99 latency — Detects performance regressions.
- Target group healthy host count — Catches containers failing to start.
- Custom application metrics — Any business-critical metric your application emits.
Limitations
- Auto-rollback can only roll back to a previous revision that has been successfully deployed. If the stage has never had a successful deployment, rollback isn't possible.
- Infrastructure synthesis (Terraform) is skipped during auto-rollback to keep the rollback fast. If you need to revert infrastructure changes, use a manual rollback.