Skip to main content

AI Failure Analysis

When a deployment step fails, DevRamps can use AI to analyze the failure, identify the root cause, and suggest a fix. The AI agent examines step logs, deployment context, and your source code to produce an actionable diagnosis.

Data Privacy

AI failure analysis is powered by Claude (Anthropic). When you trigger an analysis, the following data is sent to the AI model:

  • Step execution logs from the failed step.
  • Deployment context (step type, parameters, stage configuration).
  • Relevant source code from your repository (limited to files related to the failure).

This data is used solely for the analysis and is not stored by the AI provider beyond the request. If your organization has data residency or compliance requirements, consider this before using the feature.

How It Works

  1. A deployment step fails.
  2. You click Analyze Failure on the failed step in the dashboard.
  3. DevRamps creates an analysis job that:
    • Reads the step's execution logs.
    • Examines the deployment context (step type, parameters, stage configuration).
    • Reviews relevant source code from your repository.
    • Uses AI to identify the likely root cause.
  4. The analysis results appear in the dashboard with:
    • A root cause explanation describing what went wrong and why.
    • A suggested fix with specific code changes (when applicable).
  5. If the AI suggests code changes, you can create a pull request directly from the analysis results.

Triggering an Analysis

  1. Navigate to the failed deployment in the dashboard.
  2. Click on the failed stage, then the failed step.
  3. Click Analyze Failure.
  4. The analysis runs in the background. You'll see a status indicator while it processes.
  5. Once complete, the results appear inline with the step details.

Reviewing Results

The analysis provides:

  • Root Cause: A plain-language explanation of what caused the failure. For example: "The ECS service health check failed because the container is exiting with code 1. The application is crashing on startup due to a missing environment variable DATABASE_URL."
  • Suggested Fix: When the AI can identify a code change that would resolve the issue, it provides a diff showing the proposed changes.

Example

Here's an example of what an analysis result might look like:

Root Cause: The ECS service health check is failing because the container exits immediately with code 1. The application's server.ts references process.env.DATABASE_URL which is not set in the task definition. The environment variable was recently removed in commit abc123.

Suggested Fix: Add the DATABASE_URL environment variable to the ECS task definition, referencing the stage secret:

variables:
database_url: ${{ stage_secret("DATABASE_URL") }}

Creating a Fix PR

If the analysis includes a suggested fix:

  1. Review the proposed changes in the analysis results.
  2. Click Create Pull Request.
  3. DevRamps creates a branch in your repository with the suggested changes.
  4. A pull request is opened for your review.
  5. Review, iterate, and merge the PR as you would any other code change.
  6. When merged and pushed, a new deployment starts automatically.

Limitations

  • Analysis quality depends on log verbosity. Steps that produce detailed error messages get better analysis.
  • The AI may not always identify the correct root cause, especially for complex infrastructure issues. Always verify suggestions before applying.
  • Analysis is available for failed steps only -- it cannot analyze successful deployments.
  • One analysis job runs at a time per step. You can't run duplicate concurrent analyses for the same failed step.