Cassandra Repair

Adaptive and scheduled repair that maintains data consistency without impacting production workloads. Real-time progress tracking, full execution history, and failure alerting across every node.

Try Demo Sandbox Start Free

Consistency without the operational burden

Repair is one of the most operationally demanding processes in Cassandra. Running it too aggressively degrades performance. Running it too infrequently leads to data inconsistency and failed reads. Most teams either avoid repair entirely or run it on a rigid schedule that ignores cluster load. AxonOps provides both adaptive and scheduled repair so you get continuous consistency enforcement without manual tuning or production impact.

Query performance comes first

Adaptive Repair executes slowly by design, because query performance always comes first. Intensity only increases when the cluster is idle.

Full repair visibility

Track repair progress per node and keyspace in real time. See what has been repaired, what is pending, and what has failed, without running nodetool commands.

Intelligent alert routing

Repair failures trigger alerts routed by cluster, datacenter, severity, or custom rules to exactly the right team via Slack, Microsoft Teams, PagerDuty, OpsGenie, ServiceNow, or webhook. No noise, no missed failures.

The most advanced repair system for Apache Cassandra

Adaptive and scheduled repair with real-time monitoring, full execution history, and failure alerting.

Capability	What you get	Detail
Adaptive Repair
Continuous consistency	Repairs run continuously in the background, maintaining consistency without manual scheduling	Always-on consistency enforcement
Query-first intensity	Repair executes slowly by design, prioritising query performance. Repair intensity adjusts based on cluster load.	Queries always come first
Keyspace and table scope	Configure which keyspaces and tables are included in adaptive repair	Focus repair where it matters most
Scheduled Repair
Cron-based scheduling	Define repair schedules per keyspace or table with flexible cron expressions	Fits into existing maintenance windows
Repair type	Choose between full, incremental, or subrange repair per schedule	Match repair strategy to workload
Scope control	Target specific keyspaces, tables, or token ranges per repair job	Precise control over what gets repaired
Parallelism	Configure sequential, datacenter-parallel, or fully parallel repair execution	Balance speed against cluster load
Thread count	Set the number of repair threads per node to control resource consumption	Fine-tune repair intensity
Execution windows	Restrict repair execution to off-peak hours with start and end time constraints	Avoid production impact
Datacenter scope	Run repair within specific datacenters or across all datacenters	Localise repair to where it is needed
Monitoring & history
Real-time progress	Track repair progress per node and keyspace with live status updates	See what is running, pending, and completed
Repair history	Complete log of every repair execution with duration, scope, and outcome	Audit and troubleshoot past repairs
Failure alerting	Immediate notification when repairs stall or fail, routed by cluster, datacenter, or severity to Slack, Teams, PagerDuty, OpsGenie, ServiceNow, or webhook	Intelligent routing, zero noise
Data repaired metrics	Volume of data repaired per session with before/after consistency state	Quantify repair impact
Governance
Role-based access	Assign edit or read-only rights per user, with access scoped to specific clusters	Prevent unauthorized changes
SSO authentication	Optional Enterprise SAML integration for single sign-on	Centralized identity management
Automation
Terraform provider	Automate repair schedules, consistency targets, and adaptive repair policies as code	Codify your repair strategy

Adaptive Repair

Adaptive Repair runs continuously in the background, maintaining data consistency across replicas. It deliberately executes slowly because query performance is always the priority. Adaptive regulation of repair velocity means intensity throttles back during heavy workloads and only accelerates when the cluster is idle. The regulation logic is also aware of each table's gc_grace_seconds, and acceleration during quiet periods is continuously recalibrated to ensure all tables complete repair within their required window.

Designed to run slowly by default, keeping query latency unaffected
Intensity automatically adjusts based on real-time cluster load
Configure scope per keyspace and table to focus repair where it matters
Repair plan per table is derived from gc_grace_seconds settings, ensuring consistency is maintained within the required window

Scheduled Repair

For teams that prefer planned repair cycles, scheduled repair gives you full control over every parameter. Choose the repair type, target specific keyspaces or tables, set parallelism, configure thread counts, and restrict execution to off-peak windows. Every run is logged with full execution history and failure alerting.

Full, incremental, or subrange repair per schedule
Scope to specific keyspaces, tables, or token ranges
Sequential, datacenter-parallel, or fully parallel execution
Configurable thread count per node for resource control
Execution windows with start and end time constraints
Datacenter-scoped or cross-datacenter repair
Full execution history with duration, scope, and outcome

Scheduled repair configuration in AxonOps

Repair Monitoring & History

Track repair progress across every node with real-time status updates. See which keyspaces have been repaired, what is currently running, and where failures have occurred. Complete repair history gives you the audit trail to troubleshoot issues and verify consistency.

Real-time repair progress per node and keyspace
Complete history of every repair execution
Immediate alerting on stalled or failed repairs
Data volume repaired per session for impact tracking

Repair monitoring and history in AxonOps

See Cassandra repair in action

Open Demo Sandbox Book an Expert

Demo Sandbox