Driver Best Practices¶
This page consolidates production configuration recommendations for Cassandra drivers.
Session Management¶
Single Session per Application¶
Create one session and reuse it throughout the application lifecycle:
// CORRECT: Single session, created once
public class CassandraConfig {
private static CqlSession session;
public static synchronized CqlSession getSession() {
if (session == null) {
session = CqlSession.builder()
.withLocalDatacenter("dc1")
.build();
}
return session;
}
public static void shutdown() {
if (session != null) {
session.close();
}
}
}
// WRONG: Session per request
public User getUser(UUID id) {
try (CqlSession session = CqlSession.builder().build()) { // Expensive!
return session.execute(...);
}
}
| Aspect | Single Session | Session per Request |
|---|---|---|
| Connection overhead | Once at startup | Every request |
| Metadata discovery | Once | Every request |
| Prepared statement cache | Shared | Rebuilt each time |
| Resource usage | Predictable | Unbounded |
Graceful Shutdown¶
Close the session cleanly on application shutdown:
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
session.close(); // Waits for in-flight requests
}));
Connection Configuration¶
Contact Points¶
Provide multiple contact points for initial connection:
CqlSession session = CqlSession.builder()
.addContactPoint(new InetSocketAddress("10.0.1.1", 9042))
.addContactPoint(new InetSocketAddress("10.0.1.2", 9042))
.addContactPoint(new InetSocketAddress("10.0.1.3", 9042))
.withLocalDatacenter("dc1")
.build();
The driver only needs one successful connection to discover the full cluster topology, but multiple contact points provide redundancy during startup.
Local Datacenter¶
In Java driver 4.x, local datacenter configuration is mandatory regardless of cluster topology. Session creation fails with IllegalStateException if omitted:
// REQUIRED (driver 4.x, any topology)
.withLocalDatacenter("dc1")
In multi-DC deployments, this setting also ensures the driver routes queries to the correct datacenter.
Connection Pool Sizing¶
Default pool settings work for most workloads. Adjust only when:
- Measured stream exhaustion occurs
- Throughput exceeds tens of thousands requests/second per node
- Monitoring shows pool-related bottlenecks
In driver 4.x, pool sizing is configured via application.conf:
datastax-java-driver {
advanced.connection.pool {
local.size = 2 // connections per local node (default: 1)
remote.size = 1 // connections per remote node (default: 1)
}
}
Query Execution¶
Use Prepared Statements¶
Prepare all production queries:
// Prepare once at startup
private final PreparedStatement selectUser = session.prepare(
"SELECT * FROM users WHERE user_id = ?");
// Execute with bound values
public User getUser(UUID userId) {
Row row = session.execute(selectUser.bind(userId)).one();
return mapToUser(row);
}
Benefits:
- Reduced parsing overhead
- Enables token-aware routing (via routing key metadata bound to the statement)
- Protection against CQL injection
Set Appropriate Consistency Levels¶
Choose consistency level based on requirements:
Statement statement = selectUser.bind(userId)
.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM); // Explicit
| Use Case | Recommended CL |
|---|---|
| Strong consistency reads | LOCAL_QUORUM |
| Strong consistency writes | LOCAL_QUORUM |
| Eventually consistent reads | LOCAL_ONE |
| Analytics/reporting | ONE |
| Cross-DC consistency | QUORUM or EACH_QUORUM |
Set Query Timeouts¶
Configure appropriate timeouts:
Statement statement = selectUser.bind(userId)
.setTimeout(Duration.ofSeconds(5)); // Query-specific timeout
| Timeout Type | Typical Range | Notes |
|---|---|---|
| Read timeout | 5-10 seconds | Should exceed expected P99; align with server-side read_request_timeout |
| Write timeout | 10-30 seconds | Should align with server-side write_request_timeout; varies by workload |
| Connection timeout | 5 seconds | Adjust for network conditions |
Client-side timeouts should be set relative to server-side timeouts and application SLA requirements.
Error Handling¶
Handle Specific Exceptions¶
try {
session.execute(statement);
} catch (NoNodeAvailableException e) {
// All nodes down - circuit breaker or fail
log.error("Cluster unavailable", e);
throw new ServiceUnavailableException();
} catch (QueryExecutionException e) {
if (e instanceof ReadTimeoutException) {
// Replica(s) didn't respond - may retry
ReadTimeoutException rte = (ReadTimeoutException) e;
log.warn("Read timeout: received {}/{} required",
rte.getReceived(), rte.getRequired());
} else if (e instanceof WriteTimeoutException) {
// Write may or may not have succeeded
WriteTimeoutException wte = (WriteTimeoutException) e;
log.error("Write timeout for {}: received {}/{}",
wte.getWriteType(), wte.getReceived(), wte.getRequired());
// DO NOT retry non-idempotent writes automatically
} else if (e instanceof UnavailableException) {
// Not enough replicas alive
UnavailableException ue = (UnavailableException) e;
log.warn("Unavailable: alive {}/{} required",
ue.getAlive(), ue.getRequired());
}
}
Idempotency Marking¶
Mark idempotent operations explicitly:
// Safe to retry
Statement readStatement = selectUser.bind(userId)
.setIdempotent(true);
// NOT safe to retry
Statement counterStatement = updateCounter.bind(pageId)
.setIdempotent(false);
Policy Configuration¶
Production Policy Template¶
CqlSession session = CqlSession.builder()
.addContactPoints(contactPoints)
.withLocalDatacenter("dc1")
// Load balancing: configured via application.conf
// basic.load-balancing-policy.local-datacenter = "dc1"
// Retry: conservative, respects idempotency
.withRetryPolicy(DefaultRetryPolicy.INSTANCE)
// Reconnection: exponential backoff
.withReconnectionPolicy(
ExponentialReconnectionPolicy.builder()
.withBaseDelay(Duration.ofSeconds(1))
.withMaxDelay(Duration.ofMinutes(5))
.build())
// Speculative execution: disabled by default
// Enable only for idempotent, latency-sensitive queries
// .withSpeculativeExecutionPolicy(...)
.build();
Per-Query Policy Override¶
Override policies for specific query types:
// Latency-sensitive read with speculative execution
Statement fastRead = selectUser.bind(userId)
.setIdempotent(true)
.setSpeculativeExecutionPolicy(speculativePolicy);
// Non-idempotent write with no retry
Statement counterUpdate = incrementCounter.bind(pageId)
.setIdempotent(false)
.setRetryPolicy(FallthroughRetryPolicy.INSTANCE);
Monitoring¶
Essential Metrics¶
Monitor these driver metrics:
| Metric Category | Key Metrics |
|---|---|
| Latency | Request latency percentiles (P50, P95, P99) |
| Throughput | Requests per second |
| Errors | Error rate by type (timeout, unavailable, etc.) |
| Connections | Open connections per node |
| Pool | In-flight requests, available streams |
| Retries | Retry rate, retry success rate |
| Speculative | Trigger rate, win rate |
Health Checks¶
Implement application health checks:
public boolean isHealthy() {
try {
// Simple query to verify connectivity
session.execute("SELECT now() FROM system.local");
return true;
} catch (Exception e) {
return false;
}
}
Logging¶
Configure appropriate driver logging:
<!-- Log connection events -->
<logger name="com.datastax.oss.driver.internal.core.pool" level="INFO"/>
<!-- Log retries and speculative execution -->
<logger name="com.datastax.oss.driver.internal.core.retry" level="DEBUG"/>
<!-- Reduce noise from metadata refresh -->
<logger name="com.datastax.oss.driver.internal.core.metadata" level="WARN"/>
Common Anti-Patterns¶
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Session per request | Massive overhead | Single shared session |
| Unprepared statements in loops | Parsing overhead, no token-aware | Prepare and reuse |
| Ignoring local datacenter | Cross-DC latency | Configure explicitly |
| Retrying non-idempotent writes | Data corruption | Mark idempotency, custom retry |
| Unbounded IN clauses | Prepared statement cache churn | Fixed sizes or pagination |
| Synchronous calls in async context | Thread pool exhaustion | Use async API consistently |
| No timeout configuration | Requests hang indefinitely | Set explicit timeouts |
| Catching generic Exception | Hides specific error handling | Catch specific exceptions |
Checklist¶
Before deploying to production:
- [ ] Single session instance shared across application
- [ ] Local datacenter configured explicitly
- [ ] All queries use prepared statements
- [ ] Consistency levels set explicitly
- [ ] Timeouts configured appropriately
- [ ] Idempotent operations marked
- [ ] Error handling for specific exception types
- [ ] Driver metrics exported to monitoring
- [ ] Health check endpoint implemented
- [ ] Graceful shutdown configured
- [ ] Connection pool sized appropriately (if non-default)
- [ ] Retry policy reviewed for workload
- [ ] Speculative execution evaluated (if latency-sensitive)
Related Documentation¶
- Connection Management — Connection pooling details
- Policies — Policy configuration reference
- Prepared Statements — Statement preparation and caching