Monthly Archives: April 2016

Cassandra: Batch Loading Without the Batch — The Nuanced Edition

My previous post on this subject has proven extraordinarily popular and I get commentary on it all the time, most of it quite good. It has however, gotten a decent number of comments from people quibbling with the nuance of the post … Continue reading 

Posted in Cassandra, Java | Leave a comment


A couple of times a week I get a question where someone wants to know how to “failover” to a remote DC in the driver if the local Cassandra DC fails or even if there is only a couple of … Continue reading 

Posted in Cassandra | Leave a comment

Connection to Oracle From Spark

For some silly reason there is a has been a fair amount of difficulty in reading and writing to Oracle from Spark when using DataFrames. SPARK-10648 — Spark-SQL JDBC fails to set a default precision and scale when they are not defined … Continue reading 

Posted in Spark | Tagged , | Leave a comment

Reflection Scala-2.10 and Spark weird errors when saving to Cassandra

This originally started with this SO question, and I’ll be honest I was flummoxed for a couple of days looking at this (in no small part because the code was doing a lot). But at some point I was able to … Continue reading 

Posted in Cassandra, Spark | Tagged , , | Leave a comment

Logging The Generated CQL from the Spark Cassandra Connector

This has come up some in the last few days so I thought I’d share the available options and the tradeoffs. Option 1: Turn ON ALL THE TRACING! nodetool settraceprobability 1.0 Probabilistic tracing is a handy feature for finding expensive … Continue reading 

Posted in Cassandra, Spark | Tagged , | Leave a comment

Don’t use TextField for your unique key in Solr

This seems immediately obvious when you think about it, but TextField is what you use for fuzzy searches in Solr, and why would a person want a fuzzy search on a unique value? While I can come up with some … Continue reading 

Posted in Cassandra, Solr | Tagged , , | Leave a comment

Spark job that writes to Cassandra just hangs when one node goes down?

If one node takes down your app, do you have any replicas?

Posted in Cassandra, Spark | Tagged , | Leave a comment