Cassandra C# Driver: Surprising gotcha with SimpleStatement


When helping someone with a Batch using the C# driver and I had a bit of a surprise. I wanted to reuse the CQL and I couldn’t at that point use a Prepare because of a bug, since SimpleStatement has a nearly identical API I figured it’d be a one for one replacement so my friend and I started with:

var batchStmt = new BatchStmt();
var v2Insert = new SimpleStatement("insert into v1_readonly.documents " +
"(key, column1, value) values(?, ?, ?);");
batchStmt.Add(v2Insert.Bind("1", "DateStamp", "foo")) //1
.Add(v2Insert.Bind("1", "EprsVersion", "foo")) //2
.Add(v2Insert.Bind("1", "ExternalDocId", "foo")) //3
.Add(v2Insert.Bind("1", "RecordType", "foo")) //4
.Add(v2Insert.Bind("1", "Status", "foo")) //5
.Add(v2Insert.Bind("1", "XmlRecord", "foo")); //6
session.Execute(batchStmt);

and expected to get 6 rows but instead got the last record:

key | column1 | value
——-+———————-+———-
 1 | XmlRecord | foo
(1 rows)

We switched out for the prepared statement on our test and got

var batchStmt = new BatchStmt();
var v2Insert = session.prepare("insert into v1_readonly.documents" + 
"(key, column1, value) values(?, ?, ?);");
batchStmt.Add(v2Insert.Bind("1", "DateStamp", "foo")) //1
.Add(v2Insert.Bind("1", "EprsVersion", "foo")) //2
.Add(v2Insert.Bind("1", "ExternalDocId", "foo")) //3
.Add(v2Insert.Bind("1", "RecordType", "foo")) //4
.Add(v2Insert.Bind("1", "Status", "foo")) //5
.Add(v2Insert.Bind("1", "XmlRecord", "foo")); //6
session.Execute(batchStmt);

and this time got :

key | column1 | value
——-+———————-+———-
 1 | DateStamp | foo
 1 | EprsVersion | foo
 1 | ExternalDocId | foo
 1 | RecordType | foo
 1 | Status | foo
 1 | XmlRecord | foo
(6 rows)

So it occurred to me at some point we were making an assumption about how Bind works that it was probably still the same reference on every Bind, so lets test that:

batchStmt.Add(new SimpleStatement("insert into v1_readonly.documents (key, column1, value) values(?, ?, ?);")
 .Bind("1", "DateStamp", "foo")) //1
.Add(new SimpleStatement("insert into v1_readonly.documents (key, column1, value) values(?, ?, ?);")
 .Bind("1", "EprsVersion", "foo")) //2
.Add(new SimpleStatement("insert into v1_readonly.documents (key, column1, value) values(?, ?, ?);")
 .Bind("1", "ExternalDocId", "foo")) //3
.Add(new SimpleStatement("insert into v1_readonly.documents (key, column1, value) values(?, ?, ?);")
 .Bind("1", "RecordType", "foo")) //4
.Add(new SimpleStatement("insert into v1_readonly.documents (key, column1, value) values(?, ?, ?);")
 .Bind("1", "Status", "foo")) //5
.Add(new SimpleStatement("insert into v1_readonly.documents (key, column1, value) values(?, ?, ?);")
 .Bind("1", "XmlRecord", "foo")); //6
session.Execute(batchStmt);

and it works:

key | column1 | value
——-+———————-+———-
 1 | DateStamp | foo
 1 | EprsVersion | foo
 1 | ExternalDocId | foo
 1 | RecordType | foo
 1 | Status | foo
 1 | XmlRecord | foo
(6 rows)

Ok so what’s going on here? Why does this work on the prepared statement? Viewing the source for the DataStax csharp-driver prepared statement

public BoundStatement Bind(params object[] values) 
{ 
  var bs = new BoundStatement(this); 
  if (values != null && values.Length == 1 && Utils.IsAnonymousType(values[0])) 
  { 
    //Using named params 
    //Reorder the params according the position in the query 
    values = Utils.GetValues(Metadata.Columns.Select(c => c.Name), values[0]).ToArray(); 
  }  
  bs.SetValues(values); 
  return bs; // EDITOR NOTE: returns a new copy of BoundStatement
}

Compare this with the SimpleStatement Bind

public SimpleStatement Bind(params object[] values) 
{ 
  if (values != null && values.Length == 1 && Utils.IsAnonymousType(values[0])) 
  { 
    var keyValues = Utils.GetValues(values[0]); //Force named values to lowercase as identifiers are lowercased in Cassandra 
    QueryValueNames = keyValues.Keys.Select(k => k.ToLowerInvariant()).ToList();
    values = keyValues.Values.ToArray(); 
  } 
  SetValues(values);    
  return this;  // returns ITSELF
}

I had some thought this was intentional and the driver team confirmed this. For further verification I looked at the Java driver’s SimpleStatement and found out there was no bind and that the only way to bind variables was by passing them in the constructor, which is probably with hindsight the same interface the C# driver should have used.

public SimpleStatement(String query, Object… values) { 
  this.query = query;
  this.values = values; 
}

Hope this helps anyone using the C# driver.

Cassandra: Batch loading without the Batch keyword