Cassandra Terminology : Cheat Sheet


*This document will be updated continuously until it is as complete as can be, please let me know what I can add to make it more useful for everyone, especially new Cassandra  users*

Columns

At the bottom end of the hierarchy there is a column, a column has three parts to it; A name, value and a timestamp. The name and value is stored as a raw byte array (byte[]) and can be of any size.

Super Columns

A super column is similar in terms of having a name,value pair however, it does not have a timestamp.

The major difference between a column and a super column is that :

A column maps to the binary representation of a string value and a super column maps to a number of columns. Read more of this post

Cassandra Cheat Sheets


I’ll be getting some free time over the next few weeks and I intend to put together a series of Cassandra Cheat Sheets. I’ve been wanting to for a while but haven’t gotten round to it, so the following list are the individual sheets which target specific things relating to Cassandra.

  1. Cassandra Terminology
  2. Cassandra Query Language
  3. Cassandra Thrift API
  4. Cassandra Internals

I’ll add more as I get time in pdf,html and doc format… The first cheat sheet is already being written.

Cassandra Hector Wrapper – Hector Simplified


This a simple wrapper I wrote for Hector.
—————————————–
Available at : https://github.com/zcourts/cassandra-hector-wrapper

It doesn’t support all the features of Hector…that’s sort of the point but not. The main point was to get something quick and simple.
I did this on the train over 3/4 mornings while heading to work. I wanted it to not have anything too complex or low level.
In effect I hope that even a new Cassandra user could just download a copy and start using it without the need to
fully understand all of Cassandra’s concepts. 
I’ll review it and make some needed changes however it does currently work fine. I noticed after pushing to github that
I’m not check if Hector returns null, which generally means whatever was requested couldn’t be found or an error occured
or something or the other.

Usage is similar to Hector’s see https://github.com/rantav/hector if you want to use Hector directly.
See the file https://github.com/zcourts/cassandra-hector-wrapper/blob/master/src/main/java/com/scriptandscroll/adt/UsageExamples.java
for a decent set of usage examples.

You start by creating a Keyspace object.

Keyspace ks=new Keyspace("clusterName", "keyspaceName", "localhost:9160") ;

//then a column or super column family object
ColumnFamily cf= new ColumnFamily(ks,"columnFamilyName);

//now the magic happens, you simple do cf.get[column|columns|row,rows]
Row row= cf.Row getRow("rowKey", "startColumn", "endColumn");

//you can now do
Column col = row.getColumn("columnName");
//then
String val= col.getValue();
//  OR .....
String val2=row.getColumnValue("columnName");
//OR
Iterator<Column> it=row.iterator();
while(it.hasNext()){
  Column c = it.next();
  //do whatever
}

Thats it!

Its important to note that I didn’t write this because the Hector client was lacking in anyway at all.
Quite the opposite in fact. The guys working on hector have done an awesome job and myself and I’m sure others
appreciate it. However when I was working on updating a project recently it was taking me far too much time to sift
through the hector docs and get familiar with all the changes etc. I started with a single file but that quickly got too nasty
and I just stopped, drew out some ideas and it turned out into all the classes currently in the project.

https://github.com/zcourts/cassandra-hector-wrapper/blob/master/src/main/java/com/scriptandscroll/adt/UsageExamples.java

package com.scriptandscroll.adt;

import java.util.ArrayList;
import java.util.List;

/**
 *Shows basic usage of the classes.
 * It firstly makes no provision to allow you to create keyspaces or column families, YET!
 * But once those are created from the CLI or some other way it provides a way to deal with
 * just about everything
 * @author Courtney Robinson <courtney@crlog.info>
 */
public class UsageExamples {

	public static void main(String[] args) {
		Keyspace ks = new Keyspace("clusterName", "keyspaceName", "localhost:9160");
		//Standard column family examples
		//create a column family object - THIS DOES NOT CREATE A COLUMN FAMILY IN CASSANDRA but assumes one with the given name already exists!
		ColumnFamily cf = new ColumnFamily(ks, "cfName");
		//now we can perform actions on this column family.
		//
		//first lets get a single column
		Column col = cf.getColumn("rowkey1", "columnName");
		//now we can use its value or name using
		col.getName();//returns a string
		//or
		col.getValue();//returns a string
		//
		//
		//we can get a set of columns from a row in three ways, by giving a startand end column name
		List<Column> cols = cf.getColumns("rowkey2", "startCol", "endCol");
		//by giving start and end col names and specifying a max amount of cols to get
		List<Column> cols2 = cf.getColumns("rowke2", "startCol", "endCol", false, 5);
		//or by giving an array of all columns to get
		//in this case it will only return the given columns
		List<Column> cols3 = cf.getColumns("rowkey2", new String[]{"col1", "col2", "col3", "col4"});
		//
		//We can also get rows within a CF
		//by setting start and end column names to an empty string and not setting a max value
		//we can get all the columns within the given row
		//the same options as getColumns apply, you specify columns by start and end key with an optional max amount or an array of columns
		Row row = cf.getRow("rowkey", "", "");
		//you can now do cool stuff with this row object like add and remove columns.
		//if you later pass this object to a column family it will apply those changes in Cassandra e.g.
		row.putColumn("newColName", "newColValue");
		//or
		row.putColumn(new Column("newerColName", "newerColValue"));
		//while we're at it we can remove columns from this row
		row.removeColumn(col);
		//or
		row.removeColumn("colName");
		//if we now write this row back to the column family all those changes are applied
		cf.putRow(row);//that's it! two new columns will be added, and two removed
		//we coould do
		//setting false stops it removing columns from cassandra that were removed from the object
		//columns that were added are still added obviously...
		cf.putRow(row, false);
		//we can also get multiple rows like this
		//setting start and end row keys and column names to empty gets everything
		//but we set the max rows to return as 20 and the max columns per row to 5
		//so up to 20 rows are returned which will contain up to 5 columns
		//there are multiple variations on these methods that allows various operations
		List<Row> rows = cf.getRows("", "", "", "", false, 20, 5);
		//Simple? Good! That is the aim!

		//Lukily Super column family operations work in a similar manner
		SuperColumnFamily scf = new SuperColumnFamily(ks, "superCFName");
		//now go through the same thing again...
		SuperColumn scol = scf.getSuperColumn("rowkey", "supercolName");
		//get sub columns of this super column
		List<Column> subCols = scol.getAllColumns();
		//or get multiple super columns
		List<SuperColumn> scol2 = scf.getSuperColumns("rowkey", new String[]{"superCol1", "superCol2", "superCol3"});

		//get a single sub column
		Column subCol = scf.getSubColumn("rowket", "superColname", "subcolname");
		List<Column> subCols2 = scf.getSubColumns("rowkey", "superCol", "startSubcol", "endSubCol");

		//we can get sub columns from multiple rows
		List<String> keys = new ArrayList<String>();
		keys.add("key1");
		keys.add("key2");
		keys.add("key3");
		keys.add("key4");
		keys.add("key5");
		//gets a list of rows with the sub columns requested
		List<Row> rowSubCols = scf.getSubColumnsFromMultipleRows(keys, "superColumn", "startSubCol", "endSubCol", false, 20);
		//get an entire super row
		SuperRow srow = scf.getSuperRow("rowkey", "startColumn", "endCol");
		SuperColumn sc = srow.getSuperColumn("superCol");//now do what we want
		List<SuperRow> lsuperRows = scf.getSuperRows(keys, "startCol", "endCol");
		//get up to 20 rows
		List<SuperRow> srows2 = scf.getSuperRows("startKey", "endKey", new String[]{}, 20);

		//we can also add and remove from a super row just as we did with a normal row
		ArrayList<Column> cols5 = new ArrayList();
		cols5.add(new Column("subname", "subval"));

		srow.putSuperColumn(new SuperColumn("colname", cols5));
		//and now
		scf.putSuperRow(srow);
		//all done...
		//still simple?
	}
}

On its own this would all mean nothing so again, a big thank you to the Hector guys.

If you can think of a better name then by all means, please say. Any comments, suggestions or general thoughts on it are most welcomed.

Cassandra Query Language (CQL) v2.0 reference


This is an update to my two previous posts:
http://crlog.info/2011/03/29/cassandra-query-language-aka-cql-syntax/ AND
http://crlog.info/2011/06/13/cassandra-query-language-cql-v1-0-0-updated/
If you’re using versions of Cassandra prior v1.0 beta one of the above links may be more appropriate as little things may have changed here and there. The official doc is on Gitub in textile markup here https://github.com/apache/cassandra/blob/trunk/doc/cql/CQL.textile

Cassandra Query Language (CQL) v2.0

Read more of this post

Install and configure Varnish (3.0.1) cache with WordPress


A few days ago I promised I’d go through what  I did when setting up varnish with wordpress, its been slightly delay  but here goes. The blog I used it on was Script and Scroll. I’ve ran the Apache benchmark tool against the site, only one URL but made 1Million requests to it and it held up quite well. The requests were completed in about 5 minutes, I may post the results in a later entry…

So, on the day that alarm bells went off form traffic spike, I just install varnish and cached everything and anything that I could… Obviously this isn’t ideal in most cases but I had very little time to spend on this. So method one below is the “naive” approach I’d say, but it works and works fairly well. Method two is my current configuration which is based on a post Donncha O Caoim did. I had to tweak a few things because I am using the latest version of varnish some of the configuration options he used changed in version 3 of varnish.

Read more of this post

Introduction to NOSQl and Apache Cassandra.


A while back (April 2011) I wrote an article that I wanted to publish on Sitepoint but I guess they didn’t like it because I haven’t had a reply back. I was just randomly cleaning up my mail and found it in my sent messages folder.

I’ve published it on Script and Scroll,under Introduction to NoSQL, while it was written months ago it is still perfectly relevant and could serve as a good starting point for a beginner interested in getting started with Cassandra and the whole NoSQL idea.

I’ll be interested in getting any feedback on it so, have a quick read and mail me your thoughts on it :-) .

Setting up a multi-node Cassandra cluster on a single Windows machine


In Windows explorer, go to “C:\Windows\System32\drivers\etc”
Copy the file called “hosts” to your desktop ( or any editable location)
Open the hosts file from the desktop and add the following to the end of the file:

#cassandra nodes
127.0.0.1               127.0.0.2
127.0.0.1               127.0.0.3
127.0.0.1               127.0.0.4
127.0.0.1               127.0.0.5
127.0.0.1               127.0.0.6

Read more of this post

CQL : Creating a column family


Following on from my last post about how to create a keyspace with CQL,  in this tutorial/post I’ll create a Coloumn family. Unless this has changed recently CQL does not support creating supoer column families. As far as I know there are no plans to do so either… Read more of this post

CQL : Creating a simple keyspace


I promised a few tutorials covering CQL examples so I’m going to kick off a series to demonstrate all/most of the features of Cassandra‘s new query language, CQL.

In this example we’ll create a keyspace and look at the proerties available when doing so, this will demonstrate one of the key words in CQL, CREATE KEYSPACE. This example has been done in Java since it is one of the easiest languages to get up and running with, however the CQL statements are portable i.e. not dependent on the language you use. There is a PHP CQL driver being developed by Nick, Dave, my self and others but I’ve been out for about a month pre-occupied with a few things, the guys have been making some progress so check out the driver if you’re using PHP and want to use or contribute to the development. Read more of this post

Cassandra Query Language (CQL) v1.0.0 (UPDATED)


NOTE: CQL V2 reference is available here http://crlog.info/2011/09/17/cassandra-query-language-cql-v2-0-reference/

Cassandra Query Language (CQL) v1.0.0

This is an update to my previous post documenting the Cassandra query language CQL. A few changes have been made in CQL, the biggest change being the addition of the INSERT keyword. Previously the UPDATE statement would perform an insert if a value did not already exists, the INSERT statement now explicitly does this inserting. BATCH and ALTER TABLE are also now included in the mix, see the official doc here : https://github.com/apache/cassandra/blob/trunk/doc/cql/CQL.textile.

If you’re new to NoSQL and Cassandra you can read this gentle Introduction to NoSQL and Apache Cassandra
Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 346 other followers

%d bloggers like this: