Contents
Cassandra
The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.
Apache Cassandra is an open source NoSQL distributed database trusted by thousands of companies for scalability and high availability without compromising performance.
Data modeling approach Because Cassandra uses this single table-single query approach, queries can perform faster. Data in Cassandra is often arranged as one query per table, and data is repeated in many tables, a process known as denormalization.
Slackbuild
1 cd /tmp
2 wget http://slackbuilds.org/slackbuilds/14.1/system/apache-cassandra.tar.gz
3 tar xvzf apache-cassandra.tar.gz
4 wget http://archive.apache.org/dist/cassandra/2.0.7/apache-cassandra-2.0.7-bin.tar.gz
5 ./apache-cassandra.SlackBuild
6 installpkg /tmp/apache-cassandra-2.0.7-noarch-1_SBo.tgz
Node up
cqlsh
http://wiki.apache.org/cassandra/GettingStarted
1 bin/cqlsh
CREATE KEYSPACE mykeyspace WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
describe keyspaces;
USE mykeyspace;
CREATE TABLE users ( user_id int PRIMARY KEY, fname text, lname text );
INSERT INTO users (user_id, fname, lname) VALUES (1745, 'john', 'smith');
INSERT INTO users (user_id, fname, lname) VALUES (1744, 'john', 'doe');
INSERT INTO users (user_id, fname, lname) VALUES (1746, 'john', 'smith');
describe tables;
SELECT * FROM users;
desc table users;
CREATE INDEX ON users (lname);
desc table users;
SELECT * FROM users WHERE lname = 'smith';
Python sample app
http://datastax.github.io/python-driver/getting_started.html
1 cd /tmp/
2 wget http://slackbuilds.org/slackbuilds/14.1/libraries/libev.tar.gz
3 tar xvzf libev.tar.gz
4 cd libev
5 wget http://dist.schmorp.de/libev/Attic/libev-4.15.tar.gz
6 ./libev.SlackBuild
7 installpkg /tmp/libev-4.15-i486-2_SBo.tgz
8 easy_install pip # if not installed
9 pip install cassandra-driver
10 pip install blist
python3 cass.py
python3 asyncCass.py
1 import time
2 from cassandra.cluster import Cluster
3
4 def sucessHandler(rows):
5 print('Received data !')
6 try:
7 for user_row in rows:
8 print('>>> %d %s %s'%( user_row.user_id, user_row.fname, user_row.lname) )
9 except Exception as ex:
10 print(ex)
11
12 def errorHandler(exception):
13 print(exception)
14
15 if __name__=='__main__':
16 cluster = Cluster(['127.0.0.1'])
17 session = cluster.connect('mykeyspace')
18 futurex = session.execute_async('SELECT user_id , fname , lname FROM users')
19 futurex.add_callbacks(sucessHandler,errorHandler)
20 print('Wait 3 seconds ...')
21 time.sleep(3)
Python types conversion
http://datastax.github.io/python-driver/getting_started.html
Python Type |
CQL Literal Type |
None |
NULL |
bool |
boolean |
float |
float double |
int |
int |
long |
bigint varint counter |
decimal.Decimal |
decimal |
str unicode |
ascii varchar text |
buffer bytearray |
blob |
date datetime |
timestamp |
list tuple generator |
list |
set frozenset |
set |
dict OrderedDict |
map |
uuid.UUID |
timeuuid uuid |
Java types conversion
CQL3 data type |
Java type |
ascii |
java.lang.String |
bigint |
long |
blob |
java.nio.ByteBuffer |
boolean |
boolean |
counter |
long |
decimal |
java.math.BigDecimal |
double |
double |
float |
float |
inet |
java.net.InetAddress |
int |
int |
list |
java.util.List<T> |
map |
java.util.Map<K, V> |
set |
java.util.Set<T> |
text |
java.lang.String |
timestamp |
java.util.Date |
timeuuid |
java.util.UUID |
uuid |
java.util.UUID |
varchar |
java.lang.String |
varint |
java.math.BigInteger |