This repo shows building an example live data streaming with Apache Kafka, the streamed data are processed and pushed into a standalone Cassandra database. All our applications are on Docker containers, we have streamed the live bitcoin data from the coin ranking api. You can check Coinranking.com for more information in setting your account to try out the API.
If you don't have Docker desktop install already, you can install application from Docker.
git clone https://github.com/yTek01/streaming-with-kafka-cassandra.gitRun the docker-compose.yml to start all the containers with the command below.
cd streaming-with-kafka-cassandra
docker-compose up -ddocker exec -it <Kafka_container_name> /bin/shcd /opt/kafka_version/bin
kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic messages
kafka-topics.sh --list --zookeeper zookeeper:2181docker exec -it <container_name> cqlshCREATE KEYSPACE btc_keyspaces WITH REPLICATION={'class': 'SimpleStrategy', 'replication_factor': 1};CREATE TABLE IF NOT EXISTS btc_keyspaces.bitcoin_info (
uuiid UUID,
coinID TEXT,
symbol TEXT,
name TEXT,
color TEXT,
iconUrl TEXT,
marketCap TEXT,
price TEXT,
listedAt BIGINT,
tier INT,
change TEXT,
rank INT,
sparkline list<text>,
lowVolume BOOLEAN,
coinrankingUrl TEXT,
twenty4hvolume TEXT,
btcPrice TEXT,
PRIMARY KEY((uuiid), name));We are going to start our applications in this chronological order, the python consumer.py, and python producer.py. The consumer API will be wait for messages from the producer API.
python consumer.pypython producer.py. docker exec -it <container_name> cqlshselect count(*) from btc_keyspaces.bitcoin_info;