= Apache Kafka = https://kafka.apache.org/intro Apache Kafka® is a distributed streaming platform. A streaming platform has three key capabilities: * Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system. * Store streams of records in a fault-tolerant durable way. * Process streams of records as they occur. Publisher/Subscriber, Observer pattern, Message queues. First a few concepts: * Kafka is run as a cluster on one or more servers that can span multiple datacenters. * The Kafka cluster stores streams of records in categories called topics. * Each record consists of a key, a value, and a timestamp. Kafka has four core APIs: * The Producer API allows an application to publish a stream of records to one or more Kafka topics. * The Consumer API allows an application to subscribe to one or more topics and process the stream of records produced to them. * The Streams API allows an application to act as a stream processor, consuming an input stream from one or more topics and producing an output stream to one or more output topics, effectively transforming the input streams to output streams. * The Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table. Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it. How does Kafka's notion of streams compare to a traditional enterprise messaging system? Messaging traditionally has two models: queuing and publish-subscribe. In a queue, a pool of consumers may read from a server and each record goes to one of them; in publish-subscribe the record is broadcast to all consumers. By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes. https://kafka.apache.org/uses Kafka works well as a replacement for a more traditional message broker. Kafka is comparable to traditional messaging systems such as ActiveMQ or RabbitMQ. == Example == {{{#!highlight bash wget http://mirrors.up.pt/pub/apache/kafka/2.3.0/kafka_2.11-2.3.0.tgz tar xvzf kafka_2.11-2.3.0.tgz cd kafka_2.11-2.3.0/ # single-node ZooKeeper instance (port 2181) bin/zookeeper-server-start.sh config/zookeeper.properties # new tab .... cd kafka_2.11-2.3.0/ bin/kafka-server-start.sh config/server.properties # listens port 9092 # create topic bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test # check topics bin/kafka-topics.sh --list --bootstrap-server localhost:9092 # send messages to topic bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test >hello >test # consume messages bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning # https://pypi.org/project/kafka/ apt install python-pip # as root pip install kafka }}}