25.12.11

ActiveMQ–KahaDB Architecture

Synchronous dispatch through a persistent broker

Synchronous Dispatch through a Persistent Broker

After receiving a message from a producer, the broker dispatches the messages to the consumers, as follows:

  1. The broker pushes the message into the message store. Assuming that the enableJournalDiskSyncs option is true, the message store also writes the message to disk, before the broker proceeds.

  2. The broker now sends the message to all of the interested consumers (but does not wait for consumer acknowledgments). For topics, the broker dispatches the message immediately, while for queues, the broker adds the message to a destination cursor.

  3. The broker then sends a receipt back to the producer. The receipt can thus be sent back before the consumers have finished acknowledging messages (in the case of topic messages, consumer acknowledgments are usually not required anyway).

Concurrent store and dispatch

  • Concurrent store and dispatch is enabled, by default, for queue.

Concurrent Store and Dispatch

After receiving a message from a producer, the broker dispatches the messages to the consumers, as follows:

  1. The broker pushes the message onto the message store and, concurrently, sends the message to all of the interested consumers. After sending the message to the consumers, the broker then sends a receipt back to the producer, without waiting for consumer acknowledgments or for the message store to synchronize to disk.

  2. As soon as the broker receives acknowledgments from all the consumers, the broker removes the message from the message store. Because consumers typically acknowledge messages faster than a message store can write them to disk, this often means that write to disk is optimized away entirely. That is, the message is removed from the message store before it is ever physically written to disk.

KahaDB Architecture

  • The bulk of the data is stored in rolling journal files (data logs).
    • Where all broker events are continuously appended.
  • In particular, pending messages are also stored in the data logs.
  • BTree
    • In order to facilitate rapid retrieval of messages from the data log.
    • The complete B-tree index is stored on disk and part or all of the B-tree index is held in a cache in memory.

So, the secrete of the KahaDB is the rolling data storage and BTree indexing, which is good for fast appending and fast deleting, but not good for storing data, which is something the traditional RDBMs try to avoid.

I designed some similar storage solutions for some three system, based on different read/write requirements. I tried a flat indexing file with a BTree storage. I also tried an in-memory indexing. The similar part is: we need to separate the indexing from massive data storage.

No comments: