Apache Pulsar (incubating) is an enterprise-grade publish-subscribe (aka pub-sub) messaging system that was originally developed at Yahoo. Pulsar was first open-sourced in late 2016, and is now undergoing incubation under the auspices of the Apache Software Foundation. At Yahoo, Pulsar has been in production for over three years, powering major applications like Yahoo! Mail, Yahoo! Finance, Yahoo! Sports, Flickr, the Gemini Ads platform, and Sherpa, Yahoo’s distributed key-value store.
An application that feeds data into Pulsar is called a producer, while an application that consumes data from Pulsar is called a consumer. Consumer applications are also sometimes referred to as subscribers. Following the general pub-sub pattern, the topic is the core resource in Pulsar. Loosely speaking, a topic represents a channel into which producers append data and from which consumers pull data. This is shown in Figure 1.
Figure 1: Producers, Consumers and Topics
Pulsar was built from the ground up to support multi-tenant use cases. Pulsar supports two multi-tenancy-specific resources to enable multi-tenancy: properties and namespaces. A property represents a tenant in the system. To give an example, imagine a Pulsar cluster deployed to support a wide variety of applications (as was the case with Pulsar at Yahoo). Within a Pulsar cluster, each property can represent a team in the enterprise, a core feature, or a product line, to give just a few examples. Each property, in turns, can contain several namespaces, for example one namespace for each application or use case. A namespace can then contain any number of topics. This hierarchy is shown in Figure 2.
Figure 2: Relationship between Pulsar concepts
The namespace is the basic administrative unit in Pulsar. At the namespace level, you can set permissions, fine-tune replication settings, manage geo-replication of message data across clusters, control message expiry, and perform critical operations. All topics in a namespace inherit the same settings, which provides a powerful administrative lever for configuring many topics at once. There are two different types of namespace based on the level at which the namespace is visible:
- Local - A local namespace is visible only to the cluster in which it is defined.
- Global - In this case, the namespace is visible across multiple clusters, be they within a data center or geographically separated. This depends on whether replication is enabled across clusters in the namespace's settings.
While local and global namespaces differ based on scope, both can be shared across different teams or different organizations with the appropriate settings. Once the permission to write to a namespace is provided, an application can write to any topic within the namespace. If a topic does not exist, it is created on the fly when a producer first writes to it.
As mentioned before, each namespace can have one or more topics; each topic can have multiple subscriptions; and each subscription is set to retain and receive all messages published on the topic. To provide even more flexibility to application, Pulsar enables three different types of subscriptions that can coexist on the same topic:
- Exclusive subscription - For this type of subscription, there can be only a single consumer at any given time.
- Shared subscription - In this case, multiple consumers can attach to the same subscription and each consumer will receive a fraction of the messages.
- Failover subscription - With failover subscriptions, multiple consumers are allowed to connect to a topic but only one consumer will receive messages at any given time. The other consumers will start receiving messages only when the current receeiving consumer fails.
The three different types of subscriptions are illustrated in Figure 3. Subscriptions provide a major advantage in that they decouple how messages are produced and consumed. With support for different types of subscriptions, Pulsar provides resilience in applications without increasing complexity in development.
Figure 3: Different types of Pulsar subscriptions
The data fed into a topic can range from a few megabytes (small data) to several terabytes ("Big Data"). This means that topics need to be able to sustain steady low throughput in some cases and very high throughput in others, depending on the number of consumers. So what happens when high throughput is required on one topic and lower throughput is required on another? To address this problem, Pulsar enables you to shard the data in a topic and store it on multiple machines. These shards of data are called partitions.
Using partitions across a set of machines is a common approach for processing large volumes of data across several nodes while also achieving high throughput. By default, Pulsar topics are created as non-partitioned topics but you can very easily—using a simple CLI command or API call—create partitioned topics and assign them a specific number of partitions.
When you create a partitioned topic, Pulsar automatically partitions the data and ensures that consumers and producers can be partitioning agnostic. That is, an application that was written with a single topic can still work when the topic is partitioned, with no code changes, making partitioning a purely administrative—rather than application-level—concern.
In Pulsar, partitioning of topics is handled by a process called a broker; each node in a Pulsar cluster runs its own broker. Figure 4 shows in detail how partitions are handled by broker nodes.
Figure 4. Partitioning a topic to multiple Pulsar brokers
While applications can take advantage of partitioning with no code changes, there are a few additional hooks that can help you achieve a better distribution of data across partitions and across available consumers. Pulsar provides control over how messages are routed to a particular partitions by enabling you to select a routing policy. There are four basic routing policies:
- Single partitioning - The producer picks one random partition and routes data to that partition only. This model retains the same guarantees provided by non-partitioned topics but can be helpful when many producers are publishing to a topic.
- Round robin partitioning - In this case, the producer distributes data evenly across all the partitions in a round robin fashion. The first message goes into the first partition, the second message goes to the second partition, and so on.
- Hash partitioning - In this case, every message carries a key. The choice of the partition is based on the value generated by applying a hash function on the message key. Hash partitioning will guarantee ordering based on the key.
- Custom partitioning - Here, the producer uses a custom function that takes the message and produces a partition number. The message is then directed to that partition. Custom partitioning enables the application full control over partitioning logic.
Once a Pulsar broker receives and acknowledges data that it receives from a producer on a topic, it needs to ensure that the data is never lost under any circumstances. Unlike several other messaging systems, Pulsar achieves durability using Apache BookKeeper, which provides low-latency persistent storage. When a Pulsar broker receives messages, it sends the message data to multiple BookKeeper nodes (based on the replication factor). These nodes write the data into a write-ahead log and also write a copy to memory. Before the node sends out an acknowledgement, the log is force-written to stable storage. By force-writing the log to storage, the data is retain even in the incident of a power outage. Since the Pulsar broker writes to multiple BookKeeper nodes, it sends an acknowledgement to the producer only when the data has been successfully written to a write quorum. Pulsar is thereby able to ensure zero data loss even in the presence of hardware failures, network partitions, and other failure states. We will delve into further details of how Pulsar achieves full durability in a later blog post.
Pulsar currently powers major Yahoo applications like Mail, Finance, Sports, Gemini Ads, and Sherpa, Yahoo’s distributed key-value service. Many use cases require strong durability guarantees—for example zero data loss—while also meeting stringent performance requirements. Pulsar was deployed beginning in 2015 and is now run by Yahoo at massive scale. Pulsar:
- Is deployed globally, in 10+ datacenters, with full mesh replication capability
- Handles over 100 billion published messages a day
- Serves more than 1.4 million topics
- Provides an average publish latency of less than 5 milliseconds across the entire service
In this blog post, we provided a simple introduction to the concepts behind Apache Pulsar (incubating) and showed how Pulsar achieves durability by committing data before sending acknowledgements to consumers, enables high throughput by partitioning data, and more. In subsequent blog posts, we'll delve more deeply into Pulsar's overall architecture and into individual features, and we'll provide guides to using Pulsar in a way that takes full advantage of its many strengths.
This post features contributions from Matteo Merli and Karthik Ramasamy.