The major difference between the queuing model and the Pub/Sub model is that, for the Pub/Sub model, each consumer gets a copy of the message, rather than just one of the consumers. This can be understood with the following diagram:
As you can see, Consumer 1 and Consumer 2 both get copies of the m1 and m2 messages.
This model is useful when you want a bunch of consumers to work on the same messages, which is usually the norm in event-driven architecture (EDA), which we saw previously.
One thought that might come to mind is, does this mean that load-balancing can't be done in a Pub/Sub setup? Won't it interfere with scalability?
Generally, messaging systems also provide load balancing semantics with Topics using something called virtual topics (a term used specifically by ActiveMQ, but it's a functionality that's available in most queuing systems). The scheme is shown in the following diagram:
As you can see, the Topic still does Pub/Sub type message routing, but each consumer effectively gets a Subscription Queue. Multiple instances of the consumer can get different messages, thereby enabling scale-out and load-balancing for the consumer. It should be noted that the Subscription Queue is created automatically as part of consumers registering for consumption against a Topic; it doesn't need to be created manually.