Blog‎ > ‎

Publish-Subscribe

posted Aug 6, 2013, 2:40 AM by Joe Pritzel   [ updated Aug 7, 2013, 12:07 AM ]
So, I was looking for a publish-subscribe library, and tried using Guava. The throughput was far too low. So, I continue looking, until I found MBassador. I tried it out, and was getting much better performance, but I wanted to see if I could get better performance. So, I created my own and called it Feather.

Now, I needed a way to compare Guava, MBassador, and Feather. And I had previously run into the benchmarks here, but they had a much different usage pattern than what I needed (low subscribers per type and high message throughput). So, after I ran them and saw that Feather is faster according to those, I decided to create my own. So, these should be more representative of a large amount of messages coming from a single source, with no changes to subscriptions. I've attached the code that was used in the file called "First Test.zip". I should mention the the number of messages displayed is not the number of actual messages that were published/read. To get that number, you need to multiply the number of messages with the number of subscribers.

So, without further ado, here are my initial results (after some cleaning up of the data).

Publish Subscribe Data

So, let me explain what I did to the raw data... Basically, I merged the two runs for each sub-test by taking the lowest of each timing. This is because Guava got caught in a GC cycle a few times, and it had some massively inflated timings. Some of the Guava timings still have GC time in them, but it's because Guava generated so much garbage and took so long that I couldn't avoid it.

I wanted to see how far I could push MBassador and Feather. It got to the point where both Feather and MBassador was hitting memory limits and causing tons of time to be spent on GCing.

Anyway, the changes from the first test are as follows:
Removed testing of Guava.
Changed the msgs array to have the values of 10,000,000.
Changed the subs array to 1, 10, 25, 50 and 100.

I ended up with this data (after the same process as before):

Publish Subscribe Data 2


At this point, I realized I couldn't scale the tests with messages or subscribers because of the limits imposed by the hardware I am running. So, I decided to create a different test.

All this test does is use messages to count. This test is different, in that there isn't a single, external source of messages. Each message (except for the initial one) must be generated by a message. This should give a better sense of the throughput than the previous test, at least how I'm going to be using it. The code for this is attached in the file, "Second Test.zip". I should mention the the number of messages displayed is not the number of actual messages that were published/read. To get that number, you need to multiply the number of messages with the number of subscribers and then add the number of subscribers.

So, here are the results:

Publish Subscribe Data 3


Because the memory usage was so low with that test, and good performance, I decided to bump up the subscribers.

I set the number of messages to 100,000 and set subs to 100 and 500.

Here is the data:

Publish Subscribe Data 4


ċ
First Test.zip
(3k)
Joe Pritzel,
Aug 7, 2013, 12:06 AM
ċ
Second Test.zip
(4k)
Joe Pritzel,
Aug 6, 2013, 11:18 PM
Comments