Embedded Google PubSub Emulator for Testing

Recently I’ve been working with Google Cloud Platform PubSub for various Java applications and Apache Beam pipelines. PubSub offers effortless and cloud-native messaging topic for your software application or BigData processing requirements.

One of the problems I faced was to run PubSub emulator locally part of my Integration test suite. After hunting for ways to run PubSub emulator locally part integration test I’m happy to present my following code snippet whoever out there who looking for the same solution.

Dependencies

<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-pubsub</artifactId>
<version>1.35.0</version>
</dependency>
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>testcontainers</artifactId>
<version>1.7.0</version>
</dependency>

Usage
EmbededPubSub embededPubSub = new EmbededPubSub();
embededPubSub.start();
embededPubSub.subscriber(message -> System.out.println(" @@ Received : " + message), 10000);
embededPubSub.publish("Hello Pub/Sub!");
embededPubSub.stop();

I’ve been utilising testcontainers to spin up a container which runs gcloud cli components and then exposes few ports for host computer to communicate through.

Next thing is if you ever needs to pull/browse messages/subscriptions or other information from Local Pub/Sub emulator for debugging purposes unfortunately we’re unable to use `gcloud` cli tools for this at the time of writing this blog post. workarounds are either write a client code or use Pub/Sub Rest API.

Pub/Sub Rest API

https://cloud.google.com/pubsub/docs/reference/rest/

I hope this article will be useful for somebody who couldn’t find their head around with Pub/Sub Integration test.

Streaming MySQL change sets to Kafka / AWS Kinesis

Streaming data changes happening in MySQL is a new subject but I’m thinking this particular way of capture change sets going to get used more widely with other data storages in future. AWS DynamoDB supports streams as of now without any middleware support.

I’ve used this CDC (Change Data Capture) streams in one of my previous projects in order to move events to Kafka for further processing, some people using similar behavior to invalidate caches as per changes happening in MySQL-table row level and etc.

I’ve used debebzium/Kafka Connect project for moving MySQL CDC events to Kafka which described in the following slide with more details.

For more information about Debezium visit http://debezium.io/.

Recently I’ve read an article about which you can achieve similar behavior in AWS RDS or standard MySQL instance and move CDC events to AWS Kinesis with minimal effort. It’s a great article, please take look!.

https://aws.amazon.com/blogs/database/streaming-changes-in-a-database-with-amazon-kinesis/

I think there are many possibilities opens up with particular technology/behavior, let me know if you think this could be useful for your future implementations or current business problem you’re trying resolve.