Google Cloud Storage integration with Apache Camel
The rising adoption of Cloud services need also to integrate this features with new and existing applications. Apache Camel is an Open Source integration framework that empowers developers to quickly and easily integrate various systems consuming or producing data. It is widely used because is a small library that has minimal dependencies for easy embedding in java application. For more information about Camel and it’s documentation visit camel.apache.org.
I am a Camel fan, I have used it in many projects, and now I decided to start to contribute on it on github. As start, I decided to help to write new missing components and my first official contribution is the development of the Google Cloud Storage component. After some work, tests and pull requests I am proud to announce that from the version 3.9.0 Camel comes with the Google Cloud Storage component. It can be used to quickly integrate your java applications with Google Cloud Storage buckets. Let’s see how to use it practically.
The Google Cloud Storage Component
Before to start with the example, let’s introduce the component. The Google Cloud Storage Component has the following endpoint:
It supports both producer and consumer.
Behind the scenes, the component uses a Google Storage Client to interact with the Cloud. The authentication is targeted for use with the GCP Service Accounts. For more information please refer to Google Storage Auth Guide. Once generated the service account key you can provide authentication credentials to your application code using two ways:
- Directly by the endpoint:
- Or by setting the environment variable GOOGLE_APPLICATION_CREDENTIALS:
The consumer
The consumer can be used to read the objects inside the bucket and to send it to a destination. The consumer is a ScheduledBatchPollingConsumer this means that implements polling on the bucket. To prevent objects from being reprocessed, the consumer can be configured to automatically move the processed object to another bucket. For this reason is possible to use the options:
- moveAfterRead
- destinationBucket
- deleteAfterRead.
Notice that in this example, bucket names are not real names, if you use it can fail because bucket names are global.
The producer
The Google Storage component provides the following operation on the producer side:
- copyObject
- listObjects
- deleteObject
- deleteBucket
- listBuckets
- getObject
- createDownloadLink
To upload a file to a bucket, we can define a route like this:
For many usage examples visit Camel google storage component documentation.
Now that we know how to use the component, let’s use it for a real example.
Use case: Thumbnail creator
The use case will show how is possible to automate the thumbnail creation of some images. Suppose that we have a “source-bucket” where we put some images and that we want to automatically create and store the relative thumbnail in another bucket.
We can define a route where the “source-bucket” can be the starting point. A google storage camel consumer will poll the source-bucket verifying if there are unprocessed images. For each new file, It will load the image and move the file to the “consumed-bucket” without any modification. The loaded image will be processed by “thumbnail-processor” that implements the logic to resize the image. Later the resulting image will be sent to a google storage producer that will write the image to a “result-bucket”. At the end another producer will make a download URL for the object.
The use case can be represented by the following schema:
Developement details: Camel main
All the source code will be available on my github from the following link: Camel experiments.
The quickest way to start is to use the Camel Maven Archetypes to create a new maven project for running camel standalone:
From here is needed to setup our pom.xml file, adding the dependency to the camel google storage component:
Notice that I added also the dependency for the thumbnailator library that will be used to implement the processor that make the thumbnails.
After that the route can be defined. I decided to use the “direct” camel component to split the route in pieces to make it more readable and debuggable:
Notice that the route is very simple and reflect the previous diagram. I also added some logs to help understanding what happens during the execution.
To execute it, is possible to execute the following commands:
When the application starts, the Camel engine creates the routes and the consumer component waits to find a new file to consume.
Notice that automatically (by default, you can configure it ) if not present the component creates the buckets:
When a new file is available into the bucket:
the component consumes the file and moves the data though the routes, processes the image, save it inside the result bucket and creates a URL:
Let’s show how the buckets changed:
Notice that the source bucket is empty, the consumed bucket contains the original images and the result bucket contains the thumbnail. In addition, the original image was 1.3MB and the resulting thumbnail is 4KB. So, mission completed!!!
Conclusion
This article wants to quickly explain how to use the camel google storage component for a simple use case but also to demonstrate the power of opensource software and contribution. I am proud to have made my contribution to the Camel project. By developing this component I was able to solve my need (to integrate my applications with Google Cloud Storage) but also to share my work with the community that will be able to reuse the component and maybe, why not, even improve it. I also had the opportunity to study more closely the structure of the project, the best practices and standards they adopt and collaborate with the people involved in developing the project. I think that is not possible to study these things from books but it’s only possible through experience. For this reason I highly recommend collaborating in opensource projects.
Resources
If you want to learn more: