Time to Choose the New Project Tech Stack
Table of Contents
Intro #
As a team leader, it is my responsibility (coordinated with the company’s wide technical experience and architects) to choose the new project tech stack. I am the kind of a person that would go with the screaming edge tech just for the rush of experiencing something new and making my everyday developer life more interesting. However, from a practical point of view, as the project(integration with a major odds supplier) that we are about to create is somewhat crucial for the Business I do not have that liberty and I have to stick with modern, but proven technologies. Interestingly enough I can go with any JVM-supported language, but despite my love for Kotlin I had to choose Java (11) over it - simply because we do not have time to get four people up to speed with a new language plus a couple of new tech for each of them. But let’s get to the point - in the below sections I will discuss each and every choice that had to be made in more detail…
JDK Version / Vendor #
This should be a no-brainer - the latest version offering LTS, preferably offered by Oracle… So our choice was to use JDK 11 as JDK 12 and 13 do not offer any major dev-related changes.
However, due to the new Oracle’s licensing terms, we had to choose a different vendor, which of course should pass the TCK (Test Compatibility Kit). So we ended up with Amazon’s Corretto.
Why Amazon Corretto ? #
Longer support time windows (at least August 2024). Backport of security fixes from more recent versions of Java… the changelog here. The F.A.Q. explains what exactly what you are getting choosing to go with Corretto.
As a bonus here is the James Gosling talk on Corretto:
Other vendors: #
While we have chosen Corretto the below vendors might be considered at a later stage:
Azul’s Zulu
#
They are a long-time “player” in offering a JDK implementation. Their Zing JVM sounds good but comes at a hefty price. They offer their own set of tools and/or implementation of existing Oracle’s tools such as Mission Control, Flight Recorder, etc. However to fully rely on those you first - get coupled to a vendor and second - most of the really useful features come as a part of Zulu’s subscription plan or have to be paid for.
Adopt OpenJDK
#
More of pure Java, but currently lacks the promise of backporting fixes to older versions of their flavor of JDK.
Build tool #
We have chosen Gradle not only as it is the “modern standard”, but as due to my hatred of the XML format which goes a long way. The redundancy and verbosity of having a “close tag” can be easily avoided using a different approach (JSON, YAML, TOML, Groovy/Kotlin DSLs). I have worked for a company that stored its business data as XML. I have used XLST/Xpath and XSL-FO to generate PDFs from that internal XML format. IMO XML configuration should not be part of any Java code and I am more than happy that the times of Struts and early Spring are long gone. E.g. in the recent versions of Gradle, you need just the following two lines to compile and target Java 11:
sourceCompatibility = 11
targetCompatibility = 11
Compare that to the required Maven config:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<release>11</release>
</configuration>
</plugin>
Also have in mind
those headaches
related to using Maven and JDK >= 9
.
Main framework / Server(less) provider / WebApp container #
To be honest this is not as an easy choice as someone may think. We have projects using
Micronaut,
Dropwizard and
AWS Lambda along with monoliths using
Spring deployed on Jetty
.
Quarkus is currently the new kid on the block
that was also considered.
Struts is still alive,
Vert.x and
Vaadin are also valid options.
Someone once compared Micronaut to a tool and Spring to the whole hardware store(I could not agree more). Our choice(though one) was to go with Spring. Not only that it offers almost any tool that we will ever need, but is modular so one can choose which things “to grab” from the “hardware store”. It is easy to find talented devs that know its internals. The documentation is in a league of its own, IMO no other framework comes close. The “more modern” frameworks (looking @ you Micronaut) may offer the basics, but when it comes to something specific it feels somewhat lacking to the point that a team of my company had to open a PR to fix a bug that we were experiencing on Prod.
Our choice: Spring Boot + Spring #
Production proven. Suitable for microservice architecture. Easy to find people with working experience and knowledge of it. Rich ecosystem. Fast to fix CVEs.
Quality of Life framework #
Lombok - Some people hate it
(
Bruna Pereira,
Gonzalo Vallejos), some love it and can’t live without it.
Defining data classes
becomes easier than in Java, but still not at the level of Kotlin. Extra care should be taken to not hit any known bugs/features of Lombok. We also agreed that we need to be extra careful and keep it up to date when we upgrade to the next LTS version of Java.
Container tech / Container orchestration #
Extensive documentation, easy to find people with working experience with it both Developers and DevOps. More than 80% of the container market is “owned” by Docker. CoreOS’s rkt, Mesos’s Marathon, lxc and others are lacking the rich ecosystem that Docker has, too complex or too immature to consider.
My company is using on-premise hosted K8s deployment. Inbuilt logging and monitor tools. Dashboard. Automatic rollbacks, rolling updates. We were discussing AWS lambdas, but as the project requirements came in - it was out of the question as the services we were going to build became far more complex.
[Helm]
To ease our life deploying to K8s. For the persistence layer, we discussed briefly using k8s’ operators, but after quick digging, we decided against it, especially after reading this horror story.
Logging framework #
SLF4J This logging facade already offers a fluent API akin to Flogger. Painless switching of backends. Mandatory read: SLF4J Manual
Log4j2 / Logback We should consider which backend engine to use, as both have Pros and Cons. Spring’s usage of Logback might tilt the scale.
Flogger Fluent API logging developed and supported by Google - sounds cool, but still too immature, till recently was using Log4j (not 2!) as the only backend.
Our choice: Logback #
The default for Spring. Relatively easy to configure JSON logging for the k8s environments and human-readable format for the local one.
Message system / Stream data processing #
Kafka De Facto standard in stream processing. We will receive real-time data which needs to be validated, transformed, stored, and sent to our internal systems. The possibility to use the Kafka Streams at a later stage of the project is also there.
Flink This is still in discussion, but it might be an overkill for our use case.
Kafka Streams My own choice, but as no one has experience with it - we agreed on using Camel or if it turns out it is not needed - we can define our flow using plain old Spring beans.
Our choice: Camel #
The currently chosen tech to build the pipelines/streams. It offers some nice fluent DSL, less complex than using Project Reactor and some team members already have XP using it in Prod.
Persistence #
We need to store messages in a JSON format and some simple state (e.g. reconnection timestamps, scheduled tasks info etc). The main point is that we want to do hundreds of reads/writes per second and we don’t care if some of the data is lost as it can easily be recovered. The options that we discussed:
mySQL has no native JSON datastore. Free, but not sure how fast the free version is. Not suitable for our use case.
postgreSQL is used in our core system. Old-time player, which is one of the fastest, offers replications and backup. Still, the JSON capabilities feel like some kind of an ugly patch, the language used to access/work with JSON fields/columns feels verbose and error-prone(version >12 as it increases JSON capabilities greatly).
MongoDB We were not sure that we would need the search/aggregation operations on the JSON fields (e.g. get all JSON messages with a receivedDate
after some timestamp, or get all JSON messages with type == event
). The use cases where we would need some kind of special key to group the entries were few. (versions >4 were discussed, as we wanted ACID guarantees).
Our choice: Redis #
Redis is widely used in other similar services and our DevOps have already some experience with it. Storing JSON messages is easy. We know from experience (proven by other teams) that Redis had no problems with storing, accessing, and updating 4+ million key-value pairs, while MongoDB was struggling with ~ 2.5 million rows.