The newer the better: how we moved from GSON to Kotlinx.Serialization

The newer the better: how we moved from GSON to Kotlinx.Serialization


This article has been translated from its original publication at https://habr.com/ru/companies/tinkoff/articles/728928/


Hello, I am Andrey Meshcheryakov, an Android developer in the Tinkoff Investment Growth team. At Invest, we always strive to explore new things and maintain a modern technology stack. We have recently adopted the Kotlin.Serialization library, and I would like to share our experience with migrating from Gson to Kotlinx.

In this article, I will discuss the challenges, the less obvious aspects of its usage, and compare the functionality of these two libraries.

Kotlin.Serialization and its benefits

Kotlinx Serialization is a multi-platform, multi-format serialization library developed by JetBrains specifically for Kotlin. It consists of a compiler plugin, core library, and a set of auxiliary libraries that support various data serialization protocols.

Support for protocols in Kotlinx Serialization is made possible by dividing the serialization process into two stages: transforming an object into a sequence of primitives and encoding the resulting sequence according to the specified format. Once the logic for representing an object through primitive types is implemented, different formats can be applied to generate the final result (instances for working with the most popular data formats are provided together with the library).

Object serialization in Kotlinx.Serialization follows a code generation approach. During compilation, a serializer is generated for each class marked with the @Serializable annotation and placed within the class's companion object.

The compiler plugin analyzes classes annotated with @Serializable and informs the IDE about the existence or non-existence of serializers for those classes even before they are generated. This helps avoid errors when adding non-serializable fields to serializable classes.

At Invest, we use the JSON format for serialization, so we will focus solely on it from now on.

Before adopting Kotlinx.Serialization, we mostly relied on the Gson library, which had a significant drawback - its use of reflection. Gson uses UnsafeAllocator to allocate memory for an object and fills its fields with null values. Then, the fields are overwritten based on the input data, and any missing data is simply ignored. This can lead to NullPointerExceptions at runtime.

An interesting fact is that even though Kotlinx.Serialization is not primarily based on reflection, it still uses it to access companion objects and generated serializers.

The main advantage of Kotlinx.Serialization is its full compatibility with Kotlin. This implies using Kotlin's type system, supporting non-nullable types, and default field values. Unlike Gson, Kotlinx.Serialization constructs class objects during deserialization by invoking their constructors, which eliminates errors when working with the resulting objects. While it is possible to achieve these features with Gson by writing custom TypeAdapters, it complicates the development process and overall code elegance.

Kotlinx.Serialization also offers a more extensive exception system compared to Gson's single JsonParseException. It allows for a detailed understanding of the causes of encountered errors. Furthermore, in version 1.4.0, part of the exception hierarchy became public.

In most cases Kotlin.Serialization is easy to use and convenient for developers. But there are cases where you can't do without additional logic. For example, you will have to write your own serializers for library classes.

Transition to Kotlinx.Serialization and Migration Cases

We took our first steps towards migration in mid-2020. Initially, we introduced Kotlinx.Serialization in our experimental application and tested it with non-standard use cases such as serialization of enum and sealed classes, generics, and library classes. Then, we began the migration in our main application, adding everything necessary for its operation:

  • Kotlinx Json object provider, responsible for encoding the sequence of primitives generated by serializers into JSON format and containing the project's accepted configuration.
  • Custom enum serializer with default value handling.
  • Core serializers for commonly used library classes.
  • Wrapper for the Gradle plugin to simplify library integration into modules and disable compiler warnings from @RequiresOptIn(ExperimentalSerializationApi).
  • R8/ProGuard rules (included with the library starting from version 1.5.0).
  • Lint rules for serializable classes (we have a large development team, and these rules facilitate adherence to the project's approaches).

When writing new features, we started using Kotlinx.Serialization and gradually converted old ones to it. We also migrated the work with Firebase Remote Config to Kotlinx.Serialization. Initially, we manually wrote the code, but after adding Kotlinx.Serialization support to the OpenAPI Generator, we began generating most of the code automatically for new features, only making minor adjustments according to the project's conventions.

Migration Cases

Kotlin nullable types. The default behavior of Kotlinx Json configuration differs in terms of optionality between the language's perspective and the deserializer's perspective. Therefore, for nullable properties, it is necessary to add null as the default value. Without a default value, assigning null is only possible by explicitly specifying it in the incoming JSON: { "name": null }

The absence of the "name" field in the response will result in a JsonDecodingException. This issue can be addressed by setting the configuration flag explicitNulls = false (its default value is true in the standard configuration) when creating a Json instance. This flag allows ignoring the absence of fields in the response for nullable properties.

We encountered situations where the backend contract did not establish clear rules regarding the optionality of field values. A property that was implemented as mandatory with Gson turned out to be optional and only appeared under certain conditions. Gson did not produce errors during the JSON parsing stage, and property access occurred after it was already initialized. However, during the migration to Kotlinx.Serialization, this led to unexpected errors due to the mandatory contract enforcement during deserialization. It is important to be cautious with this aspect.

Handling errors during enum deserialization. The project continues to evolve with the addition of new features, but it is also necessary to support older versions of the application. We cannot be certain that enums will always contain the same set of values as they did from the beginning.

For example:

If a different value is received from the backend, attempting to deserialize TestInstance will result in a JsonDecodingException. This problem can be solved by specifying the configuration flag coerceInputValues = true when creating a Json instance, which enables handling of incorrect values, and adding a default value to the enum property:

Such situations are handled by standard library tools, but when it becomes necessary to get an enum list, Kotlinx.Serialization has no built-in mechanisms for declaring a default value:

A new, unknown value of "Bonus" will result in deserialization errors, and we won't be able to obtain the TestInstance at all. The flexibility of Kotlinx.Serialization allows us to implement custom serialization logic for any class if the standard library facilities are insufficient. To do this, we need to provide an implementation of the KSerializer interface for the required class.

Serializer full code link

The serializer consists of three parts: the serialize() and deserialize() functions, and the descriptor property. The functions are responsible for converting objects into a stream of primitives for encoding by the encoder in the required data format, and for the reverse process of obtaining an object from the stream of primitives. The descriptor describes the data structure in the serializable class, which is used during encoding to represent the object correctly.

Deserialization of polymorphic data structures. Sometimes objects of different types not only differ in the value of a property but also contain different data. Such objects can be represented as polymorphic hierarchies with a base parent class.

When the type of the received object is not known in advance and is determined at runtime among the subclasses of the base class, PolymorphicSerializer is used for deserialization. The simplest way to create such a structure is by using sealed classes.

The standard implementation of the polymorphic serializer looks for a "type" field in the input data and uses it to determine the type of the final object. In all subclasses, you just need to specify with the @SerialName annotation which values in the "type" field correspond to which classes.

You can change the classDiscriminator to use a different field for type determination by simply rewriting the value of classDiscriminator when creating the Json instance. However, this will affect all polymorphic structures that use this instance.

Starting from version 1.3.0, you can add the @JsonClassDiscriminator annotation to the polymorphic structure, which is similar to the classDiscriminator field and applies only to the specific structure.

If you add a property in the class that is similar to the one declared in classDiscriminator, there will be a conflict during deserialization. However, there are situations where the field is needed for some business logic. To resolve the conflict, you can annotate the property with @Transient, which indicates the ignoring of the corresponding value in the input data, and manually specify the necessary property value for each subtype.

When using abstract classes and interfaces instead of sealed classes, the serializer cannot automatically know about all the subclasses. Therefore, when creating a Json instance, they need to be explicitly specified.

To provide default handling in sealed classes, you can use the extension functions SerializersModuleBuilder.polymorphicDefaultSerializer and Deserializer when creating a Json instance.

The main drawback of this approach is that we manually specify the discriminator for each subclass, including the default implementation. If we are not interested in the data of some received objects and plan to store them in a default container, their type will be lost during deserialization and replaced with the value from the default implementation. To solve this problem, you can implement custom serialization logic for polymorphic structures.

Additionally, this approach does not work well with generics in subclass classes. Although Kotlinx.Serialization supports generics, there are no built-in mechanisms for full support of generics in polymorphic structures. The documentation suggests treating the parameter of that class as a polymorphic structure and manually declaring all the classes that the parameterized property can accept. Alternatively, you can write a custom serializer.

In Gson, we used the RuntimeTypeAdapterFactory to handle polymorphic structures, which we adapted to support default values.

Deserialization of library classes: Kotlinx.Serialization adds serialization support for some classes from the Kotlin standard library, such as primitives. You can find the full list on GitHub. If a class is not included in the supported list and is not written by us, we cannot add the @Serializable annotation and expect the plugin to generate a serializer for that class. For such classes, we need to write the serializers ourselves.

One approach to support serialization of classes outside the supported list is to declare their serializers when creating a Json instance. You need to annotate the fields with @Contextual, and then the plugin will assume that we know what we are doing, and at runtime, ContextualSerializer will select the declared class.

class BigDecimalSerializer : KSerializer<BigDecimal> {

    val descriptor: SerialDescriptor =
        PrimitiveSerialDescriptor(
            “java.math.BigDecimal”,
            PrimitiveKind.STRING
        )

    fun serialize(encoder: Encoder, value: BigDecimal) {
        encoder.encodeString(value.toString())
    }
 
    fun deserialize(decoder: Decoder): BigDecimal {
     return BigDecimal(decoder.decodeString())
    }
}

class OffsetDateTimeSerializer : KSerializer<OffsetDateTime> {

    val descriptor: SerialDescriptor =
        PrimitiveSerialDescriptor(
            “java.time.OffsetDateTime”,
            PrimitiveKind.STRING
        )

    fun serialize(encoder: Encoder, value: OffsetDateTime) {
        encoder.encodeString(DateTimeFormatter.ISO_OFFSET_DATE_TIME.format(value))
    }
 
 fun deserialize(decoder: Decoder): OffsetDateTime {
     return OffsetDateTime.parse(decoder.decodeString(), DateTimeFormatter.ISO_OFFSET_DATE_TIME)
    }
}

You can find open-source solutions for serializing commonly used classes. For example, there is a library on GitHub for BigDecimal and BigInteger classes.

Results After Migration

To compare the performance of Gson and Kotlinx.Serialization, I conducted tests on deserialization speed and memory consumption of these two libraries.

The deserialization speed comparison was done on a test project using the Jetpack Benchmark library. The test device used was a Samsung Galaxy S22 with Android 12 (One UI 4.1) and Qualcomm Snapdragon 8 gen 1 SoC. You can view the results on the GitHub project.

The testing of Kotlinx.Serialization was performed with R8 enabled and disabled to evaluate the impact of code optimization on the results. The time taken to deserialize objects from JSON was evaluated by creating test JSON files in raw resources.

The process of reading from resources to a string was not included in the final time measurement. To prevent the JVM from optimizing out this logic as unused, the deserialized result was stored in a variable.

Finally, to ensure that Gson actually received the fields correctly and the deserialization was done correctly, we performed a check on the received objects using assertEquals for one of their fields. This check was also excluded from the timing measurement.

Here is a list of objects that were compared:

  1. Simple case: objects consisting only of primitives and objects with nested types but no additional logic. Evaluation of the impact of object nesting on parsing speed (in both cases, the JSONs consisted of 30 fields).
  2. Deserialization of a list of objects consisting only of primitives - 30 fields vs 300 fields. Evaluation of the impact of JSON size on parsing speed.
  3. Deserialization of polymorphic data structures. Overall JSON size - 30 fields and 300 fields. Evaluation of the impact of JSON size on parsing speed.
  4. Deserialization of objects consisting of Java classes - 30 fields. Evaluation of the impact of runtime serializer selection in Kotlinx.Serialization.

Based on the measurement results, it can be said that the libraries showed different efficiency in different scenarios. The use of contextual deserialization in Kotlinx.Serialization led to a noticeable decrease in performance compared to Gson. However, when deserializing polymorphic structures, Kotlinx.Serialization performed better. It is worth noting that R8 code optimization for Kotlinx.Serialization showed significant improvements only for contextual serialization, while in other cases, the difference was within the margin of error.

By the way, an interesting observation is that when creating a Json object using the default implementation, the benchmark results, where its use was possible, yielded better performance, even though the speed difference should only occur during object creation.

Afterword. Library Development

In reality, benchmarking Kotlinx.Serialization against Gson was conducted out of research interest. When choosing a library, the developers prioritize its ease of use.

Kotlinx.Serialization is a flexible, modern, and actively evolving tool that offers advantages in code cleanliness and logic transparency.

The current version of Kotlinx.Serialization is 1.5.0, and in this release, they have added the following:

  • Definition of transformations applied to property names during serialization.
  • Support for deserializing large strings using the ChunkedDecoder interface (currently implemented only by JsonDecoder).
  • A serializer for the kotlin.Nothing class (throws a SerializationException when attempting to serialize, necessary for supporting parameterized classes).

The extension functions SerializersModuleCollector.polymorphicDefault and PolymorphicModuleBuilder.default are now marked as deprecated. Instead, you should use the functions SerializersModuleBuilder.polymorphicDefaultSerializer/polymorphicDefaultDeserializer and PolymorphicModuleBuilder.defaultDeserializer.

Starting from this version, the R8/ProGuard rules are shipped with the library, eliminating the need to add them manually.



Report Page