Blog

  • Kotlin IR: Unlocking Incredible Possibilities for Code Manipulation

    I’m currently working on a pet project: a Kotlin assertion library designed to handle deep assertions over any object type. The goal is to write code that looks like this:

    Kotlin
    confirmThat { (1..3).toList() } deepMatches { listOf(1, 2, 3) }

    To make this work, the library needs to convert arbitrary code into a structured matcher tree. Under the hood, the goal is to transform that simple list check into something like this:

    Kotlin
    confirmThat { (1..3).toList() } deepMatches {
        ListMatcher(
            ValueMatcher(1),
            ValueMatcher(2),
            ValueMatcher(3)
        )
    }

    To pull this off, I weighed two sophisticated technical paths:

    Option 1: Runtime Bytecode Manipulation

    The first option is to modify the compiled bytecode while the application is running using tools like Byte Buddy.

    • The Pro: It’s a standard way to handle introspection on the JVM without needing a custom compiler setup.
    • The Con: It has very limited capabilities because Kotlin-specific details — like null-safety metadata — are often erased or transformed once the code is compiled. Many things are just impossible — e.g. you cannot convert primitives (int) into objects (Int), which might make it impossible to replace int with IntMatcher.

    Option 2: Kotlin Intermediate Representation (IR)

    The more “hardcore” approach is hijacking the Kotlin compilation process itself. By using a Kotlin Compiler Plugin, I can intercept the IR (Intermediate Representation). This happens after the code is parsed but before it undergoes any “lowering” steps (converting high-level constructs into simpler ones) or gets turned into bytecode.

    • The Pro: This allows me to see the code in its purest form. I can generate highly efficient, type-safe matchers that are baked directly into the program.
    • The Con: It’s significantly more complex to implement since it requires working deep within the compiler’s internal mechanics.

    My Take

    I decided to go the “hardcore” route. I’ve been experimenting with Kotlin IR, and it works perfectly! It handles the primitive-to-object mapping and null-safety metadata with ease.

    Stay tuned—I’ll be sharing more on how I actually implemented the IR transformer in my next post.

  • Reducing memory usage 10 times with High-Performance Primitive Collections

    Kotlin basic types such as Int or Double correspond to high-performance Java primitive types such as int or double. But nullable (Int?) and generic (<Int>) versions of those types are mapped to boxed Java types such as Integer or Double.

    Boxed types are memory heavy. Let’s make a simple comparison.

    Kotlin
    @Test
    fun `memory occupied by primitive int`() {
        data class A(
            val x: Int
        )
    
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val list = List(N) { A(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        list
    }
    
    > Occupied memory: 1910 MB
    Kotlin
    @Test
    fun `memory occupied by boxed Int`() {
        data class A(
            val x: Int?
        )
    
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val list = List(N) { A(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        list
    }
    
    > Occupied memory: 3436 MB

    We already see almost 2x difference, but actually it’s more serious as our test is not accurate enough.

    Code explained

    calculateOccupiedMemoryMB measures the diff between total and occupied memory running garbage collection for at least 3 seconds in advance to reduce the garbage footprint.

    Kotlin
    fun calculateOccupiedMemoryMB(): Int {
        getRuntime().gc()
        Thread.sleep(3000)
        return ((getRuntime().totalMemory() - getRuntime().freeMemory()) / (1024 * 1024)).toInt()
    }

    list reference at the end of the block is a trick to avoid JVM optimization. If JVM sees an object is not used it might wipe it off the RAM.

    What if we need a huge Set of Int‘s or a huge Map of Int to Object? Unfortunately standard Java Collections are based on generics which means all of the objects will be autoboxed.

    Here HPPC: High Performance Primitive Collections comes to the rescue. This library has predefined collection for all the primitive types.

    Let’s compare memory footprints of a normal Java HashSet<Int> and a corresponding HPPC IntHashSet.

    Kotlin
    @Test
    fun `memory occupied by HashSet`() {
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val set = hashSetOf<Int>()
        repeat(N) { set.add(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        1 in set
    }
    
    > Occupied memory: 5098 MB
    Kotlin
    @Test
    fun `memory occupied by HashSet`() {
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val set = IntHashSet()
        repeat(N) { set.add(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        1 in set
    }
    
    > Occupied memory: 518 MB

    10 times less memory used!