Tag: Kotlin

  • Behavioral drift: silent bugs in LLM workflows

    I am working on an AI workflow, which, like many others, has a classifier step. It looks at a user prompt and routes to a proper specialist agent — support, sales, feedback. The respective system prompt had this line:

    Respond only with the exact query type. No explanation. No formatting.

    During daily editing this line was accidentally lost.

    After deployment, the classifier started returning markdown instead of a clean enum value:

    **feedback**

    At that point there was neither validation nor safe fallback value, so the workflow started to fail every time at that point.

    Why you cannot just “add validation”?

    First of all — this is a runtime issue. It will fail when a real LLM is requested, not in unit tests.

    Second. The change leading to the bug is not in the code, it’s in the system prompts.

    Third. The most important. This exact issue could be caught by validation, but what if the change happened within a complex json response? Or a node always returns markdown, but now the meaning is completely different? And even more, what if the form is still correct, but the enum distribution changed from 50:50 to 1:99?

    In practice most AI work lives in the prototyping and research phase. Prompts change weekly. Models get swapped. Workflows are restructured. The output shapes are being discovered. Under such conditions proper validation is a big challenge in itself.

    What actually happens here is that implicit contract between the LLM and the code is silently broken.

    I’d call this behavioral drift. It has three properties that make it especially nasty:

    • It looks correct to humans. **feedback** next to feedback doesn’t read as a bug.
    • It often passes automated checks. Both feedback and **feedback** are valid strings. Schemas describe what’s allowed, not what was normal.
    • Safe fallbacks make things worse. If **feedback** value is silently classified as "General request" days might pass before you notice something is wrong.

    Don’t fight it, but monitor

    How do we deal with such bugs then?

    Instead of trying to add more and more layers of sophisticated validation, one can instead compare the behaviour after any changes to the code or to the system prompts. This is kind of git diff applied to the workflow behaviour.

    Imagine after dropping the aforementioned line from the system prompt we get a notification

    Classification result format changed from "scalar string" to "markdown".

    This is a serious alert worth immediate checking.

    In a more complex case the monitoring system might warn you that

    Distribution of node responses changed from 30/30/40 to 10/0/90

    which is a behavioural drift probably concealing a serious problem.

    Anomaly Detector tool

    To obtain some peace of mind during rapid AI workflow prototyping I’ve implemented a small Kotlin/JVM tool which (in its current version) does the following:

    1. captures most important characteristics of each step output as “workflow node profiles”;
    2. upon request compares two workflow versions, flagging possible anomalies ordered by severity.

    Example findings:

    1. a node is missing (never visited) in the new version;
    2. the node is here, but its output form changed drastically;
    3. node started to route to a different target most of the time.

    I tried to keep the tool minimally invasive. To collect profiling information you just need to add checkpoints like this to your code:

    Kotlin
    detector.checkpoint(step = "classify-query", output = response.content)

    You can find the tool with full documentation here:

    https://github.com/minogin/ai-anomaly-detector

    The bigger point

    Non-deterministic software is still software. It’s time to expand our toolkit for these kinds of systems. Not just answering deterministic-world questions, such as “is this node output correct?”, but also questions like “did this node’s output distribution change significantly?”

    If you’ve shipped LLM workflows in production, you’ve probably seen something like this. Please share your experience, what non-deterministic problems you faced and what the cure was?

  • Robust and convenient Kotlin primitives

    How often do you get into this situation?

    Kotlin
    fun Document.isAccessible(tenantId: Int, userId: Int): Boolean
    
    if (doc.isAccessible(userId, tenantId)) ... // Error or security breach

    I hope – never – because you always add parameter names to function calls and also have 100% test coverage.

    Still, maintaining hundreds of indistinguishable val id: Int properties is frustrating and inevitably leads to mistakes.

    There’s a nice trick to solve this problem once and for all.

    Kotlin
    @JvmInline // Required for JVM world
    value class TenantId(val value: Int)

    Now you have a custom type for your Int id with almost no overhead. (“Almost” because Kotlin still might box value class in case if it’s used as a type parameter or a nullable value)

    Now the dangerous code above will just fail to compile:

    Kotlin
    if (doc.isAccessible(userId, tenantId)) ...
    // Argument type mismatch: actual type is 'UserId', but 'TenantId' was expected.

    To make it easier to convert Ints to custom classes use helpers like:

    Kotlin
    fun Int.toTenantId() = TenantId(this)

    It appeared to be so helpful that now I also use custom value classes for UUIDs and Strings:

    Kotlin
    @JvmInline
    value class DocumentReference(val value: String)
    
    data class Document(
      val ref: DocReference,
      ...
    )
    
    @JvmInline
    value class ExternalResourceId(val value: UUID)
    
    data class ExternalResource(
      val id: ExternalResourceId,
      ...
    )

    You can go further and make your data framework understand this custom types. For example jOOQ:

    Kotlin
    class TenantIdConverter : AbstractConverter<Int, TenantId>(Int::class.java, TenantId::class.java) {
        override fun from(v: Int?): TenantId? = v?.toTenantId()
    
        override fun to(tenantId: TenantId?): Int? = tenantId?.value
    }
    
    // Then in jooq configuration
    forcedType {
        includeExpression = "tenant\\.id"
        userType = "org.example.TenantId"
        converter = "org.example.TenantIdConverter"
    }
                        
    forcedType {
        includeExpression = ".*\\.tenant_id"
        userType = "org.example.TenantId"
        converter = "org.example.TenantIdConverter"
    }

    Now you could save and fetch your TenantId directly.

    Moving away from primitives is a big step towards Domain-Driven Design and Hexagonal Architecture which in my opinion is essential for any enterprise project!

  • Kotlin IR: Transforming DSL at Compile-Time

    In my previous post, we explored the basics of the Kotlin IR (Intermediate Representation). Today, we’ll move from theory to practice by building a transformer plugin that solves a common DSL design dilemma: Type-safe syntax vs. Runtime flexibility.

    The “Primitive” Limitation

    When building an assertion library, you often want a syntax that feels natural but remains strictly typed:

    Kotlin
    val x: Int = 10
    confirmThat { x } deepMatches { 10 }            // Should compile
    confirmThat { x } deepMatches { lessThan(20) }  // Should compile
    confirmThat { x } deepMatches { "abc" }         // Should FAIL

    Here, deepMatches expects a value of the same type as x. However, to make assertions work, we need lessThan(20) to return a Matcher object, not a primitive Java int.

    Usually, we have two bad options:

    • Wrappers: Force the user to write confirmThat { Box(x) }, which ruins the DSL.
    • Runtime Proxies: Impossible for primitive types like Int or Long.

    The IR Solution: We let the user write code that satisfies the Kotlin compiler (returning Int), then use an IR Transformer to swap that call for a Matcher object before the bytecode is generated.

    Implementing the Transformer

    To perform the swap, we implement an IrElementTransformerVoid. This allows us to intercept function calls and replace them with new expressions.

    1. Identifying the Target

    First, we filter for the specific API call we want to replace:

    Kotlin
    override fun visitCall(expression: IrCall): IrExpression {
        val fqName = expression.symbol.owner.kotlinFqName.asString()
        
        if (fqName == "com.minogin.confirm.api.lessThan") {
            // Logic for replacement goes here...
        }
        
        return super.visitCall(expression)
    }

    2. Creating the Replacement

    Once we’ve caught the lessThan call, we need to replace it with a constructor call to LessThanMatcher. This requires finding the class symbol in the classpath and mapping the original arguments.

    Kotlin
    // Setup the builder for the current scope
    val scopeSymbol = allScopes.lastOrNull()?.scope?.scopeOwnerSymbol 
        ?: return super.visitCall(expression)
        
    val builder = DeclarationIrBuilder(context, scopeSymbol, expression.startOffset, expression.endOffset)
    
    // Reference the implementation class (LessThanMatcher)
    val classId = ClassId(FqName("com.minogin.confirm.impl"), Name.identifier("LessThanMatcher"))
    val classSymbol = context.referenceClass(classId) ?: error("Implementation class not found")
    val constructorSymbol = classSymbol.owner.constructors.first().symbol
    
    // Rewrite the IR: lessThan(x) -> LessThanMatcher(x)
    return builder.irCall(constructorSymbol).apply {
        arguments[0] = expression.arguments[0] 
    }

    The “Magic” in the Bytecode

    Because this transformation happens at the IR level (after type checking but before JVM bytecode generation), the compiler is happy, and the runtime gets the object it needs.

    If we decompile the resulting .class file, we see that our placeholder function has vanished:

    Java
    // Original source
    confirmThat(10, () -> lessThan(20));
    
    // Decompiled output
    DSLKt.confirmThat(10, (scope) -> {
        return (Matcher)(new LessThanMatcher(20));
    });

    Summary & Best Practices

    By using IR Transformers, we’ve created a “syntax illusion”—providing a clean, type-safe API that behaves differently under the hood.

    A note for 2026: Since the K2 compiler is now standard, ensure your plugin is registered via the IrGenerationExtension. IR manipulation is powerful, but remember that you are bypassing standard language constraints; always provide clear error messages using IrMessageLogger if the transformation fails.


    Next time: We will dive into the project configuration (Gradle setup) and how to effectively debug your IR code.

  • Reducing memory usage 10 times with High-Performance Primitive Collections

    Kotlin basic types such as Int or Double correspond to high-performance Java primitive types such as int or double. But nullable (Int?) and generic (<Int>) versions of those types are mapped to boxed Java types such as Integer or Double.

    Boxed types are memory heavy. Let’s make a simple comparison.

    Kotlin
    @Test
    fun `memory occupied by primitive int`() {
        data class A(
            val x: Int
        )
    
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val list = List(N) { A(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        list
    }
    
    > Occupied memory: 1910 MB
    Kotlin
    @Test
    fun `memory occupied by boxed Int`() {
        data class A(
            val x: Int?
        )
    
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val list = List(N) { A(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        list
    }
    
    > Occupied memory: 3436 MB

    We already see almost 2x difference, but actually it’s more serious as our test is not accurate enough.

    Code explained

    calculateOccupiedMemoryMB measures the diff between total and occupied memory running garbage collection for at least 3 seconds in advance to reduce the garbage footprint.

    Kotlin
    fun calculateOccupiedMemoryMB(): Int {
        getRuntime().gc()
        Thread.sleep(3000)
        return ((getRuntime().totalMemory() - getRuntime().freeMemory()) / (1024 * 1024)).toInt()
    }

    list reference at the end of the block is a trick to avoid JVM optimization. If JVM sees an object is not used it might wipe it off the RAM.

    What if we need a huge Set of Int‘s or a huge Map of Int to Object? Unfortunately standard Java Collections are based on generics which means all of the objects will be autoboxed.

    Here HPPC: High Performance Primitive Collections comes to the rescue. This library has predefined collection for all the primitive types.

    Let’s compare memory footprints of a normal Java HashSet<Int> and a corresponding HPPC IntHashSet.

    Kotlin
    @Test
    fun `memory occupied by HashSet`() {
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val set = hashSetOf<Int>()
        repeat(N) { set.add(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        1 in set
    }
    
    > Occupied memory: 5098 MB
    Kotlin
    @Test
    fun `memory occupied by HashSet`() {
        val N = 100_000_000
    
        val mem1 = calculateOccupiedMemoryMB()
    
        val set = IntHashSet()
        repeat(N) { set.add(it) }
    
        val mem2 = calculateOccupiedMemoryMB()
    
        println("Occupied memory: ${mem2 - mem1} MB")
    
        1 in set
    }
    
    > Occupied memory: 518 MB

    10 times less memory used!