In Kotlin primitive types such as Int
or Double
correspond to high-performant Java primitive types such as int
or double
. But nullable (Int?
) and generic (<Int>
) versions of those types are mapped to boxed Java types such as Integer
or Double
.
Boxed types have a serious impact on memory usage. Let’s make a simple comparison.
@Test
fun `memory occupied by primitive int`() {
data class A(
val x: Int
)
val N = 100_000_000
val mem1 = calculateOccupiedMemoryMB()
val list = List(N) { A(it) }
val mem2 = calculateOccupiedMemoryMB()
println("Occupied memory: ${mem2 - mem1} MB")
list
}
> Occupied memory: 1910 MB
@Test
fun `memory occupied by boxed Int`() {
data class A(
val x: Int?
)
val N = 100_000_000
val mem1 = calculateOccupiedMemoryMB()
val list = List(N) { A(it) }
val mem2 = calculateOccupiedMemoryMB()
println("Occupied memory: ${mem2 - mem1} MB")
list
}
> Occupied memory: 3436 MB
We already see almost 2x difference, but actually it’s more serious as our test is not accurate enough.
Now imagine we need a set of Int
or a map of Int
to Object
. If we use standard Java collections we will get all the boxed type overhead because all Java collection types are generic.
Fortunately there’re quite a lot of nice primitive collection implementations. After some research I sticked to HPPC: High Performance Primitive Collections.
Let’s compare memory footprints of a normal Java HashSet
of Int
and an HPPC IntHashSet
.
@Test
fun `memory occupied by HashSet`() {
val N = 100_000_000
val mem1 = calculateOccupiedMemoryMB()
val set = hashSetOf<Int>()
repeat(N) { set.add(it) }
val mem2 = calculateOccupiedMemoryMB()
println("Occupied memory: ${mem2 - mem1} MB")
1 in set
}
> Occupied memory: 5098 MB
@Test
fun `memory occupied by HashSet`() {
val N = 100_000_000
val mem1 = calculateOccupiedMemoryMB()
val set = IntHashSet()
repeat(N) { set.add(it) }
val mem2 = calculateOccupiedMemoryMB()
println("Occupied memory: ${mem2 - mem1} MB")
1 in set
}
> Occupied memory: 518 MB
10x less memory used!