Learning about Android runtime

Filip Wiesner
MateeDevs
Published in
8 min readApr 7, 2024

--

Unknown knowns

Throughout my professional career as an Android developer, I have encountered a few “facts” that I just accepted without really asking how or why. We all have some tools we use without knowing how they truly work under the hood, and that’s perfectly okay. But sometimes it’s good to uncover some of the mystery. I used writing this article as a way to do just that. So, if you want to learn with me, here are a few such facts that I want to explore.

The first “fact” is that Compose is slower than Views. Is this really true, and if so, why? I understand that Compose is a relatively new technology that didn’t go through years of optimizations like Views have, but does that really mean that it’s generally slower? Another one is that baseline profiles improve startup performance. That sounds good, but how do they work? The final fact is that app performance may improve a day after installation. I’m not even sure where I heard this, but it is really noticeable if you have a weaker phone. However, I am still unsure about the exact reasons behind it. An additional question to all of these is, “Why don’t iOS developers face these problems and use tools like Baseline profiles? Are there any drawbacks to their approach?”

The Android Baseline Profiles are a set of classes and methods that are optimized before runtime, resulting in improved startup performance.

To find answers to all of these questions (and more), we must first understand a few things about Android runtime and compilers. It turns out that building and distributing Android apps is not a simple task. The code you build on your computer and upload to the Play Store might not really be the same code the phone executes when installed. So let’s dive deep into the compilation process.

Compilation

You might have heard that Android uses JVM (Java Virtual Machine) to execute apps. However, that is not entirely true. It uses Java .class files as a middle step of the compilation. So, in theory, any language that compiles for the JVM could run on Android. But Android has its own runtime, aptly called Android Runtime (ART), instead of the JVM. It’s not just a rebrand; it’s completely different. But that’s not important now. Let’s revisit this later to explain the details.

Right now, let’s go over some terms: high-level code, bytecode, machine code. All of these can mean different things depending on the technology used, but in general, the program is written in high-level code (e.g., Kotlin, Java, Swift…). This code is the easiest for humans to read. Bytecode is something in the middle, and there can be several stages of bytecode. For example, when compiling Swift, there is Swift IR (Intermediate Representation) bytecode, then there is LLVM IR bytecode, which is finally compiled into machine code for the targeted platform. So, the machine code is what is actually being run on the targeted architecture, and each architecture accepts different machine code.

There is a catch, though: Bytecode does not have to be compiled down to machine code; it can also be directly interpreted. This is the case for JVM bytecode (depending on the runtime). Interpreting bytecode might not be as fast as running machine code, but Java makes up for it by the fact that Java bytecode can be run on any system that has JRE (Java Runtime Environment) installed. There are even cases where a high-level language is directly interpreted, skipping the bytecode and machine code entirely, for example, JavaScript.

So now we understand the different levels of code compilation. Feels good, right? :) But what if I told you that both Kotlin and JavaScript in the example can actually end up as machine code as well? Let’s learn more!

When discussing compilation to machine code, there are generally two approaches: AOT (Ahead-Of-Time) and JIT (Just-In-Time) compilation. AOT is quite simple to explain. The code is compiled before the program’s execution so that when the program runs, it is already in the form of machine code. JIT, on the other hand, is a little bit harder to explain.

We typically refer to Just-In-Time (JIT) compilation as the process of identifying crucial code sections that are initially executed in an interpreted form and then compiling them into machine code to optimize them. These “crucial code sections” can be methods (e.g., V8 JS Engine) where we take the entire method that is hot (called often) and optimize it, or traces where the runtime can identify a hot path that can be smaller than a method or span across several methods. JITing has a few advantages over AOT. It is compiled on-device, so the machine code can be optimized specifically for the environment, and since only the hot code is compiled, the entire app takes up less space on disk. The disadvantage is that there are just more moving parts, and the performance can vary between runs so it’s less predictable.

So, is Android Runtime using JIT or AOT compilation? Well, actually both. We now know the details so let’s look at the bigger picture of the entire process.

How does it all work together

The Android platform code is typically optimized and precompiled for smooth operation during OS updates. However, your app may not benefit from this optimization immediately. Let’s explore the steps your app follows to achieve smooth performance. Various tools are used in creating a functional app for users.

  1. The first step is the compilation of the high-level code into a .class file. The class file is then typically interpreted directly by a JVM, but this is only the initial stage on Android. The high-level code can be any language that runs on a JVM. The officially supported languages for Android are Java and Kotlin.
  2. Now, because the Android runtime does not understand class files, we must “dex” them. “Dex” stands for Dalvik executable because before ART, the Android runtime was called Dalvik (pre-Android 5). The tool that does this is called D8 (also known as Dexer), and it works in tandem with R8 — a code shrinker and optimizer. The reason we need a different file format is that ART (and Dalvik before it) work in a “register-based” runtime, while the JVM uses a “stack-based” design. This modification decreases the instruction count by directly accessing registers instead of constantly manipulating a stack to retrieve values.
  3. Android Studio uses bundletool and other tools to package your compiled code, resources, and other files into an AAB (Android App Bundle). The AAB contains everything needed for your app to be distributed through the Google Play Store. In this step we also pack in the baseline profile if we have some.
  4. When a user clicks on “install,” the Play Store uses that AAB to generate an APK for the specific device, ignoring all unnecessary resources that would end up unused.Now, the magic happens: the device opens the APK and looks for the baseline or cloud profile (we’ll talk about cloud profiles in the next step). If there is any, the hot code paths described by the profile are pre-compiled (AOT), so that when the app runs, the code it runs is already in machine code.
  5. While the app is running, the device traces the code and looks for any hot paths (code that is being run very often). Anything worth compiling is asynchronously JITed at runtime, and the rest of the traces are saved. JITed code is preferred over AOT-compiled code because it can take advantage of runtime information.
    When the device is idle and charging, it can use the time to catch up and look through all the traces it generated while running the app. If there are any places that could benefit from speeding up, it compiles them (AOT) and stores the information into the .oat file (the AOT binary for the .dex file). So when the app is opened next time, it can benefit from the pre-compiled code. This process is continuous and improves the app’s runtime every time it runs.
    But wait, it gets even better! These traces don’t stay on the device; they are uploaded to the Play Store and distributed with the APKs as cloud profiles to other users. So, by running the app, you are helping other users pre-compile hot code at install time and make their experience even better.

Disclaimer: I learned most of this while researching for this article, so I might have gotten something wrong. Please correct me in the comment section if that happens. Thank you.

Answering questions

Well, that was exciting! Now, we are ready to answer the questions we had at the start. So, is Compose really slower than Views? Well, that’s a really complicated and nuanced question, but I believe we can explain at least some of that. As we’ve discussed, the OS should already be optimized from the start, and the Views are part of that. If you are old enough to remember before the AppCompat days, you know that some parts of the Views we used daily were locked behind an OS version. That’s because the actual implementation of the Views was different on each OS version. This is actually the same setup the iOS developers have. The UIKit and SwiftUI are coupled with the OS. But Compose is different; it is just a library and behaves like any other code you’ve written. That means it must undergo the same process of JITing and gradual improvements, which is the reason it can feel a bit slower. It also benefits a lot from R8 optimizations (e.g., inlining). That’s why the debug code is even slower.
I believe the baseline profiles were covered pretty well in step 4 of the previous section. Just to reiterate, the code described in the profile is compiled when installing the app. This means the device does not have to interpret the DEX code on the first startup and can instead focus on tracing and JITing other parts of the code. We can even understand the name better now. The profile set’s a new “baseline” for further optimizations by the JIT.
And about our last question, “How can the app improve overnight?” That is also explained in the last section. Your device uses idle time to review the traces of the app and optimize everything it can.

And by now, you probably guessed why iOS developers don’t have the same problems. Well, their apps are compiled to native code before uploading to the App Store, so there is no need to further optimize them. There are certainly benefits to this approach. The runtime performance of your app is more predictable and testable, and everything should be blazing fast because nothing has to be interpreted.

Finishing up, we’ve learned A LOT about the way Android runtime works. I’ve even skipped a few interesting facts about Dalvik and the history of ART. For example, ART was AOT only when initially released in Android 5, with JIT being added in Android 7.
I had a lot of fun researching for this article, and I hope you had a lot of fun reading it!

Sources:

--

--