How we reduced our Android startup time by 77%

Mobile performance at Turo

Published in

Turo Engineering

13 min readApr 10, 2023

The startup time of a mobile app is one of the most important indicators of its performance and has a significant impact on the user experience. A fast-loading app not only provides a positive first impression, but also increases engagement and retention, reduces user churn, and can improve visibility in app store rankings.

In this story, we would like to share our experience of reducing the startup time of the Turo Android app. We are going to cover, what improvements we were able to achieve, what steps we performed and how are we measuring it.

We reduced the cold startup time of Turo Android app by 77% on average.

The same metric was improved by 84% at the 50th percentile. The figure below shows a comparison of reported metrics before and after we improved the startup time of Turo’s Android app. All the values are measured in milliseconds.

Startup metrics before and after the improvement, measured in milliseconds

The illustration below demonstrates how the application looks and feels before and after the improvement that is already available for 50% of our users.

The duration of the startup before directly depended on the network connection. Therefore there might be cases where it was slightly faster but it could also take even longer to start the app. This is not the case after the improvement.

The improvement

Before jumping into the specifics of the improvements we’ve made, it’s worth examining the startup process of the Turo app prior to the optimization efforts.

The startup process of the app consisted of the following stages:

App initialization. This includes the application process initialization which the developers have little control over. In addition, it includes all the code, the application runs before rendering the first UI frame. Android Vitals reported the end of this stage as startup complete.
Synchronous network requests. The app was performing a number of network requests while showing a custom splash screen before allowing the user to see a home screen.
Custom splash animation. After the network requests were completed, the app was running a custom splash animation that takes roughly 1 second. After that, the app navigated from SplashActivity to HomeActivity.
Home skeleton. At this point, the user is able to partially interact with the app while waiting for the home content to load and seeing a shimmer animation. While this stage was already cacheable, there were some improvements required there as well.

Splash screen

One easy way we were able to improve the startup time of the app was by removing a custom splash animation that was displayed on each launch. While the animation was part of the design language, it was launched after all the synchronous network requests during startup, causing a delay in the overall startup time. By removing the animation, we were able to save approximately 1 second.

Now that we’ve tackled this item, let’s move on to more technically interesting aspects of startup improvement.

Initially, we used a dedicated SplashActivity to run all the startup work before routing the app to the HomeActivity. However, the latest guidelines advise against this approach. Therefore, we eliminated the redundant SplashActivity and transferred all the startup logic to our root activity by leveraging the Splash Screen API. This allowed for a unified experience across all Android versions and, as a bonus, improved our startup metrics in the Google Play Console’s Android Vitals as shown in a scheme below.

Splash screen API helps to correct Android Vitals metrics

If an app displays a custom image or animation as a splash screen, the startup time metric in Android Vitals may not accurately reflect the actual startup time. This is because the time spent displaying the custom screen is not counted as part of the startup process. To address this issue, using a Splash Screen API can help explicitly define when the startup process is complete, resulting in more accurate reporting.

Deferring synchronous network requests

Our app was performing a series of network requests while displaying the splash screen in order to fetch the necessary data before users could interact with it. This led to a noticeable slowdown in the startup process of the app, particularly when the network connectivity was poor.

In addition, the app was performing a different number of startup network requests depending on whether the app is used in guest, host modes, or unauthenticated. This resulted in inconsistent experiences for different types of users.

As an example, if the app is used in guest mode, a few of the requests help to decide which screen should be shown after the splash screen. By default, it's the home screen. However, there are special cases where the user could be redirected either straight to an active vehicle reservation or to a feedback screen where they can rate their last experience booking a vehicle.

Since in the majority of cases, it will be a home screen, we don’t want to slow down every app startup by checking for a minority of scenarios. Therefore, we now open a home screen by default while making those network requests asynchronously. If any redirect is required, a corresponding screen is displayed on top of a home screen.

Fetching feature flags

While it’s a relatively straightforward optimization in many cases, there are network requests that require special handling. When fetching feature flags from the network, it is not sufficient just to run the network request asynchronously.

If no feature flag is requested immediately after the app startup, the home screen contents are displayed instantly with no issues.

However, what if we need to use a feature flag on a home screen? In this case, the network request won’t make it in most cases. This results in using the value from the cache which might be outdated or absent in many cases which in particular may lead to incorrect results for A/B tests.

To solve the problem, we’re starting to fetch feature flags in the background right on app launch. If any feature flag is required before the request is complete, such a call is suspended until they are fetched from the server. Meanwhile, we’re showing a home screen skeleton with a shimmer animation. All the subsequent feature flag reads in the app are instant as shown in the figure below.

Asynchronous fetching of feature flags on startup

In order to avoid showing the loading skeleton on each subsequent app startup, we’re using a short-term cache for feature flags. This could be a few hours, a day, or a few days. We are still experimenting with the exact cache expiration values.

Baseline Profiles

After we removed all the synchronous network requests our startup duration became more deterministic and now it makes sense to apply Baseline Profiles. This is the feature that helps to pre-compile startup code paths ahead of time and boost the app startup time.

Applying baseline profiles to our app resulted in approximately 15% of improvement according to Macrobenchmark results (when compared to the run with no compilation applied).

Optimizing disk I/O operations

When investigating the ways to improve startup times for our app, we noticed that it performs a bunch of disk I/O operations during the startup.

Strict mode violations. StrictMode is a tool that among other things helps to detect accidental disk I/O operations performed on a main thread. Such operations are especially unwelcome during the application’s startup and it is recommended to defer them or move to a background thread. This tool should be used for debug builds only. When configured, it will post events in logcat that look like the example below and include stack traces that help to identify the source of the issue.

StrictMode policy violation; ~duration=107 ms: android.os.strictmode.DiskReadViolation

This helped us to identify a bunch of reads from the local storage on the UI thread that could be moved to the background.

Shared preferences. When investigating logs from strict mode, we identified that a bunch of the issues, in particular, were caused by reads from SharedPreferences. Therefore, some of them were moved to a background thread. Others, however, needed to be performed before the home screen is displayed, so the code that accesses them was deferred until the very end of the app initialization.

It is important to emphasize, that SharedPreferences start reading from the disk the moment the object is initialized according to the documentation of context.getSharedPreferences

Retrieve and hold the contents of the preferences file ‘name’, returning a SharedPreferences through which you can retrieve and modify its values.

context.getSharedPreferences("my-prefs", Context.MODE_PRIVATE)

Often, these objects are eagerly initialized and injected into constructors using dependency injection, which can lead to early disk reads on the UI thread.

Lazy initialization

A quick solution for the issue with SharedPreferences is to ensure they are being initialized lazily. If Dagger is used as a dependency injection framework in a project, the dagger.Lazy could help to achieve this goal as shown in the example below.

class LocalDataSource(val prefs: dagger.Lazy<SharedPreferences>) { 
  
  fun getData(): String? {
    return prefs.get().getString("data", null)
  }
}

In addition, the lazy injection could be used for other types of classes that might perform extensive work on the UI thread during their initialization.

Application lifecycle callbacks

Another factor that could negatively impact the startup time is the callbacks that are tied to the app’s lifecycle:

Application.ActivityLifecycleCallbacks — and especially its functions onActivityCreated, onActivityStarted, and onActivityResumed.
FragmentManager.FragmentLifecycleCallbacks — and especially its functions onFragmentCreated, onFragmentViewCreated, and onFragmentResumed.

Those are easy to miss at the first glance but they could hold a heavy initialization code that would obviously, slow down the app startup time. Moreover, those callbacks could be implicitly used by other third-party SDKs.

Third-party SDKs

Speaking of third-party SDKs. Many of them suggest they should be initialized in the Application.onCreate. Unfortunately, in some cases, their initialization logic might not be the most efficient in terms of performance. While some of them are pretty easy to fix just by moving their initialization in a background thread or even deferring it until they are actually used, others might be more tricky and take more effort to remove or adapt them.

As an example, we’re using an SDK that needs to be tied to the lifecycle of every Activity. Moreover, it is implicitly initialized under the hood the first time onActivityCreated is called performing disk I/O and other operations in the UI thread slowing down the app startup. While it’s possible to improve its performance during the initialization it requires additional effort and such cases are something one should be prepared to deal with.

Home screen initialization

While the Turo Android app already used a local cache for the home screen contents we’ve found an issue that forced the app to show the loading skeleton on every app startup. It turned out one of the network requests at one point started to return a nullable value where a non-null field was expected on a client. This caused the serialization error but was suppressed by one of the RxJava operators like onErrorResumeNext. Eventually, this led to a portion of the home screen contents to be always requested from the network causing a loading delay, since it was always failing to be cached.

Measuring the improvement

Making performance improvements is surely an important task, but measuring it is essential.

Configuring an A/B test

The described startup improvement is already available to 50% of our users as part of the ongoing A/B test. Since we got rid of the old SplashActivity and moved all the startup logic to the HomeActivity, the legacy startup configuration is being simulated in the latter with the help of splash screen API which covers the screen while the old network requests and I/O operations are being performed.

In order to decide whether the user should be placed in control (slow startup) or treatment (fast startup) we don’t use a traditional feature flag requested from the remote server. This is due to the fact, that it should be requested very early in the app lifecycle which may lead to issues with consistency. Instead, we use a pretty straightforward method to divide users into experiment groups locally on the device, ensuring a 50/50 split as shown below.

val deviceId: String = <locally generated and cached UUID>

val variant = when (abs(deviceId.hashCode() % 2)) {
    0 -> Variant.SLOW_STARTUP
    1 -> Variant.FAST_STARTUP
    else -> error("unreachable branch")
}

This code narrows down a string into one of two possible numbers: 0 or 1 where each of them is associated with a corresponding variant. In case the string is a randomly generated UUID, the 50/50 split is guaranteed. In our case, we use a locally generated and cached deviceId.

The experiment is presently in progress, and we are actively gathering and assessing the data.

Measuring the startup time

In addition, we needed to measure the startup time of the app itself. We could not rely on tools like Android Vitals as we needed to have a clear distinction between both variants of an A/B test. Instead, we needed a simple tool that allows identifying a cold startup time and reporting it to our analytics under the corresponding experiment variant.

In order to measure the startup time, we needed to identify 2 points in time: startup beginning and end.

Startup beginning. We consider a time when the system calls ContentProvider.onCreate a starting point for measurements. Why content provider? Simply because content providers are one of the first things written by the developer that is initialized in the app, even before the application class.

We also track the time since the process started using Process.getStartUptimeMillis. However, the majority of the app process initialization is not under the developer’s control, so counting the startup time from this point might be slightly redundant. Therefore, we rely on the former value in the scope of this blog post.

Startup end. Since we’re using a splash screen API it might come in handy while identifying when a startup is complete. It has a setKeepOnScreenCondition callback that is called when a splash screen is about to be closed. Under the hood, it’s just OnPreDrawListener. The most straightforward way to do it is shown below.

class HomeActivity : AppCompatActivity() {
  
  override fun onCreate(...) {
    val splashScreen = installSplashScreen()
    super.onCreate(...)

    splashScreen.setKeepOnScreenCondition {
      handler.postAtFrontOfQueue {
        reportStartupComplete()
      }
      false
    }
  }
}

We report the app startup time as a difference between the end and beginning points while a timestamp value for each of them is measured using SystemClock.uptimeMillis call.

If you would like to know more about what metrics to pick in order to measure the Android app startup time or how to identify a cold startup, we highly recommend checking this series of blog posts from Pierre-Yves Ricau: 1, 2, 3.

The issue?

Regardless of what metric is being used as a startup beginning, after analyzing the reported data we’ve noticed that the app reported a bunch of enormously large values (e.g. in some cases the app reported startup to take minutes, hours, or even days). As it turns out, in some cases, the app process might be started way before the user clicks on the app icon to launch the app.

After analyzing the source code of existing startup monitoring solutions on Android like Firebase, we can see this issue is handled by simply ignoring any value that is greater than 60 seconds. Therefore, we followed a similar approach.

Verifying the data

But how can we confirm that filtering out all values greater than 60 seconds will keep the reported metrics precise?

What helped us to verify it is the fact we‘re running a startup A/B test in 2 iterations. The reader might remember, the first change we made was removing a custom splash animation that takes roughly 1 second. This was the only change for the first iteration of the experiment and it provided a static difference in startup time between the control and treatment.

We analyzed the reported metrics while filtering out all the values greater than 60 seconds. By doing so we observed a static 1-second difference at each percentile between experiment groups. On the other hand, when the filtering was not applied, the difference was growing noticeably the closer we moved to the 99th percentile. Therefore, we confirmed that filtering out values greater than 60 seconds produces the correct result.

Learnings

There are multiple reasons why the startup process of the app might be slowed down. It’s important to start the improvement with those tasks that require the least effort and produce the maximum result. When moving to the other side of the equation, it’s always important to evaluate each step and see if it is worth the effort as compared to the possible improvement it provides.

Due to the significant difference in performance metrics between before and after states, this project helped to ignite interest in other areas of mobile performance and developer productivity both on Android and iOS platforms at Turo.

While improving startup time is a challenging task, measuring it brings its own additional challenge. There is no simple API on Android that would provide it. Finding a proper way to report startup metrics requires trial and error while testing every idea or hypothesis is slowed down by the product release process and its gradual rollout to the users.

In many cases, engineering faces a conflict between allocating time and resources either on product work or performance optimization. However, treating performance as a standalone product feature is inevitable for its success.