Performance of 2GIS for Android

Performance of 2GIS for Android


This article has been translated from its original publication at https://habr.com/ru/companies/2gis/articles/734688/

Every large application eventually faces the task of increasing its startup speed. The 2GIS application on Android was no exception. Let me tell you how the testing team searched for the causes of slow startup.

Initial performance issues

In November 2021, the application took two seconds to launch on a test Huawei Mate 10, and then it took an additional 10 seconds to render the map. This means that users had to wait 12 seconds before they could start using 2GIS, which was catastrophically slow.

The problem was found in the enabled "Metro" layer: upon startup, the application applied additional styles to the layer, causing a slight redraw of the map. This means that the user had to wait for us to first render the map with one set of styles and then redraw them.

To fix the problem asap, a temporary workaround was implemented by disabling the loading of the layer during startup. It was not remembered, and the user had to enable it manually each time. Over time, these temporary fixes were abandoned and a proper solution was implemented.

After this incident, our QA team decided to include startup speed checks in every release.

Manual checks 

The initial checks on development builds were done manually. And by manually, I mean truly manual: a tester with a stopwatch in hand would launch the application five times at intervals of 10 minutes. They would then record the results and calculate the average. The same process was repeated for the previous release to compare the startup speeds.

Naturally, this was not an ideal solution as it took a lot of time and was highly unreliable.

We started looking for internal parameters within the application that could accurately track the startup speed. Timestamps of module initialization, which were written to the log, proved to be such parameters.

In addition, we made changes to the calculation algorithm:

  1. We abandoned the 10-minute waiting period between application launches, as the device could "rest" during that time and activate battery-saving features.
  2. We increased the number of launches to 10 (data from only five launches was insufficient for analysis).
  3. We started calculating the median instead of the arithmetic mean since the mean was not effective with anomalous outliers, such as suspiciously fast or slow application startups.

These measurement results on test devices more or less matched the information about startup speeds reported by users. However, over time, this process stopped being effective.

Critically long start 

After that, there were many major changes in our applications.

We completely redesigned the main screen.

Now it looks like this. 


We redesigned the search results. 

In each category, the filters can now be customized individually: for food - types of cuisine and average bill, for beauty - type of services or haircut cost. There is also a new section called "here were friends."

We changed the design of the company and building cards.

And in April 2022, we released a "hybrid" feature - the ability to use the application without downloading the entire city.

Everything was great. We were satisfied with the release until we looked at the graph, where it was clear: the launch speed increased, consequently the retention rate decreased. Soon, we also started receiving feedback that the application was running slower.

Automated tests 

We formulated several hypotheses for the reasons behind these performance issues. But before testing them, we decided to determine how to reliably assess whether we had actually improved performance.

One option was to install an optimized build on a device or emulator and run it multiple times. Simultaneously, we would search for information about the launch speed and its changes in the logs. Overall, it was a decent approach, but it had a major drawback - it was labor-intensive. The process could take anywhere from 15 to 30 minutes, depending on QA, the device, and the method of collecting metrics.

In addition to that:

The amount of data obtained was too small. Theoretically, we could manually launch the application a thousand times and then analyze the launch speed indicators. However, this would require a colossal amount of time.

Device background processes. It is difficult to account for all the smartphone processes that can affect performance at the moment.

Throttling. Launching an application is a resource-intensive operation for a device. The device performs a lot of heavy work, increasing energy consumption, which can lead to overheating. To somewhat control this, a smart phone will artificially lower its performance.

We decided that the fairest approach would be to create a benchmark that could be executed on real devices. Even better, we would use the devices that our users had in their top preferences.

The next step was to ensure that the results could be trusted. For this, we turned to mathematical statistics. With its help, we managed to understand how to calculate the results and ensure that they were not misleading.

What we did:

  1. We wrote an automated test that launches the application N times with a timeout of M seconds. The timeout helps prevent the device from being overloaded while also preventing it from going into a deep sleep.
  2. For each device, we selected an optimal timeout based on the relative standard deviation. Roughly speaking, this is the percentage of how much the results differ from each other within the sample. The lower the percentage, the less variation within the sample. For example, with a timeout of 10 seconds, we see that the coefficient of variation (CV) is 40%. Therefore, we can assume that the device struggles to perform the launch task every 10 seconds, so we increase the timeout and then check how the CV changes. Each device on which we will run this automated test went through this calibration process.

PROFIT! In the end, we obtained an automated test that can concurrently launch two different versions of the application. This way, we address the issue of devices "living their own lives." And we can compare changes in launch speed, for example, between the previous and current versions of the application.

After implementing the benchmark, we were able to assess how the optimizations we made were helpful.

Adding Traces 

First, we explored how to analyze current performance issues.

To do this, we started actively profiling the application and adding traces. Our application consists of various components: Qt, QML, C++, Java. Some traces were already added, but they were not sufficient. We added traces in places where needed and monitored what was happening in these components during basic scenarios such as application launch, search, and card opening. We examined the results in Perfetto.

We identified bottlenecks:

  1. Not all threads were fully utilized; some were occasionally idle. How we addressed it: We optimized the workload in the threads to minimize idle time.
  2. Many objects were created during application startup that were not actually needed for the initial launch. For example, distance measurement elements from two simultaneous long taps. How we addressed it: We deferred the creation of such objects.
  3. We identified elements that are not immediately needed by the user when opening a card or performing a search. For example, the action bar displaying the route, "Call," "Favorites," "Show entrance," sharing, taxi, and promotional CTA.


To address this, we delayed the creation of these elements by milliseconds. This allowed us to quickly display the header, body, and eventually the action bar.

The card launch speed improved, but the overall application launch speed was still unsatisfactory.

Problem with accumulated data: 

To find the cause of the long startup, we compared the main version of the application with the beta version. We used the beta version as the primary one on a device, constantly using it for our purposes, while frequently reinstalling the main version.

One day, the beta version had a drop in launch speed, while the main version launched quickly. We initially thought the cause was known bugs that were fixed in the next version, so we were surprised when the launch speed remained slow after updating.

We suspected that the problem was not in the application's functionality but in its accumulated data on the device. We captured logs and simply reinstalled the beta version, and voila, the application launched as fast as the main version!

There were no doubts anymore—the problem was definitely with the data.

We encountered a classic mistake with test devices—the application's lifecycle is very short, and it is often reinstalled. The large volume of data does not have enough time to accumulate.

By removing just one line of code—Directories::logDiskUsage(options);—which collected statistics on application files and calculated their size, we gained a significant speed improvement. Just one line of code—and hundreds of thousands of satisfied users.

Accelerating card opening and search results: 

So, we tackled the main problem—the critically slow startup. The application started much faster, and the graphs showed a significant improvement. However, there were other important user scenarios that we also wanted to speed up.

For card opening, we optimized it using two simple methods:

  1. Previously, we compiled the object first and then created it. We improved it by pre-compiling the object and creating it only when the user opens the card.
  2. When opening a card, we used to display the header first, then the card actions, and only after that, the card body containing the most important information for the user. Some elements, such as card actions, took up time for visually displaying the main card information: organization name, address, contact details.

We rearranged the order of element display and achieved a visual acceleration—all the necessary data is displayed earlier.

In the search results, we prioritized card creation over action creation—first, we displayed the cards because they are more valuable objects. Additionally, we started preparing the cards in the search results even before receiving the search results and determining the card types.

Added automated tests: To control regression, we added automated tests.

  1. Benchmark for the speed of opening advertising cards and places. 

It performs N consecutive card openings: launch the application → open a card → collect metrics → open another card → …

In addition, it performs N application launches with subsequent card openings: launch the application → open a card → collect metrics → close the application → open the application → ...

We do this because the first card opening after application startup is a heavy operation.

  1. Benchmark for the rendering speed of elements in the search results. 

We also measure the opening of advertising cards. This benchmark, like the previous one, can perform N application launches with subsequent metric measurements. It can also perform multiple measurements within a single session.

Conclusion

Screencasts recorded on three different devices—Samsung Galaxy A51, Google Pixel (first generation), and Redmi Note 8 Pro—will tell the story better than words.

On each device, we launched three different versions of 2GIS: 6.12 (let's call it the slow one), 6.15 (the latest version where the situation has already improved significantly), and another version currently under development (let's call it 6.15+ for convenience). In each version, we performed a simple action—opening a company card.

Currently, the production version is 6.20, and we continue to optimize.



Report Page