Best Practices: A/B testing

A/B testing is a crucial component of every business strategy. It involves testing your assumptions against a control group to ensure your new approach maximizes revenue and grows your business. With Unity LevelPlay’s A/B testing tool, you can challenge and experiment with different monetization variables to understand how your users engage with ads and choose a winning strategy.

A/B testing requires smart planning and defined goals to get the most conclusive results. The following best practices will help you conduct clean tests using the Unity LevelPlay A/B testing tool, so you can make the best decisions based on accurate data.

Best practices for setting up an A/B test

Test everything that can impact ARPDAU: You should constantly be testing any variable that can help boost your ARPDAU. Use the following tests as a starting point:

- Add new ad networks to see if they are profitable
- Waterfall optimization tests such as hybrid vs. traditional waterfall, a new waterfall for a specific country, different instance pricing, adding or removing instances, waterfalls for segments and groups
- Different capping and pacing strategies to bring the greatest profits without harming engagement
- Banner refresh rates to maximize banner performance
- Different rewarded video reward amounts to determine which has the greatest level of engagement

Utilize the AB traffic allocator – If you don’t want to start your test with an even 50/50 split between your control and test groups, use the A/B traffic split allocator to split your traffic to a 90 /10 split. This allows you to analyze the performance of the test group without committing too much traffic to the test group. If the test group performs well, start increasing the traffic 5-10% at a time.
Test 1 variable at a time: As you optimize your ad strategy, you’ll probably find that there are multiple variables you want to test. The only way to effectively evaluate the significance of any change is to isolate one variable and measure its impact. This means, for example, that you shouldn’t activate a new network while also changing the reward amount of a video placement.
Make bulk changes to the variable your testing: Test multiple changes to the same variable in your B group that are in line with your KPIs. For example, if you are using Real Time Pivot latency reports to identify the instances with high latency (variable), remove multiple networks you suspect are causing high waterfall latency.
Avoid testing on very low DAU apps: Have at least 15K DAU on the app level to reach the most data driven business conclusions.
Test for a minimum of 3 days and a maximum of 14: This will ensure enough learning for group B and allows you to test other variables in quick succession.
Leave group A alone: Once you’ve identified the variable you want to test, leave your control group (group A) unaltered. The settings in group A already exist as your current ad strategy. Changes should only be made in group B to challenge the current strategy and compare the results.

Best practices while an A/B test is running

See results in real time: Go to the A/B dashboard in real time pivot to verify that your test is up and running and to see your results in real time.
Monitor the test data: Use the A/B dashboard to keep track of your data to make sure that nothing too drastic occurs as a result of the changes you made.
Don’t make changes to 50/50 split test: Don’t make changes in group B during a 50/50 split test test. If you want to test a different assumption or even make small adjustments to the current test, terminate the test and start another one. If you started a test with a different traffic allocation, make sure to not make any changes within 24hrs of the test starting. This is crucial and will help to keep the data organized and clean for analysis.

Best practices for analyzing the results

Dig into the data: Use the A/B dashboard as a starting point of analyzing results. Compare the data and the performance of the two groups in accordance with the KPIs you set before initiating the test. If you need to dive in deeper, review the performance and cohort reports that are linked in the dashboard.
Use ARPDAU as your guiding light: The main metric that should guide your decision about which group to continue with is ARPDAU. Analyze the average revenue per daily active user to see the user behavior in your app after the changes made in the test group. While other metrics might be affected by new users or user acquisition, ARPDAU gives you a clearer picture of all your existing users’ behavior and reaction to your strategy.

Cumulative ARPU metric is also an important metric to consider, especially when analyzing new users from a specific day. ARPU is closely tied to the cumulative ARPU metric, so it’s important to monitor them together.

3. Look at the right data at the right time: Only start analyzing results after all changes have been made to group B so you can see the impact of all your changes. You should also exclude the first day of the test from your analysis. In the case of using the A/B traffic allocator to five your test group more or less traffic over the period of the test, make sure you wait at least 24 hours before so you can get the full results of the test.

4. Ignore daily data: When testing an assumption, you are aiming for long-term improvement rather than daily changes. Since daily performance can be volatile, analyze the data for the entire test as a whole without breaking it into days.

5. Follow up on app trends post-test termination: Come back to the tested app’s A/B dashboard 2 weeks after the test closed to analyze the ongoing results of the initial A/B test. This will allow you to learn the sustainability and long-term effects of your test.

Make an A/B testing routine to keep improving your app performance. Small and incremental changes can quickly add up to drive significant revenue growth, and there’s always room for more optimization. Use the insights from the analysis of the previous tests to understand how you can further improve in your next tests.