Whilst metrics such as total visitors and total converters are useful in understanding the impact of an experience, they don't tell us anything about whether it has been successful in meeting its defined objectives and how your visitors engaged with an experience. To do this, we need to apply the principles of A/B testing:
In this simple 50/50 split test, your site traffic is divided equally between two treatments, the control and variation. By observing the behavior amongst visitors in the control and comparing it against the same behavior amongst visitors in the variation-and employing our stats modeling, with a bit of prior thrown in-we can answer focused questions such as:
Of course, what success means depends on what we are trying to achieve. This is where goals come in. Goals enable you to define the criteria that you will use to evaluate whether an experiment has been successful.
Each Qubit experience will have a primary goal and secondary goals. You can add a maximum of five goals for each experience.
Your experience's goals are displayed in the Test Summary panel. Let's look at an example:
Here we can see that the user has created an experience with the default goals:
TIP: Of course, you are free to define the goals for your experience and can add up to a maximum of five.
INFO: When referring specifically to the metric reported for an experience, RPV and RPC refer to revenue from the moment a visitor enters the experience until the moment they leave or the experience ends.
In this next example, we see that user has added a custom goal addToBag:clicked
:
INFO: Custom goals are a great option where you are looking to evaluate the success of an experience in triggering a QP event or getting the user to interact with a button or a similar UI element.
For each goal, we report the results of experience variations compared to the experience control. Variations are always compared to the experience control, so, for example, A v B, A v C, A v D, etc:
WARNING: We do not perform direct comparisons between variations, B/C, C/D, etc.
By comparing the variation to the control, you always have a solid basis to determine which variation, if any, is having the biggest impact on each of your goals, whether that be conversions, Revenue Per Visitor, or the firing of specific QP events.
Stated simply, the variation that is most successful at achieving the primary goal is the winner.
A goal is considered complete when the sample size is 100%. Any changes in uplift, positive or negative, are statistically significant because, at 100% sample size, we have collected a sufficient quantity of data to be able to declare with a high degree of accuracy that the observed change is not due to a random factor.
INFO: Remember that an experience is considered complete only when the primary goal has reached statistical significance.
INFO: A goal is declared as the statistically significant winner when the sample size is 100% and the probability of uplift is greater than 95%.
Goals are therefore the key to understanding how your experience is performing and we provide clear visual cues to help you evaluate your experience against each one.
When your primary goal has reached sample size, we will present the outcome of the experience. There are a number of possible outcomes, each shown as a New finding:
If the probability of uplift is between 80% and 95%¹ for the primary goal, we will report that that a variation is performing better than the control.
This means we are getting more confident about the change in uplift being attributable to the experience, however, because the result is not yet statistically significant (>95%), we can't yet declare it a winner:
We have a winner! - in short, your experiment has been a success. We are more than 95%¹ sure that the observed change in uplift for the primary goal is a result of the experience and not some random factor.
In the following example we have observed a 2.95% uplift in conversions for those visitors that saw the experience variation:
INFO: ¹ 95% is the default winning threshold for all Qubit experiences but you have the option of changing this. See Setting custom statistical thresholds.
These metrics will not display if you have selected the All traffic allocation mode. See Traffic allocation for more information.
Pilot tests run at much lower power (they don't have much chance of detecting uplift), so they complete much faster. They should generally only be used to test that a change does not produce a massive negative effect.
A normal test requires 6,000 converters in each variant. If a customer has only a 5% chance of being put in the control, it will take a much longer time to get to 6,000 for that variant. 50/50 is the fastest possible A/B test.
Although this can differ between clients and depends on the configuration of your property, typically the QProtocol events used to report conversions and revenue are identified in the following table:
Vertical | Events |
---|---|
eCommerce | |
eGaming | |
Travel |
Goals are attributed for two weeks after an experience is paused to cater for an experience influencing a visitor who purchases slightly later. We believe this is the most accurate way to handle the changes induced by iterations.
A visitor's conversions and other goals are counted towards experience results only if achieved during the same iteration in which the visitor entered the experience.
Indeed the statistics must not carry conversions/goals across iterations because both the experience itself and the visitor's allocation to control/variation may well have changed.
Multiple iterations can therefore delay the achievement of statistical significance.