Google has recently completed a significant overhaul in web measurement, marking the end of Universal Analytics – a long-standing industry standard for the past nine years. The change encompasses more than just a revamped user interface, as analysts must now adapt to new dimensions, metrics, default attribution models and conversion modeling. In this article, we will delve into the intricacies of Google Analytics 4’s modeling process, providing a detailed explanation. Furthermore, we will conduct a comprehensive analysis, comparing its approach and attribution outcomes with an unbiased Roivenue AI Attribution model.
The concept of attribution, often referred to as the “attribution problem,” is a persistent challenge that marketers face. It involves determining how to allocate credit to various marketing activities. Let’s imagine a scenario involving your customer:
Now the question arises: which ad should receive credit for driving that purchase? Traditionally, Google Analytics (GA) would attribute 100% of the credit to the last touchpoint in the customer’s journey, which, in this case, is the remarketing platform. However, this approach tends to overvalue marketing activities at the bottom of the funnel while undervaluing other touchpoints. Similarly, rule-based models encounter similar issues, as they may under or overvalue certain groups of channels. On the flip side, these models are relatively easy to implement and understand due to their straightforward logic.
A more advanced approach to tackling this issue is the data-driven approach, where the entire customer journey is analysed, and a model attempts to estimate the impact of each touchpoint. There are various methods for consolidating customer journey data, as well as multiple data-driven models available. In our exploration, we will delve into how GA4 addresses this challenge and evaluate its performance in both theory and real-world scenarios.
First and foremost, it’s worth noting that Google has been providing data-driven attribution modelling for quite some time, so it’s not an entirely new concept in GA4. However, the major shift is that it has now become the default attribution model, making it the most widely used among marketers.
The second notable change is that Google has discontinued offering other models, such as first-click and linear, apart from last-click and data-driven. This decision stems from the fact that Google found that these additional models were not extensively utilized by the majority of users. Consequently, analysts now have fewer tools at their disposal when seeking to comprehend the role of different channels within their marketing mix.
Google Analytics effectively measures website visits for users who have given their consent with collecting cookies and do not have anti-tracking mechanisms installed in their browsers. For gathering additional data there is an option to implement server-side measurement, although that is a topic for a separate article.
One notable feature offered by Google Analytics is the utilization of Google Signals for tracking customer journeys. This functionality enables cross-device tracking when users are logged into their Google accounts on multiple devices. However, it’s important to note that this feature needs to be activated as an optional setting and may not function optimally for iOS 14.5 and subsequent versions.
Both models in GA 4 (last click & Data-driven) share a common approach in processing Direct visits. In the post-processing stage, the significance of Direct is minimized. When a customer journey includes both Direct and any other channel, all the credit is allocated exclusively to the other channel. As a result, in GA4, Direct receives credit only in cases where the customer journey consists solely of Direct visits.
In the following section, we will provide a detailed explanation of the underlying logic of GA4 attribution modeling. However, if you prefer to read the official documentation prepared by Google, we have included a link for your reference.
Google is analyzing the impact of each interaction on conversion probability, as in – what is the likelihood of a user to convert at any given time in his journey? It uses factors such as:
The data-driven attribution model assigns credit based on how the addition of each ad interaction to the path changes the estimated conversion probability.
In the following high-level illustration, the combination of Ad Exposure #1 (Paid search), Ad Exposure #2 (Social), Ad Exposure #3 (Affiliate), and Ad Exposure #4 (Search) leads to a 3% probability of conversion. When Ad Exposure #4 does not occur, the probability drops to 2%, so we know that Ad Exposure #4 drives +50% conversion probability. We repeat this for each ad interaction and use the learned contributions as attribution weights.
It is important to note that Google does not incorporate parameters related to visit quality. Consequently, it may not accurately identify channels that commonly appear in conversion journeys but have minimal impact on the user’s decision-making process. The impact of this will be discussed later when we compare the attribution results of different models.
A significant drawback of the attribution modeling GA uses is its inherent limitation in accounting for the post-view impact of brand-awareness campaigns and other upper funnel activities. These types of campaigns often have low click-through rates but play a crucial role in influencing buyers over the medium to long term.
Last-click model ignores direct traffic and attributes 100% of the conversion value to the last channel that the customer clicked through (or engaged view through for YouTube) before converting.
Google provides alternative perspectives to analyze the data, such as a model that attributes all the credit to the final interaction with Google Ads (known as Google Paid Channels Last Click). However, we will not discuss these models in greater detail as they inherently skew the results.
Google provides an intriguing option for analysts utilising BigQuery, as it allows users to utilise a Streaming Export feature. This feature captures the raw, unprocessed data directly from measurements and saves it in Google BigQuery. Analysts can then access this unaltered data and apply their own customized rules and analyses. In the comparative section of this article, we will explore and compare the processed data within GA4 with the raw data obtained through BigQuery.
In today’s web measurement landscape numerous challenges arise leading to a portion of conversions being unobservable or customer journeys being incomplete. These challenges commonly arise due to the following reasons:
To address these limitations, Google employs machine learning techniques to model and estimate the missing conversions. Google’s machine learning models analyse trends between directly observed conversions and those that are unattributed. By identifying similarities between attributed conversions on one browser and unattributed conversions on another, the machine learning model can predict overall attribution. This prediction allows for the aggregation of both modeled and observed conversions.
While Google’s intent to provide a more comprehensive view is understandable, it also introduces an additional layer of complexity for analysts. Some reports in Google Analytics (GA) include modeled conversions, while others do not. This lack of differentiation between observed and modeled conversions within GA reports can pose challenges for analysts seeking to understand the data accurately.
Roivenue serves as a specialized attribution tool for marketers, aiming to enhance marketing ROI. Its core functionality revolves around integrating data from multiple sources and employing advanced modeling techniques to achieve precise attribution. This ensures that all digital channels are fairly represented in the attribution process. In this part, we will provide an in-depth explanation of Roivenue’s methodology and conduct a thorough comparison with GA4, assessing their respective approaches and methodologies.
The first notable distinction in modeling between Roivenue and GA4 lies in the underlying customer journey data used for attribution. While Roivenue utilizes visit-level data from Google Analytics 4 as a starting point, it goes beyond this by incorporating impression-level data from Demand-side Platforms (DSPs). This inclusion enables the evaluation of the post-view impact of real-time buying campaigns and the accurate measurement of conversions through direct deals with publishers.
Additionally, Roivenue provides a unique solution for measuring post-view and cross-device conversions from walled gardens such as Meta (Facebook, Instagram), TikTok, Twitter, YouTube, and others. While the process of connecting visit-level data is similar to GA4, the incorporation of impression-level data requires a deeper exploration to fully comprehend its added value.
Roivenue retrieves events from Google Analytics to reconstruct visit-level customer journeys. It is worth highlighting that Roivenue also captures multiple qualitative parameters of the visit, such as bounces, pageviews, events and more. This comprehensive data allows Roivenue to assess the true impact of each specific visit on conversions more accurately. By considering these qualitative parameters, Roivenue can provide a more nuanced understanding of the effectiveness and significance of individual visits in driving conversions.
Demand-side Platforms (DSPs) are essential tools for marketers to purchase ad space in real time across the internet. They not only facilitate real-time buying but also provide performance measurement for inventory obtained through direct deals with publishers. DSPs often offer pixel-based measurement and employ advanced techniques to track conversions effectively. Additionally, these platforms grant access to granular data, including browser identifiers and timestamps, for each served ad.
By integrating this detailed impression-level data with Google Analytics data, Roivenue can enhance visits with precise information about impressions. This integration allows for a comprehensive view of the customer journey, ensuring that even campaigns with zero direct visits receive fair credit if they contribute to increased awareness and impact conversions further down the funnel.
This represents a significant breakthrough in the field of attribution. In the past, tracking the post-view performance of walled gardens posed a challenge because these platforms did not allow pixel-based tracking. As a result, marketers had to rely on Google Analytics data, which couldn’t capture the full brand effect of campaigns, or trust the potentially inflated results reported by advertising platforms that have a vested interest in selling more ad space. Moreover, these platforms typically count the entire conversion for themselves within their attribution windows, leading to duplicate reporting if a person is targeted across multiple platforms.
Roivenue introduces a solution to this dilemma. By integrating the most granular data available from platforms (typically hourly data on the ad level) and combining it with conversions tracked in Google Analytics, Roivenue can identify matches between reported conversions. When a match is found, Roivenue generates synthetic touchpoints for all the platforms claiming involvement in a given conversion. While Roivenue cannot directly observe the real impression-level data from these platforms, it can indirectly assess their post-view impact on conversions by leveraging their reporting.
As a result, the customer journey consists of visits, true impressions, and synthetic touchpoints, all of which are used in the attribution calculation. This comprehensive approach allows for a more accurate understanding of the contribution made by each platform and facilitates a more insightful attribution analysis.
Roivenue employs an AI attribution model based on recurrent neural networks (RNNs) and places a strong emphasis on transparency regarding its methodology. The model utilizes the aforementioned customer journey data and aims to predict the probability of a conversion. It then assigns credit to each touchpoint based on the extent to which it contributed to increasing the likelihood of a conversion. The process consists of three key steps:
Example of the step 3 – how conversion likelihood estimates are converted into scores
By following this systematic approach, Roivenue’s AI attribution model leverages machine learning techniques to effectively assess the significance of touchpoints and provide valuable insights into the attribution process.
Now that we have explored the methodologies of both GA4 and Roivenue, it is crucial to compare them and highlight the differences in their results. Ultimately, marketers rely on these results to make informed decisions and it is important for any attribution model to offer a distinct view of the results, otherwise there is no point in deploying it in addition to the existing model.
This is a brief summary of the differences in the methodologies, but now let’s dive into the results!
To conduct a comprehensive comparison, we have collected data from six distinct online retailers operating across multiple countries and various industries, including fashion, furniture, professional equipment and more. Our aim was to ensure a balanced representation of retailers with different customer life cycles, varied marketing mixes and diverse strategies. By including this diverse range of retailers, we can obtain a holistic view of how attribution models perform across various business contexts, providing valuable insights into its strengths and weaknesses.
We have decided to use Last Click results from Google Big Query as a baseline so you can always find it as 100%. All the other models are then a % from the last click in Big Query. This was chosen as it also provides an interesting view into how post-processing of the data can influence the results you see in GA4. All the data are in percentages. You will find a summary of the patterns which were frequently found in the data in the next chapter, but we are also providing a description for each client after the summary if you really want to deep-dive into the comparison.
When combining all of the results, we can see some interesting patterns:
We can see a combination of previously described patterns, specifically:
Furthermore, an observation can be made regarding the “Direct” channel. Google acknowledges that this channel lacks substantial insights for marketers, so the model minimises its value by design which is a useful feature. Roivenue allows marketers to configure the approach to this channel, granting greater flexibility in the analysis.
This client confirms some of the previous observations but also shows some less frequent results:
We can see a combination of previously described patterns, specifically:
These observations reinforce the consistent patterns observed across different channels in GA and Roivenue. The significant overvaluation of Organic in GA4 and the minimizing effect on Direct in both platforms highlight the importance of comprehensive attribution models that can accurately represent the true impact of each channel. Additionally, the contrasting treatment of Walled Gardens between the two platforms may indicate inherent differences in their attribution methodologies and underscores the significance of choosing the most suitable tool for accurate marketing analysis.
The description of the client 6 would be exactly the same as for the client 5, all the patterns are very similar.
In this article, we explored the world of data-driven attribution in Google Analytics 4 (GA4) and compared it with the independent data-driven attribution model offered by Roivenue AI. The attribution problem, which marketers face, involves allocating credit to various marketing activities in a customer’s journey. Traditional models, such as last-click attribution, tend to undervalue touchpoints leading up to the final conversion.
GA4’s data-driven attribution model, which has now become the default in the platform, attempts to estimate the impact of each touchpoint by analyzing factors like time from conversion, device type, ad interactions, and creative assets. However, it has limitations in accounting for post-view impact and cross-device conversions from walled gardens.
Roivenue AI’s attribution model stands out by integrating visit-level data from Google Analytics with impression-level data from Demand-side Platforms (DSPs) and walled gardens. This allows Roivenue to accurately measure post-view impacts and cross-device conversions, providing a more comprehensive view of customer journeys. The AI-powered recurrent neural network model in Roivenue predicts conversion probabilities and assigns credit to each touchpoint based on its contribution to increasing the likelihood of conversion.
In comparing the results of both models across various business contexts and industries, we discovered some interesting patterns. Google’s data-driven model did not significantly inflate the contribution of Google Ads campaigns, but it showed challenges in accurately capturing the impact of other channels, such as organic search and walled gardens. Roivenue, on the other hand, excelled in recognizing the post-impression effect of walled garden campaigns, giving a more accurate representation of their impact.
Overall, Roivenue AI’s attribution model demonstrated greater transparency, customization options, and a more comprehensive approach to attribution, making it a valuable tool for marketers seeking to enhance their understanding of customer journeys and marketing ROI. However, both models have their strengths and weaknesses, and the choice between them depends on the specific needs and goals of each marketer.
Ready to Dive Deeper? Join our upcoming free live Webinar: Demystifying GA4 – the facts on attribution modeling in Google Analytics