Sampling in Google Analytics

Unsampler is a discontinued product that was previously provided by Roy App. Its purpose was to help users with sampled data in Google Analytics Standard.

Services which get unsampled data from Google Analytics

Please try these great alternatives for extracting unsampled data from Google Analytics if you do not have Google Analytics 360 Suite (Premium):

OWOX BI Streaming
By jacking into GA through a ga(‘require’) call, OWOX BI Streaming is replicating the HTTP request, but sending it to your own Google Storage. OWOX have built their own version of the Google Analytics table schema that can be queried through BigQuery.

This data is truly unsampled, but you will only be able to access it via BigQuery.

Supermetrics Google Sheets add-on
To extract as unsampled data as possible via Google Analytics core reporting API, we recommend the Supermetrics Google Sheets add-on.

It uses an approach to use as small a date range as possible to extract unsampled data. Does not work in all cases with very narrow segments.

More on Google Analytics sampling

Roy App Nostalgia

Read the legacy information about the Unsampler

I can use my free Google Analytics account and get unsampled data! The best of it, it keeps refreshing all the time, I don’t have to update anything.



How much data are you missing?

In the picture below you have made a landing page report in Google Analytics. You use a segment you created called “Returning Customers”.

The data is sampled using only 16% of the actual data:

sampled-landing-pages

Using the Unsampler, the same query could be exported to a CSV, giving these results:

unsampled-landing-pages

The difference is stunning:

unsampled-results-table

In this simple example the actual numbers are less than what we thought! Are you making decisions based on these numbers?


How it works

The Roy App Unsampler uses a well-known technique of iterating your Google Analytics data day by day, thus maximing the 500.000 session sampling limit per request. This will allow for even the narrowest segment to be extracted, unsampled.

Once all days of data has been imported, the Unsampler will keep polling every new day’s data so you can make new exports from the Unsampler.

Made to look like what you’re used to

Probably you’ve come across different tools making use of the Google Analytics API. We’ve tried to make it look very similar to Google Analytics Query Explorer. Set up your queries, let the Unsampler work, and then export it to the format of your choice.


What is Google Analytics sampling – and why is it bad?

For Google Analytics to be able to serve data to all its users – remember, Google Analytics is used across half the known web – they need to limit the requests. Sampling is a way of chosing a smaller set of sessions, and the extrapolate data from that small set.

When do I get sampling in Google Analytics?

When the dreaded yellow box pops up, you know you’ve reached into sampled data.

In the Google Analytics user interface, the box will pop up in the top right corner, just below the selected date range:

sampling-ga-ui

In the Query Explorer, a yellow box pops up just above the results table, next to the “Get data” button:

sampling-query-explorer

In a Google Analytics API response, it will have the containsSampledData field set to true:

sampling-api-response

Can I trust sampled data? Not really.

In some larger aggregates, when 95% of the samples are in the real dataset, you can probably make analysis. But when you’re looking at narrow drilldowns, and the sample set is small, you can’t trust it.

sampled-landing-pages

Above is an example of a typical landing page report with a segment applied to it. The sampling is based on 16% of sessions. It is evident that many of the top rows have the same amount of sessions – this is due to sampling. Google Analytics guesses the visits. Once you see patterns like this with many rows of similar result metrics, in this case 12s and 6s, you should really be careful in your analysis.

Sampling is bad because you can’t trust the data any longer. Don’t worry, the data is still there, in the Google Analytics data warehouse. It is just your means of retrieving it that samples the data.

What about Google Analytics Premium?

With Google Analytics Premium, you won’t have any issues with sampling. By default, the user interface will sample (albeit with a much larger amount of samples), but you can create special unsampled reports. It is also possible to export data to Google BigQuery for really great processing.

Google Analytics Premium is one of the best deals out there, but it is rather pricy. Depending on if you make a deal with a Google Analytics Certified Partner or buy directly from Google, your cost will end up at around 150,000 USD/year.

Modified on:
2018-05-03
Published on:
2015-05-27
Published by:
David Jurelius