NoGA project (SIDN fonds)
by Marnix Dessing
This pages sets the goal of giving a detailed description of the tracking implementations of Google Analytics and Matomo. We have concluded that there are four main tracking techniques and we will now deep dive into the implementation of both tools.
Google published a tracking code overview for their Web Analytics platform. It explains that the platform exploits the JavaScript tag technique[1]. The tracking code works in three steps:
Google explains that first-party cookies are used for[2]:
By default, cookies are set on the domain specified in the document.host browser property and the cookie path is set to the root level (/).
However, if one chooses to enable advertising features, third-party cookies are set[1]. Those third-party cookie domains are set to the domains of different google marketing products such as: DoubleClick.net, gstatic.com etc.[3] This kind of third-party cookie can be used to track users across the web. For each site containing a reference to one of these domains the cookie is send to Google. (See HTTP Cookies)
User identification over multiple sessions is done via the first-party cookie. Which contains a random generated ID (the client ID) to identify a user. Google Analytics also provides a feature to use a custom user ID besides the client ID. One problem this overcomes is tracking the same user across different browsers and devices (when the user ID is known and provided). By only using the client ID it is not possible to recognize the same user on different browsers or devices. [4]
The parameters of the GIF request described above, have been documented by Google[5] a overview of parameters is displayed below.
Variable | Description |
---|---|
utmac | Account String. Appears on all requests. |
utmcc | Cookie values. This request parameter sends all the cookies requested from the page. |
utmcn | Starts a new campaign session. Either utmcn or utmcr is present on any given request. Changes the campaign tracking data; but does not start a new session |
utmcr | Indicates a repeat campaign visit. This is set when any subsequent clicks occur on the same link. Either utmcn or utmcr is present on any given request. |
utmcs | Language encoding for the browser. Some browsers don’t set this, in which case it is set to “-“ |
utmdt | Page title, which is a URL-encoded string. |
utme | Extensible Parameter Value is encoded. Used for events and custom variables. |
utmfl | Flash Version |
utmhn | Host Name, which is a URL-encoded string. |
utmhid | A random number used to link Analytics GIF requests with Google AdSense. |
utmipc | Product Code. This is the sku code for a given product. |
utmipn | Product Name, which is a URL-encoded string. |
utmipr | Unit Price. Set at the item level. Value is set to numbers only in U.S. currency format. |
utmiqt | Quantity. |
utmiva | Variations on an item. For example: large, medium, small, pink, white, black, green. String is URL-encoded. |
utmje | Indicates if browser is Java-enabled. 1 is true. |
utmn | Unique ID generated for each GIF request to prevent caching of the GIF image. |
utmp | Page request of the current page. |
utmr | Referral, complete URL. |
utmsc | Screen color depth |
utmsr | Screen resolution |
utmt | Indicates the type of request, which is one of: event, transaction, item, or custom variable. If this value is not present in the GIF request, the request is typed as page. |
utmtci | Billing City |
utmtco | Billing Country |
utmtid | Order ID, URL-encoded string. |
utmtrg | Billing region, URL-encoded string. |
utmtsp | Shipping cost. Values as for unit and price. |
utmtst | Affiliation. Typically used for brick and mortar applications in ecommerce. |
utmtto | Total. Values as for unit and price. |
utmttx | Tax. Values as for unit and price. |
utmul | Browser language. |
utmwv | Tracking code version |
Matomo, just as Google Analytics, uses a JavaScript tag tracking technique. However, Matomo is equipped with other tools too.
When no JavaScript execution is possible, Matomo can use a web beacon instead[6]. Both the JavaScript and Web Beacon can be used depending on the browser of the user. The Web Beacon can also be used to track the reading of emails (that is, if the mail client does not block the beacon)[7].
Another feature is the log import feature for user tracking. Mainly aimed to protect privacy of users by replacing the front-end techniques. Examples given of sites on which privacy is highly important are government or healthcare portals. I.e. on sites with webpages that may contain sensitive information it can violate privacy rights of users to use Javascript tracking (content tracking for example).
Matomo also offers IP anonymisation, respects can DoNotTrack requests and has other privacy features.[8] Both for Javascript tracking and log importation.
Matomo uses first-party cookies only. Unless it is configured to place a third-party cookie[9] which can be configured when tracking over multiple domains for example.
The first-party cookies are listed in the documentation (by name, time to live and details)[9]:
It is possible to disable cookie usage in Matomo. Matomo will fall back on other identification methods such as browser fingerprint. The cost of not using cookies is loss in data accuracy, reports such as “Days since last visits” or “Visit by visit count” will be affected[10].
Matomo has several options to identify new or returning users[11]:
Matomo defines the following list of data that is tracked by default[12]:
Bullet points market with a asterisk (*) are mentioned as data possible containing Personally Identifiable Information (PII). Besides the marked points above, the follwing list is also data that can contain PII.
The tracking of downloads is not always registered with Google Analytics, at least not by default. Matomo provides features for automatically registering downloads. This works by detecting links to a file (based on the file extension in the URL). If a URL does not always contain a file extension a work around is to add a CSS class to the link. This will register the link as a download in Matomo. [13]
Both Google Analytics and Matomo use a GIF image sized 1x1 to send the collected data to the backend. In search for an answer on the question why a GIF image is used the first main suspect was Cross-origin-resource-sharing (CORS). As the JavaScript often is executed from a different origin then the origin of the actual web analytics service. The fact that images can be loaded from any domain without CORS makes a tiny GIF image suitable for cross-origin requests. The GIF image is never rendered to the DOM and is only retrieved by the JavaScript. [16]
Cookies are used to maintain a form of state between client (a visitors browser) and server. Cookies are set by the Server via HTTP or via JavaScript. The cookie is then send along on each request to the server [14].
Cookie setting | Description |
---|---|
Name | The name of the cookie. |
Expires | The expiration date and time. |
Secure | If set the cookie is send over HTTPS only. |
HttpOnly | If set the cookie is inaccessible with JavaScript. |
Domain | The domain on which the cookie is active. |
Path | The path within the domain in which the cookie is active ‘/’ is the root. |
The distinction between first and third-party cookies is important in the field of Web Analytics and tracking. Third-party cookies are mostly used for advertising and tracking. Browsers and add-ons such as ad-block are now often blocking third-party cookies.
For cookies placed by a website that is being browsed (lets say example.com), a cookie with the domain example.com is a first-party cookie. A cookie with another domain e.g. doubleclick.net is a third-party cookie. This third-party cookie can be placed by an ad or banner that is loaded onto the page. Now a user can be identified on other websites with a ad or banner from that same third-party domain [15].
[6] Beacon usage
[7] Tracking without javascript
[8] Log importing and analysis
[10] No Cookies
[11] Unique visitor recognition
[12] What data does Matomo track?
[13] Tracking downloads
[15] Http cookies mozilla
tags: NoGA - "Web - analytics"