The driver exposes the Google Analytics data model in the form of a relational schema. Google Analytics stores two types of data: admin data and analytics data. Admin data consists of objects related to the management of Google Analytics services. The driver maps these objects directly as relational tables. Analytics data consists of all the data that Google Analytics collects about a website.

Given the large volume of analytics data available and the limitations around which aspects of analytic data that can be returned in a single result set, it can be challenging to formulate valid SQL queries. To address this issue, the driver requires that you create custom tables containing the specific metrics and dimensions that you want to run queries against. The Configuration Manager includes a table designer tool to assist with custom table creation. See Adding custom tables for details.

Note: The driver itself defines a number of functional tables, such as INCOMPATIBILITY and METADATA. The driver uses these tables to translate the Google Analytics data model into a relational schema. The metadata for these tables, as well as the admin and custom tables, is provided in corresponding topics beginning with the ACCOUNTS topic.

The following sections provide information to help you write effective queries against Google Analytics.

Google Analytics data structure

Google Analytics is a service that generates detailed statistics about a website's traffic and traffic sources. But Google Analytics is not just a database. It is a multi-dimensional hypercube containing all kinds of measurements about traffic to a website. When you connect to Google Analytics using the driver, you can reach into this repository and flatten it into relational data that can be used with your JDBC application.

Imagine a very small store of data about your website. For each hit, the Google Analytics service logs the date, language of user, country of origin, new or returning user, and their time on the site (in seconds).

2019-01-01 en US new 5.3
2019-01-01 nl DK new 90.4
2019-01-01 es ES new 24
2019-01-01 ja JP new 4.2
2019-01-01 es MX new 345.3
2019-01-01 ja JP returning 655.9
2019-01-02 en US new 45.7
2019-01-02 en US new 345.9
2019-01-02 es ES new 57.7
2019-01-02 en US new 6.8
2019-01-03 es MX new 876.1
2019-01-03 ja JP returning 5.7
2019-01-03 en GB new 5.6
2019-01-03 en US new 617.9
2019-01-03 en US returning 56.1
2019-01-04 es MX new 45.1
2019-01-04 jp JP new 178
2019-01-04 en US returning 103.9

Google Analytics collected data for this website over four days. The data is broken down by date, language, country and user type. For each visit, the time spent on the site was recorded. The time on the site is a metric, and the other columns are dimensions.

In an actual scenario, Google Analytics aggregates a vast amount of information about your website. It measures hundreds of things, and categorizes them by hundreds of dimensions. The query interface that Google Analytics provides allows you to fetch these metrics and group them. Because of the massive amount of information they store, their interface limits you to fetching up to ten metrics at a time, grouped by no more than nine dimensions.

For example, suppose you want to know how much time new visitors spent on the site. Your dimension is user type and your metric is time. You would get back two rows:

new 2648
returning 821.6

How much data you get back depends on how you ask for it. If you ask for two dimensions, you get even more data, because you get one row per permutation. Requesting how much time users have spent on each day, broken down by country, returns more rows:

2019-01-01 DK 90.4
2019-01-01 ES 24
2019-01-01 JP 660.1
2019-01-01 MX 345.3
2019-01-01 US 5.3
2019-01-02 ES 57.7
2019-01-02 US 398.4
2019-01-03 GB 5.6
2019-01-03 JP 5.7
2019-01-03 MX 876.1
2019-01-03 US 674
2019-01-04 JP 178
2019-01-04 MX 45.1
2019-01-04 US 103.9

Adding custom tables

Custom tables can be added to your Google Analytics relational schema, and further customized, using the Configuration Manager. Once you open the Configuration Manager, you can add and customize tables from the Configure Logical Schema button on the Schema Settings tab. This tool allows you define a new table, or customize an existing table, in terms of metrics and dimensions. You can select up to ten metrics and nine dimensions per table. Based on your selections, a table definition is created in the form of a JSON string which is incorporated into the connection string as a value for the Add Tables (AddTables) option. For example:


AddTables='{"MyTable":["sessions","_language","_country"]}'
Note: Dimension names specified in the string must include a leading underscore character. For example, "_browser". The driver uses the underscore character to distinguish dimensions from metrics.

The table that results from this example would include the three selected columns, plus the columns currencyCode, keepEmptyRows, propertyId, rowId, and timeZone. The following is an example query for your custom table:


SELECT _LANGUAGE,SESSIONS FROM MyTable

Paging support

The driver employs row offset paging when returning results using the Reports API. The page size returned for results is set to 10,000 rows.

Realtime reports

Realtime reports return events and usage data for periods ranging from the present to 30 minutes ago (up to 60 minutes for Google Analytics 360 properties). Reports can return up to 14 dimensions and 4 metrics. For more information on the supported dimensions and metrics, refer to the documentation for the Google Analytics Data API.

You can configure Realtime reports by specifying the following parameters for the Default Query Options (DefaultQueryOptions) option.
  • startMinutesAgo: The inclusive start minutes for the query with Realtime Report API. The default is 29 (twenty-nine minutes prior to the current time).
  • endMinutesAgo: The inclusive end minutes for the query with Realtime report API. The default is 0 (current time).
These filters will automatically be applied to your query. For example, the following query can be used to return realtime reports:
SELECT _appVersion,_city,activeUsers,eventCount FROM REALTIMEREPORTS where propertyID=123344545

Note that paging is not supported when querying realtime reports, and a maximum of 100,000 rows can be returned.