Methodology & Notes
Definitions, measurement methodology, and known constraints — updated 2026-04-13 09:56
Definitions
Key metrics and how they are computed in this dashboard.
New vs Returning visitors
- New Visitor: a visitor coming to the site for the very first time.
- Returning Visitor: a visitor who has visited the site before and is coming back.
- The distinction does not depend on the selected date range — it depends on the visitor's global history.
- Recognition relies on cookies / User ID. If cookies are deleted or blocked, a returning visitor may be counted as new.
- Recognition is per device / browser (unless User ID is enabled).
first_visit / first_open), which can
lead to interpretation differences between the two tools.
How this dashboard computes it: via Matomo's VisitFrequency.get API,
which returns nb_visits_new and nb_visits_returning per day based on
the visitor's cookie history. When a date filter is applied, new and returning visits are
summed across daily rows, preserving Matomo's global-history definition.
Visits, Unique Visitors, Pageviews
- Visit (session): a series of consecutive actions by the same visitor, ended after 30 minutes of inactivity (Matomo default).
- Unique Visitor: distinct visitor identified by cookies/fingerprint within the analysed period. Aggregating across days overcounts — use cautiously.
- Pageview: one page load recorded by the tracker. Does not deduplicate reloads of the same URL within a visit.
Bounce rate, Avg. visit duration
- Bounce rate: percentage of visits with a single pageview. A high bounce is not inherently bad — for a "read-one-article" news page it's normal.
- Avg. visit duration: average time between the first and last recorded action of a visit. Bounced visits contribute 0 seconds (no second action to measure from).
Conversions
- Conversion: a visit that triggered a configured Goal (e.g. download, event registration, video play).
- Attribution model: last non-direct touch — the campaign of the converting visit receives the conversion. Multi-touch attribution is not available.
Page type classification
Pages on cefic.org are grouped into categories (News, Policy, Guidance, Case Studies, Events, Science, Industry Data, Sectors, Highlights, Resources, About, Home, Other) by URL-prefix matching rules defined in the ingestion script. Pages that don't match any prefix are labelled “Other”.
Matomo Cloud API constraints
| Limitation | Impact | Status |
|---|---|---|
| Segment pre-processing required | Some Matomo Cloud plans require custom segments to be created in the Segment Editor before
the API returns data for them. Initially visitorType==returning was not available
via VisitsSummary.get. |
Resolved — switched to VisitFrequency.get
which returns both new & returning in a single call without segments. |
| API rate limits & chunking | Matomo Cloud enforces rate limits on API calls. Requesting large date ranges in a single call may fail or return partial data. | Mitigated — the pipeline splits requests into 30-day chunks with automatic retry. |
| Period-snapshot tables overwrite | Tables ending in _period (keywords, websites, socials, exit pages, etc.)
are fully refreshed on each run. Historical period-level data is not preserved — only
the latest snapshot exists. |
By design — these tables hold current-state summaries, not time series. |
Data coverage gaps
| Gap | Detail | Workaround |
|---|---|---|
| Data starts 2025-05-19 | The Matomo tracking tag was installed on cefic.org on that date. No analytics data exists before this point. | Year-over-year comparisons will only be available from May 2026 onwards. |
| Page-type classification is rule-based | Pages are classified (News, Policy, Guidance, etc.) based on URL prefix matching. Pages that don't match any known prefix are labelled “Other”. | New sections on cefic.org require adding rules in ingest_matomo.py.
Currently ~27% of pageviews fall into “Other”. |
| Campaign audience parsing | Audiences (members, staff, anyone, non_members) are inferred by parsing the campaign name string. Non-standard naming produces “undefined”. | Standardise UTM campaign names following the pattern
topic_audience_variant. |
| Conversion attribution | Conversions are attributed to the campaign of the visit session. No multi-touch or cross-session attribution is available. | This is a Matomo-level limitation (last-touch model). Consider Matomo's Multi Attribution plugin for advanced needs. |
Dashboard & display limitations
| Limitation | Detail |
|---|---|
| One-screen layout | Main dashboards are designed for desktop screens (1280×800 minimum). On smaller screens or mobile devices, some panels may overlap or require scrolling. |
| Date filter scope | The date filter applies to daily-granularity data. Period-snapshot tables (keywords, websites, socials, exit pages, page performance) always show the full-period view regardless of the selected date range. |
| Plotly chart rendering | Charts require the Plotly.js library (~4.5 MB). A CDN is used with a local fallback.
In fully offline/air-gapped environments, ensure plotly.min.js is present
in site/assets/. |
| Static generation | All data is embedded at build time. There is no live connection to Matomo — dashboards reflect the state at last pipeline run (daily 06:00 CET). |
Pipeline & infrastructure
| Item | Detail |
|---|---|
| Single-machine pipeline | The pipeline runs on a single server via cron. There is no HA, no retry scheduler, and no alerting beyond log inspection. If the machine is down at 06:00, that day's update is skipped (caught up automatically the next day). |
| DuckDB single-writer | DuckDB allows only one writer at a time. Running the pipeline concurrently (e.g. manual + cron) may cause lock errors. |
| GitHub → Azure deployment | The pipeline pushes to GitHub, which triggers Azure Static Web Apps deployment. If the GitHub push succeeds but Azure deployment fails, the live site will be stale until the next successful deployment. |