Definitions

Key metrics and how they are computed in this dashboard.

New vs Returning visitors

Important (common sources of confusion):
  • The distinction does not depend on the selected date range — it depends on the visitor's global history.
  • Recognition relies on cookies / User ID. If cookies are deleted or blocked, a returning visitor may be counted as new.
  • Recognition is per device / browser (unless User ID is enabled).
Matomo vs GA4 — do not compare directly: In Matomo, the New / Returning distinction relies on the visitor's global history (independent of the analysed period). In GA4, the classification depends on events detected within the selected period (first_visit / first_open), which can lead to interpretation differences between the two tools.

How this dashboard computes it: via Matomo's VisitFrequency.get API, which returns nb_visits_new and nb_visits_returning per day based on the visitor's cookie history. When a date filter is applied, new and returning visits are summed across daily rows, preserving Matomo's global-history definition.

Visits, Unique Visitors, Pageviews

Bounce rate, Avg. visit duration

Conversions

Page type classification

Pages on cefic.org are grouped into categories (News, Policy, Guidance, Case Studies, Events, Science, Industry Data, Sectors, Highlights, Resources, About, Home, Other) by URL-prefix matching rules defined in the ingestion script. Pages that don't match any prefix are labelled “Other”.

Matomo Cloud API constraints

LimitationImpactStatus
Segment pre-processing required Some Matomo Cloud plans require custom segments to be created in the Segment Editor before the API returns data for them. Initially visitorType==returning was not available via VisitsSummary.get. Resolved — switched to VisitFrequency.get which returns both new & returning in a single call without segments.
API rate limits & chunking Matomo Cloud enforces rate limits on API calls. Requesting large date ranges in a single call may fail or return partial data. Mitigated — the pipeline splits requests into 30-day chunks with automatic retry.
Period-snapshot tables overwrite Tables ending in _period (keywords, websites, socials, exit pages, etc.) are fully refreshed on each run. Historical period-level data is not preserved — only the latest snapshot exists. By design — these tables hold current-state summaries, not time series.

Data coverage gaps

GapDetailWorkaround
Data starts 2025-05-19 The Matomo tracking tag was installed on cefic.org on that date. No analytics data exists before this point. Year-over-year comparisons will only be available from May 2026 onwards.
Page-type classification is rule-based Pages are classified (News, Policy, Guidance, etc.) based on URL prefix matching. Pages that don't match any known prefix are labelled “Other”. New sections on cefic.org require adding rules in ingest_matomo.py. Currently ~27% of pageviews fall into “Other”.
Campaign audience parsing Audiences (members, staff, anyone, non_members) are inferred by parsing the campaign name string. Non-standard naming produces “undefined”. Standardise UTM campaign names following the pattern topic_audience_variant.
Conversion attribution Conversions are attributed to the campaign of the visit session. No multi-touch or cross-session attribution is available. This is a Matomo-level limitation (last-touch model). Consider Matomo's Multi Attribution plugin for advanced needs.

Dashboard & display limitations

LimitationDetail
One-screen layout Main dashboards are designed for desktop screens (1280×800 minimum). On smaller screens or mobile devices, some panels may overlap or require scrolling.
Date filter scope The date filter applies to daily-granularity data. Period-snapshot tables (keywords, websites, socials, exit pages, page performance) always show the full-period view regardless of the selected date range.
Plotly chart rendering Charts require the Plotly.js library (~4.5 MB). A CDN is used with a local fallback. In fully offline/air-gapped environments, ensure plotly.min.js is present in site/assets/.
Static generation All data is embedded at build time. There is no live connection to Matomo — dashboards reflect the state at last pipeline run (daily 06:00 CET).

Pipeline & infrastructure

ItemDetail
Single-machine pipeline The pipeline runs on a single server via cron. There is no HA, no retry scheduler, and no alerting beyond log inspection. If the machine is down at 06:00, that day's update is skipped (caught up automatically the next day).
DuckDB single-writer DuckDB allows only one writer at a time. Running the pipeline concurrently (e.g. manual + cron) may cause lock errors.
GitHub → Azure deployment The pipeline pushes to GitHub, which triggers Azure Static Web Apps deployment. If the GitHub push succeeds but Azure deployment fails, the live site will be stale until the next successful deployment.