Courrier des statistiques N10 - 2023

With issue 10, the Courrier des Statistiques journal celebrates its five years of publication in a new format and continues to explore issues and methods in the area of official statistics.
The review begins with a topic that is now unavoidable for statisticians: datavisualisation. Falling between dissemination and communication, datavisualisation seeks to simplify messages to make them more easily understood by readers and to make people want to read them.
The second article, on defence statistics, addresses an area in which data, which are often sensitive, are both highly confidential and open to researchers under very secure conditions.
What administrative data should be studied, what surveys should be used and what choices should be made in relation to statistics on sport? This is precisely what the third article is about.
In this issue, two articles on registers echo those already published on this subject in issue 8. FINESS is the French register of health and social establishments and plays a fundamental role in the ecosystem of health information systems. Ramsese, the French academic and ministerial register of educational establishments, is used in a wide variety of ways: for governance, management, interoperability and statistical needs. A shared feature of these two registers is that they are both, in their respective fields, organised centrally and subject to high quality requirements.
Finally, the last paper uses an educational approach and striking examples to discuss the differences between random and non-probability sampling.

Courrier des statistiques
Paru le :Paru le12/02/2025
Pascal Ardilly
Courrier des statistiques- February 2025
Consulter

Can we rely on non-probability sampling?

Pascal Ardilly

Sample surveys are based on either a probability sample or a non-probability sample. In the non-probability approach, the probability of a given individual being included in the sample generally depends on the value of the variable collected from that individual. This produces a particular error known as ‘selection bias’. In the non-probability ‘quotas’ method, this bias is limited by structuring the sample according to certain variables that explain the measured phenomenon. However, a bias remains if those variables fail to account its whole variability. In order to fully justify the method, one appeals to an assumed behaviour of individuals, known as modelling. Other non-probability selection methods exist, such as the purposive selection method – reflecting the common perception of ‘representativeness’ - or volunteer sampling, particularly developed in recent years through ‘Access panels’. In this last case, the selection bias can be significant, even considerable. Unfortunately, the bias cannot be reduced by increasing the size of the sample. Two striking examples – one relating to the vaccine uptake rate against the coronavirus, the other to the 1936 presidential elections in the United States – illustrate this phenomenon, known as the ‘big data paradox’.