Courrier des statistiques N3 - 2019
The Secure Data Access Centre (CASD), a Service for Datascience and Scientific Research
Giving researchers access to individual data collected by the Official Statistical System constitutes a major scientific challenge. This very detailed information requires a very high level of security to avoid any disclosure, which would be prejudicial to the citizen, or any use by an unauthorised third party. To meet this security requirement, INSEE created in 2010 the Secure Data Access Centre (Centre d'accès sécurisé aux données, or CASD), whose teams have designed a secure device, allowing remote access while ensuring strong user authentication and confinement of the files. CASD, now autonomous, has developed over time, extending its perimeters to other data producers and other types of highly detailed, sensitive data such as health data and administrative data. This service provides new solutions to the issue of record linkage and reproducibility of research work based on confidential data. CASD is increasingly used by the research community in France, and the originality of the experience, although relatively recent compared with that of its foreign partners, is enabling it to expand on an international level.
- Individual Data and Confidentiality
- The Specific Needs of Researchers
- Box 1. A Few Key Achievements
- Confidentiality vs Openness: Initial Steps
- Technology to Open Up New Possibilities
- Extremely Tight, Certified Security...
- ... Which Does Not Restrict Use on the Part of Researchers
- Access to Data: The Administrative Procedure...
- ... Followed by an Enrolment Session
- Autonomy but Not Without Checks
- The Benefits for Data Producers...
- ... and for Researchers
- Record Linkage to Increase the Possibilities...
- ... Using the “Hashed NIR”
- An Innovative Approach to Certification of Research Based on Confidential Data...
- International Opportunities
- Box 2. Status and Funding of CASD
The first work devoted entirely to statistical confidentiality appeared this year (Le Gléau, 2019). It gives a complete overview for France and some observations with regard to foreign countries in relation to the measures implemented to manage the use of data collected to produce statistics. Emphasis is placed in particular on the duality between the need to collect individual data and the requirement for security to ensure such data remain confidential, from both a legal and a technical point of view. For the statisticians of the Public Statistical Department, safeguards are provided by their status and by their integration in INSEE or in ministries. For researchers, with their ever-increasing need for very detailed public statistical data, there is the issue of the legal framework for access, as well as the question of technical security safeguards required to maintain the confidentiality of the data. We will observe how the Secure Data Access Centre (CASD) provides an answer to these questions and therefore fosters the development of access to data for scientific research. We will also consider how relations with both producers and researchers have evolved over time and have enabled new uses to be made available associated with new technology relating to data science, record linkage and the certification of results.
Individual Data and Confidentiality
A large amount of individual data on persons and companies is collected today by INSEE and the ministerial statistical departments for public statistics purposes, by the authorities in the course of executing their duties, by companies for their management purposes and by universities for the purposes of research in different fields such as health. In addition, there is ever-increasing individual data associated with the use of electronic methods (credit card payments, etc.) collected automatically. All this information covers a wide range of fields that are of particular interest in terms of research: income, wealth, health, company accounting data, information on geographic locations, schooling, professional careers, etc.
Although they are not all directly-identifying data (names or identifiers such as social security numbers or addresses), much of these data are indirectly-identifying data as a result of their precision. Some of them are sensitive under the law, and are posing an even greater risk in the event of identification for the persons or companies concerned. For companies, minimal amounts of information are usually sufficient to identify them.
The Specific Needs of Researchers
The data are covered by various forms of confidentiality contained in regulations and laws depending on the field: fiscal confidentiality, medical confidentiality, criminal confidentiality, business confidentiality and general professional confidentiality. In the case of public statistics, the confidentiality obligation for statisticians is written into law. This confidentiality obligation is known as “statistical confidentiality”, a specialised version of professional confidentiality.
Generally, these various provisions did not initially take the research purpose into account. The provisions have been progressively amended to incorporate this to enable researchers to use this very rich source of data for their quantitative analyses.
The initial progress in relation to researchers’ access to anonymised public statistical data with the Quetelet network effectively showed very quickly that whilst these less-detailed files, sent directly to researchers, represented significant progress, they did not, however, meet the requirements of many research projects. The growing concern of CNIL in relation to personal data led to a further reduction in the level of detail of these data, sometimes to the point where it became impossible for certain demographics or urban sociology studies, for example, to be undertaken. In parallel, new statistical methods associated with more effective means of calculation required very detailed information, particularly on the part of economists, at a time when it was becoming possible to deploy for the purposes of analysis increasingly more administrative data that were relevant to the evaluation of public policy.
To meet these requirements, the Act on statistical confidentiality, known as the 1951 Act, which had already seen several amendments, notably to allow researchers to use business data, was extended in 2008 to cover the use of data relating to persons and households for research purposes. An amendment in these terms for all personal data was made in 2004 in the Act on information technology and civil liberties. Amendments to other provisions also followed, notably in the field of tax data.
Legal developments do not of themselves suffice to safeguard the confidentiality of data. Appropriate security measures providing additional technical safeguards must be applied when the data are actually accessed. These measures require very high standards, which are clearly easier to apply in the public statistics department than outside of it because of their function. This explains the difficulties that have had to be addressed to ensure that such measures could be applied in entities such as universities and research centres before sending them the data. It was with this aim of extending the security mechanism beyond the Public Statistical Department and therefore responding to researchers’ needs to access data that, in 2010, INSEE and GENES created the Secure Data Access Centre (CASD), initially focused on access to data resulting from public statistics (see (Le Gléau and Royer, 2011) and Box 1).
Box 1. A Few Key Achievements
Today CASD is hosting approximately 500 research projects being undertaken in France (Amiens, Lyon, Marseille, Dijon, Paris, etc.) and abroad (United Kingdom, Germany, the Netherlands, Poland, Spain, Italy, etc.), representing approximately 350 sites deployed. In total, almost 1,500 users rely on CASD to access confidential data.
June 1999 – Report of Roxane Silberman on access to data for research.
October 2007 – INSEE launched a CASD pilot project following peer evaluation of the European Statistics Code of Practice.
July 2008 – Amendment to Act 51-711 to allow researchers to access data on individuals and households.
October 2009 – First Statistical Confidentiality Committee relating to data on individuals and households - Announcement with regard to the implementation of usage-based invoicing to cover costs.
February 2010 – Roll-out of CASD for 30 projects.
January 2011 – CASD is awarded the tender for the Équipex project (Équipement d’Excellence du programme Investissements d’avenir) and obtains funding of €4M to aid its development.
March 2012 – Creation of CASD entity at Genes.
September 2014 – Tax data are made available.
October 2016 – the Law for a Digital Republic, Article 36 of which extends the powers of the Statistical Confidentiality Committee to administrative data.
December 2018 – CASD Public Interest Group is established.
Confidentiality vs Openness: Initial Steps
Since the 1980s, to solve the conflict between data confidentiality and researchers’ wishes to use these data more widely, certain countries, including the United States, Canada, Great Britain and Germany, have set up secure access centres taking the form of isolated premises: users must physically travel to centres to work at them, with very strict checks when they enter and leave the premises. In particular, any extracted results can only be retrieved after verification has been carried out by the operators to ensure that statistical confidentiality has been adhered to. Although the data could be accessed, it was nonetheless very inconvenient for researchers to be required to travel, sometimes over long distances, to access the data.
To overcome the drawbacks of these arrangements, experiments started in the late 1990s with a view toward developing systems allowing both secure and remote access. In 1999, a report provided to the minister responsible for higher education and research (Ouvrir dans un nouvel ongletSilberman, 1999) highlighted the need for researchers to use very detailed data. The report referred to the fact that this had been trialled abroad, notably at Statistique Québec. From the 2000s, this type of system could already be found in the United States (NORC Data Enclave in Chicago) and in several European countries. The case of Denmark was often cited in the early discussions on the issue of setting up such a mechanism in France by CREST researchers.
These systems differed in terms of their technical implementation and because of differing national data protection legislation, but their characteristics were quite similar nonetheless. In particular, they relied on software dedicated to remote access. Such systems require software to be installed at unmanaged workstations. They do not therefore offer sufficient security safeguards and are complex to implement: they often generate compatibility problems and conflicts relating to installation or maintenance.
Technology to Open Up New Possibilities
These technical constraints, combined with the restricted resources they initially had available, led INSEE and GENES to design specific equipment, thereby meeting the need for access while avoiding the drawbacks cited above. France therefore developed its own remote access system to enable researchers to access and exploit confidential data resulting primarily from public statistics.
Rather than using third-party software solutions, CASD project team designed a computerized box specifically for this particular purpose of secure remote access to confidential data: the SD-Box. Once the user has an SD-Box, he/she only needs to login to gain remote access to resources for processing confidential data stored at secure technical locations. This location at which the data are stored and processed is called a secure bubble (Figure 1). The principle of this bubble is that no data can leave it without an appropriate checking procedure occurring. User authentication is carried out with the aid of a mechanism based on a smart card containing a security certificate and a biometric fingerprint reader. In accordance with the law, this process has received CNIL authorisation. The secure bubble system ensures that the box is entirely isolated, and everything operates within a closed circuit with no external contact: this guarantees a high level of security from end to end.
Figure 1. The SD-Box and the Secure Bubble: A Patented Mechanism Designed by INSEE and Genes
Extremely Tight, Certified Security...
The technology developed offers the benefit of enabling ISO 27 001 security certification, the international benchmark in the field, to be attained, representing extremely tight security given that each component of the access chain is fully managed. For example, to log in, the user must be located in an entity that is contractually linked to CASD and have an up-to-date, authenticated SD-Box, as well as a biometric smart card and a valid user account. Risk can be managed efficiently through a fully controlled technical and organisational mechanism. This unified architecture model has been approved by a number of security audits conducted by specialist companies, all of which have highlighted the mechanism's very high security level.
Finally, having technology providing authentication, confinement and traceability of data offers safeguards that are essential for the secure dissemination of confidential information. Confinement is a fundamental technical prerequisite if the traceability of data is to be ensured: once data are in the open air, unlimited copies can be made, at marginal, almost nil, cost. It then becomes impossible to trace the data.
... Which Does Not Restrict Use on the Part of Researchers
Data confinement must not create conditions of use that are so restrictive as to substantially complicate certain tasks or even prevent them from being carried out. Researchers must have access to all necessary tools as well as appropriate computational power. The latter point has been of major concern to CASD since the architecture was designed and continues to be viewed as such in its daily management. Unlike an ordinary service providing a computation environment, as is observed on the cloud, for example, the confinement requirement of CASD does not offer the user the option of installing software him/herself. This is a significant constraint for researchers, and it must therefore be offset by a wide range of scientific software being made available to them with the option of adding to this if necessary at fairly short notice. The same applies to the computational power, which must be able to be configured on the basis of requirements, type of processing and volume of data.
In addition to technical security, legal security must also be considered. Before accessing the technical infrastructure, researchers must carry out a series of steps with respect to which they also incur personal liability.
Access to Data: The Administrative Procedure...
In the case of data resulting from public statistics or tax data, a research project must first be submitted to the Statistical Confidentiality Committee to enable a waiver of statistical or tax confidentiality to be obtained for the members of this project. For other administrative data, the data producer may also refer the matter to the committee.
In this regard, it must be remembered that researcher status certainly covers varying institutions (universities, institutes, etc.); however, unlike the status of public statisticians, the status is not defined by legislation (Acts or Decrees). This is why INSEE decided to set up the Statistical Confidentiality Committee and to entrust it with the task of checking that the project submitted can be classified as a scientific research project and that the project sponsors are researchers.
This committee, the members of which include data producers and representatives of researchers, considers a range of criteria defined in legislation, including the purpose of the proposed research, the relevance of the data to which access has been requested and the status of the researchers.
Following this investigation, the committee issues an opinion, which is followed by a decision of the administration of the National Archives or of the Minister for the Budget in the case of tax data. The CNIL is also involved in the case of personal data. Irrespective of the procedure, the agreement of the data producing department is required.
... Followed by an Enrolment Session
Before accessing CASD, researchers must, once authorised, undertake a training and awareness course known as an enrolment session at the premises of CASD in Palaiseau, during which they are made aware of the laws on the protection of confidentiality and of compliance with the rules on statistical confidentiality, mirroring training sessions that exist at various centres abroad.
The conditions for hosting the box are also explained, although these conditions are contained in a contract between CASD and the entity where the SD-Box is installed. For example, there is a requirement for the SD-Box to be installed in premises that can be locked, for the screen to be visible only to the user, etc. At the end of the session, the researcher obtains his/her access card and affixes his/her digital fingerprints to it, following a procedure overseen by CASD engineers (Figure 2).
Figure 2. The Reseacher Incurs Liability When He/She Undertakes the Enrolment Sessions
Autonomy but Not Without Checks
After enrolment, researchers receive an SD-Box at their entity. Then, all they have to do is plug it into a screen and keyboard and connect it to the network. They can immediately start work and can carry out their analyses with real autonomy. The only exception is that users cannot technically retrieve any files from their box (Figure 3).
Figure 3. The Researcher Works Autonomously but Not Whithout Checks
This is isolated from any other device. Files cannot be printed or transferred, and copy and paste operations cannot be undertaken. When their work is sufficiently advanced and they wish to retrieve the results files, researchers use a CASD programme. Amongst other things, this programme posts the results in an area of the server reserved for this purpose:
- Where a procedure known as a priori checking occurs, CASD managers check that the user has taken all necessary steps to ensure that the result files fulfil the confidentiality rules laid down by the data producer; if this is the case, they send the data to the user.
- There is an automatic procedure, without manual checking, for certain categories of data such as health data. Users are required to fill out an online form in which they confirm that they have adhered to the confidentiality rules. The file is then sent to them automatically by a text message notification, accompanied by a secure download link. A copy of these files is retained by CASD for a period of five years to enable ex post checks to be carried out.
In both cases, the checks relate exclusively to data confidentiality and never to the quality or scientific relevance of the work.
The Benefits for Data Producers...
The fact that secure dissemination of the data is undertaken by a third party, such as CASD, saves the producer, for whom this is generally only an ancillary task, from having to invest too heavily in infrastructure to offer this service. This also enables this service to be shared among several data producers to minimize investment and operating costs; it should be noted that there is no joining fee or operating cost for a data producer who wishes to make its data available in CASD for scientific research purposes. CASD also undertakes contracting with researchers (Figure 4). In many cases, producers no longer need to enter into an agreement with researchers to allow access to their data. Additionally, for producers who nonetheless wish to have an agreement with researchers, the agreement is significantly shorter because it no longer contains specific clauses concerning technical and security matters.
Since the implementation of the GDPR, security safeguards have become stringent legal requirements. A standardized model enables these formalities to be considerably reduced and therefore facilitates compliance with the requirements of processing registers or impact studies.
Figure 4. The Role Played by CASD as Intermediary Reduces the Burden on the Producer and Facilitates the Sharing of Sources
... and for Researchers
This configuration also offers the major advantage of making it possible to use data from several producers by joint use or by record linkage within a single working environment.
Researchers have been quick to take advantage of this: in 2013, 16% of projects were already using the sources of two or three producers, at a time when INSEE sources still constituted the majority of the data stored in CASD servers. Since then, under almost four times as many projects, the proportion using sources from several producers has increased to 52%, with projects now including sources from four or even five data producers. Currently, 171 projects jointly use the data of INSEE and DGFiP (Ministry of Finance).
This option represents a benefit for researchers, compared with a situation such as that of the United Kingdom, where the arrangements for accessing tax and public statistics data are different. In most comparable countries abroad, the development of a number of silo-based secure access centres has meant that it is complex for researchers to carry out record linkage or to jointly use several data sources from a number of producers. This difficulty is explained in particular by the fact that, historically, the first secure access centres abroad were created in the form of physical centres that later became remote access centres.
This model based on specialisation and sharing means that there is a clear advantage in using French data. To our knowledge, this asset is the only one of its types, on this scale, in the world. This explains why an increasing number of European researchers are now requesting access to French data.
Record Linkage to Increase the Possibilities...
The data available in CASD are in of themselves a very rich source of information for study and research. However, their informative and explanatory power is increased when record linkage occurs, specifically by adding the data collected for an individual in a file to the data available for the same individual in another file. Certain studies or evaluations can only be carried out on the condition that record linkage occurs first. This is the case, for example, in regard to studying links between income from employment and replacement income (unemployment, daily allowances, health insurance, and pensions) between the educational career and professional career of an individual or between health and work.
This record linkage of individual files is particularly useful and can be necessary for the design, implementation and evaluation of public policy in many fields. This linkage offers advantages over surveys that would be specifically designed to answer predetermined questions. Such surveys would be very costly with regard to the resources available to researchers and could clearly only cover a substantially more limited sample than administrative files, one feature of which is that they are often exhaustive with regard to the population in question.
Until now, however, there have been very few studies or research works in France based on the linkage of this type of file. Linkage that actually enables individuals appearing in two files to be matched is generally carried out using an identification number such as the NIR (registration number in the National Directory for the Identification of Natural Persons). This appears in a large number of files. However, its use was very restricted until 2017 because it required the prior publication of a decree at the Conseil d’État authorising the processing.
... Using the “Hashed NIR”
Since the Act for a Digital Republic, enacted in 2016, and its implementing decree, which appeared a year later, it has become legally possible to undertake data processing using a derivative of the NIR. The NIR is a partially meaningful indicator (sex, age, and place of birth), and when it was created, it was only expected to be used for social security purposes.
The Health Act extended this field to that of an individual's health identifier for their treatment for health and social care purposes. The CNIL nevertheless continues to be vigilant and prefers to limit its use. However, there are techniques allowing an NIR to be linked to another indicator known as the “hashed NIR” using an asymmetrical process that enables each NIR to have a unique reference but does not allow the original NIR to be recomputed using the “hashed NIR”. Hashing the NIR allows the same services to be performed as using the NIR itself, but with a considerably lower risk of a person being identified.
To use the “hashed NIR”, a trusted third party should be used to manage the secret keys required for the cryptographic hash functions. A different secret key is created for each research project, generating different encrypted NIRs for each linkage. The result of the linkage of two files brings with it a greater risk of re-identification than the initial files taken separately. This requires specific precautions to be taken in relation to its dissemination. This is why the law stipulates that a second third party, such as CASD, should be instructed to effectively carry out the record linkage using the “hashed NIR” and to make the data securely available once the data have been linked (Figure 5).
Figure 5. A First Trusted Third Party Produces a Secret Key, and CASD Carries Out the Record Linkage
An Innovative Approach to Certification of Research Based on Confidential Data...
Since confidential data have become secure open data, the question of reproducibility of computations, which underlines the scientific basis for the work, has arisen. Scientific journals effectively ask researchers to file data and code so that the results published can be referred to by third parties for verification, which clearly poses problems in relation to confidential data (Ouvrir dans un nouvel ongletPérignon et alii, 2019).
Until now, the checking of results has had to occur after submission to the reviewers appointed by the publisher. For the reviewer to be able to carry out these checks, he/she must follow the accreditation procedure laid down by the Statistical Confidentiality Committee and travel to CASD premises to be enrolled. This process can take months. In practice, no reviewer has had the time or resources required to commit to this type of checking.
CASD has formed a partnership with a certification agency, CASCAD, to implement a solution for certification of the reproducibility of a research work based on confidential data.
The specialized and accredited agency checks the conformity of the results before they are submitted to the scientific journal. Certification is awarded following an evaluation process led by a specialist in the programming language used by the researcher on the basis of source data present in CASD and all computer code made available by the researcher. This should increase the likelihood of publication of an article in academic journals.
Thanks to the support of the Statistical Confidentiality Committee and of all producers, a pilot was able to start in April 2019 for a period of one year. The principle of this pilot is that after approval of the certifier, CASD gives it access to a secure environment for each request for the period required for the certification. Environments created in this way are closed at the end of each certification. The programmes and associated data are stamped and archived in encrypted form over a period of five years.
At a time when the issue of reproducibility of the results of research is often of primary importance in several fields, the rationalisation of certification offered by this service should allow for considerable progress to be made.
On an international level, data producers have agreed to give transnational remote access to researchers from the European Union and EFTA member countries. Researchers have recently been authorised to work on data from INSEE and the Ministry of Agriculture from the United States and Canada, subject to certain additional conditions.
CASD, well placed from this perspective at an international level, after having participated in the European project Data without Boundaries (DwB) is coordinating the implementation of a collaboration between the French, British, German and Dutch secure access centres (Figure 6): the aim of the IDAN (International Data Access Network) is to facilitate access to the secure data of these countries for researchers, avoiding them having to travel and enabling them to deploy data from several countries more easily from the location of each of the network partners. Therefore, by the end of 2019, it should be possible to access the data of all the centres from each of the other centres.
Set up belatedly compared with other major countries in Europe and North America, CASD is currently in a leading position due to both its technology and the amount of data it stores. It was therefore one of the four secure centres questioned by the American Commission responsible for the report on the use of open administrative data for the evaluation of public policy and research (Ouvrir dans un nouvel ongletAmerican Congress, 2017). The increasing amount of data available in CASD demonstrates the trust that has been generated among producers, a sign that further developments are likely, notably in the field of health at a time at which the interplay between this field and the economic and social sciences field is increasingly important. Among researchers, the fears initially expressed in relation to the constraints to be overcome by a secure system have been broadly offset by the amount and quality of the new open data, enabling major projects to be undertaken using French data, including at an international level.
Figure 6. CASD Coordinates a Network Between the French, British, German and Dutch Centres
Box 2. Status and Funding of CASD
CASD is a facility which received the Équipex label (Équipement d’Excellence du programme Investissements d’avenir) in 2011 and has accordingly receiving funding for development up to 2019. The Équipex project tender rules notably required the implementation of invoicing arrangements to ensure the service was self-funding after 2019. CASD has met this requirement since 2012 by invoicing for its services. Average annual invoicing today is just over one thousand Euros per user. Invoicing enables operating costs to be covered in part, the remainder being covered by contributions from project partners, namely INSEE, Genes, CNRS, the École polytechnique and HEC Paris.
At the end of 2018, the project partners decided to set up a dedicated structure for CASD by creating a Public Interest Group, a public law non-profit legal entity having administrative and financial autonomy. The conversion of CASD into a Public Interest Group in line with the Équipex consortium provides it with a more flexible modus operandi suited to its needs, whilst offering guarantees to public partners in its capacity as a public entity.
The creation of this structure has enabled the missions of CASD to be written into legislation, these being to set up and implement secure access services for confidential data for non-profit purposes relating to research, study, evaluation or innovation, activities classified as research services, primarily public in nature, and to optimise the technology developed to secure access to data in the private sector.
The recent implementation of the new European data protection regulation (GDPR) suggests that there will be an increased need for security for anyone wishing to work on highly detailed data. By adopting an autonomous structure, CASD will therefore be more capable of providing the service required for research.
Paru le :22/06/2021
It was succeeded by “Quetelet Progedo Diffusion”, the French network of data centres for the social sciences, which disseminated French data relating to human and social sciences to the research community.
French data protection authority (Commission Nationale de l’informatique et des Libertés, or CNIL).
Loi n° 51-711 du 7 juin 1951 modifiée sur l’Obligation, la coordination et le secret en matière de statistiques (Law No 51-711 of 7 June 1951 on legal obligation, coordination and confidentiality in the field of statistics, as amended).
The French Data Protection Act (Loi n° 78-17 du 6 janvier 1978 relative à l’Informatique, aux fichiers et aux libertés) is the Law No 78-17 of 6 January 1978 on information technology, data files and civil liberties.
The Group of National Schools of Economics and Statistics (Groupe des Écoles Nationales d’Économie et Statistique, or GENES) is a public higher education and research institution attached to the ministry of economics and finance, subject to the technical supervision of INSEE.
National Opinion Research Center (Ouvrir dans un nouvel ongletLane & Shipp, 2007).
The Centre for Research in Economics and Statistics (Centre de recherche en économie et statistique, or CREST) is a research centre attached to GENES.
The ISO 27 001 standard allows companies and authorities to obtain certification confirming that an information security management system has been effectively implemented.
General Data Protection Regulation.
Law No 2016-1321 of 7 October 2016 (Article 34) amending Articles 22 and 25 of the Law on Information Technology and Civil Liberties, and Decree No 2016-1930 of 28 December 2016, simplifying the prior formalities relating to processing for statistical or research purposes.
Loi n° 2016-41 du 26 janvier 2016 de Modernisation de notre système de santé (Law No 2016-41 of 26 January 2016 on the Modernisation of our Health System) (Article 147).
CASCAD (Certification Agency for Scientific Code and Data) is a non-profit research support body, funded by various French institutions including the CNRS, HEC Paris and the University of Orléans.
See also the public consultation of researchers launched in July 2019 by the CNIL on data processing for scientific research purposes (Ouvrir dans un nouvel ongletCNIL, 2019).
European Free Trade Association.
Meet at least one of the following two conditions: either be a European citizen or have a European entity involved in the project.
This is a project which falls within the scope of the European Union’s seventh framework programme for 2007-2013 (FP7), for research and technological development (Alvheim et alii, 2012).
Pour en savoir plus
ALVHEIM, Atle, BOND, Steve, GADOUCHE, Kamel, GÜRKE, Christopher et SCHILLER, David, 2012. Report on the state of the art of current SC in Europe. Septembre 2012. Project DwB, funded under : FP7-Infrastructures
AMERICAN CONGRESS, 2017. Ouvrir dans un nouvel ongletThe Promise of Evidence-Based Policymaking. [en ligne]. Septembre 2017. Rapport de la Commission sur l’Élaboration de politiques fondées sur des données probantes. [Consulté le 8 octobre 2019]
CAPELLE-BLANCARD, Günther et BELLANDO, Raphaëlle, 2015. Ouvrir dans un nouvel ongletL’accès aux données bancaires et financières : une mission de service public. [en ligne]. Juillet 2015. Rapport du groupe de travail du CNIS. [Consulté le 8 octobre 2019]
COMMISSION NATIONALE INFORMATIQUE ET LIBERTÉS (CNIL), 2016. Ouvrir dans un nouvel ongletCommunication cadre relative au Big Data. [en ligne]. 18 février 2016. [Consulté le 8 octobre 2019]
COMMISSION NATIONALE INFORMATIQUE ET LIBERTÉS (CNIL), 2019. Ouvrir dans un nouvel ongletRégime juridique applicable aux traitements poursuivant une finalité de recherche scientifique (hors santé). [en ligne]. [Consulté le 26 septembre 2019]
GUESDON, Maxence, BENZENINE, Eric, GADOUCHE, Kamel, et QUANTIN, Catherine, 2016. Ouvrir dans un nouvel ongletSecurizing data linkage in french public statistics. [en ligne]. 6 octobre 2016. BMC Medical Informatics and Decision Making. [Consulté le 8 octobre 2019]
LANE, Julia et SHIPP, Stephanie, 2007. Ouvrir dans un nouvel ongletUsing a Remote Access Data Enclave for Data Dissemination. In : The International Journal of Digital Curation. [en ligne]. 27 juillet 2007. N° 1, Volume 2 | 2007, pp. 128-134. [Consulté le 14 octobre 2019]
LE GLÉAU, Jean-Pierre, 2019. Le secret statistique. 2 mai 2019. EDP Sciences, Collection Le monde des données. ISBN 978-2-75982-342-0
LE GLÉAU, Jean-Pierre et ROYER, Jean-François, 2011. Le centre d'accès sécurisé aux données de la statistique publique française : un nouvel outil pour les chercheurs. In : Courrier des statistiques. [en ligne]. Mai 2011. N°130, pp. 1-5. [Consulté le 14 octobre 2019]
LOTH, André et alii, 2015. Ouvrir dans un nouvel ongletDonnées de santé : anonymat et risque de ré-identification. [en ligne]. 6 juillet 2015. Dossiers solidarité et santé, Drees, N° 64. [Consulté le 14 octobre 2019]
DE MONTJOYE, Yves-Alexandre et alii, 2018. Ouvrir dans un nouvel ongletOn the privacy conscientious use of mobile phone data. [en ligne]. 11 décembre 2018. Nature, Scientific Data, Comment. [Consulté le 14 octobre 2019]
MOREL-À-L’HUISSIER, Pierre et PETIT, Valérie, 2018. Ouvrir dans un nouvel ongletRapport d'information sur l’évaluation des dispositifs d’évaluation des politiques publiques. [en ligne]. 15 mars 2018. Assemblée Nationale. [Consulté le 14 octobre 2019]
PÉRIGNON, Christophe, GADOUCHE, Kamel, HURLIN, Christophe, SILBERMAN, Roxane et DEBONNEL, Éric, 2019. Ouvrir dans un nouvel ongletCertify reproducibility with confidential data. In : Science. [en ligne]. Juillet 2019. [Consulté le 8 octobre 2019]
SILBERMAN, Roxane, 1999. Ouvrir dans un nouvel ongletLes sciences sociales et leurs données. [en ligne]. Juin 1999. Rapport, Ministère de l’éducation nationale, de la recherche et de la technologie. [Consulté le 14 octobre 2019]
SILBERMAN, Roxane, 2013. Transnational Access to Official Micro-data: The Data without Boundaries European Network. In : KLEINER, Brian, RENSCHLER, Isabelle, WERNLI, Boris, FARAGO, Peter, JOYE, Dominique, 2013. Understanding Research Infrastructures in the Social Sciences. Seismo Press, Social Sciences and Social Issues AG, Zurich, pp. 47-66. ISBN 978-3-03777-133-4