Providing opportunities for secondary analysis and for the replication of studies, the sharing of research data is paramount to the social sciences. Facilitating the access to data has become a political objective, and more and more journals and funding agencies are adopting a data availability policy. In this context, it is useful to have insights into the challenges and potential benefits of ensuring that survey data are made widely available.
An article recently published in Population describes the archival activities carried out by the Surveys Department of the French Institute for Demographic Studies (Institut national d’études démographiques, INED) to provide access to INED quantitative surveys. It examines INED’s activities in both the international and French contexts of access to quantitative social science survey data.
The development of institutional structures to provide access to survey data, called data archives, started after the Second World War, especially in the field of political sciences. Survey data archives set up international networks and developed international standards for documenting the surveys. Today, the standard called Data Documentation Initiative (DDI) is widely used, for example by the Consortium of European Social Science Data Archives (CESSDA). DDI contains items for the general description of empirical studies down to the level of each variable of a dataset. Survey data archives also developed software for publishing and analyzing the data files online, such as Nesstar which is today adopted by many European data archives.
France was a late starter in this domain, mainly due to a weak academic tradition in large-scale surveys and a tendency to strong protection of personal data. Today, the Réseau Quetelet, created in the early 2000s, centralizes the requests for access to data files from the Center for Socio-Political Data (CDSP), from sociodemographic surveys managed by INED, and from the National Archive of Data from Official Statistics (ADISP, a data service of the Centre Maurice Halbwachs). Another partner of the Réseau, the Secure Data Access Centre (CASD), manages the access to very detailed data from official statistics.
As all the partners of the Réseau Quetelet, INED Surveys Department adopted DDI and Nesstar for disseminating its surveys. As of February 2016, the INED-Nesstar catalogue contains more than 240 survey records, of which 59 are available for access through the online portal of the Réseau Quetelet. Most of INED surveys carried out since its creation in 1945 (sometimes in collaboration with other public bodies) are listed and documented. They cover a large spectrum of topics, including fertility, contraception, sexuality, marriage, migration, discrimination, gender, generation, inequality, health, ageing, housing, employment, etc. For some international surveys, the documentation is available in English. INED also manages the Nesstar catalogue of the Generations and Gender Surveys (GGS) for which the requests for access are handled by the United Nations Economic Commission for Europe (UNECE) via an online platform.
Before making a survey available for re-use, an “invisible” and meticulous task of reviewing and documenting the data is performed. Metadata must be provided according to the DDI standard. This involves gathering, selecting and synthesizing information about the survey itself (e.g. summary, producers, funding agencies, data collection methodology, sampling procedures, links to the questionnaire(s) and relevant bibliography) and its data file(s) (e.g. notes on missing data and derived variables). Additionally, each variable is documented in great detail (e.g. texts of the question, the universe, how derived variables are calculated) and reorganized to follow the order of the questionnaire. When the teams involved in data collection supply exhaustive documents and codebooks, as well as clean and anonymized data files, the work of data preparation and documentation is simplified and optimized.
In the current context of a strengthening trend toward open data, high quality data documentation is of crucial importance to ensure that survey data are made available for scientifically valid re-use. To this end, the collaboration with data producers is key to optimizing the work of survey data archives.
This post has been jointly written by Arianna Caporali and Amandine Morisset, engineer at the Surveys Department of the French Institute for Demographic Studies (INED).
Caporali, A., Morisset, A., and Legleye, S. (2015). Providing Access to Quantitative Surveys for Social Research: The Example of INED. Population-E 70(3): 537-566. DOI: 10.3917/pope.1503.0537.