Data and Code Availability Policy


It is the policy of the American Economic Association to publish papers only if the data and code used in the analysis are clearly and precisely documented and access to the data and code is nonexclusive to the authors.

Authors of accepted papers that contain empirical work, simulations, or experimental work must provide, prior to acceptance, information about the data, programs, and other details of the computations sufficient to permit replication, as well as information about access to data and programs.

The Editor should be notified at the time of submission if access to the data used in a paper is restricted or limited or if, for some other reason, the requirements above cannot be met.

If data or programs cannot be published in an openly accessible trusted data repository, authors must commit to preserving data and code for a period of no less than five years following publication of the manuscript and to providing reasonable assistance to requests for clarification and replication.

The AEA Data Editor will assess compliance with this policy, including by conducting reproducibility checks, will verify the accuracy of the information provided, and will assist the authors in achieving compliance with this policy, prior to acceptance by the Editor.

The American Economic Association endorses DCAS, the Data and Code Availability Standard v1.0 DCAS 1.0 used by multiple journals in economics, and this data and code availability policy is compatible with DCAS. The specific terms and requirements are described in more detail below. 

Data

Data Availability Statement (DCAS #1)

A Data Availability Statement covering both the source data and any derivative data must be provided either in the README file (see below) or as part of an online appendix. This statement should contain detailed information about data provenance, i.e., how, where, and under what conditions an independent researcher can replicate the steps needed to access the original data, including any limitations and the expected monetary and time cost of data access. This information must be provided even when all data are included as part of the deposit.  

Raw Data (DCAS #2)

Raw data used in the research (primary data collected by the author and secondary data not otherwise available) must be included in the replication package unless the exceptions for non-public data apply (see below) or unless the exact extract of the raw data used in the analysis is published in a trusted repository that satisfies the FAIR data principles (see guidance) and a permanent identifier (e.g., DOI) is provided as part of the Data Availability Statement.

Analysis Data (DCAS #3)

Analysis data should be provided as part of the replication package unless the exceptions for non-public data apply (see below) or unless they can be fully reproduced from accessible data within a reasonable time frame and with reasonable resources.

Non-Public Data

If raw or analysis data cannot be published as part of a replication package or in an openly accessible trusted data repository, the reason(s) must be provided in the Data Availability Statement. Examples include confidential data with identifying information of persons or businesses and data subject to data use agreements or copyrights that prohibit redistribution. It is generally not acceptable that data be provided "upon request" if the request must be approved by the authors themselves. For non-public data, the author should indicate to the AEA Data Editor (in a form provided by the editorial office when requesting final manuscript files) whether a private (not to be published) version of the data can be provided directly to the Data Editor and/or a designated third-party replicator. Please do not upload data to the draft deposit that are not meant for publication.

Formats (DCAS #4)

The data files may be provided in any format compatible with any commonly used statistical package or software. Authors are encouraged to provide data files in open, non-proprietary formats.

Metadata (DCAS #5)

Each variable in the provided datasets should have a meaningful name or description (label), or authors may provide separate codebooks or similar metadata that describe the allowed values and their meaning. It is acceptable to reference publicly available documentation for these items.

Data Citations (DCAS #6)

Please cite all data used in the paper and the approved online appendices as per AEA Reference Style.

Computer Code

A master script is strongly encouraged. When no master script is included, please provide sufficient and precise step-by-step instructions, allowing users to exactly reproduce the generated outputs with the least amount of effort.

When additional packages or libraries are required to run the code, please provide a setup program, containing commands to download and install the necessary packages or libraries.

Code for Data Transformation and Data Cleaning (DCAS #7)

All programs used to generate the analysis data from raw data must be included, even if the raw data cannot be provided.

Code for Analysis (DCAS #8)

Programs that produce computational results such as estimation, simulation, model solution, and visualization must be included. Ideally, these programs reproduce all the computational exhibits in the paper and approved online appendices with minimal human intervention.

Formats (DCAS #9)

The programs may be provided in any format compatible with commonly used statistical packages or software. Should unusual or costly software be required, please notify the AEA Data Editor.

Software Citations

Citation of software packages (e.g., Stata packages, R libraries) is encouraged.

Supporting Materials

Instruments and Experiment Instructions (DCAS #10)

For papers collecting original data through surveys or experiments, the replication materials must include survey instruments or experiment instructions, computer code for experiment or survey collection mechanisms, and original instructions and details on subject selection, unless this information is already provided as part of the paper's appendix. Please see the supplementary Policy for Experimental and Survey Papers.

Ethics Approval (DCAS #11)

If applicable, approval by ethics boards—the Institutional Review Board (IRB) in the United States and equivalent institutions elsewhere—should be demonstrated by including the name of the ethics board and any approval or exemption record number in the title footnote and the author disclosure statement(s). Please see the Disclosure Policy.

Registration of Randomized Controlled Trials (DCAS #12)

It is the policy of the AEA that randomized controlled trials must be registered on the RCT Registry. Please cite all such registrations in the title footnote and elsewhere in the paper as appropriate. For more information, see the RCT Registry Policy.

Documentation (README) (DCAS #13)

Please include a README document in PDF format in the uppermost directory of the replication package. The README file should include the following information:

  • A Data Availability Statement as described above (or a reference to the appendix containing such information), and statements that the authors had legitimate access to the data for their research and that they have the rights to redistribute the data that is included in the replication package.
  • A description of the content of the replication package.
  • An indication of the software and hardware used in the package, including expected running time and specific requirements needed to successfully reproduce the results (software versions, libraries to be installed, etc.). If the requirements and execution time are heterogeneous across significant portions of the package, please indicate specific requirements and running times for each of the different parts.
  • Instructions on all the steps needed to run the computer code and reproduce all the results.
  • Information mapping programs to output and how each output relates to the exhibits in the paper and appendices.
  • Data citations that are not part of the paper itself.

The README must clearly indicate any omission of the required parts of the package due to legal requirements, limitations, or other approved agreements.

While not required, the use of the Social Sciences Data Editors' Template README is strongly encouraged.

Creating a Deposit for the Replication Package

Location (DCAS #14)

The use of the AEA Data and Code Repository is strongly encouraged. Other repositories and archives considered to be "trusted" may be acceptable (see guidance); please contact the AEA Data Editor with any questions. The Data Editor has automatic access to draft deposits created in the AEA Data and Code Repository. If depositing elsewhere, appropriate arrangements should be made to provide the Data Editor with access to the draft deposit.

License (DCAS #15)

Authors retain the copyright to their own data and code and convey any permissions or restrictions imposed on secondary data they include in the replication package. The authors must permit others to use all files in the deposit for the purpose of replication and are encouraged to permit unrestricted access for broader uses. These permissions are recorded in a license. A default license is provided in the AEA Data and Code Repository; other licenses are permissible after review by the Data Editor.

Version of Record

After the data and code deposit is accepted by the AEA Data Editor, it will become the version of record associated with the paper. Corrections and revisions are subject to the Policy on Data and Code Revisions.

Additional Guidance and Documentation

Detailed instructions for preparing and depositing replication packages are provided in the AEA Data Editor's step-by-step guide. Authors are encouraged to reach out to the AEA Data Editor if they believe their particular situation is not covered by the examples and guidance.

For more information, see Frequently Asked Questions.

 

This version (February 1, 2024) supplants all prior data policies.