AEA Data and Code Guidance


Guidance for authors wishing to create data and code supplements, and for replicators.

Frequently Asked Questions

On this page:

… although some are not frequently asked, but might nevertheless be useful. Below questions and answers in random order. Please be sure to check out the official list of FAQ first. Should you have other questions not appearing on either page, please create a new issue on Github, ask the question on Twitter, or send an email to the AEA Data Editor.

What is the DOI of my openICPSR deposit? I have not yet published it, but am asked to add a citation to it in my manuscript?

Generically, each openICPSR project has a number (e.g., “109622”), that might show up on the right panel: Image of number Then

Give it a try:

Article title:
Project number:
Version number:

How do I cite my own data and code supplement?

If you created your own data (experiments, surveys, etc.), you should do one of two things:

Should we keep the data and directory structure as we used it ourselves or should we set up the files in a way that would make replication as straightforward as possible?

… the directory structure has gotten a little clunky over the years working on this project…

The Data and Code Availability Policy says:

“Files uploaded to the AEA Data and Code Repository should retain the file names as originally executed or used, their original file format, and their original “grouping” in terms of directories.”

You should feel free to reorganize, but you should ensure when we run the reorganized files, they produce the same results that are reported in the paper. Or put differently, the numbers in the paper should be produced by the reorganized files. We are not trying to reproduce your historical path to the paper, only the current state of the paper.

Such restructuring may also be appropriate if you have a very sophisticated reproducible setup in your lab or group. A replicator does not need all sorts of fancy dynamic setup scripts that are very relevant in a lab, but unnecessarily complicate the process for a replicator. You should attempt to simplify the final setup to make it easy for anybody to run this particular project, once.

The paper uses confidential data, covering [geography] for period [2001-2015]. The repository only contains code. Should the repository metadata be filled out for the data characteristics, even if the repository only has code?

[Answer from ICPSR] I think it still makes sense to complete as much metadata as possible. There are syntax files specific to the data available through a restricted-use agreement. The metadata are for increasing findability of the data collection – even if only the syntax are in the repository. It’s useful to know the data analyzed with the syntax are about a specific geographic coverage for a specific time period.

I use confidential data. I am allowed to provide the data to the Data Editor for the purpose of replication, but you are not allowed to publish the data. How do I proceed?

Moved to main FAQ

We already use git/svn/GitHub/GitLab/BitBucket/etc. Do you facilitate integration of existing version-controlled code to the AEA repo? Or even planned functionality for linking out directly to such projects where they can be found online?

Moved to main FAQ

Some econometrics papers might be accompanied by (for example) an R or Stata package (perhaps published on CRAN or SSC). What about surfacing references to associated packages more prominently?

Moved to main FAQ

Do you support Docker/ Jupyter/ etc.?

Moved to main FAQ

I have been told by the Data Editor to remove PSID data from my submitted materials. What do I do?

Moved to main FAQ

Aligning AEA RCT Registry and AEA Data and Code Repository

The AEA RCT registry has a field that codes whether data associated with a registration is publicly available. Many authors will have this coded as “non public” prior to the publication of the replication package. When the replication package is about to be published on the AEA Data and Code Repository, this field needs to be updated. Only the authors of the registry can update this field. Steps to follow:

[EXTRA] You should also record the RCT DOI as a related publication of your deposit on the AEA Data and Code Repository:


Entering related publication

Selecting import via DOI

Importing via DOI

Selecting relationship

I was asked to modify files in my repository (not yet published) but I cannot upload or edit anything

When you first submitted to the AEA, your deposit became locked. There are two ways it can be edited:

You can “recall” the submission

On the right, under “ Change status”, choose “Recall submission”

recall You should then be able to upload and make changes.

Once you are done, choose “Re-submit” from the same menu.

The Data Editor staff can request revisions

If you received a notice via the openICPSR communication log requesting revisions, you should be able to make modifications as outlined in the request. You should be all set.

Again, once you are done, choose “Re-submit” from the same menu as above.

I was wondering whether (and how) I can update the published repository for our paper. I was contacted by a researcher who is doing a replication … couple of minor issues … forgotten to include two auxiliary datasets in the repository without which one of the programs does not run successfully.

First off, excellent initiative. Our team cannot always conduct a full replication (not all data may be accessible, not enough time, no access to the software). We appreciate it when others are able to do that work, and when authors then correct the replication package.

Updating the repository is actually very easy, and updates likes these are exactly why we moved to the openICPSR repository for this. We have a policy how changes are then recorded, see

1) Log back onto your openICPSR deposit. If you don’t remember, simply click on the “Share Data” link on openICPSR, and it will show you your deposits.

2) You may need to click on “Create new version” - depends on when the deposit was initially created (applies for all deposits made after July 2020).

3) Update the README as per the policy. Authors should list the files added, any changes made to the programs, and ideally the reason why. No more than a paragraph.

4) Once you updated all files (remember to update the README), choose “Submit to AEA” in “Change Status”.

5) The AEA Data Editor will review that the criteria of the Revision Policy are satisfied, but conduct no other checks.

6) In most cases, the article will remain linked to the V1 deposit (“version of record”), but anybody navigating there will see a banner indicating that a more recent version exists (the V2 deposit).

I have a paper that uses data from 14 different sources. How do I comply with requirement for data citations and fit within page limits? (for instance in Papers and Proceedings).

We understand page limits, here are possible workarounds, in decreasing order of preference: