For a video tutorial on this process, see this Youtube video.
Start the deposit process
Go to the AEA Data and Code Repository, and start the process:
Checklist for Metadata
- Title (Suggested: “Data and Code for: (NAME OF PAPER)”)
- If only data, or only code are provided, adjust accordingly
- “Principal Investigators” (=Authors)
- These need not be in the same order as on the paper.
- These need not be the same people as on the paper (more, or less, is OK)
- Please ensure that all authors have affiliations (if not affiliated: “Independent Researcher”)
- Summary (Suggested: The abstract from the article and/or a note that this is data and/or code accompanying the article)
- Do not cite the article, or include “forthcoming”.
- Subject Terms (e.g., “Machine Learning”, “Randomized Control Trial”, “Nudges”, …)
- JEL Classification (can be the same as article)
- Manuscript Number (your manuscript tracking number as assigned by the editorial office, e.g., “AER-2019-0000”)
Most deposits will also need to provide the following metadata elements. In some cases, it may not make sense to fill out (for instance, a laboratory experiment may have no meaningful “geographic coverage”). These elements contribute to better inclusion in search engines.
- Geographic coverage (e.g, “United States”, “Florida, U.S.”, “Indonesia”, …)
- Time period(s) (e.g., “1982-2008”)
- Collection date(s)
- Universe (e.g., “All households in Canada”, “Manufacturing establishments in Indonesia”, …)
- Data Type(s)
The following elements are suggested for certain types of data, and may not apply to all types of data.
- Data Source
- Units of Observation
- Any additional metadata elements
Start by providing the metadata (descriptors) for the data and code you are uploading.
Details on Filling Out Metadata
Describe the project
- The title should be “
Data and Code for: [Title of article]”
- The authors should be those who compiled the data and code. The names, and order of the names, may differ (if necessary) from the article.
- The summary might be short. It can include the abstract of the article itself. It does not need to include information on the related article (which has its own field).
- Identify any funding sources here - the information can be queried by some funders, and can assist with your award reporting.
Scope of project section
To fill out the required metadata elements Subject Terms, JEL Classification, and Manuscript Number, open the “Scope of Project” section:
Click on each + to open the related section:
- Authors MUST provide additional subject terms (keywords). You do not need to repeat JEL codes.
- Authors MUST provide JEL codes (under “Scope of Project”)
- Authors MUST provide the Manuscript Number, (your manuscript tracking number as assigned by the editorial office, e.g., “AER-2019-0000”) as this will allow us to properly connect the repository with the manuscript.
- Where appropriate, authors are REQUIRED to define
- the geographical scope(s)
- the time period(s)
- the universe(s)
- data type(s)
- Most fields are repeatable, please enter as many values as needed. For instance, if subsets of the data cover different periods (e.g.,
2004-2019). Just click “add value” next to the time period field for each time period.
- This information can also be provided when only code is made available.
- When only code is produced, authors should choose
data type = program source code:
- Methodology is particularly relevant for survey or experimental data:
- response rates, sampling rates, etc.
- We ENCOURAGE all authors to define
- the unit of observation (e.g. individual, firm, establishment, county, country)
Related publications section
- The AEA editorial office will provide an entry for this field that links back to the published manuscript - authors do not need to add any reference to the manuscript anywhere in the deposit form (other than the Manuscript Number)
- Authors are encouraged to link back to working papers or related publications that have or will use this (same!) data.
- If code is derived from or continues to be updated on a Git repository (Github, Gitlab, Bitbucket, etc.), authors can link to it here.
- Future functionality will automatically list articles (including articles by third parties) that cite the data.
Once the metadata is completed, authors can upload files.
Upload files in the way you expect the files to be organized in order to run the code.
Checklist for Uploading
- README is in PDF or TXT format
- Do not upload a ZIP file - IMPORT IT!
- Do not upload manuscripts, appendices, responses to editors, etc.
- Directory structure does not contain redundant/ superfluous directories
- Do not upload data that you do not have the rights to publish!
- If the UNCOMPRESSED contents of the deposit (the UNZIPPED size of the ZIP file) are larger than 30GB, please send an email to the AEA Data Editor to request an increase in the quota. Reasonable requests will be authorized. Size of the deposit is never a reason not to provide materials, as we have found solutions for every single case so far.
- If you have more than 1,000 files in your deposit, talk to us before uploading.
- The Import functionality can handle ZIP files, but cannot handle other compression formats (RAR,7z, etc.). Please convert to ZIP before importing.
Do not upload data that you do not want published!
- Contact the AEA Data Editor if you are able to share data for reproducibility checks that cannot be published.
Consult the Sharing restricted-access data with the AEA Data Editor page.
- If you can share the data more broadly, but want to control access, you must create a separate deposit for the parts of the data that are sensitive while keeping the code, and any non-sensitive data, in the “primary” deposit as described on the present page. Your README must describe how to combine the two deposits.
- It is possible to IMPORT a ZIP file (do NOT upload a ZIP file - no ZIP files should be visible in the deposit). Replicators will be downloading a ZIP file that preserves the directory structure.
- Please upload the README (in PDF or TXT) as the very first file - ensuring that it can be found easily by browsers of the archive.
- It is OK to upload Markdown or Word documents in addition to, but not instead of the PDF or TXT version
- Please upload the README to the root of the repository - any data and code can be in subdirectories, but it is easier to find the README if it is not in subdirectories.
- There should be no duplicate README files in the repository
Your deposit should have
- no redundant directories: the first thing you should see is the README and any subdirectories
- there should be no ZIP files!
- the structure should be as you last ran the code
[NOTE] The AEA staff will not re-arrange or otherwise restructure your deposit in any way. What you see in the deposit interface is what others will see once it is published.
You should see something like this:
data_directory/ prog_directory/ README.pdf LICENSE.txt
LICENSE.txt is optional if you want to adopt one of the standard openICPSR licenses upon publication. See our licensing guidance for other options).
Submitting to the Data Editor
Once you are satisfied that all data files are present, are complete, and all metadata is satisfactory, including all required elements filled out, you should submit the deposit, by changing the status of the deposit:
Choose “Submit to AEA” under “Change Status”.
You will be presented with a page to confirm that you are going through with the submission. You will then be presented with a page, asking various questions.
You should answer these questions in regards to the data in the deposit you are submitting. The answer should NOT consider any other data that may have been used as part of the manuscript’s analysis, but that are not present in this particular deposit.
Contact the Data Editor if you have any questions or concerns.
Can individuals be identified?
The normal answer to this question is “No.”
Are the data sensitive?
The normal answer to this question is “No.”
You should answer this one as a function of the earlier answers. “Public download” means users will be registered users of openICPSR and consent to the license (next question), without further controls. “Restricted access” means that data will be distributed through openICPSR’s Restricted Data Access Mechanism.
Choose a license
You should choose a license from the drop-down menu, or, if you have a custom license as part of the deposit, select “Other”. See our Licensing Guidance.
Press “submit.” Should you have forgotten something, you can “recall” the submission, fix the issue, and re-submit.
Citing Your Deposit
At present (2020), the openICPSR repository does not display the Digital Object Identifier (DOI) that will be associated with your deposit. However, it can be deduced easily.
- Each openICPSR project has a number (e.g., “109622”), that might show up on the right panel:
- if the openICPSR project has not been published, then the DOI will be “http://doi.org/10.3886/E” + number + “V1” (e.g. http://doi.org/10.3886/E109622V1)
- if the project has already been published before, and you are updating it, then the “V1” will be incremented. See our FAQ
- You should then cite your deposit as follows (see AEA Sample References):
|Romer, Christina D., and David H. Romer. 2010. “Replication data for: The Macroeconomic Effects of Tax Changes: Estimates Based on a New Measure of Fiscal Shocks.” American Economic Association [publisher], Inter-university Consortium for Political and Social Research [distributor]. https://doi.org/10.3886/E112357V1.|