Improving Metadata

Published:

Emails sent

Authors of articles may have received the following email:

Dear Dr. So-and-so,

Since July 16, 2019, the American Economic Association has used the AEA Data and Code Repository at openICPSR as the default archive for its supplements. The migration increases the findability of your data through a variety of federated search interfaces such as Google Dataset Search, the openICPSR search interface, and the general ICPSR search interface.

[more details]

For articles with more than one author, each co-author is receiving an identical email with an individualized link. Thank you for your effort!

These emails are legitimate.

Background

As the email states, the AEA announced on July 16, 2019 that it would start using the AEA Data and Code Repository for new replication package deposits. In October and December of 2019, several thousand supplements from previously published articles were migrated to the AEA Data and Code Repository.

Metadata

While immediately making these replication packages findable through Google Dataset Search and ICPSR search interfaces, the metadata for these deposits is limited to title, authorship, abstract, and JEL codes. Modern metadata standards allow for a much richer description of the materials, such as geographic context, statistical coverage or sampling techniques, etc.

Request to Authors

Authors have the best knowledge of the data they used. We are thus asking authors to enter additional metadata through an online form. Once the project is completed, all metadata provided will be integrated with the openICPSR database, and will henceforth become available publicly.

Frequently asked questions

Also see the Twitter thread on this topic.

FAQ 1

Deposit is not actually a data set, just code that simulates data

Answer: Still a deposit. Add “program source code” under “Data type.

FAQ 2

The replication package does not contain data (confidentiality). I cannot share the data from that paper.

Answer: You can still identify metadata like geographic area and time period covered by the replication package.

FAQ 3

I don’t know what you mean by providing additional metadata for a set of *.do files and README with no data.

Answer: See [FAQ 1], and add info about the data sources into the metadata (it’s not searchable if hidden in the README)

Published: