Internship with the AEA Data Editor Summer 2024

Published:

The AEA Data Editor is happy to announce a pilot project for an internship with the AEA Data Editor. The collaborating institutions are Cornell University (AEA Data Editor), Wellesley College, Haverford College, University of Notre Dame, and University of Colorado Boulder.

NOTE: Interns cannot apply independently at this time. Applications for 2024 are closed. If you are an interested local organizer, please contact Lars Vilhuber for further details on how to join in 2025.

The internship will be a 10-12 week, part-time (20 hours per week) paid position.

Description

Goal: Ensure that supplementary materials for articles in a journal with a replication policy are (a) accessible (b) reproduce the intended results, (c) document results and findings.

Work description: The American Economic Association (AEA) monitors compliance with its Data and Code Availability Policy, under the leadership of the AEA Data Editor. LDI Replication Lab members will access pre-publication materials provided by authors, and assess how well these materials reproduce the results published in the manuscript or article. The provided materials and instructions will be assessed using a checklist. Authors’ instructions will be followed (if possible), and success or failure to (i) perform the analysis (ii) replicate the authors’ results will be documented. Other related activities, such as literature search or tabulation of results, may also be assigned. Team work is encouraged, and activity will be supervised by graduate student or faculty member. Team members must be at ease working in various computer environments (Windows Remote Desktop, local laptops) and software tools (statistical software, Git).

Examples of replication packages in economics can be found at the AEA Data and Code Repository.

What interns will learn: Interns will learn and observe parts of the scientific publication process. They will learn and practice the details of the process of reproducibility checking, and will experience the challenges of ensuring that data and code are available and functional. At the end of their internship, they will have run and learned to debug code for multiple papers (typically around 5-6), reviewed output, prepared reports which will be read by senior economists throughout the world (after review by the Data Editor). They may encounter and learn about novel software and data sources, as well as how to run code on multiple platforms, including powerful Windows and Linux servers.

While the internship is new, the LDI Lab has been training undergraduate replicators for the past 6 years, and has employed more than 175 students. Students report that the experience is valuable in their future careers, including in the private sector, government, and as graduate students in academia.

Internship experience: While interns will be working remotely, they will be part of a team of interns and regular (undergraduate) staff in the LDI Replication Lab, meeting at least twice a week. They will be mentored by academic staff at their own institution as well. They will have the opportunity to interact with other interns and staff, and to learn from each other. They will also have the opportunity to interact regularly with the AEA Data Editor and his staff.

Required Qualifications/Skills/Experience: Some experience with empirical social science data analysis using statistical software is required. Knowledge of at least one of Stata, Matlab, R or SAS is required, as is familiarity with the Windows Desktop environment. Experience with Git and the command line (Linux, Mac, or Powershell) are assets. Applicants must be current students at a participating institution, residing in the United States.

Start and end dates: The internship will broadly start after institution-specific final exams over, and thus vary, but broadly, the internship will take place between mid-May and mid-August, for 10-12 weeks (depends on the institution).

Requirements: Training is required as a condition of hiring. While employed, attendance (via Zoom) at two weekly meetings is required. Training will take place on Saturday, April 6, 2024 in person, and on several subsequent days (TBD) remotely. Live attendance is expected, plus some significant self-paced work.

Once training is completed, trainees will transition to the actual “replicator” activity.

Remuneration: An hourly rate commensurate with experience will be offered. Starting wage is $15.25 per hour, up to 10 hours per week while in session; up to 20 hours per week during the summer.

Organizational Details

Location of internship

The internship is remote.

Location and date of training day

Training in 2024 will take place at Wellesley College on April 6, 2024. Participants should plan to arrive the day before. Training will end by 6PM at the latest, it is thus possible to leave the same evening, or the next day.

Agenda for training day

Time Location: see email
8:00 Breakfast
9:00 Introduction
10:00 Reproducible Practices, Template README
11:00 Data provenance, Data Citations
12:00 Lunch Break
13:00 What will you be doing in the Lab
14:00 Command Line/Git
15:00 A prototypical replication report
16:00 A walkthrough of the Workflow
17:00 How to run Stata
17:30 End (probably earlier), walk to restaurant
18:00 Restaurant

There will be various breaks throughout the day.

Additional details on training day

1. Accommodation

Wellesley cannot offer on-campus accommodation, unfortunately. The best option is the Verve Hotel in Natick, which is about a 10 to 15 minute ride to campus. The participants can get corporate rates if they book through Wellesley’s corporate link: https://www.hilton.com/en/hotels/bosqeup-the-verve-hotel-boston-natick/. The rate should be approximately $170 a night (as of February 2024). If they have any issues with booking, they can contact their local organizer.

2. Transportation options

From the airport to the Verve Hotel, people can either take a cab or use the Logan Express bus headed to Framingham (which is a cheaper option). They can purchase tickets on the bus or in advance here: https://loganexpress.com/buy-tickets.asp. They can then take a cab from the Logan Express bus station in Framingham to the Verve Hotel (less than a 5 minute ride) or arrange in advance for the hotel’s complimentary shuttle from the bus station by calling the front desk at 508-653-8800, extension 6.

3. Meals

We are grateful to Wellesley College for providing breakfast and lunch on campus on April 6 during the training, and dinner on TBD. If any participants have dietary restrictions, they should please their local organizer know.

Follow-up activities: Test cases and peer mentoring

Test cases

Test cases are worked through, and jointly handled, including with repeated peer mentoring by senior (experienced) RAs in the Lab. Three (non-consecutive) days are set aside for the peer-mentoring and walk-throughs, but work on these test cases can be done any time (adapted to individual class and exam schedules). We strongly suggest doing these immediately after the in-person training, however, as experience has shown that those who delay too long will ultimately struggle later in their work.

  • each test article should take you no more than 2-3 hours of work (decreasing as you progress)
    • Test article 1: Data is available, Stata
    • Test article 2: No data is available, only “dry” analysis
    • Test article 3: Data is available, Matlab

Live Peer group mentoring with existing replicators

(all sessions on Zoom, see calendar invite for details)

  • April 10 @ 6PM Stata : Kareena, Gary
  • April 15 @ 6PM Confidential data : Ilona, Bianca
  • April 17 @ 6PM Matlab : Jessica, Gary

Follow-up meetings:

Meetings with Lars, which is both follow-up to any further questions interns have after the peer mentoring sessions, as well as any other relevant discussions.

  • April 12 @ 10 AM
  • April 16 @ 10 AM
  • April 19 @ 10 AM (POSTPONED)

Start of work

The actual work in the Lab will start at various dates, depending on your institution’s exam schedule and other details. Contact your local coordinator for the exact start date if you are unsure.

Questions and Answers

Will I need to find accomodation in Ithaca?

The internship is remote. You will reside either at your home college, or wherever you are during your summer break. Only condition: Presence within the United States.

Can I participate and have another internship?

The AEA Data Editor internship pays for 20 hours, but many participating colleges top that up with a research-oriented job. It is technically feasible to hold another job in addition to the internship, but you will need to be very good at time management. Please contact us for further details.

Can I participate remotely for the training day?

No. In our experience, remote participants are far more likely to abandon the internship, and we require that you participate in person.