Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Last updated: April 2, 2026

All of the following scripts are either made available in bash when you run the bash setup in the $HOME/bin directory, or are available in the tools/ folder in each repository.

Data Download and Synchronization Tools

download_box_private.py

Python script for downloading files from private Box folders using JWT authentication.

Links: Source | Help

download_dv.py

Python script for downloading complete datasets from Dataverse repositories as ZIP archives using DOI.

Links: Source | Help

download_openicpsr-private.py

Python script for downloading files from private (unpublished) openICPSR deposits with authentication.

Links: Source | Help

download_openicpsr-public.py

Python script for downloading files from public (published) openICPSR deposits.

Links: Source | Help

download_osf.sh

Bash script for downloading all files and directories from Open Science Framework (OSF) projects.

Links: Source | Help

download_sivacor.py

Python script for downloading SIVACOR submission artifacts, handles ZIP extraction, and commits results to git branch.

Links: Source | Help

get_sivacor_info.py

Python script for extracting computing environment and timing information from SIVACOR JSONLD files. Can output to stdout or automatically update replication reports.

Links: Source | Help

download_zenodo_draft.py

Python script for downloading files from Zenodo draft deposits that require authentication.

Links: Source | Help

download_zenodo_public.sh

Bash script for downloading files from public Zenodo repositories using zenodo_get tool.

Links: Source | Help

list_box_files.py

Lists files from a private Box folder using JWT authentication and outputs results to a text file.

Links: Source | Help

sync-codeocean.sh

Synchronizes CodeOcean capsules with local repositories, maintaining both live Git clones and static copies.

Links: Source | Help

zenodo_get_ci.py

CI-friendly wrapper for zenodo_get that suppresses animated progress bar in automated pipelines.

Links: Source | Help

File Format Conversion Tools

convert_eps.sh

Bash script that recursively converts EPS (Encapsulated PostScript) files to PNG format using ImageMagick.

Links: Source | Help

convert_graphs.do

Stata script that converts GPH graph files to PDF and PNG formats.

Links: Source | Help

csv2md.py

Python tool for converting arbitrary CSV files to Markdown format.

Links: Source | Help

matlab_convert_fig.m

MATLAB script that converts .fig files to PNG format, processing all figure files in the current directory.

Links: Source | Help

matlab_convert_mat2csv.m

MATLAB script that converts .mat files to CSV format, extracting all variables as separate CSV files.

Links: Source | Help

mk_tex_table.sh

Converts standalone LaTeX table files to complete PDF documents with comprehensive formatting packages.

Links: Source | Help

Tools to check for various things

These are usually not used directly, but run by the Pipelines.

Stata_scan_code/

Directory containing Stata code scanning tools and packages for analyzing Stata scripts and dependencies.

Links: Source | Help

check_ipynb_order.py

Python script that verifies Jupyter notebook code cells were executed in sequential order for reproducibility.

Links: Source | Help

check_r_deps.R

R script that finds and outputs all R package dependencies as CSV from a project directory.

Links: Source | Help

check_rds_files.R

R script for checking RDS (R data files), designed to run automatically without manual changes.

Links: Source | Help

doi_validator.py

Python module to validate DOI links and convert between formats for Harvard Dataverse DOIs.

Links: Source | Help

find_cran_date.py

Python tool that determines minimum CRAN snapshot date for pinned R packages and reports matching Docker images.

Links: Source | Help

install.R

R package installation utility with version control; provides pkgTest() function to install and require packages.

Links: Source | Help

scan_pkg.jl

Julia package scanner that identifies and lists packages used in Julia files via using and import statements.

Links: Source | Help

summarize_data.py

Python script that summarizes data metadata by directory levels, aggregating file sizes from CSV.

Links: Source | Help

Ad-hoc Data Analysis and Comparison Tools

compare_manifests.py

Python script that compares two SHA256 manifest files to identify overlaps in filenames, checksums, and complete records.

Links: Source | Help

generate_png_diff.sh

Generates visual diffs for modified PNG images by comparing them against their git repository versions.

Links: Source | Help

summarize_diff_stats.py

Parses and summarizes statistical differences from files, extracting numerical values and filenames.

Links: Source | Help

Pipeline and Workflow Tools

pipeline-steps1-4.sh

Combined pipeline script that handles multiple steps of the openICPSR download process.

Links: Source | Help

run_scanner.sh

Runs Stata code scanner on ICPSR directory, reads configuration and executes scanning operations.

Links: Source | Help

sbatch-shell.sh

SLURM batch job script template for running Stata jobs on HPC clusters with resource specifications.

Links: Source | Help

JIRA Integration Tools

These tools integrate with the AEA Data Editor Jira system for task tracking and metadata extraction.

jira_add_comment.py

Posts comments to Jira issues using the Jira API with support for wiki markup formatting.

Links: Source | Help

jira_find_task_by_icpsr.py

Finds the highest-numbered Jira Task issue for a given openICPSR project ID.

Links: Source | Help

jira_get_info.py

Retrieves various information fields from Jira issues including DOIs, openICPSR URLs, and SIVACOR IDs.

Links: Source | Help

Configuration and Setup Tools

linux-system-info.sh

System information collector that displays OS details, processor info, and memory availability.

Links: Source | Help

update_tools.sh

Tool updater that downloads latest replication template files from GitHub and copies them to template directory.

Links: Source | Help

Document Processing Tools

prepare-revision.py (inactive)

Processes Markdown files by replacing code block content in Appendix sections while maintaining headers.

Links: Source | Help

Configuration Files

requirements-scanner.txt

Python requirements file for scanner tools.

Links: Source

requirements.txt

Python requirements file for general tools.

Links: Source

template.tex

LaTeX template file for document generation.

Links: Source