download_from_jira_url.py - Download replication packages from Jira-specified URLs

Description¶

This script orchestrates downloads from various repositories (Dataverse, Zenodo, OSF) using the replication package URL stored in a Jira issue. It automatically detects the repository type, checks for openICPSR deposits, and calls the appropriate download tool with the correct parameters.

Usage¶

python3.12 tools/download_from_jira_url.py <issue-key>
python3.12 tools/download_from_jira_url.py -h|--help

Arguments¶

issue-key (Required) - Jira issue key (e.g., AEAREP-8983, aearep-8361, case-insensitive)

Examples¶

# Download replication package for a Jira issue
python3.12 tools/download_from_jira_url.py AEAREP-8983

# Show help
python3.12 tools/download_from_jira_url.py --help

Workflow¶

The script follows this sequence:

Check openICPSR: Verifies if openICPSR Project Number is populated in Jira
- If yes: exits with code 2 (openICPSR handled separately)
- If no: proceeds to next step
Retrieve URL: Gets “Replication package URL” from Jira issue
Detect Repository: Analyzes URL to determine repository type:
- Dataverse: URLs containing “DVN” or “dataverse”
- Zenodo: URLs containing “zenodo”
- OSF: URLs containing “osf.io”
Download: Calls appropriate download tool:
- Dataverse: download_dv.py (extracts DOI)
- Zenodo draft: download_zenodo_draft.py (for /deposit/ URLs)
- Zenodo public: download_zenodo_public.sh (for /record/ URLs)
- OSF: download_osf.sh (if available)
Git Integration: Handles staging/commit in CI mode

Repository Detection¶

Dataverse¶

Recognizes URLs matching:

https://doi.org/10.7910/DVN/XXXXX
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XXXXX
Any URL containing “DVN” or “dataverse”

Extracts DOI and passes to download_dv.py.

Zenodo¶

Recognizes URLs matching:

https://zenodo.org/record/12345678 (public record)
https://zenodo.org/deposit/12345678 (draft deposit)
10.5281/zenodo.12345678 (DOI format)

Detects draft vs. public based on /deposit/ in URL path.

OSF¶

Recognizes URLs containing:

osf.io

Note: OSF download not yet fully implemented in this script.

Output Structure¶

Downloads create repository-specific directories:

Dataverse: dv-[PUBLISHER]-[DATASET_ID]/
Zenodo: zenodo-[RECORD_ID]/
OSF: osf-[PROJECT_ID]/ (when implemented)

Exit Codes¶

0: Success - download completed
1: Error - missing arguments, Jira errors, download failures, unsupported repository
2: openICPSR deposit found (intentional skip - handled separately)

Prerequisites¶

Required Environment Variables¶

JIRA_USERNAME - Your Jira email address
JIRA_API_KEY - API token from https://id.atlassian.com/manage-profile/security/api-tokens

Optional Environment Variables¶

ZENODO_ACCESS_TOKEN - Required for Zenodo draft deposits
CI - Set in CI/CD environments for automatic git commits

Required Tools¶

tools/jira_get_info.py with ‘replicationurl’ keyword support
Download tools for supported repositories:
- tools/download_dv.py (Dataverse)
- tools/download_zenodo_draft.py (Zenodo drafts)
- tools/download_zenodo_public.sh (Zenodo public)
- tools/download_osf.sh (OSF, optional)

Git Integration¶

In CI Environments¶

When CI environment variable is set:

Automatically stages downloaded files with git add
Commits with descriptive message including repository type and identifier
Example: "[skip ci] Adding files from Dataverse dataset doi:10.7910/DVN/ABC123"

In Local Environments¶

Suggests manual git add operation
Does not auto-commit (leaves control to user)

Error Handling¶

The script handles various error conditions:

Missing Jira credentials: Reports error and exits
Missing Replication package URL: Reports error and suggests checking Jira field
Unsupported repository: Reports error and lists supported repositories
Invalid URL format: Reports error with URL pattern extraction failure
Download tool failures: Propagates exit code from underlying tool

URL Parsing Examples¶

Dataverse¶

Input URL	Extracted DOI
`https://doi.org/10.7910/DVN/ABC123`	`doi:10.7910/DVN/ABC123`
`https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ABC123`	`doi:10.7910/DVN/ABC123`
`https://dataverse.example.edu/file.xhtml?persistentId=doi:10.5072/DVN/XYZ789`	`doi:10.5072/DVN/XYZ789`

Zenodo¶

Input URL	Record ID	Type
`https://zenodo.org/record/1234567`	`1234567`	Public
`https://zenodo.org/deposit/1234567`	`1234567`	Draft
`10.5281/zenodo.1234567`	`1234567`	Public
`https://zenodo.org/records/12345678`	`12345678`	Public

Requirements¶

Python 3.12+
All prerequisites from called download tools:
- requests library (for Dataverse, Zenodo Python tools)
- zenodo_get (for Zenodo public downloads)
- Jira API credentials

Integration with Pipeline¶

This script is designed to integrate with the AEA replication workflow:

# Example bitbucket-pipelines.yml usage
script:
  - python3.12 tools/download_from_jira_url.py $JIRATICKET

Can replace or supplement existing openICPSR/Zenodo download logic for cases where the replication package is hosted on alternative repositories.

Troubleshooting¶

“No Replication package URL found in Jira issue”¶

Cause: The “Replication package URL” field is not populated in the Jira issue.

Solution:

Check the Jira issue in browser
Verify the “Replication package URL” field contains a valid URL
Ensure Jira credentials are correctly configured

“Could not extract DOI from Dataverse URL”¶

Cause: URL format doesn’t match expected Dataverse patterns.

Solution:

Verify the URL is a valid Dataverse URL
Ensure the URL contains either a DOI or DVN identifier
Check for typos in the URL

“Could not extract record ID from Zenodo URL”¶

Cause: URL format doesn’t match expected Zenodo patterns.

Solution:

Verify the URL is a valid Zenodo URL
Ensure the URL contains a numeric record ID
Try using just the record ID number instead of full URL

“openICPSR deposit found (exit code 2)”¶

Cause: The Jira issue has an openICPSR Project Number populated.

Solution: This is intentional behavior. openICPSR deposits are handled separately through download_openicpsr-private.py or download_openicpsr-public.py.

Known Limitations¶

OSF download currently reports “not yet implemented” - manual download required
Zenodo detection defaults to trying public download first; may fail for draft deposits requiring authentication
Only supports public Dataverse datasets (no authentication support)
Custom Dataverse instances must use standard API patterns

Future Enhancements¶

Potential improvements:

Full OSF integration
Support for additional repositories (WorldBank, Box, etc.)
Better Zenodo draft vs. public detection
Parallel download support for multiple URLs
URL validation before attempting download

download_from_jira_url.py - Download replication packages from Jira-specified URLs

Description¶

Usage¶

Arguments¶

Examples¶

Workflow¶

Repository Detection¶

Dataverse¶

Zenodo¶

OSF¶

Output Structure¶

Exit Codes¶

Prerequisites¶

Required Environment Variables¶

Optional Environment Variables¶

Required Tools¶

Git Integration¶

In CI Environments¶

In Local Environments¶

Error Handling¶

URL Parsing Examples¶

Dataverse¶

Zenodo¶

Requirements¶

Integration with Pipeline¶

See Also¶

Troubleshooting¶

“No Replication package URL found in Jira issue”¶

“Could not extract DOI from Dataverse URL”¶

“Could not extract record ID from Zenodo URL”¶

“openICPSR deposit found (exit code 2)”¶

Known Limitations¶

Future Enhancements¶