Description¶
This script orchestrates downloads from various repositories (Dataverse, Zenodo, OSF) using the replication package URL stored in a Jira issue. It automatically detects the repository type, checks for openICPSR deposits, and calls the appropriate download tool with the correct parameters.
Usage¶
python3.12 tools/download_from_jira_url.py <issue-key>
python3.12 tools/download_from_jira_url.py -h|--helpArguments¶
issue-key (Required) - Jira issue key (e.g., AEAREP-8983, aearep-8361, case-insensitive)
Examples¶
# Download replication package for a Jira issue
python3.12 tools/download_from_jira_url.py AEAREP-8983
# Show help
python3.12 tools/download_from_jira_url.py --helpWorkflow¶
The script follows this sequence:
Check openICPSR: Verifies if openICPSR Project Number is populated in Jira
If yes: exits with code 2 (openICPSR handled separately)
If no: proceeds to next step
Retrieve URL: Gets “Replication package URL” from Jira issue
Detect Repository: Analyzes URL to determine repository type:
Dataverse: URLs containing “DVN” or “dataverse”
Zenodo: URLs containing “zenodo”
OSF: URLs containing “osf.io”
Download: Calls appropriate download tool:
Dataverse:
download_dv.py(extracts DOI)Zenodo draft:
download_zenodo_draft.py(for /deposit/ URLs)Zenodo public:
download_zenodo_public.sh(for /record/ URLs)OSF:
download_osf.sh(if available)
Git Integration: Handles staging/commit in CI mode
Repository Detection¶
Dataverse¶
Recognizes URLs matching:
https://doi.org/10.7910/DVN/XXXXXhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/XXXXXAny URL containing “DVN” or “dataverse”
Extracts DOI and passes to download_dv.py.
Zenodo¶
Recognizes URLs matching:
https://zenodo.org/record/12345678(public record)https://zenodo.org/deposit/12345678(draft deposit)10.5281/zenodo.12345678(DOI format)
Detects draft vs. public based on /deposit/ in URL path.
OSF¶
Recognizes URLs containing:
osf.io
Note: OSF download not yet fully implemented in this script.
Output Structure¶
Downloads create repository-specific directories:
Dataverse:
dv-[PUBLISHER]-[DATASET_ID]/Zenodo:
zenodo-[RECORD_ID]/OSF:
osf-[PROJECT_ID]/(when implemented)
Exit Codes¶
0: Success - download completed
1: Error - missing arguments, Jira errors, download failures, unsupported repository
2: openICPSR deposit found (intentional skip - handled separately)
Prerequisites¶
Required Environment Variables¶
JIRA_USERNAME- Your Jira email addressJIRA_API_KEY- API token from https://id .atlassian .com /manage -profile /security /api -tokens
Optional Environment Variables¶
ZENODO_ACCESS_TOKEN- Required for Zenodo draft depositsCI- Set in CI/CD environments for automatic git commits
Required Tools¶
tools/jira_get_info.pywith ‘replicationurl’ keyword supportDownload tools for supported repositories:
tools/download_dv.py(Dataverse)tools/download_zenodo_draft.py(Zenodo drafts)tools/download_zenodo_public.sh(Zenodo public)tools/download_osf.sh(OSF, optional)
Git Integration¶
In CI Environments¶
When CI environment variable is set:
Automatically stages downloaded files with
git addCommits with descriptive message including repository type and identifier
Example:
"[skip ci] Adding files from Dataverse dataset doi:10.7910/DVN/ABC123"
In Local Environments¶
Suggests manual
git addoperationDoes not auto-commit (leaves control to user)
Error Handling¶
The script handles various error conditions:
Missing Jira credentials: Reports error and exits
Missing Replication package URL: Reports error and suggests checking Jira field
Unsupported repository: Reports error and lists supported repositories
Invalid URL format: Reports error with URL pattern extraction failure
Download tool failures: Propagates exit code from underlying tool
URL Parsing Examples¶
Dataverse¶
| Input URL | Extracted DOI |
|---|---|
https://doi.org/10.7910/DVN/ABC123 | doi:10.7910/DVN/ABC123 |
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ABC123 | doi:10.7910/DVN/ABC123 |
https://dataverse.example.edu/file.xhtml?persistentId=doi:10.5072/DVN/XYZ789 | doi:10.5072/DVN/XYZ789 |
Zenodo¶
| Input URL | Record ID | Type |
|---|---|---|
https://zenodo.org/record/1234567 | 1234567 | Public |
https://zenodo.org/deposit/1234567 | 1234567 | Draft |
10.5281/zenodo.1234567 | 1234567 | Public |
https://zenodo.org/records/12345678 | 12345678 | Public |
Requirements¶
Python 3.12+
All prerequisites from called download tools:
requestslibrary (for Dataverse, Zenodo Python tools)zenodo_get(for Zenodo public downloads)Jira API credentials
Integration with Pipeline¶
This script is designed to integrate with the AEA replication workflow:
# Example bitbucket-pipelines.yml usage
script:
- python3.12 tools/download_from_jira_url.py $JIRATICKETCan replace or supplement existing openICPSR/Zenodo download logic for cases where the replication package is hosted on alternative repositories.
See Also¶
jira_get_info.py - Retrieve Jira issue information
download_dv.py - Download from Dataverse
download_zenodo_draft.py - Download from Zenodo draft deposits
download_zenodo_public.sh - Download from public Zenodo records
download_osf.sh - Download from OSF (if available)
Troubleshooting¶
“No Replication package URL found in Jira issue”¶
Cause: The “Replication package URL” field is not populated in the Jira issue.
Solution:
Check the Jira issue in browser
Verify the “Replication package URL” field contains a valid URL
Ensure Jira credentials are correctly configured
“Could not extract DOI from Dataverse URL”¶
Cause: URL format doesn’t match expected Dataverse patterns.
Solution:
Verify the URL is a valid Dataverse URL
Ensure the URL contains either a DOI or DVN identifier
Check for typos in the URL
“Could not extract record ID from Zenodo URL”¶
Cause: URL format doesn’t match expected Zenodo patterns.
Solution:
Verify the URL is a valid Zenodo URL
Ensure the URL contains a numeric record ID
Try using just the record ID number instead of full URL
“openICPSR deposit found (exit code 2)”¶
Cause: The Jira issue has an openICPSR Project Number populated.
Solution: This is intentional behavior. openICPSR deposits are handled separately through download_openicpsr-private.py or download_openicpsr-public.py.
Known Limitations¶
OSF download currently reports “not yet implemented” - manual download required
Zenodo detection defaults to trying public download first; may fail for draft deposits requiring authentication
Only supports public Dataverse datasets (no authentication support)
Custom Dataverse instances must use standard API patterns
Future Enhancements¶
Potential improvements:
Full OSF integration
Support for additional repositories (WorldBank, Box, etc.)
Better Zenodo draft vs. public detection
Parallel download support for multiple URLs
URL validation before attempting download