get_sivacor_info.py - Extract information from SIVACOR JSONLD files

Description¶

Parses SIVACOR JSONLD (TRO - Transparent Research Object) files to extract computing environment and execution timing information. Can output to stdout or automatically update replication reports with the extracted information in the appropriate sections.

Usage¶

# Output to stdout
python3 tools/get_sivacor_info.py <jsonld_file> <keyword>
python3 tools/get_sivacor_info.py --jsonld <file> --key <keyword>
python3 tools/get_sivacor_info.py --jobid <job_id> --key <keyword>

# Update report file
python3 tools/get_sivacor_info.py --jobid <job_id> --key <keyword> --report <report_file>

# Dry-run (preview without updating)
python3 tools/get_sivacor_info.py --jobid <job_id> --key <keyword> --report <report_file> --dry-run

# Generate SIVACOR Part B insert snippets from the TRO
python3 tools/get_sivacor_info.py --jsonld 246665/tro/tro-6a23045802a927359ccb67f4.jsonld --key sivacor-computing-environment --output generated/sivacor-partb-computing-environment.md
python3 tools/get_sivacor_info.py --jsonld 246665/tro/tro-6a23045802a927359ccb67f4.jsonld --key sivacor-replication-steps --output generated/sivacor-partb-replication-steps.md
python3 tools/get_sivacor_info.py --jsonld 246665/tro/tro-6a23045802a927359ccb67f4.jsonld --key sivacor-findings --output generated/sivacor-partb-findings.md
python3 tools/get_sivacor_info.py --jsonld 246665/tro/tro-6a23045802a927359ccb67f4.jsonld --key sivacor-appendix --output generated/sivacor-partb-appendix.md

# Generate a template-consistent SIVACOR Part B file
tools/generate_sivacor_partb.sh --dry-run
./automations/18_summarize_sivacor.sh --dry-run

Arguments¶

Positional Arguments:

jsonld_file - Path to JSONLD file (e.g., tro-69cede1db3a6af67b1c01c3d.jsonld)
keyword - Information keyword to extract (computing, time, partb, partb-sivacor, sivacor-computing-environment, sivacor-replication-steps, sivacor-findings, or sivacor-appendix)

Named Options:

--jsonld <file> - Path to JSONLD file (alternative to positional)
--jobid <id> - SIVACOR job ID (searches for tro-{jobid}.jsonld)
--key <keyword> - Information keyword (alternative to positional)
--report <file> - Report file to update with information
--output <file> - Write generated Markdown to a file
--dry-run - Preview changes without modifying files

Keywords¶

`computing`¶

Extracts and displays computing environment information:

SIVACOR Job ID
Processor type
Number of CPUs
Total memory
Operating system and version
Kernel version
Docker image
Max CPU usage percentage
Max memory usage
OS type

When --report is specified, adds information to the “Computing Environment of the Replicator” section under a “SIVACOR” heading.

`time`¶

Extracts and displays execution timing information:

SIVACOR Job ID
Start timestamp
Finish timestamp
Calculated duration (formatted as hours/minutes/seconds)

When --report is specified, adds information to the “Findings” section under a “SIVACOR Execution Time” heading.

`partb`¶

Extracts and displays a Part B-ready SIVACOR execution summary:

TRO provenance and creation tool
Workflow execution steps recorded by SIVACOR
Container images, processor, CPU count, memory, and operating system
Per-step duration and observed maximum memory usage
File arrangements before and after workflow steps
Counts of added, removed, and modified paths between arrangements
A reviewer note that SIVACOR-generated repositories should not be rerun

When --report is specified, adds the generated summary to the “Replication steps” section. This is intended for SIVACOR-generated submissions where the author has already run the package through SIVACOR and the reviewer should use the TRO for Part B, then compare deposited outputs against the manuscript for Part C.

`sivacor-replication-steps`¶

Generates only the Markdown block to insert into “Replication steps”:

Checks the third-party reproducibility box
Describes what SIVACOR actually ran
Does not rerun author code

`sivacor-computing-environment`¶

Generates only the Markdown block to insert into “Computing Environment of the Replicator”:

Checks the third-party reproducibility box
Lists the SIVACOR job ID, processor, CPU count, memory, operating system, kernel, and container images recorded in the TRO

`sivacor-findings`¶

Generates only the Markdown block to insert into “Findings”:

Compares arrangement 0 with the highest available arrangement
Summarizes generated, removed, and modified file counts
Points reviewers to the Appendix for the full arrangement comparison
Notes that SIVACOR is not designed to compare figures and tables against the manuscript

`sivacor-appendix`¶

Generates the full SIVACOR arrangement comparison for the Appendix:

Lists generated table output paths
Lists generated figure output paths
Lists other generated output, data/intermediate, log, R environment, and uncategorized paths
Lists removed and modified paths, if any

tools/generate_sivacor_partb.sh extracts Part B from the single REPLICATION.md template, fills its SIVACOR placeholders, and writes generated/REPLICATION-PartB-SIVACOR.md. For non-SIVACOR reports, normal preprocessing replaces those placeholders with empty content. The SIVACOR appendix snippet is written to generated/sivacor-partb-appendix.md and included when the normal appendix template is regenerated. automations/18_summarize_sivacor.sh --replace-report then copies the generated Part B file over REPLICATION-PartB.md, or replaces the Part B section inside REPLICATION.md when a revision report is not split.

Examples¶

# Extract computing info and print to stdout
python3 tools/get_sivacor_info.py --jobid 69cede1db3a6af67b1c01c3d --key computing

# Extract timing info and print to stdout
python3 tools/get_sivacor_info.py --jobid 69cede1db3a6af67b1c01c3d --key time

# Generate a template-consistent SIVACOR Part B file
tools/generate_sivacor_partb.sh

# Preview what would be added to report (dry-run)
python3 tools/get_sivacor_info.py --jobid 69cede1db3a6af67b1c01c3d --key computing --report REPLICATION-PartB.md --dry-run

# Preview SIVACOR Part B generation
./automations/18_summarize_sivacor.sh --dry-run

# Add computing info to report
python3 tools/get_sivacor_info.py --jobid 69cede1db3a6af67b1c01c3d --key computing --report REPLICATION-PartB.md

# Add timing info to report
python3 tools/get_sivacor_info.py --jobid 69cede1db3a6af67b1c01c3d --key time --report REPLICATION-PartB.md

# Generate generated/REPLICATION-PartB-SIVACOR.md, then apply it to the current report
./automations/18_summarize_sivacor.sh --replace-report

# Using positional arguments
cd 246302
python3 ../tools/get_sivacor_info.py tro-69cede1db3a6af67b1c01c3d.jsonld computing

# Using file path directly
python3 tools/get_sivacor_info.py --jsonld 246302/tro-69cede1db3a6af67b1c01c3d.jsonld --key time

SIVACOR Workflow Note¶

For repositories generated by SIVACOR, do not rerun the author code as part of the AEA workflow. The submitted repository should include a tro/ directory containing the TRO JSON-LD file. Use tools/generate_sivacor_partb.sh or automations/18_summarize_sivacor.sh to generate a template-consistent generated/REPLICATION-PartB-SIVACOR.md, then apply it with --replace-report when ready. In split-report cases, this updates REPLICATION-PartB.md; in single-file revision reports, it replaces the Part B section inside REPLICATION.md. The generated file inserts SIVACOR computing environment facts into “Computing Environment of the Replicator,” SIVACOR workflow steps into “Replication steps,” and a concise SIVACOR-generated file summary into “Findings.” The full arrangement comparison is written to generated/sivacor-partb-appendix.md for inclusion by the normal generated appendix template. Human review still compares output files against the manuscript, evaluates substantive code behavior, checks requirements completeness against the README, and assigns the final classification.

Requirements¶

Python >= 3.12
Standard library modules: json, argparse, sys, os, glob, re, datetime

Output Format¶

Information is formatted as Markdown bullet points with the SIVACOR Job ID displayed in backticks for proper rendering.

Computing Output Example¶

- SIVACOR Job ID: `69cede1db3a6af67b1c01c3d`
- Processor: AMD EPYC-Milan Processor
- CPUs: 16
- Total Memory: 58.8 GB
- Operating System: Ubuntu 24.04.3 LTS (Version 24.04)
- Kernel Version: 6.17.0-14-generic
- Docker Image: `dynare/dynare:6.1-R2024a`
- Max CPU Usage: 315.32%
- Max Memory Usage: 2.34 GB
- OS Type: linux

Time Output Example¶

- SIVACOR Job ID: `69cede1db3a6af67b1c01c3d`
- Started: 2026-04-02T21:22:40.882500138Z
- Finished: 2026-04-03T05:13:05.690180035Z
- Duration: 7h 50m 24s

Report Integration¶

When using the --report option, the script:

For computing keyword:
- Locates the “Computing Environment of the Replicator” section
- Inserts SIVACOR information after existing environment items
- Adds under “SIVACOR” heading
For time keyword:
- Locates the “Findings” section
- Inserts timing information after the heading and INSTRUCTIONS
- Adds under “SIVACOR Execution Time” heading

Duplicate Detection¶

If the script detects that a SIVACOR section already exists in the report:

Displays a warning message with ⚠️ emoji
Shows the existing information in Markdown format
Does not update the file (prevents duplicates)

Example warning output:

⚠️  WARNING: SIVACOR computing section already exists in report.

Existing information in Markdown notation:

**SIVACOR**

- SIVACOR Job ID: `69cede1db3a6af67b1c01c3d`
...

Job ID Detection¶

The script can automatically detect the SIVACOR Job ID in multiple ways:

Via --jobid option: Directly specified by user
From filename: Extracts from pattern tro-{jobid}.jsonld
File search: Recursively searches for tro-{jobid}.jsonld when --jobid is used

Workflow¶

Parse command line arguments
Locate JSONLD file (by path or job ID search)
Read and parse JSON data
Extract relevant information based on keyword
Format information as Markdown bullets
If --report specified:
- Check for existing SIVACOR section
- If dry-run, display what would be added
- Otherwise, insert into appropriate report section
Output results to stdout or update report file

Error Handling¶

Validates JSONLD file existence
Checks for valid JSON format
Verifies keyword is supported (computing or time)
Ensures report file exists when --report is used
Handles missing SIVACOR fields gracefully
Prevents duplicate section creation

get_sivacor_info.py - Extract information from SIVACOR JSONLD files

Description¶

Usage¶

Arguments¶

Keywords¶

computing¶

time¶

partb¶

sivacor-replication-steps¶

sivacor-computing-environment¶

sivacor-findings¶

sivacor-appendix¶

Examples¶

SIVACOR Workflow Note¶

Requirements¶

Output Format¶

Computing Output Example¶

Time Output Example¶

Report Integration¶

Duplicate Detection¶

Job ID Detection¶

Workflow¶

Error Handling¶

See Also¶

`computing`¶

`time`¶

`partb`¶

`sivacor-replication-steps`¶

`sivacor-computing-environment`¶

`sivacor-findings`¶

`sivacor-appendix`¶