This page provides representative benchmarking examples for common tasks performed on the Roman Research Nexus, a cloud-based environment for Roman science. It summarizes typical CPU usage measured for well-defined cases and is intended to support planning and resource estimation for Nexus users. The benchmark cases and results shown here are illustrative and should be interpreted alongside the detailed case definitions that follow.

Benchmark Case Definition

This section defines the benchmark cases used to estimate typical CPU usage for common operations on the Roman Research Nexus. Each case describes the processing configuration, data volume, and key assumptions (e.g., inclusion of source detection or file I/O).

RomanCal Exposure-Level Pipeline: L1 → L2

Case 1: Single-detector calibration, full processing
- Executes the full RomanCal Exposure-Level Pipeline (L1 → L2) on one WFI detector
- Includes source detection and source catalog generation
- Executed entirely in memory (no file I/O)
- Scene-dependent test executed with ~1500 sources (combined stars and galaxies, uniformly distributed)
Case 2: Single-detector calibration, no catalog products
- Executes RomanCal Exposure Level Pipeline (L1 → L2) on one WFI detector
- Source detection and source catalog generation are excluded
- Executed entirely in memory (no file I/O)
- Independent of source density since the detection and catalog steps are excluded

RomanCal Mosaic Level Pipeline: L2 → L3

Case 1: Single-detector 4-point gap-filling dither, full processing
- Executes the full RomanCal Mosaic-Level Pipeline (L2 → L3) on one WFI detector
- Combines four exposure-level images from a 4-point gap-filling dither pattern into a single mosaic
- Includes source detection and source catalog generation
- Includes file I/O
- Scene-dependent test executed with ~4500 sources (galaxies uniformly distributed; stars clustered to simulate a star cluster)
Case 2: Single-detector 4-point gap-filling dither, no catalog products
- Executes the RomanCal Mosaic-Level Pipeline (L2 → L3) on one WFI detector
- Combines four exposure-level images from a 4-point gap-filling dither pattern into a single mosaic
- Source detection and source catalog generation are excluded
- Includes file I/O
- Independent of source density since the detection and catalog steps are excluded
Case 3: Full WFI field-of-view mosaic, no source products
- Executes the RomanCal Mosaic-Level Pipeline (L2 → L3) on eighteen WFI detectors observed in a single exposure
- Combines all 18 exposure-level images into a single mosaic
- Source detection and source catalog generation are excluded
- Includes file I/O
- Independent of source density since the detection and catalog steps are excluded
- Due to memory footprint, requires RAM ≥ 64GB (medium server or larger)

Roman I-Sim Simulations

All cases include PSF generation with STPSF.

Case 1: Simulation of uncalibrated data (L1), galaxies
- Uses Roman I-Sim to simulate WFI uncalibrated imaging data (L1) for one detector
- Input catalog contains ~1000 galaxies, uniformly distributed
Case 2: Simulation of exposure-level calibrated product (L2), galaxies
- Uses Roman I-Sim to simulate WFI exposure-level calibrated product (L2) for one detector
- Input catalog contains ~1000 galaxies, uniformly distributed
Case 3: Simulation of uncalibrated data (L1), stars
- Uses Roman I-Sim to simulate WFI uncalibrated imaging data (L1) for one detector
- Input catalog contains ~1000 stars, uniformly distributed
Case 4: Simulation of exposure-level calibrated product (L2), stars
- Uses Roman I-Sim to simulate WFI exposure-level calibrated product (L2) for one detector
- Input catalog contains ~1000 stars, uniformly distributed
Case 5: Simulation of uncalibrated data (L1), mixed scene
- Uses Roman I-Sim to simulate WFI uncalibrated imaging data (L1) for one detector
- Input catalog contains ~500 galaxies and ~500 stars, uniformly distributed
Case 6: Simulation of exposure-level calibrated product (L2), mixed scene
- Uses Roman I-Sim to simulate WFI exposure-level calibrated product (L2) for one detector
- Input catalog contains ~500 galaxies and ~500 stars, uniformly distributed

Aperture Photometry

Case 1: Large stellar catalog
- Performs source detection and aperture photometry on ~10,000 stars
- Uses photutils.DAOStarFinder
- Approximately 92% of injected sources recovered
Case 2: Moderate stellar catalog
- Performs source detection and aperture photometry on ~1,000 stars
- Uses photutils.DAOStarFinder
- All injected sources recovered, with some spurious detections

Galaxy Shape Measurements

Case 1: Simple moment-based shape measurements
- Light-weight per-object shape estimator computing basic galaxy-shape quantities:
  - position angle
  - major-to-minor axis ratio
Case 2: Sérsic profile fitting
- Fits a Sérsic surface-brightness model to each individual galaxy
- Computes full covariance matrix of fitted parameters

Astrocut

Case 1: Cutout generation (100×100 pixels)
- Uses AstroCut to generate 100 cutouts of size 100×100 pixels

Case 2: Cutout generation, 10×10 pixels
- Uses AstroCut to generate 100 cutouts of size 10×10 pixels

Exposure Time Calculations (Pandeia)

Case 1: Signal-to-noise ratio estimates
- Uses Pandeia to compute SNR for a source at a given magnitude and for a given exposure configuration
- Benchmark based on 100 estimates
Case 2: Limiting magnitude estimates
- Uses Pandeia to compute the 5σ point-source limiting magnitude for a given exposure configuration
- Benchmark based on 100 estimates
Case 3: Exposure Time estimates
- Uses Pandeia to compute the exposure time needed to reach a target SNR
- Benchmark based on 100 estimates

Catalog Cross-matching

Case 1: ZTF x PanSTARRS cross match
- Cross matches ~10,000 ZTF sources against the Pan-STARRS catalog
- Input catalogs sourced from:
  - ZTF pulled from IRSA’s public S3 bucket
  - Pan-STARRS pulled from the STScI Open Data S3 bucket
- Benchmark assumes the ZTF catalog was pre-selected via a cone search to limit the sample to 10,000 sources
- Reported CPU usage reflects only the cross-matching computation, not the initial catalog query
- Catalogs are streamed directly into memory
- No intermediate disk I/O operations performed during the matching step
- This benchmark was executed using a parallelized cross-matching implementation (distributed across multiple cores within the server).

Benchmark Results Summary

The Table of Summary of Benchmark Results summarizes measured Roman Research Nexus CPU usage for the benchmark cases defined in the previous section. CPU hours are normalized per detector, mosaic, object, or batch as indicated.

Summary of Benchmark Results

Operation	Case	Server Size*	CPU Hours (normalized)
RomanCal Exposure-Level Pipeline (L1 → L2)	Case 1: Full processing	Small	0.0333 / detector
RomanCal Exposure-Level Pipeline (L1 → L2)	Case 2: No source products	Small	0.0117 / detector
RomanCal Mosaic-Level Pipeline (L2 → L3)	Case 1: 4-point dither, full processing	Small	0.232 / mosaic
	Case 2: 4-point dither, no catalog	Small	0.143 / mosaic
	Case 3: Full WFI FOV mosaic (18 detectors)	Medium	2.634 / mosaic
Roman I-Sim	Case 1: L1 simulation (~1000 galaxies)	Small	11.385 / detector
	Case 2: L2-only simulation (~1000 galaxies)	Small	11.907 / detector
	Case 3: L1 simulation (~1000 stars)	Small	3.521 / detector
	Case 4: L2-only simulation (~1000 stars)	Small	3.815 / detector
	Case 5: L1 simulation (~500 galaxies + 500 stars)	Small	7.479 / detector
	Case 6: L2-only simulation (~500 galaxies + 500 stars)	Small	7.695 / detector
Aperture Photometry	Case 1: ~10,000 stars	Small	1.671×10⁻³ / 10,000 stars
Aperture Photometry	Case 2: ~1,000 stars	Small	1.592×10⁻³ / 1,000 stars
Galaxy Shape Measurements	Case 1: Simple moments	Small	2.791×10⁻⁷ / galaxy
Galaxy Shape Measurements	Case 2: Sérsic profile fitting	Small	6.833×10⁻⁴ / galaxy
AstroCut	Case 1: Cutout generation (100×100 pixels)	Small	3.130×10⁻³ / cutout
AstroCut	Case 2: Cutout generation (10×10 pixels)	Small	3.130×10⁻³ / cutout
Pandeia Exposure Time Calculations	Case 1: 100 SNR estimates	Small	0.038 / 100 estimates
	Case 2: 100 limiting magnitude estimates	Small	0.414 / 100 estimates
	Case 3: 100 exposure time estimates	Small	0.229 / 100 estimates
Catalog Cross-Matching	Case 1: ZTF × Pan-STARRS (~10,000 sources)	Small	0.021 / 10,000 sources matched

*Note: Server sizes listed reflect the smallest configuration on which each benchmark case could be executed. Larger servers may reduce wall-clock time, but total CPU hours typically decrease only for inherently multi-threaded tasks (e.g., RomanCal and Roman I-sim), not for primarily single-threaded or user-parallelized analyses.

How To Interpret the Benchmarks

The CPU usage values in the Table of Summary of Benchmark Results should always be interpreted in the context of the corresponding case definitions described above, including assumptions about input data volume, source density, and whether file I/O or catalog generation is included.

Benchmarks were executed on the smallest server configuration capable of running each case.

In most cases, server size was chosen to satisfy memory requirements rather than to optimize runtime.

CPU hours and wall-clock time are not the same quantity.

The values reported in this table are CPU hours, which measure total compute usage. Using a larger server may reduce the elapsed runtime (wall-clock time), but it does not necessarily reduce the total CPU hours consumed.

For primarily single-threaded tasks, larger servers do not reduce CPU usage.

Many operations execute largely in a single process. For these tasks, selecting a larger server mainly provides additional memory and typically does

Parallelization can reduce wall-clock time but usually does not reduce CPU hours.

Tasks parallelized with frameworks such as Dask or Ray can complete faster by using more CPU cores simultaneously, but the total CPU hours are often similar (and may increase slightly due to parallel overhead).
Some benchmarks on this page (e.g., catalog cross-matching) already reflect parallel execution.

Some tasks are inherently multi-threaded and may show reduced CPU usage on larger servers.

RomanCal and Roman I-sim support multi-threaded execution and can take advantage of additional CPU cores. For these tasks, both wall-clock time and total CPU usage may decrease when using larger server configurations. The reported values should therefore be interpreted as upper-limit estimates.

These values are intended as guidelines, not plug-in estimates.

Actual CPU usage may vary depending on algorithm choices, source density, I/O patterns, runtime parameters, and degree of parallelism.

Reported values represent upper-limit estimates.

Ongoing software optimization and infrastructure improvements are expected to reduce resource usage over time.

Expanded parallel capabilities are planned.

Beginning in FY27, the Roman Research Nexus is expected to offer larger server configurations for highly parallel workloads, along with additional support for parallel execution using AWS-native services.

For additional questions not answered in this article, please contact the Roman Help Desk.

Latest Update	10 Mar 2026	Updates to examples
Publication	10 Feb 2026	Initial publication of this article

Benchmarking Examples and Estimated Costs on the Roman Research Nexus