Roman STScI Data Pipelines

This article contains a high-level overview of the science data pipelines for the Roman Wide Field Instrument (WFI) imaging data processing at STScI including design, philosophy, and installation instructions. 



Overview of WFI Pipelines at STScI

WFI imaging observations are processed through several pipelines to create different data products. At STScI, we are developing three pipelines:

  • Exposure Level Pipeline: Performs detector-level calibration of Level 1 to Level 2 WFI data products.
  • Mosaic Level Pipeline: Handles repixelation and mosaicking of Level 2 WFI imaging data into Level 3 products.
  • Catalog-Level Pipeline: Generates Level 4 catalogs from Level 2 and Level 3 products (currently under development).


Additional data processing specific to the WFI spectroscopic mode and microlensing exoplanet science are carried out by the Science Support Center at IPAC.


The romancal repository undergoes continuous integration testing using both unit tests and larger regression test suites to ensure that changes to the code do not result in unexpected changes of the products. Furthermore, before WFI pipeline steps are released, they are rigorously tested and validated by the engineers and instrument scientists at STScI.

We expect that most users will be able to use data products directly from the Roman archive (see Accessing WFI Data article for more information); however, there may be instances when users wish to re-run elements of the WFI science data pipelines or customize the pipeline for particular science use cases.

Installation Instructions

All of the STScI pipelines for Roman are contained in a single Python package called romancal that is publicly developed on GitHub with released versions available via the Python Package Index (PyPI)

Additional information on how to install specific versions, including the latest development version, can be found on the pipeline installation page of the romancal readthedocs documentation. Basic installation on a Unix-based operating system using a Conda environment manager can be accomplished in a bash terminal by typing the following:

$ conda create -n <environment_name> python
$ conda activate <environment_name>
$ pip install romancal 

Note that the $ symbol indicates the bash prompt. The variable environment_name is at the discretion of the user. By indicating the argument "python" during the environment creation, the latest available version of Python will be installed in the environment along with other necessary tools such as pip.

Installing  romancal will install several other dependency packages including but not limited to:

  • roman_datamodels
  • asdf
  • crds

Users will also need access to calibration reference files for some pipeline steps and should see the CRDS for Reference Files article for additional information, including how to set up necessary environment variables.


Pipeline Descriptions

Here, we provide a high-level description of the individual pipelines used to produce Roman WFI data products. Detailed information about each pipeline is provided in separate articles. Users are also advised to consult the Data Levels and Products page for information on the formats and contents of the different WFI data products.

The WFI detectors are an updated version of the detectors used in JWST instruments; therefore, the philosophical starting point for the development of the WFI data pipelines is the JWST science data pipeline. Deviations from the JWST pipelines occur when either the JWST pipeline steps are not appropriate or insufficient for WFI data, or when Roman mission science accuracy requirements necessitate changes to the underlying algorithms.

Level 2 Exposure Pipeline

The Level 2 exposure pipeline contains the algorithms necessary to correct raw WFI ramps for instrumental effects, and collapse the ramps along the time axis into rate images suitable for scientific analysis. The exposure pipeline corrects for the following instrumental effects:

  • Signal induced by the readout electronics (e.g., 1/f noise) using reference pixels
  • Dark current
  • Classic non-linearity
  • Flat-field (variations in quantum efficiency)

In addition, static bad pixels and pixels with poor calibration from the calibration reference files are flagged in the data quality arrays. Rows that intersect the guide window on each detector are flagged in the data quality arrays due to changes in the noise properties of the intersecting rows. Finally, the following steps are performed:

  • A slope per pixel is fit up the ramp to produce a count rate per pixel
  • A WCS model, including the geometric distortion, for transformation from pixels to sky coordinates (and the inverse) is added to the metadata
  • Photometric calibration information including zeropoints and nominal pixel area are added to the metadata
  • Alignment to Gaia astrometric sources is performed to update the WCS model

Note that the input Level 1 files to the exposure pipeline are separated per WFI detector (i.e., there are 18 files for a full WFI exposure), and, similarly, the output Level 2 files from the exposure pipeline are also separated per detector.

Please see the Exposure Level Pipeline article for more information.

Level 3 Mosaic Pipeline

The Level 3 mosaic pipelines contains the modules necessary to combine the calibrated level 2 data into mosaic images. This pipeline will:

  • Convert the data into MegaJansky/steradian (MJy/sr);
  • Apply a background correction so that contiguous mosaics have matching backgrounds;
  • Perform outlier detection;
  • Combine and resample the input images into a mosaic Level 3 output. 

This pipeline is where the 18 WFI detectors are combined into a single image, as well as where multiple WFI FOVs are combined for greater depth and/or size. 

Please see the Mosaic Level Pipeline article for more information.

Level 4 Catalog Pipeline

Information regarding the generation of WFI catalog products will be added in future RDox releases.


Automatic Data Processing

As WFI data are downlinked from the Roman spacecraft, they are automatically processed through several data pipelines with only a few variations depending on the observation type. For example, the exposure level pipeline does not apply a flat-field correction to WFI spectroscopic observations, as the Science Support Center applies a wavelength-dependent flat-field correction in the spectroscopic pipeline. Wherever present, these special cases are described in detail in the articles dedicated to each of the pipelines. After the data products are generated, they are ingested into MAST and immediately made available to the community with no proprietary exclusive access period. See Accessing WFI Data for more information on how to retrieve WFI data products.




For additional questions not answered in this article, please contact the Roman Help Desk at STScI.




References

  1. The Roman Space Telescope Calibration Pipeline, Readthedocs maintained by STScI 2023, Latest version
  2. JWST Science Calibration Pipeline Overview, Last Update 29 Nov 2022, JWST User Documentation (JDox)


Latest Update

 

Updated article with new info on the exposure and mosaic pipeline.
Publication

 

Initial publication of the article.