Data Levels and Products

Wide Field Instrument (WFI) science data products are categorized by data levels that indicate their calibration status and types of products. This article offers a comprehensive overview of the composition of WFI data products at each data level. 



Overview of Data Levels

The Roman Space Telescope (Roman) Wide Field Instrument (WFI) science data products will be available to users via the Roman Archive (see Accessing WFI Data for more information). Science data products from the Roman WFI are stored in Advanced Scientific Data Format (ASDF). The WFI files are categorized in five data levels 1 – 5 (often abbreviated L1, L2, etc.). Note that a sixth level (L0) refers to the raw, packetized data received from the WFI; however, L0 data are not publicly accessible. Level 1 – 4 products are generated by the Roman science centers, while L5 products are contributed by the community. 

Changes to the technical details and other specifications presented here are anticipated as part of the development of the Roman data management system. In addition, some details may be omitted while topics are in active development; information on these topics will be added in future RDox releases.

Table with High-Level Summary of WFI Data Products

Data Processing Levels1File SuffixDescription
Level 0Raw, packetized data from the telescope.
Level 1

_uncal.asdf

_wcs.asdf

_gw.asdf

_face.asdf

Uncalibrated detector data. The following file suffixes correspond to:

  • _uncal - uncalibrated science data.
  • _wcs - Gaia-aligned world coordinate system (WCS) and definitive ephemeris information (when available).
  • _gw - uncalibrated guide window pixel data and telemetry.
  • _face - fine attitude correction estimate (FACE) telemetry. These values represent the corrections made by the Attitude Control System (ACS) in response to the measured positions of the guide stars during fine guiding.
Level 2

_cal.asdf

Calibrated detector rate images. The following file suffixes correspond to:

  • _cal - calibrated detector rate images.
Level 3
_coadd.asdf

Additional TBD

Re-pixelated data including, e.g., co-additions and mosaics. The following suffixes correspond to:

  • _coadd - co-added mosaics created by  romancal . Does not apply to Galactic Bulge Time-Domain Survey or spectroscopic mode observations.
Level 4
_cat.asdf
_segm.asdf

Additional TBD

Information extracted from pixel data including, e.g., source catalogs, 1-D spectra, and light curves. The following suffixes correspond to:

  • _cat - single-band source catalogs. Does not apply to Galactic Bulge Time-Domain Survey or spectroscopic mode observations.
  • _segm - segmentation maps. Does not apply to Galactic Bulge Time-Domain Survey or spectroscopic mode observations.
Level 5TBD

1 All L1 – 5 products are accessible to Roman Archive users, but L0 data are restricted. See Accessing WFI Data for more information.



WFI File Naming Conventions

WFI file names consist of a root name and a suffix that denotes the data product type (see Overview of Data Levels above). Root names are a combination of several types of information (e.g., observing program and instrument information), and the letter "r" is always prefixed to WFI file root names to indicate that the data products are from Roman. Note that higher data levels (L3 and L4) as described above may contain one or more data products with the same root name but differing suffixes. 

File naming conventions for WFI data products are under development and subject to change.

L1 and L2 File Names

Both L1 and L2 WFI data products share common root names with the differing suffixes _uncal (L1) and _cal (L2). The root names of L1 and L2 files are also sometimes called the observation identifier (or "Observation ID") and consist of the following components (file metadata keywords for each component are shown in parentheses):

Table of L1 and L2 Root Name Components

ComponentFormatElements
Visit Identifier (visit_id)PPPPPCCAAASSSOOOVVV
  • PPPPP is the observing program number (example: 00123)
  • CC is the execution plan number (example: 01)
  • AAA is the pass number within the execution plan (example: 008)
  • SSS is the segment number within a pass (example: 002)
  • OOO is the observation number within a segment (example: 013)
  • VVV is the visit number within an observation (example: 005)
Exposure Identifier (exposure_id)eeee
  • eeee is the exposure number within a visit (example: 0005)

The root name components are separate by an underscore such that the final root name is of the format 'PPPPPCCAAASSSOOOVVV_eeee'. Using the examples in the table above, the resulting root name of an L1 and L2 observation would  be 'r0012301008002013005_0005'. For L1 and L2 files, the root name is followed by the WFI detector number in the format wfiNN, where NN is a zero-padded integer between 01 and 18 (e.g., wfi06). The detector number is followed by the filter name with the filter prefix in lowercase (e.g., f158). Finally, the filter name is followed by the data product suffix and file extension. As an example, an L2 data product may have a complete file name like 'r0012301008002013005_0005_wfi06_f184_cal', followed by the file extension.

Guide Window File Names

Guide window files share the same root name components as the L1 and L2 files described above, with an additional component that denotes the guide star acquisition number. A guide window file root name may be represented as 'PPPPPCCAAASSSOOOVVV_eeee_Q', where 'Q' is the guide star acquisition number and can have values in the range of 1–9 (inclusive). Using the previous L2 file name example, a complete guide window file name may be 'r0012301008002013005_0005_1_wfi06_f184_gw' followed by the file extension. Note that the guide window files are always archived as L1 data products, thus the suffix will always be _gw. In addition to the guide window pixel data and telemetry, a fine attitude correction estimate (FACE) telemetry file is also available. The FACE telemetry consists of information relating the corrections made by the Attitude Control System (ACS) in response to the measured positions of the guide stars during fine guiding. The FACE file uses the same root name as the guide window file with the suffix _face, and using the previous example would appear as 'r0012301008002013005_0005_1_wfi06_f184_face' followed by the file extension.


Detailed Descriptions of WFI Data Products

In addition to the descriptions below, the schema detailing the contents of the WFI science files may be found in the Roman Attribute Dictionary (RAD) repository on GitHub.

Level 1 - Uncalibrated Data

Science Ramps 

WFI L1 files are constructed from packetized L0 data. During this process, data are reoriented from the detector frame to the science coordinate frame (see Coordinate Systems article for more information on the WFI coordinate frames), and essential metadata are populated. Each L1 file contains a three-dimensional data cube representing a single, uncalibrated ramp exposure. Unlike charge-coupled devices (CCDs), infrared detectors enable non-destructive readouts during an exposure. This allows the signal in each pixel to be sampled repeatedly over time, producing a “ramp” that improves noise performance and facilitates cosmic ray rejection. Each detector is written to a separate L1 file, so a single WFI exposure yields 18 L1 files. The primary science data cube in each file has dimensions (N resultants, 4096 rows, 4096 columns), where N is defined by the multi-accumulation table used for the exposure. Resultants are analogous to groups in JWST, but offer greater flexibility: they do not need to be evenly spaced in time and can represent different numbers of averaged reads. In addition to the science cube, each L1 file includes a secondary data cube with dimensions (N resultants, 4096 rows, 128 columns), corresponding to samples of the 33rd amplifier’s virtual reference pixels (see Description of WFI for more information).

Table of L1 Science Data Specifications


ArrayDescriptionUnitsTypeDimensions
dataScience data, including the border reference pixels.Digital Number (DN)uint16(N resultants, 4096 rows, 4096 columns)
amp33Amp 33 reference pixel data.DNuint16(N resultants, 4096 rows, 128 columns)

Guide Window Data

A small subregion of each detector, referred to as a guide window, is configured to be read out at high cadence during an exposure. These guide windows are typically positioned on pre-selected bright stars and are used by the spacecraft's onboard systems for target acquisition and to maintain fine attitude control throughout an exposure.

Guide window operations differ from the full-frame science exposures. The guide window pixels are reset and then read multiple times in rapid succession. These fast readouts, referred to as reads, are grouped and averaged together to form a combined resultant. After a short pause - during which a portion of the detector outside the guide window is read - this cycle of reset and rapid readout is repeated. This results in many guide window resultants per full-frame science readout, and many more per science ramp (i.e., the full set of reads accumulated over the exposure duration).

The resulting guide window data are stored in separate L1 files from the main science ramps. These files are not propagated to higher-level calibrated data products, but are available in archive.

Table of L1 Guide Window Specifications


ArrayDescriptionUnitsTypeDimensions
signal_framesReconstituted and oriented signal frames.DNuint16(I frames, J combined resultants, K reads, Y rows, X columns)
pedestal_framesReconstituted and oriented pedestal frame GW images.DNuint16(I frames, J combined resultants, K reads, Y rows, X columns)
amp33Amp 33 reference pixel data.DNuint16(I frames, J combined resultants, K reads, Y rows, X columns)


Level 2 - Calibrated Exposures

WFI Level 2 (L2) data products are calibrated, two-dimensional rate images expressed in instrumental units of DN/s. These are generated from Level 1 (L1) ramp data by the Exposure Pipeline in romancal (see Exposure Level Pipeline for more details on pipeline algorithms). The core function of the Exposure Pipeline is to fit a constant count rate (i.e., a slope) to the accumulated signal in each pixel over time. This slope-fitting step converts the 3D ramp into a 2D rate image, with the best-fit slope for each pixel stored as the data array in the L2 file. The pipeline also propagates uncertainties through this fit. The variance array stores the per-pixel variance on the fitted slope, accounting for both read noise and photon (Poisson) noise. These contributions are computed using standard error propagation techniques in conjunction with reference files for gain and read noise. The error array in the L2 file is the square root of the variance array, and includes an additional flat error term that captures uncertainties associated with the intensity calibration.

The Exposure Pipeline also applies detector-level calibrations, including bad pixel masking, classic non-linearity correction, and dark current subtraction. It aligns the resulting rate image to Gaia astrometry and populates additional metadata, such as the conversion to physical surface brightness units. Note that L2 products from WFI spectroscopic exposures are not flat-fielded and do not include photometric calibration metadata. Wavelength-dependent flat-fielding and absolute flux calibration are instead performed later, during 1-D spectral extraction and processing by the Science Support Center (SSC) spectroscopic pipelines. 

As with Level 1, each WFI detector is stored in a separate L2 file, so a complete WFI exposure yields 18 L2 files. The science data arrays in these files have dimensions of (4088, 4088) pixels, reflecting the removal of a 4-pixel-wide border of reference pixels (refer to the Description of WFI for additional information on reference pixels). Reference pixel values from the L1 data are copied into auxiliary arrays in the L2 files, enabling inspection of the reference data used in the reference pixel correction step applied during processing.

Data quality (DQ) flags are also added by the Exposure Level Pipeline. These are stored in a DQ array as a bitwise sum of individual flags, with each power-of-two value representing a specific detector artifact or calibration condition. Detailed flag definitions will be provided in future documentation updates.

Table of L2 Science Data Specifications


ArrayDescriptionUnitsTypeDimensions
dataScience data, excludes reference pixels.DN / secfloat32(4088 rows, 4088 columns)
errTotal error array.DN / secfloat32(4088 rows, 4088 columns)
dqData quality array.N/Auint32(4088 rows, 4088 columns)
var_flatVariance array associated with the flat field.DN2 / sec2float32(4088 rows, 4088 columns)
var_poissonVariance array associated with the Poisson noise.DN2 / sec2float32(4088 rows, 4088 columns)
var_rnoiseVariance array associated with the read noise.DN2 / sec2float32(4088 rows, 4088 columns)
amp33Amp 33 reference pixel data.DNuint16(4096 rows, 128 columns)
border_ref_pix_leftOriginal border reference pixels (left).DNuint16(4096 rows, 4 columns)
border_ref_pix_rightOriginal border reference pixels (right).DNuint16(4096 rows, 4 columns)
border_ref_pix_topOriginal border reference pixels, (top).DNuint16(4 rows, 4096 columns)
border_ref_pix_bottomOriginal border reference pixels, (bottom).DNuint16(4 rows, 4096 columns)
dq_border_ref_pix_leftData quality for border reference pixels (left).N/Auint32(N resultants, 4096 rows, 4 columns)
dq_border_ref_pix_rightData quality for border reference pixels (right).N/Auint32(N resultants, 4096 rows, 4 columns)
dq_border_ref_pix_topData quality for border reference pixels (top).N/Auint32(N resultants, 4 rows, 4096 columns)
dq_border_ref_pix_bottomData quality for border reference pixels (bottom).N/Auint32(N resultants, 4 rows, 4096 columns)


Level 3 - Mosaics 

L3 products are the co-additions or mosaics of L2 files. A single L3 product may be based on the input of one or more L2 products. During L3 product generation, the data are corrected for geometric distortion and are converted from instrumental units to physical surface brightness units of MegaJanskys per steradian (MJy / sr). Data quality information from the L2 file(s) is used to screen out various undesirable detector effects from the final L3 product. The size and shape of the L3 products depends on the settings used to create the final product. For more information on the L3 science data pipeline, see Mosaic Level Pipeline.

Information regarding specific L3 products produced by the Roman science centers will be added in future RDox releases.

Level 4 - Extracted Data

L4 products contain information that has been extracted from pixelated L2 and L3 data. These products include source catalogs, 1-D spectra, and light curves.

Information regarding specific L4 products produced by the Roman science centers will be added in a future RDox release.

Level 5 - User Contributed Products 

L5 data products are created by Roman users and made available to the community via the Roman Archive. As these products are heterogeneous in nature, documentation on collections of L5 products will be made available via MAST. See Accessing WFI Data for more information on L5 products.




For additional questions not answered in this article, please contact the Roman Help Desk.




Latest Update

Most recent RDox release
Publication

 

Initial publication of the article.