WFI Data Levels and Products
Wide Field Instrument (WFI) science data products are categorized by data levels that indicate their calibration status and types of products. WFI science data products are classified into different levels, each indicating the calibration stage and the types of products. These levels progress from Level 0 (L0) to Level 5 (L5), representing an increasing level of complexity. This article offers a comprehensive overview of the composition of WFI data products at each data level.
Overview of Data Levels
The Roman Space Telescope (Roman) WFI science data products will be available to users via the Roman Archive (see Accessing WFI Data for more information). Science data products from the Roman WFI are stored in Advanced Scientific Data Format (ASDF). The WFI files are categorized in five data levels 1 – 5 (often abbreviated L1, L2, etc.); see the Table with High-Level Summary of WFI Data Products for a summary. Note that a sixth level (L0) refers to the raw, packetized data received from the WFI; however, L0 data are not publicly accessible. Level 1 – 4 products are generated by the Roman science centers, while L5 products are contributed by the community.
Changes to the technical details and other specifications presented here are anticipated as part of the development of the Roman data management system. In addition, some details may be omitted while topics are in active development; information on these topics will be added in future RDox releases.
Table with High-Level Summary of WFI Data Products
| Data Processing Levels1 | File Suffix | Description |
|---|---|---|
| Level 0 | — | Raw, packetized data from the telescope. |
| Level 1 |
| Uncalibrated detector data. The following file suffixes correspond to:
|
| Level 2 |
| Calibrated detector rate images. The following file suffixes correspond to:
|
| Level 3 | _coadd.asdf _asdf.json | Re-pixelated data including, e.g., co-additions and mosaics. The following suffixes correspond to:
|
| Level 4 | _cat.parquet Additional TBD | Information extracted from pixel data including, e.g., source catalogs, 1-D spectra, and light curves. The following suffixes correspond to:
|
| Level 5 | N/A | User-contributed high-level data products. |
1 All L1 – 5 products are accessible to Roman Archive users, but L0 data are restricted. See Accessing WFI Data for more information.
WFI File Naming Conventions
WFI file names consist of a root name and a suffix that denotes the data product type (see Overview of Data Levels above). Root names are a combination of several types of information (e.g., observing program and instrument information), and the letter "r" is always prefixed to WFI file root names to indicate that the data products are from Roman. Note that higher data levels (L3 and L4) as described above may contain one or more data products with the same root name but differing suffixes.
L1 and L2 File Names
Both L1 and L2 WFI data products share common root names with the differing suffixes _uncal (L1) and _cal (L2). The root names of L1 and L2 files are also sometimes called the observation identifier (or "Observation ID") and consist of the components (file metadata keywords for each component are shown in parentheses) described in the Table of L1 and L2 Root Name Components.
Table of L1 and L2 Root Name Components
| Component | Format | Elements |
|---|---|---|
| Visit Identifier (visit_id) | PPPPPCCAAASSSOOOVVV |
|
| Exposure Identifier (exposure_id) | eeee |
|
The root name components are separate by an underscore such that the final root name is of the format 'PPPPPCCAAASSSOOOVVV_eeee'. Using the examples in the table above, the resulting root name of an L1 and L2 observation would be 'r0012301008002013005_0005'. For L1 and L2 files, the root name is followed by the WFI detector number in the format wfiNN, where NN is a zero-padded integer between 01 and 18 (e.g., wfi06). The detector number is followed by the filter name with the filter prefix in lowercase (e.g., f158). Finally, the filter name is followed by the data product suffix and file extension. As an example, an L2 data product may have a complete file name like 'r0012301008002013005_0005_wfi06_f184_cal', followed by the file extension.
Guide Window File Names
Guide window files share the same root name components as the L1 and L2 files described above, with an additional component that denotes the guide star acquisition number. A guide window identifier is represented as 'PPPPPCCAAASSSOOOVVV_Q', where the first component before the underscore is the visit identifier and 'Q' is the guide star acquisition number, which can have values in the range of 1–9 (inclusive). Using the previous L2 file name example, a complete guide window file name may be 'r0012301008002013005_0005_1_wfi06_f184_gw' followed by the file extension. Note that the guide window files are always archived as L1 data products, thus the suffix will always be _gw. In addition to the guide window pixel data and telemetry, a fine attitude correction estimate (FACE) telemetry file is also available. The FACE telemetry consists of information relating the corrections made by the Attitude Control System (ACS) in response to the measured positions of the guide stars during fine guiding. The FACE file uses the same root name as the guide window file with the suffix _face, and using the previous example would appear as 'r0012301008002013005_0005_1_wfi06_f184_face' followed by the file extension.
When the Coronagraph is observing, the WFI will be operated in parallel to facilitate guiding. In this scenario, the WFI visit ID does not match the Coronagraph visit ID, and the WFI guide window files will be named with the Coronagraph guide window ID.
Level 3 and 4 File Names
The file names of L3 and L4 products are made up of the observing program number, an identifier for prompt or data release products, a subset name, the skycell name, and the optical element. As all archived L3 and L4 data products (except those produced from the Galactic Bulge Time Domain Survey) use a skymap tessellation, the name of the skycell is a component of the file name. The complete file name of a L3 or L4 product can be represented as rPPPPP_prdr_subset_skycell_element. Note that in L4 products, the optical element is only listed for prompt products. In data releases, multiple filters will be combined in L4 data products. The Table of L3 and L4 File Name Components provides further details for all of these file name components.
Table of L3 and L4 File Name Components
| Component Name | Component from Example File Name | Description |
|---|---|---|
| Observing Program Number | PPPPP |
|
| Prompt or Data Release Identifier | prdr | An indicator of whether the data product was generated as part of prompt processing or as part of a data release. Example values:
|
| Subset | subset | Name of the subset used to create the L3 or L4 product. Subsets allow a combination of an observing program and a prompt or data release product to have multiple versions corresponding to different combinations of input L2 images. In the Archive, prompt product subsets will all be based on visits, while data release product subsets will be based on either passes, full-depth stacks (of a single program), or arbitrary subsets. Example values:
|
| Skycell Name | skycell | The name of a skycell is composed of three components: the projection region celestial coordinates, the skycell X position within the projection region, and the skycell Y position within the projection region. Example values:
|
| Optical Element Name | element | Name of the optical element (example: |
Using the above information, a non-microlensing L3 file name from the Archive may be represented as, for example, r00186_p_v01004007001012_10m6x2y50_f184_coadd.asdf. In this example, this product is a prompt L3 image from program 00186 that combines exposures in the F184 element in execution plan 01, pass 004, segment 007, observation 001, visit 012 and represents the skymap tessellation at projection region center (RA, Dec) = (10.0, –6.0) degrees and skycell (X, Y) = (2, 50). The corresponding prompt L4 source catalog and segmentation maps would be named r00186_p_v01004007001012_10m6x2y50_f184_cat.parquet and r00186_p_v01004007001012_10m6x2y50_f184_segm.asdf, respectively.
Detailed Descriptions of WFI Data Products
In addition to the descriptions below, the schema detailing the contents of the WFI science files may be found in the Roman Attribute Dictionary (RAD) repository on GitHub.
Level 1 - Uncalibrated Data
Science Ramps
WFI L1 files are constructed from packetized L0 data. During this process, data are reoriented from the detector frame to the science coordinate frame (see Coordinate Systems article for more information on the WFI coordinate frames), and essential metadata are populated. Each L1 file contains a three-dimensional data cube representing a single, uncalibrated ramp exposure. Unlike charge-coupled devices (CCDs), infrared detectors enable non-destructive readouts during an exposure. This allows the signal in each pixel to be sampled repeatedly over time, producing a “ramp” that improves noise performance and facilitates cosmic ray rejection. Each detector is written to a separate L1 file, resulting in a single WFI exposure generating 18 L1 files. The primary science data cube in each file has dimensions (N resultants, 4096 rows, 4096 columns), where N is defined by the multi-accumulation table used for the exposure. Resultants are analogous to groups in JWST, but offer greater flexibility: they do not need to be evenly spaced in time and can represent different numbers of averaged reads. In addition to the science cube, each L1 file includes a secondary data cube with dimensions (N resultants, 4096 rows, 128 columns), corresponding to samples of the 33rd amplifier’s virtual reference pixels (see Description of WFI for more information on reference pixels). A summary of this information is contained in the Table of L1 Science Data Specifications below.
Table of L1 Science Data Specifications
| Array | Description | Units | Type | Dimensions |
|---|---|---|---|---|
| data | Science data, including the border reference pixels. | Data Number (or Digital Number; DN) | uint16 | (N resultants, 4096 rows, 4096 columns) |
| amp33 | Amp 33 reference pixel data. | DN | uint16 | (N resultants, 4096 rows, 128 columns) |
Guide Window Data
A small subregion of each detector, referred to as a guide window, is configured to be read out at high cadence during an exposure. These guide windows are typically positioned on pre-selected bright stars and are used by the spacecraft's onboard systems for target acquisition and to maintain fine attitude control throughout an exposure.
Guide window operations differ from the full-frame science exposures. The guide window pixels are reset and then read multiple times in rapid succession. These fast readouts, referred to as reads, are grouped and averaged together to form a combined resultant. After a short pause, during which a portion of the detector outside the guide window is read, this cycle of reset and rapid readout is repeated. This results in many guide window resultants per full-frame science readout, and many more per science ramp (i.e., the full set of reads accumulated over the exposure duration).
The resulting guide window data are stored in separate L1 files from the main science ramps. These files are not propagated to higher-level calibrated data products, but are available in the Archive.
Three stages of guiding are all stored in their own sections: the guide star acquisition (acq_data), spectral edge acquisition (WSM observations only, edge_acq_data), and the tracking phase (track_data). These sections contain all the information downlinked in the GW data packets. The files also contain centroid information from the engineering database for each of the pedestal and signal resultant pairs in centroids. The amp 33 reference pixel data is stored in amp33. The Table of L1 Guide Window Specifications contains details of what information is available in the guide window data.
Fine Attitude Correction Estimate (FACE) Data
Level 2 - Calibrated Exposures
WFI Level 2 (L2) data products are calibrated, two-dimensional rate images expressed in instrumental units of DN/s. These are generated from Level 1 (L1) ramp data by the Exposure Pipeline in romancal (see Exposure Level Pipeline for more details on pipeline algorithms). The core function of the Exposure Pipeline is to fit a constant count rate (i.e., a slope) to the accumulated signal in each pixel over time. This slope-fitting step converts the 3D ramp into a 2D rate image, with the best-fit slope for each pixel stored as the data array in the L2 file. The pipeline also propagates uncertainties through this fit. The variance array stores the per-pixel variance on the fitted slope, accounting for both read noise and photon (Poisson) noise. These contributions are computed using standard error propagation techniques in conjunction with reference files for gain and read noise. The error array in the L2 file is the square root of the variance array, and includes an additional flat error term that captures uncertainties associated with the intensity calibration.
The Exposure Pipeline also applies detector-level calibrations, including bad pixel masking, classic non-linearity correction, and dark current subtraction. It aligns the resulting rate image to Gaia astrometry and populates additional metadata, such as the conversion to physical surface brightness units. Note that L2 products from WFI spectroscopic exposures are not flat-fielded and do not include photometric calibration metadata. Wavelength-dependent flat-fielding and absolute flux calibration are instead performed later, during 1-D spectral extraction and processing by the Science Support Center (SSC) spectroscopic pipelines.
As with Level 1, each WFI detector is stored in a separate L2 file, so a complete WFI exposure yields 18 L2 files. The science data arrays in these files have dimensions of (4088, 4088) pixels, reflecting the removal of a 4-pixel-wide border of reference pixels (refer to the Description of WFI for additional information on reference pixels). Reference pixel values from the L1 data are copied into auxiliary arrays in the L2 files, enabling inspection of the reference data used in the reference pixel correction step applied during processing. The Table of L2 Science Data Specifications below contains more information.
Data quality (DQ) flags are also added by the Exposure Level Pipeline. These are stored in a DQ array as a bitwise sum of individual flags, with each power-of-two value representing a specific detector artifact or calibration condition. Detailed flag definitions will be provided in future documentation updates.
Level 3 - Mosaics
L3 products are the co-additions or mosaics of L2 files. A single L3 product may be based on the input of one or more L2 products. During L3 product generation, the data are corrected for geometric distortion and are converted from instrumental units to physical surface brightness units of MegaJanskys per steradian (MJy / sr). Data quality information from the L2 file(s) is used to screen out various undesirable detector effects from the final L3 product. For more information on the L3 science data pipeline, see the Mosaic Level Pipeline article for all surveys except the Galactic Bulge Time Domain Survey (GBTDS) or the Galactic Bulge Survey Pipelines article for information about microlensing product generation.
L3 products (excluding microlensing products) retrieved from the Roman Archive are created using a skymap tessellation. With this tessellation, all L3 non-microlensing products have the same dimensions of (5000 rows, 5000 columns). This is true regardless of the overlap of the input L2 images and the output L3 product. The Mosaic Level Pipeline can be re-run to generate skycell-based L3 products or custom mosaics. If using custom mosaics, note that the image dimensions may vary from those listed in the Table of L3 Science Data Specifications (Excluding Microlensing Products) below.
Level 4 - Extracted Data
L4 products contain information that has been extracted from pixelated L2 and L3 data. These products include source catalogs, 1-D spectra, and light curves. Pixel-based L4 products, such as segmentation maps, and extracted spectra will be stored in ASDF format. Catalog information will be stored in parquet format.
Information regarding specific L4 products produced by the Roman science centers will be added in a future RDox release.
Level 5 - User Contributed Products
L5 data products are created by Roman users and made available to the community via the Roman Archive. As these products are heterogeneous in nature, documentation on collections of L5 products will be made available via MAST. See Accessing WFI Data for more information on L5 products.
For additional questions not answered in this article, please contact the Roman Help Desk.