Mosaic Level Pipeline

This article contains a user-oriented overview of the Mosaic Level Pipeline, one of the Roman pipelines developed at STScI. It explains how mosaics are created with the drizzle algorithm and offers basic guidance for users interested in performing custom processing of Roman data from Level 2 to Level 3.




Introduction to the Mosaic Pipeline 

The Mosaic Level Pipeline is responsible for combining individual detector exposures into mosaics. In this context, the term "mosaic" refers to a combination or stack of multiple exposures into a larger or deeper image. The Mosaic Pipeline combines groups of Roman Wide Field Instrument (WFI) images with an implementation of the drizzle algorithm (Fruchter & Hook 2002). Drizzle was originally developed to combine Hubble Space Telescope (HST) dithered images to mitigate the effects of undersampling, while also enabling the removal of image artifacts, cosmic rays, and other  spurious detections. The drizzle algorithm combines and distributes the flux from multiple input pixels onto a new pixel grid, which is typically of a different size. This grid could cover a larger area on the sky, enabled by large dither offsets between individual exposures, or may provide finer spatial resolution through subpixel dithering that oversamples the pixels. Please refer to the DrizzlePac software documentation for more information.

For the JWST mission, STScI developed a code called resample , which is based on the drizzle algorithm. Please see the JWST pipeline readthedocs for code documentation or the JDox article for an introduction to the package. The resample code has been repurposed for Roman data and is one of the modules in the mosaic pipeline.

The mosaic pipeline takes Roman Level 2 (L2) data products, individual calibrated exposures, and processes them into Level 3 (L3) mosaic images. Please review the Roman Data Levels and Products article for a detailed explanation of the Roman data level naming conventions and the associated products created at each level.

While users can create custom mosaics tailored to their science goals, the automatic pipeline products stored in the Roman Archive follow a sky tessellation scheme that limits individual file size while supporting the creation of mosaics across large, contiguous regions of the sky. More information the Roman tessellation is available in the Tessellation article.


WFI Prism and Grism Data

The mosaic pipeline is NOT applied to prism and grism data. The Science Support Center (SSC) at IPAC is responsible for the pipeline development and processing of higher level prism and grism data products.

Pipeline Step Descriptions

Each step of the mosaic pipeline is summarized below. 

Flux

Flux is the initial step in the STScI Roman mosaic pipeline. It scales the science data, error, and variance arrays by a scale factor that converts the input L2 flux units from digital numbers per second (DN/s) into megaJanskys per steradian (MJy/sr). Users can find the scale factor in the metadata under the name meta.photometry.conversion_megajanskys , which differs for each of the 18 WFI detectors. More information about the properties of the WFI detectors is available in the WFI Detectors article. Additionally, details on the  Flux algorithm are available in the readthedocs code documentation. Essentially, the data and error arrays are multiplied by the flux scale factor, while the variance arrays are multiplied by the square of the flux scale factor. 

This step takes a single input image or an association table, and neither requires nor accepts other user-supplied arguments. The input image can be the full path of an ASDF file on disk or an image already loaded as a Roman datamodel.

SkyMatch

The  SkyMatch step measures the sky level in the input images and allows for background manipulation in the output mosaics. Note that the term "sky" in this context refers to any background signal, which may include true sky emission as well as thermal emission from the spacecraft. SkyMatch offers multiple methods for sky measurements. The two most commonly used are: 1) a local sky measurement, which calculates the background value of each image individually, and 2) a matching measurement, which computes offsets between overlapping regions of two or more images. SkyMatch can also subtract a local background or apply the calculated offset to groups of images so that the backgrounds are consistent.  

A complete list of all arguments accepted by this step is available on readthedocs. For convenience, the three most commonly used are provided in the table below.

Table of Common SkyMatch Step Arguments 

ArgumentTypeDefaultOptionsExplanation
skymethodstringMatch

Local, Global, Match, Global+Match

Specifies the algorithm SkyMatch uses to calculate the sky background. 

match_downbooleanTrueTrue, False

This option selects between the minimum and maximum sky value when Match or Global+Match is chosen for skymethod. If set to True, the matching algorithm uses the lowest sky value; if False, it uses the highest sky value.

dqbitsstring~DO_NOT_USE+NON_SCIENCEA table with Roman data quality (DQ) values is provided on readthedocs.

Indicates the DQ bits that should be taken into account  when calculating the sky background. The default instructs SkyMatch to NOT use pixels flagged with the DO_NOT_USE and NON_SCIENCE values.

The algorithms corresponding to each skymethod option listed in the Table of Common SkyMatch Step Arguments are summarized below:

  • Local - computes the sky background value for each input image using a simple median, after excluding bad pixels based on DQ flags.
  • Global - computes sky background values for all input images using the Local method, then assigns the minimum of those results to all images. 
  • Match - calculates a correction value for each image that, when applied, minimizes the mismatch between all pairs of overlapping images in a least-squares sense. 
  • Global+Match - first applies the Global method to define a baseline sky level, then uses the Match method to equalize sky values across images. 

Additional details, including example results for each algorithm, can be found in the developer documentation on readthedocs.

Outlier Detection 

The next step in the mosaic pipeline is the outlier detection. The mosaic pipeline is often used to combine many images, which enables the detection of outliers that may not have been identified during the ramp fitting step of the exposure level pipeline. The outlier detection code was originally developed for JWST and has been adapted for Roman data. The algorithm works by creating a median image from all input images and performing a statistical comparison between this median and the individual input images to identify outliers. A full description of the algorithm is available on readthedocs

This step offers an extensive number of arguments to customize the outlier detection.  See the Argument page on readthedocs for a complete list. 

The most commonly used argument is weight_type , which specifies how the data should be weighted when creating the median image. The available options for  weight_type are: 

  • exptime (default) - uses the exposure time to weight the pixels,
  • ivm - stands for inverse-variance map, which can be provided by the user or generated by the pipeline,
  • None - indicates that no weight image is used; instead, a mask is generated based on the DQ arrays of the input calibrated rate (L2) images.

More information about the different types of weights can be found on readthedocs. For background on how weights are used in HST AstroDrizzle processing, refer to the section titled "A Note about Photometry and Weights in AstroDrizzle" in the DrizzlePac Handbook

Pixels identified as outliers in this step are flagged to ensure they are excluded from further processing. 


Please note that outlier detection can mistakenly flag moving objects and/or highly variable sources as outliers in the DQ array, potentially excluding them from subsequent processing steps. It is important to adjust the outlier rejection input arguments according to the specific science goals. 

Resample 

The actual image combination happens in the resample step, which combines groups of WFI images using  an implementation of the drizzle algorithm. Users can supply an extensive list of parameters to customize this step. The automated L3 outputs produced by the mosaic pipeline will use parameters defined in the CRDS reference file, which, as of the publication of this article, is still under development. When the reference file does not override a specific argument, resample step uses the default value for that argument. A table of the most commonly used arguments is provided below, along with guidance on how users might modified them. 

Table of Common Resample Step Arguments 

ArgumentTypeDefaultOptionsExplanation
pixfracfloat1.00.0 to 1.0

The scale fraction by which the original pixel is reduced in size before being mapped to the output image. The default value of 1.0 indicates no reduction, while a value of 0.5 will halve the pixel size.

pixel_scale_ratiofloat1.00.0 to 1.0

The ratio of the pixel scale of the output image to that of the original pixels. A default value of 1.0 preserves the native WFI pixel scale, while a value of 0.5 indicates each input pixel is mapped onto four output pixels. This is often used to mitigate the effects of undersampling, assuming the exposure depth is sufficient to support the recovery of the information.

weight_typestringivmivm, exptime, none

The ivm option uses the VAR_RDNOISE array to generate an inverse-variance map. If no VAR_RDNOISE array exists, then all pixels are given equal weighting. The exptime option weights the pixels by the exposure time of each image.

good_bitsstring~DO_NOT_USE+NON_SCIENCEA table with Roman DQ values is provided on readthedocs.This string indicates which DQ bits are considered good when processing the images. By default, all pixels not flagged as DO_NOT_USE or NON_SCIENCE are considered "good". The tilde (~) at the beginning indicates bits that should not be used as "good". 

The output image of resample is the L3 data product. 

Finally, several utilities functions are available to work with with the resulting L3 data products. These include:

  • Building a model of the weighing used by the resample step;
  • Creating a pixel grid map of the transformation between the input and output world coordinate system (WCS);
  • Identifying which images contributed flux to each pixel in the output mosaic, based on the context image; and
  • Constructing a function that translates the input image coordinates into the output image coordinates. 

A detailed description of all these utility functions can be found on readthedocs documentation. 

Source Catalog

The source catalog step ( romancal.source_catalog.SourceCatalogStep() ) uses the photutils package to detect, separate (deblend), and perform photometric and morphological measurements of astronomical sources in an image. Within this step, if the option psf_fit = True  is selected, PSF models generated by STPSF are fit to stars detected via  DAOStarFinder

The source catalog step produces two output data products: 1) a source catalog and 2) a segmentation map. Both products are stored as ASDF files (see Data Levels and Products for more information). The source catalog contains photometric and morphology information. The morphology measurements are based on 2-D image moments within the source segment. Additional details about the source catalog can be found in the readthedocs. This step can be executed on L2 or co-added mosaic Level 3 (L3) images, and the output data products are considered Level 4 (L4) products.

Running the Mosaic Pipeline

To run the mosaic pipeline, users need:

  • An environment that contains the mosaic pipeline, 
  • WFI calibrated images (L2, exposure level products), and 
  • An association file that contains a properly formatted list of the L2 input images. 

The environment can be on a local system or on the Roman Research Nexus. Detailed installation instructions can be found on the romancal readthedocs installation page. romancal is the package name for the WFI imaging calibration pipelines developed at STScI.

The mosaic pipeline processes L2 images and must be given either the path to a single L2 image or a JSON association table that contains a list of all the L2 input images. In the examples below, the association table is the same example file created in the Associations article. The Roman simulated data and resultant association table can be viewed and downloaded on Box at this link.

Currently, the command line interface is recommended for running the pipeline, but other methods of calling the pipeline are described in the readthedocs, including details on running individual pipeline steps

Run via the Command Line

The command line call to run  romancal is strun, and the mosaic pipeline can be called by the class romancal.pipeline.MosaicPipeline or the alias roman_mos. The alias is used below. 

Basic Mosaic Pipeline Command
strun roman_mos r00000-o9999_RDoxExampleName_ATYPE_001_asn.json

Run with User-Supplied Arguments

Now user supplied arguments are added, one for the overall pipeline and one that is used exclusively by the  resample step. 

Mosaic Pipeline with User-Supplied Arguments
strun roman_mos r00000-o9999_RDoxExampleName_ATYPE_001_asn.json --output_file=RDox_Example_Output --steps.resample.weight_type=exptime

This command produces an output mosaic image that combines the four example images in the ASN file and names it RDox_Example_Output_i2d.asdf . It uses  exptime for the  weight_type when performing the resample step.

Run with Specific Steps Turned Off

Finally, use the format below to turn off a specific step. This can be helpful for specific science goals, such as skipping the Outlier Detection step when searching for variable objects search, or skipping the Source Catalog step if the L4 data products are not required. 

Mosaic Pipeline with Source Catalog Turned Off
strun --disable-crds-steppars roman_mos r00000-o9999_RDoxExampleName_ATYPE_001_asn.json --steps.sourcecatalog.skip=True

The --disable-crds-steppars command prevents the mosaic pipeline from connecting to CRDS and using the reference files hosted there to populate the mosaic pipeline step parameters. This command is helpful for users who plan to provide their own arguments. The --steps.sourcecatalog.skip=True skips the source catalog step.

If the mosaic pipeline does not complete successfully, please try disabling the CRDS connection and the Source Catalog step as shown above. 


The STScI Roman pipelines, including the mosaic pipeline, are developed and maintained by the Roman Science Operation Center at STScI. The code is publicly available on GitHub




For additional questions not answered in this article, please contact the Roman Help Desk.




References




Latest Update

 

Updated for romancal version 0.18.0
Publication

 

Initial publication of the article.