Accessing WFI Data
The large data volume produced by Nancy Grace Roman Space Telescope observations requires new ways to store, host, and access data. This article summarizes plans for providing access to Roman Wide Field Instrument (WFI) data at STScI, including the introduction of cloud-based data access and analysis tools.
The Roman WFI Era of Big Data
Roman Wide Field Instrument (WFI) will produce data at significantly higher rates than that of JWST or Hubble instruments. In the table below, the Roman WFI is compared against the JWST NIRCam instrument and the Hubble Wide Field Camera 3 (WFC3) IR channel. The WFI Data Format page contains additional details about the different data formats at various stages of the calibration process. The observatory will downlink up to approximately 11 Tb (1375 GB) of compressed WFI observation data per day on average, which requires the calibration, science analysis, and archive software pipelines to perform at scale.
Table to Compare Hubble WFC3, JWST NIRCam, and Roman WFI
Hubble WFC3 | JWST NIRCam | Roman WFI | |
---|---|---|---|
Number of pixels | 1 Megapixel | 34 Megapixel | 302 Megapixel |
Field of view | 4.7 arcmin2 | 9.7 arcmin2 | 1035 arcmin2 |
Field of view image data size | 0.013 GB (13 MB) | 2 GB | 8 GB |
The Roman data volume is so much larger than those for prior space telescope missions that it is necessary to shift the paradigm of data access. The Roman Science Platform (RSP) will enable users to run analysis scripts on the cloud, i.e., "bringing the compute to the data." This model provides rapid and cost-effective data access, alleviating file transfer times and the need for sufficiently powerful, user-maintained computing hardware. At the same time, the standard archive access will still be fully available.
How to Access Roman WFI Science Data
All Roman science data will immediately be publicly available and there will be no exclusive access period.
Users are encouraged to use the Roman Science Platform (RSP) for the analysis of WFI data. The RSP environment allows easy access to WFI data, along with analysis tools and computing resources. More information on the RSP will be added in future RDox releases.
Users will be able to access Roman data through the Roman Science Platform (RSP) and the Barbara A. Mikulski Archive for Space Telescopes (MAST). The RSP provides a rich computing environment, allowing broad, low-barrier access to data, compute, and software resources. The MAST website features a graphical user interface for querying data, such as searching or filtering files by keywords and positions. MAST will also enable WFI cutout services (similar to TESScut), a cross-mission source catalog-based search, and Virtual Observatory services. These services will also be enabled on the RSP, and the two access points (RSP and MAST will be interconnected.
On the back-end, the Data Access Application Programming Interface (DAAPI) will provide unified public access to Roman data stored on-premises and in cloud endpoints. The DAAPI is not a single software stack, but rather a collection of services with consistent access patterns that will allow users to query Roman data. In principle, Roman data can also be directly downloaded using AWS URIs associated with each AWS Simple Storage Service (S3) object.
How to Access Reference Files and Other Data
The Calibrated Engineering Database (EDB) hosts engineering mnemonics that describes the Roman observatory and instruments, such as telemetry and spacecraft environment data. The EDB will be accessible via both the MAST website as well as a Python-based query engine.
Further information about the Calibrated Engineering Database, including examples of how to access the information, will be added in future RDox releases.
The Roman Calibration Reference Data System (CRDS) hosts files necessary for calibration and data processing (e.g., reference files for dark subtraction). CRDS can retrieve a particular reference file if the software pipeline, instrument, and reference file parameters (known as CRDS "context") are specified. Access to CRDS is provided via Python code that can be run on macOS and Linux, both in or outside of AWS. More information on CRDS may be found in the article CRDS for Reference Files.
MAST also provides an interface for searching the Roman Calibration and Supplemental Data (CaSSI) archive, which includes copies of the calibration reference files from CRDS as well as other engineering files.
For additional questions not answered in this article, please contact the Roman Help Desk at STScI.