There are many large datasets from observational astrophysics. They are often stored in inefficient data formats, unoptimized for storage and large-scale analysis.
Here we publicly release useful observational datasets converted to single-file (monolithic) HDF5 and/or Zarr formats.
wget -c https://www.tng-project.org/path/to/file.hdf5
.
The entire GAIA DR3 catalog. Converted directly from the raw CSV files available on the
ESA Gaia Archive. The full dataset is
split into two files, and the first contains the fields most users will be interested in.
Every dataset has a description
attribute, including physical units.
All spectra taken by SDSS/BOSS (DR17). Converted from the original FITS files available on the
SDSS Science Archive Server into a single HDF5 file.
All spectra have been placed on the common wavelength grid of 4700 points, from 3531 to 10324 angstrom.
Every dataset has a description
attribute, including physical units.
The Keck Observatory Database of Ionized Absorption toward Quasars (KODIAQ) catalogs contain all quasar spectra taken with the ESI and HIRES spectrographs on Keck. All spectra are fully reduced, coadded, continuum normalized, and publicly available from the Keck Observatory Archive (KOA).
All data ever taken with the Hubble Space Telescope Cosmic Origins Spectrograph (COS). Compiled by the Hubble Spectroscopic Legacy Archive (HSLA) up to HST Cycle 26 (archive made on 15 May 2018). Contains all raw and associated combined ultra-violet (FUV and NUV) spectra from COS. The targets span all science categories. Each grating is stored in a group, with datasets: flux, wave, error, target_dec, target_ra, target_name, target_desc, target_type
.
The public data release of reduced spectra from the E-XQR-30 quasar sample. These are very high resolution (R ~ 10,000) spectra of reionization era (z > 6) quasars from XSHOOTER.
For the future: DESI, HETDEX, eROSITA, LoTSS, MUSE.
Interested? Other ideas? Get in touch.