Building data files for archival at NASA SPDF
The codes and routines at pysatNASA
are designed for end-users of NASA data
products. However, pysat in general has also been used to build operational
instruments for generating archival data to be uploaded to the Space Physics
Data Facility (SPDF) at NASA.
In general, such instruments should include separate naming conventions. An
example of this is the REACH data, where netCDF4 files are generated for
archival purposes as part of the ops_reach
package, but can be accessed by
the end user through pysatNASA
.
In general, a pysat.Instrument
object can be constructed for any
dataset. Full instructions and conventions can be found
here. In the
case of the REACH data, the operational code reads in a series of csv files and
updates the metadata according to user specifications. Once the file is loaded,
it can be exported to a netCDF4 file via pysat. In the simplest case, this is
reach = pysat.Instrument(inst_module=aero_reach, tag='l1b', inst_id=inst_id)
pysat.utils.io.inst_to_netcdf(reach, 'output_file.nc', epoch_name='Epoch')
However, there are additional options when translating pysat metadata to SPDF preferred formats. An example of this is
# Use meta translation table to include SPDF preferred format.
# Note that multiple names are output for compliance with pysat.
# Using the most generalized form for labels for future compatibility.
meta_dict = {reach.meta.labels.min_val: ['VALIDMIN'],
reach.meta.labels.max_val: ['VALIDMAX'],
reach.meta.labels.units: ['UNITS'],
reach.meta.labels.name: ['CATDESC', 'LABLAXIS', 'FIELDNAM'],
reach.meta.labels.notes: ['VAR_NOTES'],
reach.meta.labels.fill_val: ['_FillValue'],
'Depend_0': ['DEPEND_0'],
'Format': ['FORMAT'],
'Monoton': ['MONOTON'],
'Var_Type': ['VAR_TYPE']}
pysat.utils.io.inst_to_netcdf(reach, 'output_file.nc', epoch_name='Epoch',
meta_translation=meta_dict,
export_pysat_info=False)
In this case, note that the pysat ‘name’ label is output to three different
metadata values required by the ITSP standards. Additionally, the
export_pysat_info
option is set to false here. This drops several
internal pysat
metadata values before writing to file.
A full guide to SPDF metadata standards can be found here.
Other best practices for archival include adding the operational software version to the metadata header before writing. The pysat version will be automatically written to the metadata.
reach.meta.header.Software_version = ops_reach.__version__
A full example script to generate output files can be found at https://github.com/jklenzing/ops_reach/blob/main/scripts/netcdf_gen.py