Newer
Older
The ```seabass-task-tools``` module provides tools for loading groups of SeaBASS files into standardized data structures, along with helper functions for analyzing, plotting, and manipulating the bundled data.
The ```BundleSB```class in Python is designed to bundle and standardize data from multiple SeaBASS files. It takes a list of file paths pointing to separate SeaBASS files, extracts user-specified variables and key metadata, and collates the data into a structured dictionary format.
The goal of ```BundleSB``` is to make it easier to load, analyze, and visualize groups of related SeaBASS data files in a consistent manner. Different SeaBASS files frequently contain slightly different variables, depths, timestamps etc. which makes aggregation tricky. BundleSB abstracts these inconsistencies away behind a clean dictionary interface.
In addition to the core ```BundleSB``` class for bundling SeaBASS data, this Python module also includes a suite of helper functions for analyzing, plotting, and working with ```BundleSB``` objects and SeaBASS files in general. For example, handy functions like ```list_sb()``` and ```plot_depth_profile()``` reduce boilerplate code for common tasks like file listing and profile visualization. There are also specialized plotting functions such as ```plot_spectra_subset()```, which enable tailored spectral plots from ```BundleSB``` subsets meeting certain wavelength, depth, or other criteria. And lower-level functions assist with direct manipulation of variables within Bundle objects.
Having these helpers alongside the main BundleSB class enhances workflows by filling gaps in the workflow from raw SeaBASS files to publication plots and journal exports. It eliminates wheel reinvention by abstracting away numerous repetitive data wrangling and visualization steps into reusable code. And additional helper functions will continue to be added over time, improving interoperability between modules that leverage the standardized Bundle data structure. This collection of functionality moves closer towards an integrated, end-to-end SeaBASS analysis toolkit.
bb3_file_list = sb.list_sb('./data', matching_re='*BB3*.sb')
bb3_bundle = sb.BundleSB(bb3_file_list, user_variables=[ 'vsfp', 'vsfg', 'sal'])
bb3_bundle.variables
OrderedDict([('lat', ('lat', 'degrees')),
('lon', ('lon', 'degrees')),
('station', ('station', 'none')),
('depth', ('depth', 'meters')),
('time', ('time', 'datetime')),
('sal', ('sal', 'psu')),
('vsfg527_124ang', ('vsfg527_124ang', '1/m/sr')),
('filename', 'filename'),
('vsfg469_124ang', ('vsfg469_124ang', '1/m/sr')),
('vsfg652_124ang', ('vsfg652_124ang', '1/m/sr')),
('vsfp527_124ang', ('vsfp527_124ang', '1/m/sr')),
('vsfp652_124ang', ('vsfp652_124ang', '1/m/sr')),
('vsfp469_124ang', ('vsfp469_124ang', '1/m/sr'))])
bb3_bundle.wavelengths
OrderedDict([('sal', [nan]),
('vsfg', [469.0, 527.0, 652.0]),
('vsfp', [469.0, 527.0, 652.0])])
After initialization, the object contains these main attribute dictionaries:
```data```: Contains the actual bundled variable data extracted from the files\
```variables```: Metadata associated with each variable such as units\
```wavelengths``` and ```angles```: Wavelengths and angles for each radiometric variable, if any. NaNs are assigned to these magnitudes if the variables do not contain wavelength or angular data embedded in the var name.\
```size_class```: Particle size classes associated with each variable, typically for PSD (particle size distribution) data. If var name does not contain size class information, NaNs are returned.\
```parsed_variables```: List of which variables were successfully extracted
Additionally, the class handles a number of common data management tasks automatically:\
Automatically appending filename tags to keep track of which measurements came from which file.\
Extracting location, depth, station, and timestamp data into separate entries if available from either the data columns or the headers.\
Optional standard deviation extraction alongside each main variable
Handling missing variables across files by padding with NaNs to keep data aligned.\
Checking for and warning about any negative variable values.
```BundleSB``` solves the problem of pulling together groups of separate but related SeaBASS data sources into one easy-to-use Python dictionary interface. This enables simpler plotting, analysis, and data sharing workflows compared to handling many one-off SeaBASS files.
pip install git+https://oceandata.sci.gsfc.nasa.gov/rcs/joaquin/seabass-task-tools.git
The ```list_sb``` function is designed to generate a list of SeaBASS file paths from a specified directory. It handles several common tasks when programmatically accessing groups of SeaBASS files:
Listing all files matching the standard ```'.sb'``` SeaBASS extension by default.\
Allowing a custom file extension pattern if data uses something non-standard.\
Accepting optional regular expression matching to only get a filtered subgroup of files.\
Automatically prepending full directory paths to file names.\
Returning to original working directory after listing.\
Raising error if no files found to avoid silent failures.\
The goal is to reduce effort required to get a Python list containing SeaBASS file locations compared to calling ```glob.glob()``` directly.\
sb_files = list_sb('../data')
# Returns full paths for all .sb files in ../data dir
sb_files = list_sb('../data', sb_file_ext='*.sb')
# Returns paths for all .sb files in ../data dir
[work in progress, more functions need documentations here]
plot_depth_profile(sb_bundle, var, xlabel=None, filename=None)
The plot_depth_profile function is designed to create a standard depth profile plot from oceanographic data contained in a SeaBASS Bundle instance. Depth profile plots have depth on the y-axis and the variable value on the x-axis.
This function handles several common tasks when making a publication-quality depth profile plot:
Extracting the ```'depth'``` and specified variable data from the SeaBASS Bundle.\
Handling missing data or large depth gaps by inserting ```NaN``` breaks.\
Plotting depth inverted on the y-axis with values on top.\
Setting the x-axis ticks and labels to the top.\
Adding gridlines\
Tightly fitting the figure size to plot area\
Accepting an optional custom x-axis label\
Saving figure to file if filename provided\
The goal is to reduce effort required for consistent, polished oceanographic profile plots compared to general matplotlib use.\
```
sb_bundle = sb.BundleSB(file_list)
ax = sb.plot_depth_profile(sb_bundle, 'sal', filename='salinity_profile')
```
##
ylabel, title, filename, angle=None, plot_cv=False, alpha=0.5)
The plot_spectra_subset function is used to generate spectral plots from selected subsets of data contained within a SeaBASS Bundle instance.
It handles several common tasks when visualizing spectral data:
Extracting data for only user-specified wavelengths rather than full spectra\
Optionally extracting angular or coefficient of variation data\
Filtering samples in the Bundle by depth threshold\
Reshaping extracted data into a matrix for plotting\
Setting titles, axis labels, transparency\
Saving figure to file if filename provided
The goal is to simplify the process of generating publication-quality spectral plots from SeaBASS Bundles based on specific user criteria.
Example usage:
ax = sb.plot_spectra_subset(var, bb3_bundle, depth_threshold,