Changelog
v0.8.0 (unreleased)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Pascal Bourgault (@aulemahal), Juliette Lavoie (@juliettelavoie), Sarah-Claude Bourdeau-Goulet (@sarahclaude), Trevor James Smith (@Zeitsperre), Marco Braun (@vindelico).
Announcements
xscen now adheres to PEPs 517/518/621 using the setuptools and setuptools-scm backend for building and packaging. (PR/292).
New features and enhancements
New function
xscen.indicators.select_inds_for_avail_vars
to filter the indicators that can be calculated with the variables available in axarray.Dataset
. (PR/291).Replaced aggregation function
climatological_mean()
withclimatological_op()
offering more types of operations to aggregate over climatological periods. (PR/290)Added the ability to search for simulations that reach a given warming level. (PR/251).
xs.spatial_mean
now accepts theregion="global"
keyword to perform a global average (GH/94, PR/260).xs.spatial_mean
withmethod='xESMF'
will also automatically segmentize polygons (down to a 1° resolution) to ensure a correct average (PR/260).Added documentation for require_all_on in search_data_catalogs. (PR/263).
xs.save_to_table
andxs.io.to_table
to transform datasets and arrays to DataFrames, but with support for multi-columns, multi-sheets and localized table of content generation.Better
xs.extract.resample
: support for weighted resampling operations when starting with frequencies coarser than daily and missing timesteps/values handling. (GH/80, GH/93, PR/265).New argument
attribute_weights
togenerate_weights
to allow for custom weights. (PR/252).xs.io.round_bits
to round floating point variable up to a number of bits, allowing for a better compression. This can be combined with the saving step through argument"bitround"
ofsave_to_netcdf
andsave_to_zarr
. (PR/266).Added annual global tas timeseries for CMIP6’s models CMCC-ESM2 (ssp245, ssp370, ssp585), EC-Earth3-CC (ssp245, ssp585), KACE-1-0-G (ssp245, ssp370, ssp585) and TaiESM1 (ssp245, ssp370). Moved global tas database to a netCDF file. (GH/268, PR/270).
Implemented support for multiple levels and models in
xs.subset_warming_level
. Better support for DataArray and DataFrame inxs.get_warming_level
. (PR/270).Added the ability to directly provide an ensemble dataset to
xs.ensemble_stats
. (PR/299).Added support in
xs.ensemble_stats
for the new robustness-related functions available in xclim. (PR/299).New function
xs.ensembles.get_partition_input
(PR/289).
Breaking changes
climatological_mean()
has been replaced withclimatological_op()
and will be abandoned in a future version. (PR/290)experiment_weights
argument ingenerate_weights
was renamed tobalance_experiments
. (PR/252).New argument
attribute_weights
togenerate_weights
to allow for custom weights. (PR/252).For a sequence of models, the output of
xs.get_warming_level
is now a list. Revert to a dictionary withoutput='selected'
(PR/270).The global average temperature database is now a netCDF, custom databases must follow the same format (PR/270).
Bug fixes
Fixed a bug in
xs.search_data_catalogs
when searching for fixed fields and specific experiments/members. (PR/251).Fixed a bug in the documentation build configuration that prevented stable/latest and tagged documentation builds from resolving on ReadTheDocs. (PR/256).
Fixed
get_warming_level
to avoid incomplete matches. (PR/269).search_data_catalogs now eliminates anything that matches any entry in exclusions. (GH/275, PR/280).
Fixed a bug in
xs.scripting.save_and_update
wherebuild_path_kwargs
was ignored when trying to guess the file format. (PR/282).Add a warning to
xs.extract._dispatch_historical_to_future
. (GH/286, PR/287).Modify use_cftime for the calendar conversion in
to_dataset
. (GH/303, PR/289).
Internal changes
Continued work on adding tests. (PR/251).
Fixed pre-commit’s pretty-format-json hook so that it ignores notebooks. (PR/254).
Fixed the labeler so docs/CI isn’t automatically added for contributions by new collaborators. (PR/254).
Made it so that tests are no longer treated as an installable package. (PR/248).
Renamed the pytest marker from
requires_docs
torequires_netcdf
. (PR/248).Included the documentation in the source distribution, while excluding the NetCDF files. (PR/248).
Reduced the size of the files in
/docs/notebooks/samples
and changed the notebooks and tests accordingly. (GH/247, PR/248).Added a new xscen.testing module with the datablock_3d function previously located in
/tests/conftest.py
. (PR/248).New function xscen.testing.fake_data to generate fake data for testing. (PR/248).
xESMF 0.8 Regridder and SpatialAverager argument
out_chunks
is now accepted byxs.regrid_dataset
andxs.spatial_mean
. (PR/260).- Testing, Packaging, and CI adjustments. (PR/274):
xscen builds now install in a tox environment with conda-provided ESMF in GitHub Workflows.
tox now offers a method for installing esmpy from a tag/branch (via ESMF_VERSION environment variable).
$ make translate is now called on ReadTheDocs and within tox.
Linters are now called by order of most common failures first, to speed up the CI.
Manifest.in is much more specific about what is installed.
Re-adds a dev recipe to the setup.py.
Multiple improvements to the docstrings and type annotations. (PR/282).
pip check in conda builds in GitHub workflows have been temporarily set to always pass. (PR/288).
- The cookiecutter template has been updated to the latest commit via cruft. (PR/292):
setup.py has been mostly hollowed-out, save for the babel-related translation function.
pyproject.toml has been added, with most package configurations migrated into it.
HISTORY.rst has been renamed to CHANGES.rst.
actions-version-updater.yml has been added to automate the versioning of the package.
pre-commit hooks have been updated to the latest versions; check-toml and toml-sort have been added to cleanup the pyproject.toml file, and check-json-schema has been added to ensure GitHub and ReadTheDocs workflow files are valid.
ruff has been added to the linting tools to replace most flake8 and pydocstyle verifications.
tox builds are more pure Python environment/PyPI-friendly.
xscen now uses Trusted Publishing for TestPyPI and PyPI uploads.
Linting checks now examine the testing folder, function complexity, and alphabetical order of __all__ lists. (PR/292).
publish_release_notes
now uses better logic for finding and reformatting the CHANGES.rst file. (PR/292).bump2version
version-bumping utility was replaced bybump-my-version
. (PR/292).Documentation build checks no longer fail due to broken external links; Notebooks are now nested and numbered. (PR/304).
v0.7.1 (2023-08-23)
Update dependencies by removing
pygeos
, pinningshapely>=2
andintake-esm>=2023.07.07
as well as other small fixes to the environment files. (PR/243).Fix
xs.aggregate.spatial_mean
with methodcos-lat
when the data is on a rectilinear grid. (PR/243).
Internal changes
v0.7.0 (2023-08-22)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Pascal Bourgault (@aulemahal), Trevor James Smith (@Zeitsperre), Juliette Lavoie (@juliettelavoie), Marco Braun (@vindelico).
Announcements
Dropped support for Python 3.8, added support for 3.11. (PR/199, PR/222).
xscen is now available on conda-forge, and can be installed with
conda install -c conda-forge xscen
. (PR/241)
New features and enhancements
New function get_warming_level to search within the IPCC CMIP global temperatures CSV without requiring data. (GH/208, PR/210).
File re-structuration from catalogs with
xscen.catutils.build_path
. (PR/205, PR/237).New scripting functions save_and_update and move_and_delete. (PR/214).
Spatial dimensions can be generalized as X/Y when rechunking and will be mapped to rlon/rlat or lon/lat accordingly. (PR/221).
New argument var_as_string for get_cat_attrs to return variable names as strings. (PR/233).
New argument copy for move_and_delete. (PR/233).
New argument restrict_year for compute_indicators. (PR/233).
generate_weights
now allows to split weights between experiments, and make them vary along the time/horizon axis. (GH/108, PR/231).New independence_level, institution, added to
generate_weights
. (PR/231).Updated
produce_horizon
so it can accept multiple periods or warming levels. (PR/231, PR/240).Add more comments in the template. (PR/233, PR/235, GH/232).
New function
diagnostics.health_checks
that can perform multiple checkups on a dataset. (PR/238).
Breaking changes
Columns
date_start
anddate_end
now use adatetime64[ms]
dtype. (PR/222).The default output of
date_parser
is nowpd.Timestamp
(output_dtype='datetime'
). (PR/222).date_parser(date, end_of_period=True)
has time “23:59:59”, instead of “23:00”. (PR/222, PR/237).driving_institution
was removed from the “default” xscen columns. (PR/222).Folder parsing utilities (
parse_directory
) moved toxscen.catutils
. Signature changed :globpattern
removed,dirglob
added, newpatterns
specifications. See doc for all changes. (PR/205).compute_indicators
now returns all outputs produced by indicators with multiple outputs (such as rain_season). (PR/228).In
generate_weights
, independence_level all was renamed model. (PR/231).In response to a bugfix, results for
generate_weights(independence_level='GCM')
are significantly altered. (GH/230, PR/231).Legacy support for stats_kwargs in
ensemble_stats
was dropped. (PR/231).period in
produce_horizon
has been deprecated and replaced with periods. (PR/231).Some automated to_level were updated to reflect more recent changes. (PR/231).
Removed
diagnostics.fix_unphysical_values
. (PR/238).
Bug fixes
Fix bug in
unstack_dates
with seasonal climatological mean. (GH/202, PR/202).Added NotImplemented errors when trying to call climatological_mean and compute_deltas with daily data. (PR/187).
Fixed a bug in
unstack_dates
where it failed for anything other than seasons. (PR/228).cleanup
with common_attrs_only now works even when no cat attribute is present in the datasets. (PR/231).
Internal changes
Folder parsing utilities now in pure python, platform independent. New dependency
parse
. (PR/205).Updated ReadTheDocs configuration to prevent
--eager
installation of xscen (PR/209).Implemented a template to be used for unit tests. (PR/187).
Updated GitHub Actions to remove deprecation warnings. (PR/187).
Updated the cookiecutter used to generate boilerplate documentation and code via cruft. (PR/212).
A few changes to subset_warming_level so it doesn’t need driving_institution. (PR/215).
Added more tests. (PR/228).
In
compute_indicators
, the logic to manage indicators returning multiple outputs was simplified. (PR/228).
v0.6.0 (2023-05-04)
Contributors to this version: Trevor James Smith (@Zeitsperre), Juliette Lavoie (@juliettelavoie), Pascal Bourgault (@aulemahal), Gabriel Rondeau-Genesse (@RondeauG).
Announcements
xscen is now offered as a conda package available through Anaconda.org. Refer to the installation documentation for more information. (GH/149, PR/171).
Deprecation: Release 0.6.0 of xscen will be the last version to support
xscen.extract.clisops_subset
. Usexscen.spatial.subset
instead. (PR/182, PR/184).Deprecation: The argument region, used in multiple functions, has been slightly reformatted. Release 0.6.0 of xscen will be the last version to support the old format. (GH/99, GH/101, PR/184).
New features and enhancements
Support for computing anomalies in compute_deltas. (PR/165).
Add function diagnostics.measures_improvement_2d. (PR/167).
Add function
regrid.create_bounds_rotated_pole
and automatic use inregrid_dataset
andspatial_mean
. This is temporary, while we wait for a functionning method incf_xarray
. (PR/174, GH/96).Add
spatial
submodule with functionscreep_weights
andcreep_fill
for filling NaNs using neighbours. (PR/174).Allow passing
GeoDataFrame
instances inspatial_mean
’sregion
argument, not only geospatial file paths. (PR/174).Allow searching for periods in catalog.search. (GH/123, PR/170).
Allow searching and extracting multiple frequencies for a given variable. (GH/168, PR/170).
New function
xs.spatial.subset
to replacexs.extract.clisops_subset
and add method “sel”. (GH/180, PR/182).Add long_name attribute to diagnostics. ( PR/189).
New
utils.standardize_periods
to standardize that argument across multiple functions. (GH/87, PR/192).New coverage_kwargs argument added to
search_data_catalogs
to allow modifying the default values ofsubset_file_coverage
. (GH/87, PR/192).
Breaking changes
‘mean’ averaging has been deprecated in spatial_mean. (PR/125).
‘interp_coord’ has been renamed to ‘interp_centroid’ in spatial_mean. (PR/125).
The ‘datasets’ dimension of the output of
diagnostics.measures_heatmap
is renamed ‘realization’. (PR/167)._subset_file_coverage was renamed subset_file_coverage and moved to
catalog.py
to prevent circular imports. (PR/170).extract_dataset doesn’t fail when a variable is in the dataset, but not variables_and_freqs. (PR/185).
The argument period, used in multiple function, is now always a single list, while periods is more flexible. (GH/87, PR/192).
The parameters reference_period and simulation_period of
xscen.train
andxscen.adjust
were renamed period/periods to respect the point above. (GH/87, PR/192).
Bug fixes
Forbid pandas v1.5.3 in the environment files, as the linux conda build breaks the data catalog parser. (GH/161, PR/162).
Only return requested variables when using
DataCatalog.to_dataset
. (PR/163).compute_indicators
no longer crashes if less than 3 timesteps are produced. (PR/125).xarray is temporarily pinned below v2023.3.0 due to an API-breaking change. (GH/175, PR/173).
xscen.utils.unstack_fill_nan` can now handle datasets that have non dimension coordinates. (GH/156, PR/175).
extract_dataset now skips a simulation way earlier if the frequency doesn’t match. (PR/170).
extract_dataset now correctly tries to extract in reverse timedelta order. (PR/170).
compute_deltas no longer creates all NaN values if the input dataset is in a non-standard calendar. (PR/188).
Internal changes
xscen now manages packaging for PyPi and TestPyPI via GitHub workflows. (PR/159).
Pre-load coordinates in
extract.clisops_subset
(PR/163).Minimal documentation for templates. (PR/163).
xscen is now indexed in Zenodo, under the ouranos community of projects. (PR/164).
Better warning messages in
_subset_file_coverage
when coverage is insufficient. (PR/125).The top-level Makefile now includes a linkcheck recipe, and the ReadTheDocs configuration no longer reinstalls the llvmlite compiler library. (PR/173).
The checkups on coverage and duplicates can now be skipped in subset_file_coverage. (PR/170).
Changed the ProjectCatalog docstrings to make it more obvious that it needs to be created empty. (GH/99, PR/184).
Added parse_config to creep_fill, creep_weights, and reduce_ensemble (PR/191).
v0.5.0 (2023-02-28)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Juliette Lavoie (@juliettelavoie), Trevor James Smith (@Zeitsperre), Sarah Gammon (@SarahG-579462) and Pascal Bourgault (@aulemahal).
New features and enhancements
Possibility of excluding variables read from file from the catalog produced by
parse_directory
. (PR/107).New functions
extract.subset_warming_level
andaggregate.produce_horizon
. (PR/93).add round_var to xs.clean_up. (PR/93).
New “timeout_cleanup” option for
save_to_zarr
, which removes variables that were in the process of being written when receiving aTimeoutException
. (PR/106).New
scripting.skippable
context, allowing the use of CTRL-C to skip code sections. (PR/106).Possibility of fields with underscores in the patterns of
parse_directory
. (PR/111).New
utils.show_versions
function for printing or writing to file the dependency versions of xscen. (GH/109, PR/112).Added previously private notebooks to the documentation. (PR/108).
Notebooks are now tested using pytest with nbval. (PR/108).
New
restrict_warming_level
argument forextract.search_data_catalogs
to filter dataset that are not in the warming level csv. (GH/105, PR/138).Set configuration value programmatically through
CONFIG.set
. (PR/144).New
to_dataset
method onDataCatalog
. The same asto_dask
, but exposing more aggregation options. (PR/147).New templates folder with one general template. (GH/151, PR/158).
Breaking changes
Functions that are called internally can no longer parse the configuration. (PR/133).
Bug fixes
properties_and_measures
no longer casts month coordinates to string. (PR/106).search_data_catalogs no longer crashes if it finds nothing. (GH/42, PR/92).
Prevented fixed fields from being duplicated during _dispatch_historical_to_future (GH/81, PR/92).
Added missing parse_config to functions in reduce.py (PR/92).
Added deepcopy before skipna is popped in spatial_mean (PR/92).
subset_warming_level now validates that the data exists in the dataset provided (GH/117, PR/119).
Adapt stack_drop_nan for the newest version of xarray (2022.12.0). (GH/122, PR/126).
Fix stack_drop_nan not working if intermediate directories don’t exist (GH/128).
Fixed a crash when compute_indicators produced fixed fields (PR/139).
Internal changes
compute_deltas
skips the unstacking step if there is no time dimension and cast object dimensions to string. (PR/9)Added the “2sem” frequency to the translations CVs. (PR/111).
Skip files we can’t read in
parse_directory
. (PR/111).Fixed non-numpy-standard Docstrings. (PR/108).
Added more metadata to package description on PyPI. (PR/108).
Faster
search_data_catalogs
andextract_dataset
through a fasterDataCatalog.unique
, date parsing and a rewrite of theensure_correct_time
logic. (PR/127).The
search_data_catalogs
function now accepts str or pathlib.Path variables (in addition to lists of either data type) for performing catalog lookups. (PR/121).produce_horizons now supports fixed fields (PR/139).
Rewrite of
unstack_dates
for better performance with dask arrays. (PR/144).
v0.4.0 (2022-09-28)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Juliette Lavoie (@juliettelavoie), Trevor James Smith (@Zeitsperre) and Pascal Bourgault (@aulemahal).
New features and enhancements
New functions
diagnostics.properties_and_measures
,diagnostics.measures_heatmap
anddiagnostics.measures_improvement
. (GH/5, PR/54).Add argument resample_methods to xs.extract.resample. (GH/57, PR/57)
Added a ReadTheDocs configuration to expose public documentation. (GH/65, PR/66).
xs.utils.stack_drop_nans
/xs.utils.unstack_fill_nan
will now format the to_file/coords string to add the domain and the shape. (GH/59, PR/67).New unstack_dates function to “extract” seasons or months from a timeseries. (PR/68).
Better spatial_mean for cases using xESMF and a shapefile with multiple polygons. (PR/68).
- Yet more changes to parse_directory: (PR/68).
Better parallelization by merging the finding and name-parsing step in the same dask tree.
Allow cvs for the variable columns.
Fix parsing the variable names from datasets.
Sort the variables in the tuples (for a more consistent output)
In extract_dataset, add option
ensure_correct_time
to ensure the time coordinate matches the expected freq. Ex: monthly values given on the 15th day are moved to the 1st, as expected when asking for “MS”. (:issue: 53).- In regrid_dataset: (PR/68).
Allow passing skipna to the regridder kwargs.
Do not fail for any grid mapping problem, includin if a grid_mapping attribute mentions a variable that doesn’t exist.
Default email sent to the local user. (PR/68).
Special accelerated pathway for parsing catalogs with all dates within the datetime64[ns] range. (PR/75).
New functions
reduce_ensemble
andbuild_reduction_data
to support kkz and kmeans clustering. (GH/4, PR/63).ensemble_stats can now loop through multiple statistics, support functions located in xclim.ensembles._robustness, and supports weighted realizations. (PR/63).
New function ensemble_stats.generate_weights that estimates weights based on simulation metadata. (PR/63).
New function catalog.unstack_id to reverse-engineer IDs. (PR/63).
generate_id now accepts Datasets. (PR/63).
Add rechunk option to properties_and_measures (PR/76).
Breaking changes
statistics / stats_kwargs have been changed/eliminated in ensemble_stats, respectively. (PR/63).
Bug fixes
Internal changes
Default method of xs.extract.resample now depends on frequency. (GH/57, PR/58).
Bugfix for _restrict_by_resolution with CMIP6 datasets (PR/71).
More complete check of coverage in
_subset_file_coverage
. (GH/70, PR/72)The code that performs
common_attrs_only
in ensemble_stats has been moved to clean_up. (PR/63).Removed the default
to_level
in clean_up. (PR/63).xscen now has an official logo. (PR/69).
Use numpy max and min in properties_and_measures (PR/76).
Cast catalog date_start and date_end to “%4Y-%m-%d %H:00” when writing to disk. (GH/83, PR/79)
Skip test of coverage on the sum if the list of select files is empty. (PR/79)
Added missing CMIP variable names in conversions.yml and added the ability to provide a custom file instead (GH/86, PR/88)
Changed ‘allow_conversion’ and ‘allow_resample’ default to False in search_data_catalogs (GH/86, PR/88)
v0.3.0 (2022-08-23)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Juliette Lavoie (@juliettelavoie), Trevor James Smith (@Zeitsperre) and Pascal Bourgault (@aulemahal).
New features and enhancements
parse_directory: Fixes to xr_open_kwargs and support for wildcards (*) in the directories. (PR/19).
New function
xscen.ensemble.ensemble_stats
added. (GH/3, PR/28).New functions
spatial_mean
,climatological_mean
anddeltas
added. (GH/4, PR/35).Add argument
intermediate_reg_grids
toxscen.regridding.regrid
. (GH/34, PR/39).Add argument
moving_yearly_window
toxscen.biasadjust.adjust
. (PR/39).Many adjustments to
parse_directory
: better wildcards (GH/24), allow custom columns, fastpaths forparse_from_ds
, and more (PR/30).Documentation now makes better use of autodoc to generate package index. (PR/41).
periods argument added to compute_indicators to support datasets with jumps in time (PR/35).
Breaking changes
Patterns in
parse_directory
start at the end of the paths indirectories
. (PR/30).Argument
extension
ofparse_directory
has been renamedglobpattern
. (PR/30).- The
xscen
API and filestructure have been significantly refactored. (GH/40, PR/41). The following functions are available from the top-level: adjust
,train
,ensemble_stats
,clisops_subset
,dispatch_historical_to_future
,extract_dataset
,resample
,restrict_by_resolution
,restrict_multimembers
,search_data_catalogs
,save_to_netcdf
,save_to_zarr
,rechunk
,compute_indicators
,regrid_dataset
, andcreate_mask
.
- The
xscen now requires geopandas and shapely (PR/35).
Following a change in intake-esm xscen now uses “cat:” to prefix the dataset attributes extracted from the catalog. All catalog-generated attributes should now be valid when saving to netCDF. (GH/13, PR/51).
Internal changes
parse_directory: Fixes to xr_open_kwargs. (PR/19).
Fix for indicators removing the ‘time’ dimension. (PR/23).
Security scanning using CodeQL and GitHub Actions is now configured for the repository. (PR/21).
Bumpversion action now configured to automatically augment the version number on each merged pull request. (PR/21).
Add
align_on = 'year'
argument in bias adjustment converting of calendars. (PR/39).GitHub Actions using Ubuntu-22.04 images are now configured for running testing ensemble using tox-conda. (PR/44).
import xscen smoke test is now run on all pull requests. (PR/44).
Fix for create_mask removing attributes (PR/35).
v0.2.0 (first official release)
Contributors to this version: Gabriel Rondeau-Genesse (@RondeauG), Pascal Bourgault (@aulemahal), Trevor James Smith (@Zeitsperre), Juliette Lavoie (@juliettelavoie).
Announcements
This is the first official release for xscen!
New features and enhancements
Supports workflows with YAML configuration files for better transparency, reproducibility, and long-term backups.
Intake_esm-based catalog to find and manage climate data.
Climate dataset extraction, subsetting, and temporal aggregation.
Calculate missing variables through Intake-esm’s DerivedVariableRegistry.
Regridding with xESMF.
Bias adjustment with xclim.
Breaking changes
N/A
Internal changes
N/A