Open
Description
🐛 Bug Report
The below script shows a difference between lazy aggregated_by
and realized (eager) aggregated_by
.
With py37
, iris=3.0.3
and dask=2021.06.0
real and lazy give the same result.
With py311
, iris>=3.9.0
and dask>=2024.5.0
they are very different at the end of the time series. The lazy version is wrong. I have plots generated with the release candidate for Iris 3.4, where there was no problem.
One more thing I tried was to recreate the lazy cube by touching the data, and then making it lazy again i.e., hovmollers.data = hovmollers.lazy_data()
. The problem went away.
Thanks,
import datetime
import cf_units
import numpy as np
import iris
import iris.coord_categorisation as cat
import iris.quickplot as qplt
def pp_callback(cube, field, filename):
"""
Hovmollers saved by IDL as pp have time headers not understood by Iris,
and wrong units.
"""
if filename.endswith('.pp'):
# Figure out times.
TIME_UNIT = cf_units.Unit('hours since 1970-01-01 00:00:00', 'gregorian')
first_date = datetime.datetime(field.lbyr, field.lbmon, field.lbdat)
last_date = datetime.datetime(field.lbyrd, field.lbmond, field.lbdatd)
first_bound = TIME_UNIT.date2num(first_date)
last_bound = TIME_UNIT.date2num(last_date)
n_bounds = (last_date - first_date).days // 5 + 1
all_bounds = np.linspace(first_bound, last_bound, n_bounds)
bounds = np.vstack((all_bounds[:-1], all_bounds[1:])).T
points = bounds.mean(axis=1)
time_coord = iris.coords.DimCoord(
points, bounds=bounds, standard_name='time', units=TIME_UNIT)
cube.add_dim_coord(time_coord, 0)
# Fix units.
cube.units = 'mm day-1'
hovmollers = iris.load(
'gpcp_pentadal_v2.2_1979_2013.pp',
callback=pp_callback).concatenate_cube()
cat.add_day_of_month(hovmollers, 'time')
cat.add_month_number(hovmollers, 'time')
lazy_stdev = hovmollers.aggregated_by(['month_number', 'day_of_month'], iris.analysis.STD_DEV)
hovmollers.data
real_stdev = hovmollers.aggregated_by(['month_number', 'day_of_month'], iris.analysis.STD_DEV)
diff = lazy_stdev - real_stdev
print(diff.data.max())
qplt.contourf(diff)
qplt.show()
This issue was raised on behalf of @rcomer 👍
Metadata
Metadata
Assignees
Type
Projects
Status
No status