Following a discussion on the mailing list, I'd like to propose adding a new example to the CF Convention document to illustrate the use of cell_methods to specify different mean quantities when using a mask which is time varying (e.g. sea_ice). The qualifier where has been introduced into the cell_methods to specify masked spatial operations, e.g. area: mean where sea_ice to represent a spatial mean over sea ice. The current convention does not explicitly comment on whether the where construct can be used with other dimensions. For the CMIP6 data request there is a requirement to specify the temporal mean of quantities averaged over sea ice, and the spatial extent of the sea ice is generally varying in time.
The proposal is to make it clear that use of where for non-spatial dimensions is allowed by adding examples in section 7. It is also necessary to provide these examples to clarify the subtle differences implied by different formulations of the cell_methods statement.
Clarification at start of section 7.3.3
Add a clarification after this sentence in the first paragraph of 7.3.3 "Sometimes, however, it is useful to limit consideration to only a portion of a cell (e.g. a mean over the sea-ice area)", to introduce the idea of time-varying area fractions:
The portion concerned is constant in time in many cases, but it could be time-varying.
New example for time-varying area fractions
The following new example and explanatry text should be added in section 7.3.3:
Example 7.8: Time mean over area fractions which vary with time
float simple_mean(lat,lon): simple_mean:cell_methods: area: mean where sea_ice time: mean float weighted_mean(lat,lon): weighted_mean:cell_methods: area: time: mean where sea_ice float partial_mean(lat,lon): partial_mean:cell_methods: area: mean where sea_ice over sea time: mean
When the area fraction is varying with time, there are several different ways in which a time mean can be formulated. Three of these are illustrated in this example. Suppose, for instance, we are averaging over three time steps and the data at one grid point is -10, -6, -2 with area fractions .75, .50, .25. The values of the simple_mean, weighted_mean and partial mean are, respectively, (-10 -6 -2)/3 = -6, (-10*.75 - 6*.5 -2*.25)/(.75+.5+.25) = -7.33 , and (-10*.75 - 6*.5 -2*.25)/3 = -3.667. The partial mean provides the contribution to the mean over the entire grid from a specified area type. The simple mean is weighting each time period equally, while the weighted mean provides equal weighting to each unit area of sea_ice.
In example 7.8, time could be replaced by any other coordinate over which an average is taken, such as an ensemble index.
Hello Jonathan,
The answer to the first question is yes, and I've modified the layout to make it a bit clearer.
On the 2nd point, I'm not sure. What I'm proposing is clarifying the usage with time-varying area fractions, or, in principle, area fractions which vary with any other coordinate dimension (e.g. an ensemble index). At the moment the construct is still dependent on the standard name area_fraction, so it can only apply to horizonal areas.
I've added a sentence to the proposed insert to clarify that time can be replaced by another dimension.
regards, Martin
Dear Martin
Thanks for the first point.
CF 7.3.3 begins "By default, the statistical method indicated by cell_methods is assumed to have been evaluated over the entire horizontal area of the cell. Sometimes, however, it is useful to limit consideration to only a portion of a cell (e.g. a mean over the sea-ice area)." My concern is that this implies that the "where" syntax is concerned only with fractions of area, which is indeed what it was intended to imply. If we are now generalising it to fractions of a cell in any possible dimension, I think that we need to make this clear at the outset.
Best wishes
Jonathan
Dear Jonathan,
Perhaps there has been a mis-understanding: what I am proposing is clarifying the cell_methods usage for time-varying area fractions. The dimensions over which the method (mean in the example) is applied will then be area and time. The area fraction is still, however, the fraction of the cell area identifying with the specified type at any time. I was not proposing to generalise to other kinds of cell fractions,
regards, Martin
Dear Martin
You could then insert some text to acknowledge that the areas could be time-varying, which is not obvious currently. Alternatively you could take a broader view, which is how it seemed to me. Consider that you only had a dimension of time and were measuring sea-ice at a point. Would it not be natural still to allow a cell_method of "time: mean where sea_ice"?
Best wishes
Jonathan
Dear Jonathan,
I'm not sure I understand the first point: the fact that the proposed example 7.8 is about time varying area fractions is in the example title.
On the 2nd point: yes, I can see that this is possible, but I don't think it would fit in 7.3.3, which is about portions of cells. If you want to generalise the idea of cell, I think that would raise quite a number of issues. As I don't have a use case for this, I'd like to leave it out of this ticket. Another approach might be to add a new sub-section, e.g. "7.3.5 Cell methods on a masked axis" (as you would essentially be using sea_ice as a mask). This looks like a sensible generalisation to me, but it is not directly relevant to my use case, so, again, I would like to leave it out of this ticket.
regards, Martin
Dear Martin
I suggest inserting the a new third sentence of the first paragraph of 7.3.3: "The portion concerned is constant in time in many cases, but it could be time-varying." This will make the generality clear.
I think the generalisation to applying "where" in time when there is no space would be obvious, but we can leave it until it's explicitly requested, as you say.
Best wishes
Jonathan
Dear Jonathan,
First point: Thanks, I see the point now. I've added your suggested sentence to the proposal.
Second point: OK
regards, Martin
