Opened 3 years ago
Closed 9 months ago
#140 closed enhancement (fixed)
Clarifying the role of attributes on boundary variables.
Reported by: | davidhassell | Owned by: | cf-conventions@… |
---|---|---|---|
Priority: | medium | Milestone: | |
Component: | cf-conventions | Version: | |
Keywords: | boundary variable, attribute | Cc: |
Description
1. Title
Clarifying the role of attributes on boundary variables.
2. Moderator
TBC (any offer will be gladly accepted).
3. Requirement
To disallow inconsistencies between particular attributes of a boundary variable and its associated coordinate or auxiliary coordinate variable.
For example, it is currently possible for a boundary variable to have a different standard_name attribute to its associated coordinate or auxiliary coordinate variable. This would be unsatisfactory because the user of the data cannot know which of the possibilities is correct.
It is proposed that if a boundary variable has attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the boundary variable array values (units, calendar, leap_month, leap_year and month_lengths) then they must always agree exactly with the same attributes of its associated coordinate or auxiliary coordinate variable. In addition, it is recommended that these attributes are not provided to a boundary variable since they are already inherited implicitly.
No restriction is made on any other boundary variable attributes.
This does not affect datasets encoded with previous versions of CF
4. Initial Statement of Technical Proposal
The following changes should be made to section 7.1. Cell Boundaries (additions marked by TEXT, deletions by
TEXT):
To represent cells we add the attribute bounds to the appropriate coordinate variable(s). The value of bounds is the name of the variable that contains the vertices of the cell boundaries. We refer to this type of variable as a "boundary variable". A boundary variable will have one more dimension than its associated coordinate or auxiliary coordinate variable. The additional dimension should be the most rapidly varying one, and its size is the maximum number of cell vertices. Since a boundary variable is considered to be part of a coordinate variable's metadata, it is not necessary to provide it with attributes (such as long_name and units)
.and providing no attributes is always acceptable. Boundary variable attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the array values (units, calendar, leap_month, leap_year and month_lengths) must always agree exactly with the same attributes of its associated coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that the attributes units, standard_name, axis, positive, calendar, leap_month, leap_year and month_lengths are not provided to a boundary variable.
In section 7.1 Cell Boundaries of the conformance document (additions marked by TEXT, deletions by
TEXT):
- Requirements:
- The type of the bounds attribute is a string whose value is a single variable name. The specified variable must exist in the file.
- A boundary variable must have the same dimensions as its associated variable, plus have a trailing dimension (CDL order) for the maximum number of vertices in a cell.
- A boundary variable must be a numeric data type.
- If a boundary variable has
units or standard_name attributes, they must agree with those of its associated variable.units, standard_name, axis, positive, calendar, leap_month, leap_year and month_lengths attributes, they must agree exactly with those of its associated variable.
- Recommendations:
- The points specified by a coordinate or auxiliary coordinate variable should lie within, or on the boundary, of the cells specified by the associated boundary variable.
- Boundary variables should not have the
_FillValue or missing_value_FillValue, missing_value, units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes.
5. Benefits
It would be disallowed to encode type-determining attributes (units, calendar, standard_name, axis and positive) or array value interpretation attirbutes (units, calendar, leap_month, leap_year and month_lengths) on a boundary variable if they conflict with the associated coordinate or auxiliary coordinate variable.
6. Status Quo
Attributes on a boundary variable may conflict with the associated coordinate or auxiliary coordinate variable, and this is not always checked by the CF checker.
This proposal does not affect datasets encoded under previous versions of CF, other than via the potential for extra warnings being raised by the CF checker.
David Hassell
Change History (32)
comment:1 Changed 3 years ago by jonathan
comment:2 Changed 12 months ago by taylor13
I support this proposal with the caveat that if we allow formula_terms on parametric coordinate *bounds* (as I've advocated in ticket #147, then we might want to include some mention here that the formula_terms attached to the bounds should be consistent with the formula_terms attached to the parametric coordinate variable itself. By "consistent" I mean that the same parameters must be defined (but of course the parameter values will be stored in different variables from the parameters of the coordinates themselves).
thanks, David, for proposing this change.
best regards, Karl
comment:3 Changed 12 months ago by davidhassell
Karl,
I agree with your note on formula_terms. I would go further to say that if the parent coordinate variable also has formula_terms which refers to a variable with bounds then those bounds must be referred to by same parameter of the bounds' formula_terms.
I'll draft some text to add to the section 7.1 and conformance changes proposed above...
Thanks, David
comment:4 Changed 12 months ago by davidhassell
Proposed first paragraph of section 7.1. Cell Boundaries (original additions marked by TEXT, deletions by TEXT, new additions in TEXT ):
To represent cells we add the attribute bounds to the appropriate coordinate variable(s). The value of bounds is the name of the variable that contains the vertices of the cell boundaries. We refer to this type of variable as a "boundary variable". A boundary variable will have one more dimension than its associated coordinate or auxiliary coordinate variable. The additional dimension should be the most rapidly varying one, and its size is the maximum number of cell vertices. Since a boundary variable is considered to be part of a coordinate variable's metadata, it is not necessary to provide it with attributes (such as long_name and units)
.and providing no attributes is always acceptable. Boundary variable attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the array values (units, calendar, leap_month, leap_year and month_lengths) must always agree exactly with the same attributes of its associated coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that the attributes units, standard_name, axis, positive, calendar, leap_month, leap_year and month_lengths are not provided to a boundary variable. If the associated variable is a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) then two cases are possible
1) if the boundary variable also has a formula_terms attribute then its terms must be the same as those for the parametric coordinate variable, but with different variables named as term values and using, wherever possible, the boundary variables of variables named by parametric coordinate variable's formula_terms
2) if the boundary variable does not have a formula_terms attribute then it is assumed that the formula_terms of the parametric coordinate variable applies, substituting a named variable with its boundary variable, wherever possible
We'll also need some extra changes to the conformance document ...
comment:5 Changed 12 months ago by taylor13
Just to note: This would be much less complicated if we decide to reject Jonathan's alternative under ticket #147. Then the new text would read:
If the boundary variable is associated with a parametric coordinate variable and both the coordinate variable and the boundary variable have formula_terms (ref section 4.3.2), then the terms in the formula definition must be the same for the coordinates and its bounds, but with different parametric variable names specified for any terms in the definition that depend on the vertical coordinate.
comment:6 Changed 12 months ago by davidhassell
Hi Karl,
I like this approach, but I think we can retain your clarity whilst retaining Jonathan's alternative:
If the boundary variable is associated with a parametric coordinate variable then it assumed that the formula definition of the parametric coordinate variable also applies to the bounds. The term values are the same except when the named variable depends on the vertical coordinate, in which case the named variable is substituted with its boundary variable, if it exists. Note that a formula_terms attribute may also be provided on a boundary variable provided it adheres to these restrictions.
comment:7 Changed 12 months ago by jonathan
Dear Karl and David
I don't understand why Karl thinks the 1D formula terms (things like sigma values) are not anything like coordinate data. I have the same view as David that they do contain something like coordinate data, even though they're not coordinates by themselves. Evidently they do have bounds; in Karl's preferred arrangement (David's second case in comment 21 of ticket 147), hybrid_sigma:formula_terms points to A_bounds and B_bounds. If you don't call these boundary variables, what are they? If they are boundary variables, why not point to them with a bounds attribute? However since we've said all these things already, and know each other's point of view, it must be some philosophical disagreement. We'll have to arrange a conference about it sometime!
There is an important advantage in Karl's arrangement that you don't have to work out the identities of the formula terms for the bounds, since there's a formula_terms attribute to tell you them explicitly. What if we make it mandatory for the bounds variable of a parametric vertical coordinate to have a formula_terms attribute? This would be a backward-incompatible change, in the sense that data that was compliant with earlier versions of CF might not be compliant with the new version.
That would simplify the text here. Starting from the bold bit, we would have
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, its boundary variable must have a formula_terms attribute too. Because the same standard_name must describe both variables, the formula must have the same terms (as specified in Appendix D), but a different variable must be named by the two formula_terms attributes for any term which depends on the vertical dimension, because the boundary variables have one more dimension.
Then my preferred arrangement can be permitted by further text
The boundary variables for these formula terms may also be identified by bounds attributes of the formula terms variables. In that case, the formula_terms of the boundary variable and the bounds of the formula terms variables must be consistent.
So this permits David's case 2 and the case 3 I wrote down in ticket 147, but not David's case 1, which Karl doesn't like. In Martin's list in comment 20 of ticket 147, I would advocate option 4 - do nothing. We always permit non-standardised attributes in CF. The formula_terms attribute used other than for variables containing coordinate data (in the broad sense in which David and I interpret it) doesn't mean anything to CF, but it's allowed. It may have a meaning to the data-writer. Of course, it might be a mistake as well, but we don't police such mistakes. We have no general prohibition of or recommendation against using attributes from Appendix A in situations where CF doesn't describe their use.
Best wishes
Jonathan
comment:8 Changed 12 months ago by davidhassell
Hello Karl, Jonathan,
Allowing the term values which span the vertical dimension to not have a bounds attribute would certainly make writing software harder, as the software would have to work out that a formula terms named variable is associated with a boundary variable and then make that connection explicit.
Running with Jonathan's idea of insisting that the boundary variable has a formula_terms attribute, I would take it further and insist that term values which span the vertical dimension must have a bounds attribute which points to the appropriate variable named in the boundary variable's formula_terms for its boundary variable. This is also a backwards-incompatible change:
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, its boundary variable must have a formula_terms attribute too. Because the same standard_name must describe both variables, the formula must have the same terms (as specified in Appendix D), but a different variable must be named by the two formula_terms attributes for any term which depends on the vertical dimension, because the boundary variables have one more dimension. For these terms, the boundary variable's formula_terms must name the bounds of the variables named by the vertical coordinate variable's formula_terms.
That said, I like to think that we can find some non-confusing wording which allows my case 1, and so no backward-incompatible changes would be necessary.
All the best,
David
comment:9 Changed 12 months ago by taylor13
Hi David,
Could you expand on why you think software will want to extract the so-called "bounds" values for variables appearing in formula_terms along with the values themselves? I would have thought that for parametric coordinates you would want to primarily associate formula terms with the coordinate values they are used to transform. So for the coordinates themselves you would associate the parameter values in the formula_terms that is attached to the parametric coordinate. For the *bounds* on that coordinate you would associate the parameter values in the formula_terms attached to the parametric coordinate's bounds.
Why is there any need to associate the parameter values used for coordinate bound transformations with the parameter values used for coordinate transformations. I should think these two sets of parameter values will invariably be used independently. I suppose one might want to put into a container all the coordinate and bound information, but I don't think you would ever put the coordinate information and the coordinate and bounds parameters together without also including the coordinate bounds themselves. If this is the case then your code could easily construct such a container without a "bounds" attribute attached to the parameter variables.
I'm sorry if I'm a bit slow on this, but you seem to have a specific use case where "working out" needed relationships is difficult. Could you describe it in a bit more detail? This could help us reach consensus.
Sorry this seems to be taking up your valuable time, but I assure you if there is a compelling use case, then I'll favor including Jonathan's alternative.
best regards, Karl
comment:10 Changed 11 months ago by davidhassell
Hello Karl, Jonathan and all,
It'd be nice to wrap this up! How about this text, which says "feel free to put formula_terms on bounds, but if they're not there then infer them from the parent parametric coordinate"
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, then that formula must also be associated with the boundary variable, because the same standard_name describes both variables. The association is explicit if the boundary variable also has a formula_terms attribute defining the same terms (as specified in Appendix D), but a different variable must be named for any term which depends on the vertical dimension, because the boundary variables have one more dimension. If the association is implicit then the formula_terms attribute is omitted from the boundary variable and it is assumed that the formula_terms attribute from the parametric coordinate variable applies, but for any term which depends on the vertical dimension, the named variable is replaced with its boundary variable. In this case, such named variables must be auxiliary coordinate variables with associated boundary variabes.
Explicit example:
float eta(eta) ; eta:long_name = "eta at layer midpoints" ; eta:positive = "down" ; eta:standard_name = " atmosphere_hybrid_sigma_pressure_coordinate" ; eta:formula_terms = "a: a b: b ps: ps p0:p0”; eta:bounds="etabnds"; float etabnds(eta,2); etabnds:formula_terms = " a: abnds b: bbnds ps: ps p0:p0" ; float a(eta); a:long_name = "’a’ coefficient for vertical coordinate (at full levels)"; a:units = ‘Pa’ float b(eta); b:long_name = "’b’ coefficient for vertical coordinate (at full levels)"; b:units = ‘Pa Pa-1’ float abnds(eta,2); abnds:long_name = "’a’ coefficient for vertical coordinate (at half-levels)"; float bbnds(eta,2); bbnds:long_name = "’b’ coefficient for vertical coordinate (at half-levels)"; float ps(lat, lon); ps.units = 'Pa'; float p0; p0.units = 'Pa'; float T(eta,lat,lon); T:standard_name="air_temperature"; T:units="K";
Implicit example:
float eta(eta) ; eta:long_name = "eta at layer midpoints" ; eta:positive = "down" ; eta:standard_name = " atmosphere_hybrid_sigma_pressure_coordinate" ; eta:formula_terms = "a: a b: b ps: ps p0:p0”; eta:bounds="etabnds"; float etabnds(eta,2); float a(eta); a:long_name = "’a’ coefficient for vertical coordinate (at full levels)"; a:units = ‘Pa’ a:bounds = "abnds"; float b(eta); b:long_name = "’b’ coefficient for vertical coordinate (at full levels)"; b:units = ‘Pa Pa-1’ b:bounds = "bbnds"; float abnds(eta,2); float bbnds(eta,2); float ps(lat, lon); ps.units = 'Pa'; float p0; p0.units = 'Pa'; float T(eta,lat,lon); T:standard_name="air_temperature"; T:units="K"; T:coordinates="a b";
Note also that there is nothing stopping the two methods being combined, which the checker would have to look out for. E.g.
float eta(eta) ; eta:long_name = "eta at layer midpoints" ; eta:positive = "down" ; eta:standard_name = " atmosphere_hybrid_sigma_pressure_coordinate" ; eta:formula_terms = "a: a b: b ps: ps p0:p0”; eta:bounds="etabnds"; float etabnds(eta,2); etabnds:formula_terms = " a: abnds b: bbnds ps: ps p0:p0" ; float a(eta); a:long_name = "’a’ coefficient for vertical coordinate (at full levels)"; a:units = ‘Pa’ a.bounds = 'abnds'; float b(eta); b:long_name = "’b’ coefficient for vertical coordinate (at full levels)"; b:units = 'Pa Pa-1'; b.bounds = 'bbnds'; float abnds(eta,2); float bbnds(eta,2); float ps(lat, lon); ps.units = 'Pa'; float p0; p0.units = 'Pa'; float T(eta,lat,lon); T:standard_name="air_temperature"; T:units="K"; T.coordinates = 'a b';
comment:11 Changed 11 months ago by jonathan
Dear David
Thanks for your posting. I agree that we should resolve this for the sake of CMIP6 and CF - we could still get it into 1.7 if we agree now about it.
I would be happy with what you suggest, but I think that Karl wouldn't be. He think it's important that the bounds must have formula terms, so he would not like to allow your "Implicit" case. For the sake of reaching an agreement, I think it would be fine to require bounds to have formula terms (as Karl prefers), but I still want to permit formula terms to have bounds (as you and I prefer) and hence allow your "two methods" case.
That was the aim of the text I proposed in comment 7
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, its boundary variable must have a formula_terms attribute too. Because the same standard_name must describe both variables, the formula must have the same terms (as specified in Appendix D), but a different variable must be named by the two formula_terms attributes for any term which depends on the vertical dimension, because the boundary variables have one more dimension. The boundary variables for these formula terms may also be identified by bounds attributes of the formula terms variables. In that case, the formula_terms of the boundary variable and the bounds of the formula terms variables must be consistent.
which would be well-illustrated by your "explicit" and "two methods" CDL.
What do you think, Karl?
Cheers
Jonathan
comment:12 Changed 11 months ago by davidhassell
Hello Jonathan, Karl,
I too prefer the "explicit" method, but am quite keen on keeping the "implicit" method to preserve backwards compatibility. To date, both methods have been allowed (by virtue of neither being disallowed), so to preclude the "implicit" method now would cause problems.
I would be more than happy to say that the "explicit" method is strongly recommended.
Note that the implicit method requires that formula_terms variables with bounds are auxiliary coordinate variables - otherwise their bounds attributes are not standardised.
Given all that, how about this text, which updates Jonathan's:
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, it is strongly recommended that its boundary variable has a formula_terms attribute too. Because the same standard_name must describe both variables, the formula must have the same terms (as specified in Appendix D), but a different variable must be named by the two formula_terms attributes for any term which depends on the vertical dimension, because the boundary variables have one more dimension. The boundary variables for these formula terms may also be identified by bounds attributes of the formula terms variables that are auxiliary coordinate variables. In that case, the formula_terms of the boundary variable and the bounds of the formula terms variables must be consistent. If the formula_terms attribute is omitted from the boundary variable then it is assumed that the formula_terms attribute from the parametric coordinate variable applies to the boundary variable, but for any term which depends on the vertical dimension, the named variable is replaced with its boundary variable. Therefore, such a named variable must be an auxiliary coordinate variable with an associated boundary variable.
All the best,
David
comment:13 Changed 10 months ago by taylor13
Thanks, David and Jonathon, for continuing to engage on this, because I agree that it is important to try to include this in 1.7 (in support of CMIP6). I think there is probably no harm (and may be some virtue) in allowing the "implicit" case, so I support Jonathan's most recent proposal and David's amendment that "it is strongly recommended that its boundary variable has a formula_terms attribute too."
I must confess that I didn't find the text added by David at the end, starting with "If the formula_terms attribute is omitted" to be helpful. Perhaps it's the time of the evening or the afterglow of a nice glass of wine, but unless Jonathan thinks this text will be understandable and enlightening, then I would favor omitting it.
Let's try to wrap this up.
Anyone else with any objections to the proposal?
best regards, Karl
comment:14 follow-up: ↓ 15 Changed 10 months ago by davidhassell
Hello Karl,
On a re-read, I agree that we can omit text as you suggest. That would give (changing one "the" to an "a"):
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, it is strongly recommended that its boundary variable has a formula_terms attribute too. Because the same standard name must describe both variables, the formula must have the same terms (as specified in Appendix D), but a different variable must be named by the two formula_terms attributes for any term which depends on the vertical dimension, because the boundary variables have one more dimension. The boundary variables for these formula terms may also be identified by bounds attributes of the formula terms variables that are auxiliary coordinate variables. In that case, a formula_terms attribute of the boundary variable and the bounds of the formula terms variables must be consistent.
Thanks, David
comment:15 in reply to: ↑ 14 Changed 10 months ago by jonathan
Dear Karl and David
I agree with this text as well, thank you. I suggest a reordering of the penultimate sentence, and I think with this rewording "the" is better than "a" in the final sentence:
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, it is strongly recommended that its boundary variable has a formula_terms attribute too. Because the same standard name must describe both variables, the formula must have the same terms (as specified in Appendix D), but a different variable must be named by the two formula_terms attributes for any term which depends on the vertical dimension, because the boundary variables have one more dimension. In addition, any formula terms variable which is an auxiliary coordinate variable may have a bounds attribute to identify its boundary variable. In that case, the formula_terms attribute of the boundary variable and the bounds attribute of the formula terms variable must be consistent.
I'm very glad we seem likely agree this at last! I think an example is also needed, as we previously discussed, presumably the one with both methods shown. David, please could you repeat the proposed example? Also, please could you draft the required changes to the conformance document? We can discuss that if you like.
I assume that if we accept this ticket we also accept the uncontroversial main part of David's proposal in the initial statement, about all the other attributes of boundary variables. I'm still happy with that part. Are you, Karl?
In addition, if we accept this ticket, we can close ticket 147, since this one has dealt with the issue. It's a long-standing issue in CF, and it's good to have it resolved!
Best wishes
Jonathan
comment:16 Changed 10 months ago by davidhassell
Dear Jonathan and Karl,
I'm afraid that I don't favour the "the" in the last sentence. It implies to me that in this case there has to be a formula_terms attribute on the bounds, which is not the case. How about
In that case, the bounds attribute of the formula terms variable must be consistent with the formula_terms attribute of the boundary variable, if present.
or leaving the whole sentence out and moving it to the conformance doc.
I shall re-check and re-post the example.
Many thanks,
David
comment:17 Changed 10 months ago by jonathan
Dear David
Ah, I see what you mean by putting "a" - when I read it first I didn't pick up that implication. I think your new version of this sentence is fine, thanks. It should appear in the conventions document in order to provide the reason for the conformance document to state it as a requirement.
Best wishes
Jonathan
comment:18 Changed 10 months ago by taylor13
Hi David and Jonathan,
I was about to finally sign off on this when I realized what we’ve agreed upon isn’t entirely made clear in the text. I thought Jonathan (in the generous spirit of “reaching an agreement”) proposed that although in the past neither explicit nor implicit methods were part of the standard, both were used (in different datasets). For backward compatibility with these legacy datasets, software should be able to infer the formula_terms using either method, but going forward, we deprecate sole use of the implicit method (but allow it in addition to the explicit method). If that is what we’ve agreed, then would the following text be clear?
If a parametric coordinate variable with a formula_terms attribute (ref section 4.3.2) also has a bounds attribute, its boundary variable must have a formula_terms attribute too. In this case the same terms would appear in both (as specified in Appendix D), since the transformation from the parametric coordinate values to physical space is realized through the same formula. For any term that depends on the vertical dimension, however, the variable names appearing in the formula terms would differ from those found in the formula_terms of the coordinate variable itself because the 2-dimensional bound locations do not generally coincide with the 1-dimensional coordinate locations.
Whenever a formula_terms attribute is attached to a boundary variable, the formula terms may additionally be identified using a second method: the variables appearing in the vertical coordinate's formula_terms may be declared to be auxiliary coordinates, and those coordinates may have bounds attributes that identify their boundary variables. In that case, the bounds attribute of a formula terms variable must be consistent with the formula_terms attribute of the boundary variable. Software digesting legacy datasets (constructed prior to version 1.7 of this standard) may have to rely in some cases on the first method of identifying the formula term variables and in other cases, on the second. Henceforth, however, the first method will be sufficient.
My impression is that Jonathan thought this would be o.k., but David was worried that disallowing sole use of the “implicit” (or second) approach might be problematic.
The "explicit" and "combined" examples provided by David in https://cf-trac.llnl.gov/trac/ticket/140#comment:10 should also be included.
best regards,
Karl
comment:19 Changed 10 months ago by davidhassell
Hi Karl, Jonathan,
OK - I'll go along that, on the grounds that existing software "should" already cater for both methods. I like the clear wording.
If Jonathan agrees, I'll put together in a new comment all of the new text that this ticket implies, plus the examples and conformances.
Thanks,
David
comment:20 Changed 10 months ago by jonathan
Dear Karl and David
Yes, that wording is fine, thanks. Instead of "Henceforth" in the final sentence, I'd put "Starting from version 1.7", because this text will remain in future versions, when "Henceforth" would be confusing.
Best wishes
Jonathan
comment:21 Changed 10 months ago by davidhassell
That's good, then - I'll put together the entire set of changes in the next few days
Thanks, David
comment:22 Changed 10 months ago by davidhassell
Here are the proposed changes in full:
After the first paragraph of section 7.1 Cell Boundaries, add the following three paragraphs and two examples:
Boundary variable attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the array values (units, calendar, leap_month, leap_year and month_lengths) must always agree exactly with the same attributes of its associated coordinate, scalar coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that these are not provided to a boundary variable.
If a parametric coordinate variable with a formula_terms attribute (section 4.3.2) also has a bounds attribute, its boundary variable must have a formula_terms attribute too. In this case the same terms would appear in both (as specified in Appendix D), since the transformation from the parametric coordinate values to physical space is realized through the same formula. For any term that depends on the vertical dimension, however, the variable names appearing in the formula terms would differ from those found in the formula_terms attribute of the coordinate variable itself because the 2-dimensional bound locations do not generally coincide with the 1-dimensional coordinate locations.
Whenever a formula_terms attribute is attached to a boundary variable, the formula terms may additionally be identified using a second method: variables appearing in the vertical coordinate's formula_terms may be declared to be coordinate, scalar coordinate or auxiliary coordinate variables, and those coordinates may have bounds attributes that identify their boundary variables. In that case, the bounds attribute of a formula terms variable must be consistent with the formula_terms attribute of the boundary variable. Software digesting legacy datasets (constructed prior to version 1.7 of this standard) may have to rely in some cases on the first method of identifying the formula term variables and in other cases, on the second. Starting from version 1.7, however, the first method will be sufficient.
Example: Specifying formula_terms on a boundary variable when the named variables that depend on the vertical dimension are not associated with coordinate, scalar coordinate or auxiliary coordinate variables.
float eta(eta) ; eta:long_name = "eta at half levels" ; eta:positive = "down" ; eta:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ; eta:formula_terms = "a: A b: B ps: PS p0: P0" ; eta:bounds="eta_bnds" ; float eta_bnds(eta, 2) ; eta_bnds:formula_terms = "a: A_full b: B_full ps: PS p0: P0" ; float A(eta) ; A:long_name = "'a' coefficient for vertical coordinate at half levels" ; A:units = "Pa" ; float B(eta) ; B:long_name = "'b' coefficient for vertical coordinate at half levels" ; B:units = "1" ; float A_full(eta, 2) ; A_full:long_name = "'a' coefficient for vertical coordinate at full levels" ; A_full:units = "Pa" ; float B_full(eta, 2) ; B_full:long_name = "'a' coefficient for vertical coordinate at full levels" ; B_full:units = "1" ; float PS(lat, lon) ; PS.units = 'Pa' ; float P0 ; P0.units = 'Pa' ; float temp(eta, lat, lon) ; temp:standard_name = "air_temperature" ; temp:units = "K" ;
Example: Specifying formula_terms on a boundary variable when the named variables that depend on the vertical dimension contain boundary values to auxiliary coordinate variables.
float eta(eta) ; eta:long_name = "eta at half levels" ; eta:positive = "down" ; eta:standard_name = " atmosphere_hybrid_sigma_pressure_coordinate" ; eta:formula_terms = "a: A b: B ps: PS p0: P0" ; eta:bounds="eta_bnds" ; float eta_bnds(eta, 2) ; eta_bnds:formula_terms = "a: A_bnds b: B_bnds ps: PS p0: P0" ; float A(eta) ; A:long_name = "'a' coefficient for vertical coordinate at full levels" ; A:units = "Pa" ; A:bounds = "A_bnds" ; float B(eta) ; B:long_name = "'b' coefficient for vertical coordinate at full levels" ; B:units = "1" ; B:bounds = "B_bnds" ; float A_bnds(eta, 2) ; float B_bnds(eta, 2) ; float PS(lat, lon) ; PS.units = 'Pa' ; float P0 ; P0.units = 'Pa' ; float temp(eta, lat, lon) ; temp:standard_name = "air_temperature" ; temp:units = "K"; temp:coordinates = "A B" ;
The conformance document is to be changed as follows:
In section 7.1 Cell Boundaries:
Replace requirement "If a boundary variable has units or standard_name attributes, they must agree with those of its associated variable." with
If a boundary variable has units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes, they must agree with those of its associated variable.
Add a new requirement (This requirement is backwards incompatible):
Starting with version 1.7, a boundary variable must have a formula_terms attribute when it contains bounds for a parametric vertical coordinate variable that has a formula_terms attribute. In this case the same terms and named variables must appear in both except for any term that depends on the vertical dimension, for which the variable name appearing in the boundary variable's formula_terms must differ from that found in the formula_terms of the coordinate variable itself. In this case, if the named variable in the formula_terms attribute of the vertical coordinate variable is a coordinate, scalar coordinate or auxiliary coordinate variable then its bounds attribute must be consistent with the equivalent term in formula_terms attribute of the boundary variable.
Replace recommendation "Boundary variables should not have the _FillValue or missing_value attributes." with
Boundary variables should not have the _FillValue, missing_value, units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes.
comment:23 Changed 10 months ago by taylor13
Hi David,
Thanks for getting this done so quickly.
This looks fine except:
In the examples, the half levels should apply to bounds, and the full levels to the coordinate itself. This is I think always the case for eta with one example provided at: https://rda.ucar.edu/datasets/ds627.2/docs/Eta_coordinate/
In the 2nd example, wouldn't it be better to label it: "Linking variables appearing in formula_terms associated with coordinate values with their counterparts associated with the bounds of the coordinates."
In the 2nd example, I think we should require the units attribute be defined for both A_bnds and B_bnds. Otherwise software relying on the now recommended method for defining these would have to figure out the units from the bounds attributes attached to A and B, which is more work. We would probably also want to tweak the last sentence in the 1st paragraph which states: "To avoid duplication, however, it is recommended that these are not provided to a boundary variable" (since I think in the case of bounds attached to formula term variables, I think the units should be provided).
best regards, Karl
comment:24 Changed 10 months ago by jonathan
Dear David and Karl
Thanks for doing this, David.
I agree with Karl that it would seem more obvious and usual to have the half-levels as bounds in the examples.
The two examples are of course very similar. I can imagine a reader glancing at them and thinking they are the same, and being consequently puzzled, or poring over them to spot the differences, as I have just been doing. :-) The titles of the examples are not easy to understand, I would say, and therefore don't help very much. I suggest solving these difficulties by having only one example, labelled something like "Specifying formula_terms when a parametric coordinate variable has bounds", in which you give the bounds attributes of the formula terms variables and the coordinates attribute of the data variable a CDL comment such as // This attribute is included for the optional second method. You could perhaps also comment on the formula_terms of the bounds // This attribute is mandatory.
I'm sorry to say that I might disagree with Karl's final comment. I hope this isn't going to break our consensus! Actually I hadn't thought of this consequence before, which relates to our different views about the paths to formula terms, and indeed the original motive for this ticket. David and I feel that bounds variables are closely related to their parent variables, and the natural route to a bounds variable is via its parent; therefore they can share attributes, including units and standard_name. Thus Karl's requirement would contradict the first paragraph if the formula terms have bounds, as you say, Karl. What do you think, David? Karl, is there a use-case in which you are dealing with eta_bnds without being aware of eta?
In the second paragraph, we currently have, "the variable names appearing in the formula terms would differ from those found in the formula_terms attribute of the coordinate variable itself because the 2-dimensional bound locations do not generally coincide with the 1-dimensional coordinate locations." I think we can state this more strongly and clearly, "... because the boundary variables for formula terms are two-dimensional while the formula terms themselves are one-dimensional." That is, they must be different variables not because the locations are different, though that it is almost certainly the case, but because they have different dimensionality.
In the new requirement for the conformance document, can we say something more about the dimensionality of the bounds variable of the formula terms? I think it must have the same dimensions as the formula terms variable, with the addition of the final (CDL) dimension of the coordinate bounds variable.
Best wishes
Jonathan
comment:25 Changed 10 months ago by taylor13
Dear David and Jonathan,
I like Jonathan's idea to combine both examples and annotate it.
I don't think it important to *require* the units attributes to be included with the parametric bounds variables, so let's not worry about my "final comment". For the record, I think the most common use case where one might care only about the eta bounds and not eta is in performing a mass-weighted vertical integral of a quantity. One would need the values of the quantity being integrated and the eta-bounds. The eta-bounds would be transformed to pressure using the the formula terms attached to eta-bounds. The pressure "widths" of each cell could then be calculated and multiplied by the variable being integrated; then summed. There would be no need to know anything about eta itself. Of course, in this procedure there would be no need to know anything about the parametric units either, so it doesn't demand that we make it easy to extract the units associated with A_bnds and B_bnds.
I think Jonathan's rewrite of the sentence in the 2nd paragraph is an improvement.
I think we must be about "there".
best wishes, Karl
comment:26 Changed 10 months ago by davidhassell
Dear Karl and Jonathan,
I agree with all of your suggestions and changes. I hope that I have parsed them correctly and that we are indeed "there". The proposed changes, in full, are now:
After the first paragraph of section 7.1 Cell Boundaries, add the following three paragraphs and two examples:
Boundary variable attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the array values (units, calendar, leap_month, leap_year and month_lengths) must always agree exactly with the same attributes of its associated coordinate, scalar coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that these are not provided to a boundary variable.
If a parametric coordinate variable with a formula_terms attribute (section 4.3.2) also has a bounds attribute, its boundary variable must have a formula_terms attribute too. In this case the same terms would appear in both (as specified in Appendix D), since the transformation from the parametric coordinate values to physical space is realized through the same formula. For any term that depends on the vertical dimension, however, the variable names appearing in the formula terms would differ from those found in the formula_terms attribute of the coordinate variable itself because the boundary variables for formula terms are two-dimensional while the formula terms themselves are one-dimensional.
Whenever a formula_terms attribute is attached to a boundary variable, the formula terms may additionally be identified using a second method: variables appearing in the vertical coordinate's formula_terms may be declared to be coordinate, scalar coordinate or auxiliary coordinate variables, and those coordinates may have bounds attributes that identify their boundary variables. In that case, the bounds attribute of a formula terms variable must be consistent with the formula_terms attribute of the boundary variable. Software digesting legacy datasets (constructed prior to version 1.7 of this standard) may have to rely in some cases on the first method of identifying the formula term variables and in other cases, on the second. Starting from version 1.7, however, the first method will be sufficient.
Example: Specifying formula_terms when a parametric coordinate variable has bounds.
float eta(eta) ; eta:long_name = "eta at full levels" ; eta:positive = "down" ; eta:standard_name = " atmosphere_hybrid_sigma_pressure_coordinate" ; eta:formula_terms = "a: A b: B ps: PS p0: P0" ; eta:bounds="eta_bnds" ; float eta_bnds(eta, 2) ; eta_bnds:formula_terms = "a: A_bnds b: B_bnds ps: PS p0: P0" ; // This attribute is mandatory float A(eta) ; A:long_name = "'a' coefficient for vertical coordinate at full levels" ; A:units = "Pa" ; A:bounds = "A_bnds" ; // This attribute is included for the optional second method float B(eta) ; B:long_name = "'b' coefficient for vertical coordinate at full levels" ; B:units = "1" ; B:bounds = "B_bnds" ; // This attribute is included for the optional second method float A_bnds(eta, 2) ; float B_bnds(eta, 2) ; float PS(lat, lon) ; PS.units = "Pa" ; float P0 ; P0.units = "Pa" ; float temp(eta, lat, lon) ; temp:standard_name = "air_temperature" ; temp:units = "K"; temp:coordinates = "A B" ; // This attribute is included for the optional second method
The conformance document is to be changed as follows:
In section 7.1 Cell Boundaries:
Replace requirement "If a boundary variable has units or standard_name attributes, they must agree with those of its associated variable." with
If a boundary variable has units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes, they must agree with those of its associated variable.
Add a new requirement (This requirement is backwards incompatible):
Starting with version 1.7, a boundary variable must have a formula_terms attribute when it contains bounds for a parametric vertical coordinate variable that has a formula_terms attribute. In this case the same terms and named variables must appear in both except for any term that depends on the vertical dimension, for which the variable name appearing in the boundary variable's formula_terms attribute must differ from that found in the formula_terms attribute of the coordinate variable itself. The different variable must have the same dimensions as its associated variable, plus have a trailing dimension (CDL order) for the maximum number of vertices in a cell. If the named variable in the formula_terms attribute of the vertical coordinate variable is a coordinate, scalar coordinate or auxiliary coordinate variable then its bounds attribute must be consistent with the equivalent term in formula_terms attribute of the boundary variable.
Replace recommendation "Boundary variables should not have the _FillValue or missing_value attributes." with
Boundary variables should not have the _FillValue, missing_value, units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes.
comment:27 Changed 10 months ago by taylor13
Hi David,
Looks good to me (except in the example P0.units should be P0:units and PS.units should be PS:units).
What a shame we're done with this. Now we'll have to busy ourselves with less important stuff.
cheers, Karl
comment:28 Changed 10 months ago by jonathan
Dear David
Thanks for this. In the preamble, it still says two examples, although there's only one now. "The different variable" is maybe an unclear phrase. You could say explicitly, "The boundary variable of the formula_terms variable must have the same dimensions as the formula_terms variable, plus a trailing dimension (CDL order) for the maximum number of vertices in a cell, which must be the same as the trailing dimension of the boundary variable of the parametric vertical coordinate variable." I've added the last bit. That's quite a mouthful - I hope digestible, if not palatable.
Best wishes and thanks to you and Karl on this successful outcome.
Jonathan
comment:29 Changed 10 months ago by taylor13
another slight tweak for the sentences under consideration (but you be the judge on whether it makes it any more readable):
In this case the same terms and named variables must appear in both except for terms that depend on the vertical dimension. For such terms the variable name appearing in the boundary variable's formula_terms attribute must differ from that found in the formula_terms attribute of the coordinate variable itself. The boundary variable of the formula_terms variable must have the same dimensions as the formula_terms variable, plus a trailing dimension (CDL order) for the maximum number of vertices in a cell, which must be the same as the trailing dimension of the boundary variable of the parametric vertical coordinate variable.
best, Karl
comment:30 Changed 10 months ago by davidhassell
OK - I like the two proposed changes - thanks to Jonathan and Karl for working on the minutea of this.
Here it is again, in full (one thing I've learnt from Jeff's and Tanya's excellent work on the production of 1.7 is that having all the changes in place at the bottom of the ticket is very useful when implementing it - having to trawl the ticket for odd sentences here and there can be difficult).
After the first paragraph of section 7.1 Cell Boundaries, add the following three paragraphs and one example:
Boundary variable attributes which determine the coordinate type (units, standard_name, axis and positive) or those which affect the interpretation of the array values (units, calendar, leap_month, leap_year and month_lengths) must always agree exactly with the same attributes of its associated coordinate, scalar coordinate or auxiliary coordinate variable. To avoid duplication, however, it is recommended that these are not provided to a boundary variable.
If a parametric coordinate variable with a formula_terms attribute (section 4.3.2) also has a bounds attribute, its boundary variable must have a formula_terms attribute too. In this case the same terms would appear in both (as specified in Appendix D), since the transformation from the parametric coordinate values to physical space is realized through the same formula. For any term that depends on the vertical dimension, however, the variable names appearing in the formula terms would differ from those found in the formula_terms attribute of the coordinate variable itself because the boundary variables for formula terms are two-dimensional while the formula terms themselves are one-dimensional.
Whenever a formula_terms attribute is attached to a boundary variable, the formula terms may additionally be identified using a second method: variables appearing in the vertical coordinates' formula_terms may be declared to be coordinate, scalar coordinate or auxiliary coordinate variables, and those coordinates may have bounds attributes that identify their boundary variables. In that case, the bounds attribute of a formula terms variable must be consistent with the formula_terms attribute of the boundary variable. Software digesting legacy datasets (constructed prior to version 1.7 of this standard) may have to rely in some cases on the first method of identifying the formula term variables and in other cases, on the second. Starting from version 1.7, however, the first method will be sufficient.
Example: Specifying formula_terms when a parametric coordinate variable has bounds.
float eta(eta) ; eta:long_name = "eta at full levels" ; eta:positive = "down" ; eta:standard_name = " atmosphere_hybrid_sigma_pressure_coordinate" ; eta:formula_terms = "a: A b: B ps: PS p0: P0" ; eta:bounds="eta_bnds" ; float eta_bnds(eta, 2) ; eta_bnds:formula_terms = "a: A_bnds b: B_bnds ps: PS p0: P0" ; // This attribute is mandatory float A(eta) ; A:long_name = "'a' coefficient for vertical coordinate at full levels" ; A:units = "Pa" ; A:bounds = "A_bnds" ; // This attribute is included for the optional second method float B(eta) ; B:long_name = "'b' coefficient for vertical coordinate at full levels" ; B:units = "1" ; B:bounds = "B_bnds" ; // This attribute is included for the optional second method float A_bnds(eta, 2) ; float B_bnds(eta, 2) ; float PS(lat, lon) ; PS:units = "Pa" ; float P0 ; P0:units = "Pa" ; float temp(eta, lat, lon) ; temp:standard_name = "air_temperature" ; temp:units = "K"; temp:coordinates = "A B" ; // This attribute is included for the optional second method
The conformance document is to be changed as follows:
In section 7.1 Cell Boundaries:
Replace requirement "If a boundary variable has units or standard_name attributes, they must agree with those of its associated variable." with
If a boundary variable has units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes, they must agree with those of its associated variable.
Add a new requirement (This requirement is backwards incompatible):
Starting with version 1.7, a boundary variable must have a formula_terms attribute when it contains bounds for a parametric vertical coordinate variable that has a formula_terms attribute. In this case the same terms and named variables must appear in both except for terms that depend on the vertical dimension. For such terms the variable name appearing in the boundary variable's formula_terms attribute must differ from that found in the formula_terms attribute of the coordinate variable itself. The boundary variable of the formula_terms variable must have the same dimensions as the formula_terms variable, plus a trailing dimension (CDL order) for the maximum number of vertices in a cell, which must be the same as the trailing dimension of the boundary variable of the parametric vertical coordinate variable. If a named variable in the formula_terms attribute of the vertical coordinate variable depends on the vertical dimension and is a coordinate, scalar coordinate or auxiliary coordinate variable then its bounds attribute must be consistent with the equivalent term in formula_terms attribute of the boundary variable.
Replace recommendation "Boundary variables should not have the _FillValue or missing_value attributes." with
Boundary variables should not have the _FillValue, missing_value, units, standard_name, axis, positive, calendar, leap_month, leap_year or month_lengths attributes.
comment:31 Changed 9 months ago by davidhassell
Hello,
A consensus has been reached, and the standard three weeks have passed, so I think that we can accept this ticket for inclusion in version 1.7.
Many thanks to all who contributed, all the best,
David
comment:32 Changed 9 months ago by painter1
- Resolution set to fixed
- Status changed from new to closed
Thank you for making this proposal, which I support.
Jonathan