Handling and formatting of vector quantities in CF
|Reported by:||lavergne||Owned by:||cf-conventions@…|
Handling and formatting of vector quantities in CF
- Lavergne (Norwegian Met Institute - MET.NO)
As of writing this proposal, the CF convention allows for defining datasets/variables that are components of 2D (or 3D) vectors. For example, a CF file might contain a wind_speed and a wind_direction variables.
It, however, lacks the capability to indicate that the two variables above are truly the components of a higher-dimension vector (the wind vector, that in that case would be 2D). Such a capability would be quite useful for:
- Allowing a single status_flag to apply for the vector as a whole, that is to both its components. Such a flag variable should have wind_vector status_flag as its standard_name (if the vector object was given the standard name wind_vector).
- Enable third-party software (such as map plotting software) to identify that there is a vector object in the file, and that it can thus be displayed with arrows (or barbs).
- Generally easing any vector-based operations such as rotation/scaling of components when changing Earth-mapping projection.
If feasible, the implementation of this proposal should be made backward compatible.
4. Initial Statement of Technical Proposal
The requirements above can be implemented using an umbrella variable for the vector variable. Such a variable would hold no data, have no dimension, and be of arbitrary type, just like the grid mapping variable (5.6. Horizontal Coordinate Reference Systems, Grid Mappings, and Projections).
The (proposed) vector variable would hold only two string attributes: a standard_name and a components one. It can hold others such as long_name if relevant.
The standard_name identifies what the vector quantity is about, and allows for later cross-referencing (e.g. in the status_flag example above).
The components attribute is a space-separated list of variable names that are all components of the vector variable. It is noteworthy that components is intended as any decomposition of the vector along some axis (e.g. speed/dir, x/y, u/v, north/east, etc...). To be valid, a vector variable shall have at least as many components as the dimensionality of the vector. A 2D wind vector shall thus list at least 2 components (e.g. u and v) but we allow for the speed to also be in the file, and listed as a component.
As the case now, each component variable shall define its dimensions, units, standard_name, grid_mapping, etc... The vector variable only holds the necessary attributes to find what the components are.
Note that all the components variables named by the vector variable must have the same set of coordinate axes, identified by the standard_names of their coordinate variables, although they do not have to have the same sets of coordinate values. This is to exclude, for instance, one component variable having time-latitude-longitude and another time-altitude-latitude-longitude as coordinate variables, but it does permit components to be on an Arakawa C-grid.
It should probably not be allowed to list two variables in the components attribute that both have the same 'standard_name'. The vector variable would then have two (possibly different) directions.
The proposal could be implemented as a section "3.6 Vector quantities".
Example in the case of an sea ice drift dataset:
// The two X and Y datasets and the direction. float dX(time, yc, xc) ; dX:long_name = "component of the displacement along the x axis of the grid" ; dX:standard_name = "sea_ice_x_displacement" ; dX:units = "km" ; dX:_FillValue = -1.e+10f ; dX:coordinates = "lat lon" ; dX:grid_mapping = "Polar_Stereographic_Grid" ; float dY(time, yc, xc) ; dY:long_name = "component of the displacement along the y axis of the grid" ; dY:standard_name = "sea_ice_y_displacement" ; dY:units = "km" ; dY:_FillValue = -1.e+10f ; dX:coordinates = "lat lon" ; dX:grid_mapping = "Polar_Stereographic_Grid" ; float dir(time, yc, xc) ; dY:long_name = "direction of the displacement" ; dY:standard_name = "direction_of_sea_ice_displacement" ; dY:units = "degrees" ; dY:_FillValue = -1.e+10f ; dX:coordinates = "lat lon" ; dX:grid_mapping = "Polar_Stereographic_Grid" ; // The new vector variable: int ice_drift_vector; drift_vector:standard_name = "sea_ice_displacement_vector" ; drift_vector:long_name = "sea ice drift vector" ; drift_vector:components = "dX dY dir" ; // A status flag for the vector: byte status_flag(time, yc, xc) ; status_flag:standard_name = "sea_ice_displacement_vector status_flag" ; status_flag:long_name = "rejection and quality level flag" ; status_flag:valid_min = 0b ; status_flag:valid_max = 30b ; status_flag:grid_mapping = "Polar_Stereographic_Grid" ; status_flag:coordinates = "lat lon" ; status_flag:flag_values = 0b, 1b,..., 22b, 30b ; status_flag:flag_meanings = "missing_input_data over_land ... interpolated nominal_quality" ;
This proposal will ease the use of any vector quantities so winds, currents, sea ice motion, etc...
It enables the data producer to document that there are some vector quantities hosted in this file. This should greatly help e.g. third-party software to locate the vectors and act specifically for them (plot with arrows, rotate them, etc...)
7. Status Quo and other approaches
This proposal stems from a discussion on the main CF list, originally posted as "[CF-metadata] Proposal for better handling vector quantities in CF" on Nov 24th 2011. The discussion contains both PROS and CONS to the proposal, as well as some alternative approaches. We will not summarize these discussions here, but let us nonetheless list the alternatives:
- Extensive use of ancillary_variables;
- Define a vector dimension;
- Introduce Groups (Common Data Model-2) in CF;
- Introduce Compound Data Types (HDF5) in CF.
Change History (69)
comment:24 follow-up: ↓ 33 Changed 5 years ago by lavergne
- Resolution set to worksforme
- Status changed from new to closed
comment:25 Changed 5 years ago by lavergne
- Resolution worksforme deleted
- Status changed from closed to reopened