wiki:PointObservationConventionsJMG

Chapter 9 Discrete Sampling Geometries

Chapter 5 explains how to specify Coordinate Systems for data arranged in a multidimensional rectangular spatiotemporal grid. This chapter extends and modifies that framework for data with discrete sampling geometries in space and time, meaning that the spatiotemporal dimensions are not all independent, and gridpoints do not exist for all possible combinations of space and time coordinates.

9.1.1 Feature types

The types of discrete sampling geometry, called feature types, that are allowed by this chapter are:

  • point: a collection of data points with no structure in time and space
  • timeSeries: a series of data points at the same location, with varying time
  • trajectory: a series of data points along a 1D curve in time and space [JMG: could it be confusing to call it 1D? since in another sense it is a curve in 4D! ][RPS: I agree. How about "trajectory" instead of "1D curve"]
  • profile: a set of data points along a vertical line
  • timeSeriesProfile: a series of profiles at the same location, with varying time
  • trajectoryProfile: a set of profiles which originate from points along a trajectory

A single instance of each feature type can be formally described and distinguished by the coordinates and dimensions it involves (with dimensions shown in CDL order):

pointdata(i)x(i) y(i) [z(i)] t(i)
timeSeriesdata(i,o)x(i) y(i) [z(i)] t(i,o)
trajectorydata(i,o)x(i,o) y(i,o) [z(i,o)] t(i,o)
profiledata(i,o)x(i) y(i) z(i,o) t(i)
timeSeriesProfiledata(i,p,o)x(i) y(i) z(i,p,o) t(i,p)
trajectoryProfiledata(i,p,o)x(i,o) y(i,o) z(i,p,o) t(i,p)

where x y z t are the spatiotemporal coordinates, [] indicate optional coordinates, i is the subscript identifying the instance of the feature type, while o and p are subscripts of the data values that compose that instance. For example, in a collection of timeSeries features, each timeSeries instance i has data values at each o time index.

The aim of this chapter is to provide efficient ways of storing many instances of a given feature type in each data variable. There may be more than one data variable in the file, but in this version of CF the data variables must all be of the same feature type. Future versions of CF may generalize this to allow multiple feature types in a file.

The feature type is specified by a global attribute featureType [JMG: not CF:featureType, because we haven't agreed that namespace convention yet], which must have a value of one of the above feature types (case-insensitive). The global attribute featureType is required if either of the ragged array representations (see 9.1.2) is used. New file writers are strongly recommended to use these new representations, and to include the featureType attribute even if using the multidimensional representation (as in Sections 5.4 and 5.5), because it gives useful information about the nature of the data. [JMG: Do you agree with this? It's not quite what we said before. I'm suggesting that any multidimensional rep is already legal by ch 5, so we can't mandate the featureType, but we can strongly recommend it on grounds of providing extra metadata. On the other hand, we can mandate it for the ragged reps, since these are anyway brand new, and the attr is needed to decode them successfully in some cases.]

9.1.2 Representations

There are two approaches for representing data with a discrete sampling geometry in CF:

  • the multidimensional (rectangular array) representation is simplest but requires the coordinate variable for each dimension to contain the union of all the values taken for that dimension by all the instances. This representation is recommended only in special cases, outlined below and in following subsections. Typically the data variable would be sparsely populated so this representation is inefficient of space. For example, if there are several timeSeries, the time coordinate variable must include all of the sampling times of all of the timeSeries; that wastes a lot of space if the timeSeries do not have sampling times in common. This representation is described by the conventions of chapter 5.
  • the two ragged array representations, which allow different instances to have different numbers of data values. These representations are described by the conventions of this section.

In the multidimensional representation, the variables containing data and coordinates should dimensioned as shown in the table of section 9.1.1. The dimension which runs over instances of the feature type (timeSeries, profiles, etc.) should be the outer dimension i.e. the leading dimension in CDL, as shown by i in the table. We call this the "instance dimension".

In the ragged representations, variables which have both the instance dimension (i) and dimensions running over data elements of the feature (o and p) become one-dimensional, with a size equal to the total number of data elements in all instances, as shown in the examples in following subsections. We call this the "sample dimension". Variables which have only the instance dimension provide metadata that describe the instance (timeSeries, trajectory, etc.). In general, we call these "instance variables". For timeSeries in particular, they are called "station variables". The instance dimension may be larger than the number of instances which are currently present in the data, with the unused instances having missing values in the instance variables.

A single instance of the point feature type is zero-dimensional, so a collection of them is one-dimensional. For that feature type, only the multidimensional representation is used (following chapter 5), because a one-dimensional array cannot be ragged, and the ragged representations would needlessly take up extra space.

For other feature types, if there is only a single instance, then there is no need for an instance dimension, the data will be therefore be one-dimensional, and again the multidimensional representation should be used. The multidimensional representation can also be used if there are multiple instances, and the multivalued coordinates have the same values for each instance; for instance, if a collection of timeSeries all have the same sampling times, the multidimensional representation will be efficient of space. In other cases, the ragged representations are recommended. The following subsections detail each feature type and show examples of the possible ragged representations of each.

The two ragged representations have distinct advantages and structure:

  • The contiguous ragged array representation is the most efficient storage method but can be used only if each instance can be written all at once. It stores each instance as a set of adjacent elements in the data variable. The canonical use case for this is when all the data to be written is accessible at the same time, and you expect that the common pattern will be to read all the data at once from each instance. If the sample dimension is the netCDF unlimited dimension, the data for each timeSeries will be contiguous on disk. [JMG: is that not true if it isn't the unlimited dimension?] This representation is identifiable by the presence of a count attribute on each data variable, which names another variable that records how many samples there are in each feature instance, as shown in the examples. This count variable must be of type integer and must have the instance dimension as its sole dimension.
  • the indexed ragged array representation stores the instances interleaved in the data variable, so they can be written incrementally. The canonical use case is when writing real-time data streams that contain reports from many sources; the data can be written as it arrives. If the sample dimension is the netCDF unlimited dimension, new data can be appended to the file. This representation is identifiable by the presence of a index attribute on each data variable. This attribute names another variable which is a zero-based index that assigns each sample to one of the feature instances, as shown in the examples. This index variable must be of type integer, must have the sample dimension as its single dimension, and must have an instance attribute naming the instance dimension.

9.1.3 Coordinates

It is required that the data [JMG: not observations necessarily - could be model data] can be located in the space and time dimensions required for each feature type (see table in section 9.1.1 [JMG: that is more general than saying x and y reqd, z optional. Also x and y might not be lat and lon. They could be projection coords e.g. on a national grid.]) by information contained in the file. Therefore:

  1. The coordinates attribute must identify the auxiliary coordinate variables needed to locate the data.
  2. The location must be unambiguous, and so the coordinates attribute must not point to multiple variables for any of the spatiotemporal dimensions.
  3. If there is a vertical coordinate variable, it must be identified as specified in chapter 4.3. The use of the attribute axis="Z" is recommended for clarity. [JMG: I'm not sure about that. I agree, 4.3 seems to allow it, but I recall a big debate about whether 2D lat and lon aux coord vars would have the axis attr, and I thought we concluded they wouldn't. If it's an aux coord var, it's not really an axis.] A standard_name attribute (see section 3.3) that clarifies the vertical coordinate is recommended, e.g. "altitude", "height", "height_above_reference_ellipsoid", "geopotential_height", or "surface_altitude". [JMG: Perhaps this bullet about vertical coord could be one about spatiotemporal coords in general i.e. including horizontal and time as well, and say that they should be identified as in chapter 4.]
  4. It is strongly recommended to include a instance variable which uniquely identifies the instance with a standard_name of station_id, trajectory_id or profile_id as appropriate. [JMG: I am not sure about what this contains - is it a name or a number? Is it useful if its values are not standardised, or are we proposing to standardise them?]
  5. There may optionally be other instance variables describing stations, trajectories or profiles, for instance with standard_names of station_desc [JMG: what would that contain?] or WMO_station_id [JMG: These are standardised, I presume - can we give a URL to the list of them?]
  6. Instance variables do not have to be identified as auxiliary coordinate variables, by naming them in the coordinates attribute, but doing so is recommended because it is then easier for generic software to associate the correct metadata with the instances. [JMG: do you agree with that?]

For coordinates that are connected [JMG: what do you mean by connected?], coordinate bounds are specified following section 7.1 "Bounds for 1-D coordinate variables". For coordinates that are not connected, follow section 7.1 "Bounds for 2-D coordinate variables with 4-sided cells" and "Bounds for multi-dimensional coordinate variables with p-sided cells".

Auxiliary coordinates may use missing values to indicate that the corresponding data element contains missing data, so a reader can check just the auxiliary coordinate values to infer where there is missing data. A data variable must have missing values wherever any of its auxiliary coordinate variables does.

9.2 Point Data

[JMG: This example is not a ragged representation, it's plain 1D and conforms to chapter 5 with the addition of the featureType. It's a possible data variable without chapter 9 and may already be in use. Why don't we move it to chapter 5 (including the featureType) and just refer to it from here?]

To represent data at scattered, unconnected locations, both data and coordinates use the same, single dimension. The coordinates attribute is used on the data variables to unambiguously identify the time, lat, lon [JMG: couldn't that be horizontal, in general, not necessarily lat and lon?], and (optional) vertical auxiliary coordinate variables.

dimensions:
  obs = 1234 ;

variables:
  double time(obs) ;
    time:standard_name = "time"; // JMG: changed to standard_name
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(obs) ;
    lon:standard_name = "longitude"; // JMG: changed to standard_name
    lon:units = "degrees_east";
  float lat(obs) ;
    lat:standard_name = "latitude" ; // JMG: changed to standard_name
    lat:units = "degrees_north" ;
  float alt(obs) ;
    alt:standard_name = "height" ;
    alt:units = "m";
    alt:positive = "up";
    alt:axis = "Z"; // JMG: not sure this is right. If Z, why not X Y and T as well? But they aren't really axes.

  float humidity(obs) ;
    humidity:long_name = "specific humidity" ;
    humidity:coordinates = "time lat lon alt" ;
  float temp(obs) ;
    temp:long_name = "temperature" ;
    temp:units = "Celsius" ;
    temp:coordinates = "time lat lon alt" ;

attributes:
  featureType = "point";

In this example, the humidity(i) and temp(i) data are associated with the coordinate values time(i), lat(i), lon(i), and optionally alt(i). The obs dimension may use the unlimited dimension or not. If the time coordinate is ordered, the obs dimension may be named time (making time a coordinate variable rather than an auxiliary coordinate variable). [JMG: It could be confusing to leave that choice open. Of course either way is legal, since this is in any case an ordinary chapter-5 kind of variable, but wouldn't it be simpler to recommend not having a coordinate variable? Choosing time is arbitrary. It could be any of them. They might equally well be ordered in latitude, for instance.] [JMG: Can we omit this: "The time coordinate may use a missing value, which indicates to skip all the data for that time index."? That is true of any of them, not just time. Also, it would not be legal if time was actually a Unidata coordinate variable, because then missing data is not allowed.]

9.3 TimeSeries? Data

Discrete data may be taken at a set of locations called stations. The stations have horizontal coordinates (usually latitude and longitude), and optionally a vertical coordinate and other station variables. The set of observations at a particular station, if ordered by time, is a timeSeries. [JMG: Do they have to be ordered by time? Is ordering a requirement of any of the dimensions of other feature types? If so, it would be useful to indicate this in the table of 9.1.1.]

9.3.1 Multidimensional representation

When the numbers of observations at each location are the same, one can use the multidimensional representation:

[JMG: Is this representation necessary? Example 5.4 is fine, in which the data is 2D (station,time) and all timeSeries have the same sample times. But this generalisation to allow time also to be 2D is only possible if they just happen to have the same number of samples, and I think it's surprising to regard this as multidimensional. Could we remove this one in the interest of simplicity? Of course, they don't really have to have the same number of samples as you could force any collection of timeSeries into the multidimensional rep and have a lot of missing data, but that's why we provide the ragged reps.]

dimensions:
  station = UNLIMITED ;
  obs = 13 ;

variables:
  float lon(station) ;
    lon:long_name = "station longitude";
    lon:units = "degrees_east";
  float lat(station) ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;
  float alt(station) ;
    alt:long_name = "vertical distance above the surface" ;
    alt:standard_name = "height" ;
    alt:units = "m";
    alt:positive = "up";
    alt:axis = "Z";
  char station_name(station, name_strlen) ;
    station_name:long_name = "station name" ;
    station_name:standard_name = "station_id";
  int station_info(station) ;
    station_info:long_name = "any kind of station info" ;
  float station_elevation(station) ;
    alt:long_name = "height above the geoid" ;
    alt:standard_name = "surface_altitude" ;
    alt:units = "m";

  double time(station, obs) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
    time:missing_value = -999.9;
  float humidity(station, obs) ;
    humidity:long_name = "specific humidity" ;
    humidity:coordinates = "time lat lon alt" ;
    humidity:_FillValue = -999.9;
  float temp(station, obs) ;
    temp:long_name = "temperature" ;
    temp:units = "Celsius" ;
    temp:coordinates = "time lat lon alt" ;
    temp:_FillValue = -999.9;

attributes:
    :CF\:featureType = "timeSeries";

The humidity(s,i) and temp(s,i) data are associated with the coordinate values time(s,i), lat(s), lon(s), and optionally vertical(s). The station dimension may be the unlimited dimension or not.

The time coordinate may use a missing value, which indicates that data is missing for that location and obs index. This allows one to have a variable number of observations at different stations, at the cost of some wasted space. The data variables may also use missing data values, to indicate that just that data variable is missing. If all the time values are identical for all timeSeries, you may use time(obs) or time(time) to indicate this.

Note that this is a generalization of Example 5.4, which assumes that all the timeSeries have observations with the same time coordinates.

9.3.2 Contiguous ragged array representation

dimensions:
  station = 23 ;
  obs = 1234 ;

variables:
  float lon(station) ;
    lon:standard_name = "longitude"; // JMG: changed to standard name
    lon:units = "degrees_east";
  float lat(station) ;
    lat:standard_name = "latitude" ; // JMG: changed to standard name
    lat:units = "degrees_north" ;
  float alt(station) ;
    alt:long_name = "vertical distance above the surface" ;
    alt:standard_name = "height" ;
    alt:units = "m";
    alt:positive = "up";
    alt:axis = "Z"; JMG: not sure about this---comment earlier
  char station_name(station, name_strlen) ;
    station_name:long_name = "station name" ;
    station_name:standard_name = "station_id"; JMG: not sure about this---comment earlier
  int station_info(station) ;
    station_info:long_name = "some kind of station info" ;
  int rowsize(station) ; // named by the count attribute of the data variables

  double time(obs) ;
    time:long_name = "time of measurement" ;
    time:standard_name = "time";
    time:units = "days since 1970-01-01 00:00:00" ;
  float humidity(obs) ;
    humidity:standard_name = "specific_humidity" ; // JMG: changed to standard name
    humidity:coordinates = "time lat lon alt station_name" ; // JMG: added station_name
    humidity:_FillValue = -999.9;
    humidity:count = "rowsize" ; // points to the variable of that name
  float temp(obs) ;
    temp:standard_name = "air_temperature" ; // JMG: changed to standard name
    temp:units = "Celsius" ;
    temp:coordinates = "time lat lon alt station_name" ; // JMG: added station_name
    temp:_FillValue = -999.9;
    temp:count = "rowsize" ; // points to the variable of that name

attributes:
    :featureType = "timeSeries";

Here, station is the index dimension, and obs the sample dimension. The sample dimension could be the netCDF unlimited dimension, but that is not required. The auxiliary coordinate variables lat, lon, alt and station_name are station variables.

The rowsize variable contains the length of each timeSeries and is identified by the count attribute of the data variables. The first rowsize[0] data values and time coordinates are for the first timeSeries (instance station=0), the next rowsize[1] are for the second timeSeries (instance station=1), etc.

9.3.3 Indexed ragged array representation

dimensions:
  station = 23 ;
  obs = UNLIMITED ;

variables:
  float lon(station) ;
    lon:standard_name = "longitude"; // JMG: changed to standard name
    lon:units = "degrees_east";
  float lat(station) ;
    lat:standard_name = "latitude" ; // JMG: changed to standard name
    lat:units = "degrees_north" ;
  float alt(station) ;
    alt:long_name = "vertical distance above the surface" ;
    alt:standard_name = "height" ;
    alt:units = "m";
    alt:positive = "up";
    alt:axis = "Z"; JMG: not sure about this---comment earlier
  char station_name(station, name_strlen) ;
    station_name:long_name = "station name" ; JMG: not sure about this---comment earlier
    station_name:standard_name = "station_id";
  int station_info(station) ;
    station_info:long_name = "some kind of station info" ;

  int stationIndex(obs) ;
    stationIndex:long_name = "which station this obs is for" ;
    stationIndex:instance = "station" ;
  double time(obs) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float humidity(obs) ;
    humidity:standard_name = "specific_humidity" ; // JMG: changed to standard name
    humidity:coordinates = "time lat lon alt station_name" ; // JMG: added station_name
    humidity:_FillValue = -999.9;
    humidity:index = "stationIndex" ;  // points to the variable of that name
  float temp(obs) ;
    temp:standard_name = "air_temperature" ; // JMG: changed to standard name
    temp:units = "Celsius" ;
    temp:coordinates = "time lat lon alt station_name" ; // JMG: added station_name
    temp:_FillValue = -999.9;
    temp:index = "stationIndex" ;  // points to the variable of that name

attributes:
    :featureType = "timeSeries";

The dimensions and the station variables are all the same as in the contiguous representation. The stationIndex variable assigns each sample to one of the timeSeries and is identified by the index attribute of the data variables. The instance attribute of the stationIndex variable records the instance dimension. Thus, time[0], humidity[0] and temp[0] belong to the element of the station dimension that is indicated by stationIndex[0]; time[1], humidity[1] and temp[1] belong to element stationIndex[1] of the station dimension, etc.

9.3.4 Single timeSeries

[JMG: Can we omit this? I have stated generally that it is possible in the introduction, and it conforms to chapter 5. We could note in the ch 5 example that if there's only one timeSeries the coord vars could be scalar.]

When there is a single timeSeries in the file, one can can use the multidimensional representation with number of stations = 1. One can also use scalar coordinates. This case is identified when the lat and lon coordinates are scalar. In this case, no connecting variable between station and observations is required, since they all belong to the same station.

dimensions:
  obs = 1233 ;
  name_strlen = 23

variables:
  float lon ;
    lon:long_name = "station longitude";
    lon:units = "degrees_east";
  float lat ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;
  float alt(station) ;
    alt:long_name = "vertical distance above the surface" ;
    alt:standard_name = "height" ;
    alt:units = "m";
    alt:positive = "up";
    alt:axis = "Z";
  char station_name(name_strlen) ;
    station_name:long_name = "station name" ;
    station_name:standard_name = "station_id";

  double time(obs) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
    time:missing_value = -999.9;
  float humidity(obs) ;
    humidity:long_name = "specific humidity" ;
    humidity:coordinates = "time lat lon alt" ;
    humidity:_FillValue = -999.9;
  float temp(obs) ;
    temp:long_name = "temperature" ;
    temp:units = "Celsius" ;
    temp:coordinates = "time lat lon alt" ;
    temp:_FillValue = -999.9;

attributes:
    :CF\:featureType = "timeSeries";

9.3.5 Flattened representation

[JMG: Can we omit this? It strikes me as an unnecessary complication, unless there's a clear reason why one might prefer it.]

When factoring out the station information is not desired, one may use a 'flattened representation', in which the station information is repeated for each observation. The station_id variable, which is required, is used to associate the observations into a time series.

dimensions:
  obs = UNLIMITED ;

variables:
  float lon(obs) ;
    lon:long_name = "station longitude";
    lon:units = "degrees_east";
  float lat(obs) ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;
  float alt(station) ;
    alt:long_name = "vertical distance above the surface" ;
    alt:standard_name = "height" ;
    alt:units = "m";
    alt:positive = "up";
    alt:axis = "Z";
  int station_id(obs) ;
    station_name:long_name = "station" ;
    station_name:standard_name = "station_id";

  double time(obs) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float humidity(obs) ;
    humidity:long_name = "specific humidity" ;
    humidity:coordinates = "time lat lon alt" ;
    humidity:_FillValue = -999.9;
  float temp(obs) ;
    temp:long_name = "temperature" ;
    temp:units = "Celsius" ;
    temp:coordinates = "time lat lon alt" ;
    temp:_FillValue = -999.9;

attributes:
    :CF\:featureType = "timeSeries";

The humidity(i) and temp(i) data are associated with the coordinate values time(i), lat(i), lon(i), and optionally alt(i). All observations with the same station_id are assumed to belong to that timeSeries.

In some observational networks, station location may change. However, for timeSeries feature types this should be infrequent and not overly consequential. In principle, a new station identifier should be assigned. In practice, occasional and small adjustments to station location may not matter for typical processing of data for visualization, and generic clients may not detect these changes, eg they may assume that the first location encountered is valid for all other observations at the same station. Specialized clients, of course, may be more careful in examining station location data, and nothing prevents data providers from using a factored representation as in 9.3.1, 9.3.2, and 9.3.3, and also putting location information into the observation record, as in the flattened representation in 9.3.5.

9.4 Trajectory Data

Point data may be taken along a flight path or ship path, constituting a connected set of points called a trajectory.

Some assumptions are common to all trajectory representations:

  • It is strongly recommended that there always be variable (of any data type) with standard_name attribute "trajectory_id", whose values uniquely identify the trajectory.
  • The outer dimension of the trajectory_id variable is the 'trajectory dimension'.
  • All variables that have the trajectory dimension as their only dimension are considered to be information about that trajectory
  • The trajectory_id variable may use missing values. This allows one to reserve more space than is needed.

9.4.1 Multidimensional representation

When storing multiple trajectories in the same file, and the number of observations in each trajectory is the same, one can use the multidimensional representation:

dimensions:
  obs = 1000 ;
  trajectory = 77 ;

variables:
  char trajectory(trajectory, name_strlen) ;
   trajectory:standard_name = "trajectory_id";
   trajectory:long_name = "trajectory name" ;
  int trajectory_info(trajectory) ;
    trajectory_info:long_name = "some kind of trajectory info" 

  double time(trajectory, obs) ;
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(trajectory, obs) ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(trajectory, obs) ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;

  float z(trajectory, obs) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:positive = "up" ; 
    z:axis = "Z" ; 

  float O3(trajectory, obs) ;
    O3:long_name = "ozone concentration" ;
    O3:units = "1e-9" ;
    O3:coordinates = "time lon lat z" ;

  float NO3(trajectory, obs) ;
    NO3:long_name = "NO3 concentration" ;
    NO3:units = "1e-9" ;
    NO3:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "trajectory";

The NO3(t,i) and O3(t,i) data are associated with the coordinate values time(t,i), lat(t,i), lon(t,i), and alt(t,i). The trajectory dimension may be the unlimited dimension or not. All variables that have trajectory as their only dimension are considered to be information about that trajectory.

The time coordinate may use a missing value, which indicates that data is missing for that trajectory and obs index. This allows one to have a variable number of observations for different trajectories, at the cost of some wasted space. The data variables may also use missing data values.

9.4.2 Single Trajectory

When a single trajectory is stored in a file, one can use a variation of 9.4.1 which removes the trajectory dimension:

dimensions:
  time = 42;

variables:
  char trajectory(name_strlen) ;
    trajectory:standard_name = "trajectory_id";

  double time(time) ;
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(time) ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(time) ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;
  float z(time) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:positive = "up" ; 
    z:axis = "Z" ; 

  float O3(time) ;
    O3:long_name = "ozone concentration" ;
    O3:units = "1e-9" ;
    O3:coordinates = "time lon lat z" ;

  float NO3(time) ;
    NO3:long_name = "NO3 concentration" ;
    NO3:units = "1e-9" ;
    NO3:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "trajectory";

The NO3(n) and O3(n) data is associated with the coordinate values time(n), z(n), lat(n), and lon(n). When the time coordinate is ordered, it is appropriate to use a coordinate variable for time, i.e. time(time). The time dimension may be unlimited or not.

Note that structurally this looks like unconnected point data as in example 9.2.1. The presence of the CF:featureType = "trajectory" global attribute indicates that in fact the points are connected along a trajectory.

Note that this is the same as Example 5.5.

9.4.3 Ragged array (contiguous) representation

When the number of observations for each trajectory varies, and one can control the order of writing, one can use the contiguous ragged array representation. The canonical use case for this is when rewriting raw data, and you expect that the common read pattern will be to read all the data from each trajectory. If the obs dimension is the unlimited dimension, this data will be contiguous on disk.

dimensions:
  obs = 3443;
  trajectory = 77 ;

variables:
  char trajectory(trajectory, name_strlen) ;
     trajectory:standard_name = "trajectory_id";
  int rowSize(trajectory) ;
    rowSize:long_name = "number of obs for this trajectory " ;
    rowSize:CF\:ragged_row_count = "obs" ;

  double time(obs) ;
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(obs) ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(obs) ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;
  float z(obs) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:positive = "up" ; 
    z:axis = "Z" ; 

  float O3(obs) ;
    O3:long_name = "ozone concentration" ;
    O3:units = "1e-9" ;
    O3:coordinates = "time lon lat z" ;

  float NO3(obs) ;
    NO3:long_name = "NO3 concentration" ;
    NO3:units = "1e-9" ;
    NO3:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "trajectory";

The O3(i) and NO3(i) data are associated with the coordinate values time(i), lat(i), lon(i), and alt(i). All observations for one trajectory are contiguous along the obs dimension, and should be time ordered. All variables that have trajectory as their single dimension are considered to be information about that trajectory. The obs dimension may use the unlimited dimension or not.

The row_size variable contains the number of observations for each trajectory, and is identified by having an attribute with name "CF:ragged_row_count" whose value is the observation dimension being counted. It must have the trajectory dimension as its single dimension, and must be type integer. The observations are associated with the trajectory using the same algorithm as in 9.3.2.

9.4.4 Ragged array (indexed) representation

When the number of observations at each trajectory vary, and the observations cannot be written in order, one can use the indexed ragged array representation. The canonical use case is when writing real-time data streams that contain reports from many trajectories. The data can be written as it arrives; if the obs dimension is the unlimited dimension, this will effectively append to the file.

dimensions:
  obs = UNLIMITED ;
  trajectory = 77 ;

variables:
  char trajectory(trajectory, name_strlen) ;
    trajectory:standard_name = "trajectory_id";

  int trajectory_index(obs) ;
    trajectory_index:long_name = "index of trajectory this obs belongs to " ;
    trajectory_index:CF\:ragged_row_index= "trajectory" ;
  double time(obs) ;
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(obs) ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(obs) ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;
  float z(obs) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:positive = "up" ;
    z:axis = "Z" ;  

  float O3(obs) ;
    O3:long_name = "ozone concentration" ;
    O3:units = "1e-9" ;
    O3:coordinates = "time lon lat z" ;

  float NO3(obs) ;
    NO3:long_name = "NO3 concentration" ;
    NO3:units = "1e-9" ;
    NO3:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "trajectory";

The O3(i) and NO3(i) data are associated with the coordinate values time(i), lat(i), lon(i), and alt(i). All observations for one trajectory will have the same trajectory index value, and should be time ordered. The obs dimension may use the unlimited dimension or not. All indices are zero based.

The trajectory_index variable is identified by having an attribute with name of "CF:ragged_row_index" whose value is the trajectory dimension name.

9.5 Profile Data

A series of connected observations along a vertical line, like an atmospheric or ocean sounding, is called a profile. The lat, lon locations are factored out into the profile.

Some assumptions are common to all profile representations:

  • It is strongly recommended that there always be a variable (of any type) with standard_name attribute "profile_id", whose values uniquely identify the profile.
  • The outer dimension of the profile_id variable is the 'profile dimension'.
  • All variables that have the profile dimension as their only dimension are considered to be information about that profile
  • The profile_id variable may use missing values. This allows one to reserve more space than is needed.

9.5.1 Multidimensional representation

When storing multiple profiles in the same file, and the numbers of vertical levels in each profile are the same, one can use the multidimensional representation:

dimensions:
  z = 42 ;
  profile = 142 ;

variables:
  int profile(profile) ;
     profile:standard_name = "profile_id";
  double time(profile);
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(profile);
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(profile);
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;

  float alt(profile, z) ;
    alt:long_name = "height above mean sea level" ;
    alt:units = "km" ;
    alt:positive = "up" ; 
    alt:axis = "Z" ;  

  float pressure(profile, z) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat alt" ;

  float temperature(profile, z) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat alt" ;

  float humidity(profile, z) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat alt" ;

attributes:
  :CF\:featureType = "profile";

The pressure(p,i), temperature(p,i), and humidity(p,i) data is associated with the coordinate values time(p), alt(p,i), lat(p), and lon(p). If the vertical coordinates are the same for all profiles, one can use z(z) instead of alt(profile,z). The time coordinate may depend on z also, e.g. time(profile,z).

When there are a variable number of observations for different profiles, use alt(profile, z) with missing values.

9.5.2 Single Profile

When a single profile is stored in a file, one can use a variation of the 9.5.1 which removes the profile dimension:

dimensions:
  z = 42 ;

variables:
  int profile ;
    profile:standard_name = "profile_id";

  double time;
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;

  float alt(z) ;
    alt:long_name = "height above mean sea level" ;
    alt:units = "km" ;
    alt:positive = "up" ; 
    alt:axis = "Z" ;  

  float pressure(z) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat z" ;

  float temperature(z) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat z" ;

  float humidity(z) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "profile";

The pressure(i), temperature(i), and humidity(i) data is associated with the coordinate values time, alt(i), lat, and lon. The time coordinate may depend on z also, eg may be time(z).

9.5.3 Ragged array (contiguous) representation

When the number of vertical levels for each profile varies, one can use the contiguous ragged array representation. One stores the set of observation for each profile contiguously along the obs dimension. The canonical use case for this is when rewriting raw data, and you expect that the common read pattern will be to read all the data from each profile.

dimensions:
  obs = UNLIMITED ;
  profiles = 142 ;

variables:
  int profile(profile) ;
    profile:standard_name = "profile_id";
  double time(profile);
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(profile);
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(profile);
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ; 
  int rowSize(profile) ;
    rowSize:long_name = "number of obs for this profile " ;
    rowSize:CF\:ragged_row_count = "obs" ;

  float z(obs) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:positive = "up" ;
    z:axis = "Z" ;  

  float pressure(obs) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat z" ;

  float temperature(obs) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat z" ;

  float humidity(obs) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "profile";

The pressure(i), temperature(i), and humidity(i) data is associated with the coordinate values time(p), z(i), lat(p), and lon(p), where p is found by reading the rowSize variable values as in 9.3.4. The time coordinate may depend on z also, e.g. time(p,z).

9.5.4 Ragged array (indexed) representation

When the number of vertical levels for each profile varies, and one cant write them contiguously, one can use the indexed ragged array representation. The canonical use case is when writing real-time data streams that contain reports from many profiles, arriving randomly.

dimensions:
  obs = UNLIMITED ;
  profiles = 142 ;

variables:
  int profile(profile) ;
    profile:standard_name = "profile_id";
  double time(profile);
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(profile);
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(profile);
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ; 

  int parentIndex(obs) ;
    parentIndex:long_name = "index of profile " ;
    parentIndex:CF\:ragged_row_index= "profile" ;
  
  float z(obs) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:positive = "up" ;
    z:axis = "Z" ;  

  float pressure(obs) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat z" ;

  float temperature(obs) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat z" ;

  float humidity(obs) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "profile";

The pressure(i), temperature(i), and humidity(i) data is associated with the coordinate values time(p), z(i), lat(p), and lon(p), where p=parent_index(i). The time coordinate may depend on z also, e.g. time(p,z). All indices are zero based.

9.6 Time Series of Profiles

When profiles are taken at a set of stations, one gets a time series of profiles at each station, called a timeSeriesProfile.

The same assumptions are made as with timeSeries data:

  • The outer dimension of the latitude and longitude coordinates (which must agree) is the 'station dimension'.
  • All variables that have the station dimension as their outer dimension are considered to be station information, and are called 'station variables'.
  • It is strongly recommended that there always be station variable (of any type) with standard_name attribute "station_id", whose values uniquely identify the station.
  • The station_id variable may use missing values. This allows one to reserve more space than is needed for stations.
  • There may be station variables with standard_name attribute "station_desc", "surface_altitude", and "station_WMO_id"..

9.6.1 Multidimensional representation

When storing time series of profiles at multiple stations in the same file, if there are the same number of time points for all timeSeries, and the same number of vertical levels for every profile, one can use the multidimensional representation:

dimensions:
  station = 22 ;
  profile = 3002 ;
  z = 42 ;

variables:
  float lon(station) ;
    lon:long_name = "station longitude";
    lon:units = "degrees_east";
  float lat(station) ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;
  char station_name(station, name_strlen) ;
    station_name:standard_name = "station_id" ;
    station_name:long_name = "station name" ;
  int station_info(station) ;
    station_name:long_name = "some kind of station info" ;

  float alt(station, profile , z) ;
    alt:long_name = "height above mean sea level" ;
    alt:units = "km" ;
    alt:positive = "up" ; 
    alt:axis = "Z" ;  

  double time(station, profile ) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
    time:missing_value = -999.9;

  float pressure(station, profile , z) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat alt" ;

  float temperature(station, profile , z) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat alt" ;

  float humidity(station, profile , z) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat alt" ;

attributes:
 :CF\:featureType = "timeSeriesProfile";

The pressure(s,p,i), temperature(s,p,i), and humidity(s,p,i) data is associated with the coordinate values time(s,p), z(s,p,i), lat(s), and lon(s).

The time coordinate may depend on z also, e.g. time(station,profile,z). If all of the profiles use the same z coordinate, alt(station, profile, z) may be factored out into z(z).

When there are varying number of profiles for different stations, use time(station, profile) with missing values. When there are varying number of levels for different profiles, use alt(station, profile, z) with missing values.

9.6.2 Profile time series at a single station

If there is only one station in a file, one can use a variation of 9.6.1 which removes the station dimension:

dimensions:
  profile = 30 ;
  z = 42 ;

variables:
  float lon ;
    lon:long_name = "station longitude";
    lon:units = "degrees_east";
  float lat ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;
  char station_name(name_strlen) ;
    station_name:standard_name = "station_id" ;
    station_name:long_name = "station name" ;
  int station_info;
    station_name:long_name = "some kind of station info" ;

  float alt(profile , z) ;
    alt:long_name = "height above mean sea level" ;
    alt:units = "km" ;
    alt:axis = "Z" ;  
    alt:positive = "up" ; 

  double time(profile ) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
    time:missing_value = -999.9;

  float pressure(profile , z) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat alt" ;

  float temperature(profile , z) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat alt" ;

  float humidity(profile , z) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat alt" ;

attributes:
 :CF\:featureType = "timeSeriesProfile";

The pressure(i,j), temperature(i,j), and humidity(i,j) data are associated with the coordinate values time(p), alt(p,i), lat, and lon. The time coordinate may depend on z also, e.g. time(profile,z). If all of the profiles use the same z coordinate, alt(profile, z) may be factored out into z(z).

9.6.3 Ragged array of profile time series

When the number of profiles and levels for each station varies, one can use the ragged array representation. This uses the contiguous ragged array representation for profiles (9.5.3), and adds the (factored out) station information with station indexes (9.2.4). The canonical use case is when writing real-time data streams that contain profiles from many stations, arriving randomly. However, the data for entire profile is written all at once, and contiguously.

dimensions:
  obs = UNLIMITED ;
  profiles = 1420 ;
  stations = 42;

variables:
  float lon(station) ;
    lon:long_name = "station longitude";
    lon:units = "degrees_east";
  float lat(station) ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;
  float alt(station) ;
    alt:long_name = "altitude above MSL" ;
    alt:units = "m" ;
  char station_name(station, name_strlen) ;
    station_name:long_name = "station name" ;
    station_name:standard_name = "station_id";
  int station_info(station) ;
    station_info:long_name = "some kind of station info" ;

  int profile(profile) ;
    profile:standard_name = "profile_id";
  double time(profile);
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  int station_index(profile) ;
    station_index:long_name = "which station this obs is for" ;
    station_index:CF\:ragged_row_index = "station" ;
  int row_size(profile) ;
    row_size:long_name = "number of obs for this profile " ;
    row_size:CF\:ragged_row_count = "obs" ;

  float z(obs) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:axis = "Z" ;  
    z:positive = "up" ;

  float pressure(obs) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat z" ;

  float temperature(obs) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat z" ;

  float humidity(obs) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "timeSeriesProfile";

The profile is associated with a station using the station_index(profile). For each profile, the observations must be written contiguously, and the number of obs for each profile written in row_size(profile).

The pressure(i), temperature(i), and humidity(i) data is associated with the coordinate values time(p), z(i), lat(s), and lon(s), where s = station_index(p). The time coordinate may depend on z also, e.g. time(obs) instead of time(profile).

9.7 Trajectory of Profiles

When profiles are taken along a trajectory, one gets a time series of profiles called a trajectoryProfile. This looks like a collection of profiles (see 9.5), except that the profile locations are assumed to be a connected set of points along a trajectory. A single file may contain one or more such trajectoryProfile features.

Some assumptions are common to all trajectoryProfile representations:

  • It is strongly recommended that there always be a variable (of any type) with standard_name attribute "trajectory_id", whose values uniquely identify the trajectory.
  • The outer dimension of the trajectory_id variable is the 'trajectory dimension'.
  • All variables that have the trajectory dimension as their only dimension are considered to be information about that trajectory
  • The trajectory_id variable may use missing values. This allows one to reserve more space than is needed.

9.7.1 Trajectory Profile multidimensional representation

If there are the same number of profiles for all trajectories, and the same number of vertical levels for every profile, one can use the multidimensional representation:

dimensions:
  trajectory = 22 ;
  profile = 33;
  z = 42 ;

variables:
  int trajectory (trajectory ) ;
    trajectory:standard_name = "trajectory_id" ;

  float lon(trajectory, profile) ;
    lon:units = "degrees_east";
  float lat(trajectory, profile) ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;

  float alt(trajectory, profile , z) ;
    alt:long_name = "height above mean sea level" ;
    alt:units = "km" ;
    alt:positive = "up" ; 
    alt:axis = "Z" ;  

  double time(trajectory, profile ) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
    time:missing_value = -999.9;

  float pressure(trajectory, profile , z) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat alt" ;

  float temperature(trajectory, profile , z) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat alt" ;

  float humidity(trajectory, profile , z) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat alt" ;

attributes:
 :CF\:featureType = "trajectoryProfile";

The pressure(s,p,i), temperature(s,p,i), and humidity(s,p,i) data is associated with the coordinate values time(s,p), alt(s,p,i), lat(s,p), and lon(s,p).

The time coordinate may depend on z also, eg time(trajectory,profile,z). If all of the profiles use the same z coordinate, alt(trajectory, profile, z) may be factored out into z(z).

When there are varying number of profiles for different trajectorys, use time(trajectory, profile) with missing values. When there are varying number of levels for different profiles, use alt(trajectory, profile, z) with missing values.

9.7.2 Single Trajectory in the file

If there is only one trajectory in the file, one can use a variation of 9.7.1 which removes the trajectory dimension:

dimensions:
  profile = 33;
  z = 42 ;

variables:
  int trajectory;
    trajectory:standard_name = "trajectory_id" ;

  float lon(profile) ;
    lon:units = "degrees_east";
  float lat(profile) ;
    lat:long_name = "station latitude" ;
    lat:units = "degrees_north" ;

  float alt(profile, z) ;
    alt:long_name = "height above mean sea level" ;
    alt:units = "km" ;
    alt:positive = "up" ; 
    alt:axis = "Z" ;  

  double time(profile ) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
    time:missing_value = -999.9;

  float pressure(profile, z) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat alt" ;

  float temperature(profile, z) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat alt" ;

  float humidity(profile, z) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat alt" ;

attributes:
 :CF\:featureType = "trajectoryProfile";

9.7.3 Ragged array of trajectoryProfile data

When the number of profiles and levels for each trajectory varies, one can use the ragged array representation. This uses the contiguous ragged array representation for profiles (9.5.3), and adds trajectory information with trajectory indexes. The canonical use case is when writing real-time data streams that contain profiles from many trajectories, arriving randomly. However, the data for entire profile is written contiguously all at once.

dimensions:
  obs = UNLIMITED ;
  profiles = 142 ;
  section = 3;

variables:
  int trajectory(trajectory) ;
    section:standard_name = "trajectory_id" ;

  double time(profile);
    time:long_name = "time" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(profile);
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(profile);
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ; 
  int row_size(profile) ;
    row_size:long_name = "number of obs for this profile " ;
    row_size:CF\:ragged_row_count = "obs" ;
  int trajectory_index(profile) ;
    trajectory_index:long_name = "which trajectory this profile is for" ;
    trajectory_index:CF\:ragged_row_index= "trajectory" ;
  
  float z(obs) ;
    z:long_name = "height above mean sea level" ;
    z:units = "km" ;
    z:positive = "up" ;
    z:axis = "Z" ;  

  float pressure(obs) ;
    pressure:long_name = "pressure level" ;
    pressure:units = "hPa" ;
    pressure:coordinates = "time lon lat z" ;

  float temperature(obs) ;
    temperature:long_name = "skin temperature" ;
    temperature:units = "Celsius" ;
    temperature:coordinates = "time lon lat z" ;

  float humidity(obs) ;
    humidity:long_name = "relative humidity" ;
    humidity:units = "%" ;
    humidity:coordinates = "time lon lat z" ;

attributes:
  :CF\:featureType = "trajectoryProfile";

The profile is associated with a trajectory using the trajectory_index(profile). The observations for each profile must be written contiguously, and the number of obs in each profile is stored in row_size(profile).

The pressure(i), temperature(i), and humidity(i) data is associated with the coordinate values time(p), z(i), lat(p), and lon(p). The time coordinate may depend on z also, eg time(obs) instead of time(profile).

New standard names

  • station_id JMG: not sure what this contains---a name or a number?---same query about profile and trajectory
  • station_desc JMG: what would this contain?
  • station_WMO_id JMG: is there a URL for a standardised list we can point to?
  • profile_id
  • trajectory_id

New variable attributes

  • count whose value is the name of a variable which counts the number of elements of each instance of a feature type in the contiguous ragged array representation.
  • index whose value is the name of a variable which assigns each element to a particular instance of a feature type in the indexed ragged array representation.
  • instance whose value is the name of the dimension that runs over instances of the feature type in the index ragged array representation.

New global attribute

featureType can take one of these values (case-insensitive):

  • point
  • timeSeries
  • trajectory
  • profile
  • timeSeriesProfile
  • trajectoryProfile

Other modifications

In section 5, insert the following new paragraph between the second and third:

"When the data exists in more than one spatiotemporal dimension, spatiotemporal coordinate variables can conveniently be used to locate it provided it is arranged on a multidimensional grid, which means that the spatiotemporal dimensions are all independent, and there exists a gridpoint for every possible combination of the subscripts along the spatiotemporal dimensions. Some data is not gridded in this sense i.e. only a subset of the possible combinations of spatiotemporal coordinates exist, for instance timeseries, trajectories and vertical profiles. We refer to the latter kind of arrangement as a discrete sampling geometry, and different conventions are provided for it in chapter 9, which are typically much more efficient of space than the gridded representation and provide additional metadata to describe its structure.

In section 5, third paragraph, change: [JMG: Do we need this change? With the count and index attributes, I am not sure whether any exclusion is reqd for chapter 9.]

"The dimensions of an auxiliary coordinate variable must be a subset of the dimensions of the variable with which the coordinate is associated (an exception is label coordinates (Section 6.1, “Labels”) which contain a dimension for maximum string length)"

to

"The dimensions of an auxiliary coordinate variable must be a subset of the dimensions of the variable with which the coordinate is associated. There are two exceptions to this. First, label coordinates (see Section 6.1, “Labels”) contain a dimension for maximum string length. Second, indexed and contiguous representations of discrete sampling geometries (see Section 9, “Discrete Sampling Geometries”) allow special kinds of coordinates which are connected in a different way than by the dimension."

In section 5.4, first paragraph, add at the end:

If there is more than one set of sample times, it is strongly recommended that new data writers use the "Discrete Sampling Geometries" Conventions in Chapter 9.3 and 9.6, which provide more extensive options for writing timeSeries data.

In section 5.5, first paragraph, add at the end:

If there is more than one trajectory, it is strongly recommended that new data writers use the "Discrete Sampling Geometries" Conventions in Chapter 9.4, which provide more extensive options for writing trajectory Data.

[JMG: suggest we also add subsections to 5 showing the (scalar or 1D) point feature type, and the 1D profile feature type, which is already mentioned in an example elsewhere. These new subsections would have a similar recommendation to look at chapter 9.]

Also see

Summary of encodings

Last modified 8 years ago Last modified on 10/12/10 05:56:33