Opened 3 years ago

# Add support for complex numbers to CF

Reported by: Owned by: mikedixon cf-conventions@… high cf-conventions

## Motivation

The ongoing development of the CfRadial? convention, aimed at representing radar and lidar data in CF NetCDF, has highlighted the need to properly represent complex numbers in CF. The intention with this proposal is to obtain agreement on adopting a complex number convention that (a) respects the CF approach, (b) satisfies the requirements of the radar and lidar research community and (c) provide a general solution so that complex numbers from any discipline can be accurately and simply represented.

## The nature of complex numbers

A complex number can be expressed in the form P = X + iY, where X and Y are real numbers, and i is the square root of -1. For the complex number P = X + iY, X is called the real part, and Y is called the imaginary part.

Complex numbers can also be represented in a polar or exponential forms.

The relationship between the Cartesian form and the polar form is:

P = X + iY = R(cos(theta) + isin(theta))

where R is the distance from the origin O to the point P, and theta is the angle between the X axis and the line OP, measured counterclockwise.

## Representing complex numbers as 2 reals

The above discussion demonstrates that no matter which form you choose to use, a complex number requires the storage of two real numbers, as well as a clear description of which form you are using. It seems reasonable to determine the form being used by inspecting the units attribute.

## Use of complex numbers in radar data

At the most raw level, radars generate time series of I (in-phase) and Q (quadrature) signals. These two signal components are considered to comprise a complex number.

Additionally, the spectra and covariances of radar signals are stored as complex numbers.

Because of the very large dynamic range of radar signals (up to 10 orders of magnitude), these complex numbers are often stored as power, in log units, and phase, in degrees.

## Storing complex numbers in NetCDF

Because complex numbers must be represented by 2 real numbers, a complex variable will have one extra dimension, of size 2.

We propose that this dimension should be the last of the dimensions used for a variable – i.e. the 2 parts of the number are stored immediately one after the other in the array.

So, for example, a radar variable holding complex data for an I/Q time series could be represented as follows

```time = 3000 ;
range = 996 ;
complex = 2;
float IQ(time, range, complex) ;
IQ:is_complex = “true” ;
IQ:units = "volt" ;
IQ:_FillValue = -9999F ;
IQ:coordinates = "time range" ;
```

## Attribute indicating variable is complex

As indicated in the example above, we propose that the string attribute:

```is_complex = "true";

```

be attached to any variable that holds complex numbers.

## Units attribute for real/imaginary form

If the complex number is stored using real and imaginary parts, the units of both parts will be the same.

This is true for the example above, where we have:

```IQ:units = "volts";
```

## Units attribute for polar form

If the polar or exponential form is used, the units of the two parts will not be the same.

We could use a single units string attribute, but make it comma-delimited to separate the units for the 2 parts.

For example:

```IQ:units = "dBm,degree";
```

Or we could have 2 separate units attributes:

For example:

```IQ:units_first_part = "dBm";
IQ:units_second_part = "degree";
```

The full proposal is attached.

The documents are maintained at:

### comment:1 Changed 3 years ago by jonathan

Dear Mike

Thanks for your proposal. It would probably encourage discussion if you could put the text in this ticket. That's more accessible than a linked PDF or GitHub, because the ticket is distributed to everyone by email in plain text. We may change to using GitHub instead of trac, but we're not yet technically set up for that. It's being worked on.

As you say, a complex number is a pair of real numbers, where each member of the pair has a distinct function. A way we could accommodate this need in CF without any structural change is to store each complex data variable as a pair of real data variables, where the two members are distinguished by standard names. This is analogous to vector quantities, in which each component is stored as a separate data variable and distinguished by standard name, such as eastward, northward and upward components of sea water velocity. As you say, the components may have different units. A standard name implies canonical units, so components with different units cannot have the same standard name, so cannot be in the same data variable. To remove this restriction would probably raise a lot of potential problems.

Best wishes

Jonathan

### comment:2 Changed 3 years ago by mikedixon

• Description modified (diff)

### comment:3 Changed 3 years ago by mikedixon

• Description modified (diff)

### comment:4 Changed 21 months ago by piyush.agram

Hi,

I wanted to comment on this CF proposal from an implementer's point of view (I apologize if my perspective is HDF5 and GDAL centered).

1. Representation of complex numbers.
• Would it be possible to consider using compound data types? This is already the case with h5py and storing numpy.complex datatypes. Here is an example h5ls output for a numpy.complex64 array written using h5py
```f32                      Dataset {5/5, 5/5}
Location:  1:1400
Storage:   200 logical bytes, 200 allocated bytes, 100.00% utilization
Type:      struct {
"r"                +0    native float
"i"                +4    native float
} 8 bytes
```
1. Products generated like these are already by GDAL's HDF5 driver and treated as complex data. See https://github.com/OSGeo/gdal/blob/master/gdal/frmts/hdf5/hdf5dataset.cpp#L407-L448
1. In general, working with an extra complex dimension seems cumbersome since a lot of SAR/InSAR datasets now are represented as data cubes (acquisitiondate, time, range) or (baseline, time, range) or (polarization, time, range) for stacks/ tomograms. It would be lot simpler if queries for dimensions directly returned the sizes instead of us having to check for dimension size of 2 and is_complex. This is a personal preference I guess.
1. I like the idea of is_complex attribute for disambiguation. As an implementer, I would detect a compound data type and confirm with this attribute that the data represents complex numbers.
1. I would prefer the usage of a different attribute name than "units". Very often, amplitudes are replaced by a coherence / correlation measure which are unitless. Maybe something like "format" or "representation" which could be used as follows:
```# For real and imaginary parts
IQ:format = "cartesian"

# For polar representation for fields like coherence/correlation
IQ:format = "amplitude,degree"

# For db representation
IQ:format = "dBm,degree"
```

Thanks for starting this discussion.

Piyush

Last edited 21 months ago by piyush.agram (previous) (diff)
Note: See TracTickets for help on using tickets.