Opened 4 years ago
Last modified 3 months ago
#102 accepted enhancement
Request for two additional cell_methods
Reported by: | rhorne@… | Owned by: | davidhassell |
---|---|---|---|
Priority: | medium | Milestone: | |
Component: | cf-conventions | Version: | |
Keywords: | cell_method | Cc: |
Description
The GOES-R ground system generates products that conmtains two statistics that are not defined in the current set of cell methods in the CF standard. The two statistics are:
(1) root_mean_square (2) total_sum_of_squares
This trac item requests that these two statistics be added to the set of CF defined cell_methods. The changes to the CF standard required to support this are as follows...
In CF standard paragraph 7.3. Cell Methods, change this sentence:
The values of method should be selected from the list in Appendix E, Cell Methods, which includes point, sum, mean, maximum, minimum, mid_range, standard_deviation, variance, mode, and median.
To:
The values of method should be selected from the list in Appendix E, Cell Methods, which includes point, sum, mean, maximum, minimum, mid_range, standard_deviation, variance, mode, median, root_mean_square, and total_sum_of_squares.
In CF standard Appendix E Cell Methods, add the following two rows to Table E.1 Cell Methods:
(first new row)
cell_methods column: root_mean_square
Units column u
Description column: Root mean square (RMS)
(second new row)
cell_methods column: total_sum_of_squares
Units column u2
Description column: Total sum of squares (TSS)
Change History (14)
comment:1 Changed 4 years ago by taylor13
comment:2 Changed 4 years ago by rhorne@…
Hi Karl:
Not a stupid question. I was going to request sum_of_squares. Before I opened the ticket, I wanted to make sure the term is unambiguious so I googled "sum of squares" and found that the term can mean numerous things (checkout: http://en.wikipedia.org/wiki/Sum_of_squares).
The specific type of "sum of squares" applicable to this domain appears to be generally referred to as total sum of squares. I came to this conclusion by looking over a variety of decent statistics-oriented web sites.
However, if the consensus is that sum_of_squares is sufficient, I'm fine with it.
very respectfully,
randy
comment:3 Changed 4 years ago by jonathan
Dear Randy
Thanks for this proposal. I'd agree to it, with three comments
- root_mean_square has already been agreed to be added, in ticket 61. This shows that we need a new version of the standard document soon! So that part is not needed.
- It's good that you did the research - my question was the same as Karl's. I would say that sum_of_squares would be better. It seems to me that the various kinds of sum of squares are all the same formula, just used for different purposes. It's probably clearer to name the operation in the simplest way.
- To shorten both the standards document and this proposal, we could say in Sect 7.3, "The values of method should be selected from the list in Appendix E, Cell Methods, which includes for instance point, sum and mean". There's no need to list them all - that's what the Appendix is for. Those first three are mentioned in the subsequent discussion, and have particular importance.
Best wishes
Jonathan
comment:4 Changed 4 years ago by rhorne@…
Dear Jonathan:
Thanks for pointing out trac item #61. Note that trac item #61 does not include the markups required to the CF standard. As a result, I have consolidated the cell method changes required for this trac item and trac item #61 immediately below. Hopefully, no one will take offense to this. I have also included a note in trac item #61 indicating that this trac item contains the markups.
In CF standard paragraph 7.3. Cell Methods, change this sentence:
The values of method should be selected from the list in Appendix E, Cell Methods, which includes point, sum, mean, maximum, minimum, mid_range, standard_deviation, variance, mode, and median.
To:
The values of method should be selected from the list in Appendix E, Cell Methods, which includes for instance, point, sum, and mean.
In CF standard Appendix E Cell Methods, add the following two rows to Table E.1 Cell Methods:
(first new row)
cell_methods column: root_mean_square
Units column u
Description column: Root mean square (RMS)
(second new row)
cell_methods column: sum_of_squares
Units column u2
Description column: Sum of squares
(third new row)
cell_methods column: mean_of_upper_decile
Description column: Mean of the upper group of data values defined by the upper tenth of their distribution
very respectfully,
randy
comment:5 Changed 4 years ago by jonathan
Dear Randy
Thanks for doing this. It looks fine. Please could others who are happy with this simple proposal express their support for it (especially, members of the conventions committee, since a proposal requires support from two members of the committee for it to be accepted).
Cheers
Jonathan
comment:6 Changed 4 years ago by lowry
This now looks fine to me.
comment:7 Changed 4 years ago by biard
An editorial note: The sentence
The values of method should be selected from the list in Appendix E, Cell Methods, which includes for instance, point, sum, and mean.
would be less confusing if you wrote it as
The values of method should be selected from the list in Appendix E, Cell Methods, which includes (for instance) point, sum, and mean.
Or
The values of method should be selected from the list in Appendix E, Cell Methods, which includes point, sum, and mean, among others.
Or
The values of method should be selected from the list in Appendix E, Cell Methods.
comment:8 Changed 4 years ago by taylor13
This looks good, as is, but I think Jim's 2nd rewording (ending with "among others") reads best. Karl
comment:9 Changed 4 years ago by graybeal
+1, and to Jim's revised wordings, #2 is fine choice.
comment:10 Changed 4 years ago by rhorne@…
Folks:
I am going to piggyback a related issue on this ticket ....
On GOES-R ground, some of the data we report for a given geographic region (i.e. a cell containing many observations) is the percent that a particular geophysical quantity exists.
We were planning on using the cell method "sum" because it generally conforms to the definiiton of sum in Appendix E (The data values are representative of a sum or accumulation over the cell. This is the default method for a quantity that is extensive with respect to the specified dimension.)
The problem with doing this is ...
Percent is not the same units of the geophysical quantity. To solve this, the units column for the "sum" row in appendix E could be modified to allow percent.
Another option may be to treat "percent" as a cell method.
Comments appreciated.
very respectfully,
randy
comment:11 Changed 4 years ago by jonathan
Dear Randy
Although the above is a reasonable point to raise, I don't think it belongs in this ticket, which is otherwise clear-cut and near being accepted. Adding something different will confuse the process, so if it's OK with you I suggest that your comment should not be read as part of this ticket. I will reply to it on the email list instead.
Thanks
Jonathan
comment:12 Changed 4 years ago by rhorne@…
Dear Jonathan:
No problem !
very respectfully,
randy
comment:13 Changed 3 years ago by jonathan
This ticket should have been clearly stated as accepted 10 months ago, since it had enough support according to the rules. Below is the final form of the change (with the amended wording). The change should be included in CF 1.7 and Randy Horne should be included in the list of additional authors of the conventions. No change is needed to the conformance document. Thanks, Randy.
Jonathan
In CF standard paragraph 7.3. Cell Methods, change this sentence:
The values of method should be selected from the list in Appendix E, Cell Methods, which includes point, sum, mean, maximum, minimum, mid_range, standard_deviation, variance, mode, and median.
To:
The values of method should be selected from the list in Appendix E, Cell Methods, which includes point, sum, and mean, among others.
In CF standard Appendix E Cell Methods, add the following two rows to Table E.1 Cell Methods:
(first new row)
cell_methods column: root_mean_square
Units column u
Description column: Root mean square (RMS)
(second new row)
cell_methods column: sum_of_squares
Units column u2
Description column: Sum of squares
(third new row)
cell_methods column: mean_of_upper_decile
Description column: Mean of the upper group of data values defined by the upper tenth of their distribution
comment:14 Changed 3 months ago by davidhassell
- Owner changed from cf-conventions@… to davidhassell
- Status changed from new to accepted
Hope this isn't a stupid question: How does "Total sum of squares" differ from "sum of squares"? thanks, Karl