228 lines
9.3 KiB
Markdown
228 lines
9.3 KiB
Markdown
|
<!--
|
|||
|
|
|||
|
@license Apache-2.0
|
|||
|
|
|||
|
Copyright (c) 2020 The Stdlib Authors.
|
|||
|
|
|||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|||
|
you may not use this file except in compliance with the License.
|
|||
|
You may obtain a copy of the License at
|
|||
|
|
|||
|
http://www.apache.org/licenses/LICENSE-2.0
|
|||
|
|
|||
|
Unless required by applicable law or agreed to in writing, software
|
|||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|||
|
See the License for the specific language governing permissions and
|
|||
|
limitations under the License.
|
|||
|
|
|||
|
-->
|
|||
|
|
|||
|
# dsemch
|
|||
|
|
|||
|
> Calculate the [standard error of the mean][standard-error] of a double-precision floating-point strided array using a one-pass trial mean algorithm.
|
|||
|
|
|||
|
<section class="intro">
|
|||
|
|
|||
|
The [standard error of the mean][standard-error] of a finite size sample of size `n` is given by
|
|||
|
|
|||
|
<!-- <equation class="equation" label="eq:standard_error_of_the_mean" align="center" raw="\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}" alt="Equation for the standard error of the mean."> -->
|
|||
|
|
|||
|
<div class="equation" align="center" data-raw-text="\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}" data-equation="eq:standard_error_of_the_mean">
|
|||
|
<img src="https://cdn.jsdelivr.net/gh/stdlib-js/stdlib@3ebbbfc49c54971356c0cf8f6282e6720cb07755/lib/node_modules/@stdlib/stats/base/dsemch/docs/img/equation_standard_error_of_the_mean.svg" alt="Equation for the standard error of the mean.">
|
|||
|
<br>
|
|||
|
</div>
|
|||
|
|
|||
|
<!-- </equation> -->
|
|||
|
|
|||
|
where `σ` is the population [standard deviation][standard-deviation].
|
|||
|
|
|||
|
Often in the analysis of data, the true population [standard deviation][standard-deviation] is not known _a priori_ and must be estimated from a sample drawn from the population distribution. In this scenario, one must use a sample [standard deviation][standard-deviation] to compute an estimate for the [standard error of the mean][standard-error]
|
|||
|
|
|||
|
<!-- <equation class="equation" label="eq:standard_error_of_the_mean_estimate" align="center" raw="\sigma_{\bar{x}} \approx \frac{s}{\sqrt{n}}" alt="Equation for estimating the standard error of the mean."> -->
|
|||
|
|
|||
|
<div class="equation" align="center" data-raw-text="\sigma_{\bar{x}} \approx \frac{s}{\sqrt{n}}" data-equation="eq:standard_error_of_the_mean_estimate">
|
|||
|
<img src="https://cdn.jsdelivr.net/gh/stdlib-js/stdlib@3ebbbfc49c54971356c0cf8f6282e6720cb07755/lib/node_modules/@stdlib/stats/base/dsemch/docs/img/equation_standard_error_of_the_mean_estimate.svg" alt="Equation for estimating the standard error of the mean.">
|
|||
|
<br>
|
|||
|
</div>
|
|||
|
|
|||
|
<!-- </equation> -->
|
|||
|
|
|||
|
where `s` is the sample [standard deviation][standard-deviation].
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<!-- /.intro -->
|
|||
|
|
|||
|
<section class="usage">
|
|||
|
|
|||
|
## Usage
|
|||
|
|
|||
|
```javascript
|
|||
|
var dsemch = require( '@stdlib/stats/base/dsemch' );
|
|||
|
```
|
|||
|
|
|||
|
#### dsemch( N, correction, x, stride )
|
|||
|
|
|||
|
Computes the [standard error of the mean][standard-error] of a double-precision floating-point strided array `x` using a one-pass trial mean algorithm.
|
|||
|
|
|||
|
```javascript
|
|||
|
var Float64Array = require( '@stdlib/array/float64' );
|
|||
|
|
|||
|
var x = new Float64Array( [ 1.0, -2.0, 2.0 ] );
|
|||
|
var N = x.length;
|
|||
|
|
|||
|
var v = dsemch( N, 1, x, 1 );
|
|||
|
// returns ~1.20185
|
|||
|
```
|
|||
|
|
|||
|
The function has the following parameters:
|
|||
|
|
|||
|
- **N**: number of indexed elements.
|
|||
|
- **correction**: degrees of freedom adjustment. Setting this parameter to a value other than `0` has the effect of adjusting the divisor during the calculation of the [standard deviation][standard-deviation] according to `N-c` where `c` corresponds to the provided degrees of freedom adjustment. When computing the [standard deviation][standard-deviation] of a population, setting this parameter to `0` is the standard choice (i.e., the provided array contains data constituting an entire population). When computing the corrected sample [standard deviation][standard-deviation], setting this parameter to `1` is the standard choice (i.e., the provided array contains data sampled from a larger population; this is commonly referred to as Bessel's correction).
|
|||
|
- **x**: input [`Float64Array`][@stdlib/array/float64].
|
|||
|
- **stride**: index increment for `x`.
|
|||
|
|
|||
|
The `N` and `stride` parameters determine which elements in `x` are accessed at runtime. For example, to compute the [standard error of the mean][standard-error] of every other element in `x`,
|
|||
|
|
|||
|
```javascript
|
|||
|
var Float64Array = require( '@stdlib/array/float64' );
|
|||
|
var floor = require( '@stdlib/math/base/special/floor' );
|
|||
|
|
|||
|
var x = new Float64Array( [ 1.0, 2.0, 2.0, -7.0, -2.0, 3.0, 4.0, 2.0 ] );
|
|||
|
var N = floor( x.length / 2 );
|
|||
|
|
|||
|
var v = dsemch( N, 1, x, 2 );
|
|||
|
// returns 1.25
|
|||
|
```
|
|||
|
|
|||
|
Note that indexing is relative to the first index. To introduce an offset, use [`typed array`][mdn-typed-array] views.
|
|||
|
|
|||
|
<!-- eslint-disable stdlib/capitalized-comments -->
|
|||
|
|
|||
|
```javascript
|
|||
|
var Float64Array = require( '@stdlib/array/float64' );
|
|||
|
var floor = require( '@stdlib/math/base/special/floor' );
|
|||
|
|
|||
|
var x0 = new Float64Array( [ 2.0, 1.0, 2.0, -2.0, -2.0, 2.0, 3.0, 4.0 ] );
|
|||
|
var x1 = new Float64Array( x0.buffer, x0.BYTES_PER_ELEMENT*1 ); // start at 2nd element
|
|||
|
|
|||
|
var N = floor( x0.length / 2 );
|
|||
|
|
|||
|
var v = dsemch( N, 1, x1, 2 );
|
|||
|
// returns 1.25
|
|||
|
```
|
|||
|
|
|||
|
#### dsemch.ndarray( N, correction, x, stride, offset )
|
|||
|
|
|||
|
Computes the [standard error of the mean][standard-error] of a double-precision floating-point strided array using a one-pass trial mean algorithm and alternative indexing semantics.
|
|||
|
|
|||
|
```javascript
|
|||
|
var Float64Array = require( '@stdlib/array/float64' );
|
|||
|
|
|||
|
var x = new Float64Array( [ 1.0, -2.0, 2.0 ] );
|
|||
|
var N = x.length;
|
|||
|
|
|||
|
var v = dsemch.ndarray( N, 1, x, 1, 0 );
|
|||
|
// returns ~1.20185
|
|||
|
```
|
|||
|
|
|||
|
The function has the following additional parameters:
|
|||
|
|
|||
|
- **offset**: starting index for `x`.
|
|||
|
|
|||
|
While [`typed array`][mdn-typed-array] views mandate a view offset based on the underlying `buffer`, the `offset` parameter supports indexing semantics based on a starting index. For example, to calculate the [standard error of the mean][standard-error] for every other value in `x` starting from the second value
|
|||
|
|
|||
|
```javascript
|
|||
|
var Float64Array = require( '@stdlib/array/float64' );
|
|||
|
var floor = require( '@stdlib/math/base/special/floor' );
|
|||
|
|
|||
|
var x = new Float64Array( [ 2.0, 1.0, 2.0, -2.0, -2.0, 2.0, 3.0, 4.0 ] );
|
|||
|
var N = floor( x.length / 2 );
|
|||
|
|
|||
|
var v = dsemch.ndarray( N, 1, x, 2, 1 );
|
|||
|
// returns 1.25
|
|||
|
```
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<!-- /.usage -->
|
|||
|
|
|||
|
<section class="notes">
|
|||
|
|
|||
|
## Notes
|
|||
|
|
|||
|
- If `N <= 0`, both functions return `NaN`.
|
|||
|
- If `N - c` is less than or equal to `0` (where `c` corresponds to the provided degrees of freedom adjustment), both functions return `NaN`.
|
|||
|
- The underlying algorithm is a specialized case of Neely's two-pass algorithm. As the standard deviation is invariant with respect to changes in the location parameter, the underlying algorithm uses the first strided array element as a trial mean to shift subsequent data values and thus mitigate catastrophic cancellation. Accordingly, the algorithm's accuracy is best when data is **unordered** (i.e., the data is **not** sorted in either ascending or descending order such that the first value is an "extreme" value).
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<!-- /.notes -->
|
|||
|
|
|||
|
<section class="examples">
|
|||
|
|
|||
|
## Examples
|
|||
|
|
|||
|
<!-- eslint no-undef: "error" -->
|
|||
|
|
|||
|
```javascript
|
|||
|
var randu = require( '@stdlib/random/base/randu' );
|
|||
|
var round = require( '@stdlib/math/base/special/round' );
|
|||
|
var Float64Array = require( '@stdlib/array/float64' );
|
|||
|
var dsemch = require( '@stdlib/stats/base/dsemch' );
|
|||
|
|
|||
|
var x;
|
|||
|
var i;
|
|||
|
|
|||
|
x = new Float64Array( 10 );
|
|||
|
for ( i = 0; i < x.length; i++ ) {
|
|||
|
x[ i ] = round( (randu()*100.0) - 50.0 );
|
|||
|
}
|
|||
|
console.log( x );
|
|||
|
|
|||
|
var v = dsemch( x.length, 1, x, 1 );
|
|||
|
console.log( v );
|
|||
|
```
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<!-- /.examples -->
|
|||
|
|
|||
|
* * *
|
|||
|
|
|||
|
<section class="references">
|
|||
|
|
|||
|
## References
|
|||
|
|
|||
|
- Neely, Peter M. 1966. "Comparison of Several Algorithms for Computation of Means, Standard Deviations and Correlation Coefficients." _Communications of the ACM_ 9 (7). Association for Computing Machinery: 496–99. doi:[10.1145/365719.365958][@neely:1966a].
|
|||
|
- Ling, Robert F. 1974. "Comparison of Several Algorithms for Computing Sample Means and Variances." _Journal of the American Statistical Association_ 69 (348). American Statistical Association, Taylor & Francis, Ltd.: 859–66. doi:[10.2307/2286154][@ling:1974a].
|
|||
|
- Chan, Tony F., Gene H. Golub, and Randall J. LeVeque. 1983. "Algorithms for Computing the Sample Variance: Analysis and Recommendations." _The American Statistician_ 37 (3). American Statistical Association, Taylor & Francis, Ltd.: 242–47. doi:[10.1080/00031305.1983.10483115][@chan:1983a].
|
|||
|
- Schubert, Erich, and Michael Gertz. 2018. "Numerically Stable Parallel Computation of (Co-)Variance." In _Proceedings of the 30th International Conference on Scientific and Statistical Database Management_. New York, NY, USA: Association for Computing Machinery. doi:[10.1145/3221269.3223036][@schubert:2018a].
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<!-- /.references -->
|
|||
|
|
|||
|
<section class="links">
|
|||
|
|
|||
|
[standard-error]: https://en.wikipedia.org/wiki/Standard_error
|
|||
|
|
|||
|
[standard-deviation]: https://en.wikipedia.org/wiki/Standard_deviation
|
|||
|
|
|||
|
[@stdlib/array/float64]: https://www.npmjs.com/package/@stdlib/array-float64
|
|||
|
|
|||
|
[mdn-typed-array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray
|
|||
|
|
|||
|
[@neely:1966a]: https://doi.org/10.1145/365719.365958
|
|||
|
|
|||
|
[@ling:1974a]: https://doi.org/10.2307/2286154
|
|||
|
|
|||
|
[@chan:1983a]: https://doi.org/10.1080/00031305.1983.10483115
|
|||
|
|
|||
|
[@schubert:2018a]: https://doi.org/10.1145/3221269.3223036
|
|||
|
|
|||
|
</section>
|
|||
|
|
|||
|
<!-- /.links -->
|