API¶
Top level user functions:
all (a[, axis, keepdims, split_every, out]) |
Test whether all array elements along a given axis evaluate to True. |
allclose (arr1, arr2[, rtol, atol, equal_nan]) |
Returns True if two arrays are element-wise equal within a tolerance. |
angle (x[, deg]) |
Return the angle of the complex argument. |
any (a[, axis, keepdims, split_every, out]) |
Test whether any array element along a given axis evaluates to True. |
apply_along_axis (func1d, axis, arr, *args[, …]) |
Apply a function to 1-D slices along the given axis. |
apply_over_axes (func, a, axes) |
Apply a function repeatedly over multiple axes. |
arange (*args, **kwargs) |
Return evenly spaced values from start to stop with step size step. |
arccos (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.arccos. |
arccosh (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.arccosh. |
arcsin (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.arcsin. |
arcsinh (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.arcsinh. |
arctan (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.arctan. |
arctan2 (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.arctan2. |
arctanh (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.arctanh. |
argmax (x[, axis, split_every, out]) |
Return the maximum of an array or maximum along an axis. |
argmin (x[, axis, split_every, out]) |
Return the minimum of an array or minimum along an axis. |
argtopk (a, k[, axis, split_every]) |
Extract the indices of the k largest elements from a on the given axis, and return them sorted from largest to smallest. |
argwhere (a) |
Find the indices of array elements that are non-zero, grouped by element. |
around (x[, decimals]) |
Evenly round to the given number of decimals. |
array (object[, dtype, copy, order, subok, ndmin]) |
This docstring was copied from numpy.array. |
asanyarray (a) |
Convert the input to a dask array. |
asarray (a, **kwargs) |
Convert the input to a dask array. |
atleast_1d (*arys) |
Convert inputs to arrays with at least one dimension. |
atleast_2d (*arys) |
View inputs as arrays with at least two dimensions. |
atleast_3d (*arys) |
View inputs as arrays with at least three dimensions. |
average (a[, axis, weights, returned]) |
Compute the weighted average along the specified axis. |
bincount (x[, weights, minlength]) |
This docstring was copied from numpy.bincount. |
bitwise_and (x1, x2, /[, out, where, …]) |
This docstring was copied from numpy.bitwise_and. |
bitwise_not (x, /[, out, where, casting, …]) |
This docstring was copied from numpy.invert. |
bitwise_or (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.bitwise_or. |
bitwise_xor (x1, x2, /[, out, where, …]) |
This docstring was copied from numpy.bitwise_xor. |
block (arrays[, allow_unknown_chunksizes]) |
Assemble an nd-array from nested lists of blocks. |
blockwise (func, out_ind, *args[, name, …]) |
Tensor operation: Generalized inner and outer products |
broadcast_arrays (*args, **kwargs) |
Broadcast any number of arrays against each other. |
broadcast_to (x, shape[, chunks]) |
Broadcast an array to a new shape. |
coarsen (reduction, x, axes[, trim_excess]) |
Coarsen array by applying reduction to fixed size neighborhoods |
ceil (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.ceil. |
choose (a, choices) |
Construct an array from an index array and a set of arrays to choose from. |
clip (*args, **kwargs) |
Clip (limit) the values in an array. |
compress (condition, a[, axis]) |
Return selected slices of an array along given axis. |
concatenate (seq[, axis, …]) |
Concatenate arrays along an existing axis |
conj (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.conjugate. |
copysign (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.copysign. |
corrcoef (x[, y, rowvar]) |
Return Pearson product-moment correlation coefficients. |
cos (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.cos. |
cosh (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.cosh. |
count_nonzero (a[, axis]) |
Counts the number of non-zero values in the array a . |
cov (m[, y, rowvar, bias, ddof]) |
Estimate a covariance matrix, given data and weights. |
cumprod (x[, axis, dtype, out]) |
Return the cumulative product of elements along a given axis. |
cumsum (x[, axis, dtype, out]) |
Return the cumulative sum of the elements along a given axis. |
deg2rad (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.deg2rad. |
degrees (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.degrees. |
diag (v) |
Extract a diagonal or construct a diagonal array. |
diagonal (a[, offset, axis1, axis2]) |
Return specified diagonals. |
diff (a[, n, axis]) |
Calculate the n-th discrete difference along the given axis. |
divmod (x1, x2[, out1, out2], / [[, out, …]) |
This docstring was copied from numpy.divmod. |
digitize (a, bins[, right]) |
Return the indices of the bins to which each value in input array belongs. |
dot (a, b[, out]) |
This docstring was copied from numpy.dot. |
dstack (tup[, allow_unknown_chunksizes]) |
Stack arrays in sequence depth wise (along third axis). |
ediff1d (ary[, to_end, to_begin]) |
The differences between consecutive elements of an array. |
einsum (subscripts, *operands[, out, dtype, …]) |
This docstring was copied from numpy.einsum. |
empty (*args, **kwargs) |
Blocked variant of empty |
empty_like (a[, dtype, chunks]) |
Return a new array with the same shape and type as a given array. |
exp (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.exp. |
expm1 (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.expm1. |
eye (N[, chunks, M, k, dtype]) |
Return a 2-D Array with ones on the diagonal and zeros elsewhere. |
fabs (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.fabs. |
fix (*args, **kwargs) |
Round to nearest integer towards zero. |
flatnonzero (a) |
Return indices that are non-zero in the flattened version of a. |
flip (m, axis) |
Reverse element order along axis. |
flipud (m) |
Flip array in the up/down direction. |
fliplr (m) |
Flip array in the left/right direction. |
floor (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.floor. |
fmax (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.fmax. |
fmin (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.fmin. |
fmod (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.fmod. |
frexp (x[, out1, out2], / [[, out, where, …]) |
This docstring was copied from numpy.frexp. |
fromfunction (func[, chunks, shape, dtype]) |
Construct an array by executing a function over each coordinate. |
frompyfunc (func, nin, nout) |
This docstring was copied from numpy.frompyfunc. |
full (*args, **kwargs) |
Blocked variant of full |
full_like (a, fill_value[, dtype, chunks]) |
Return a full array with the same shape and type as a given array. |
gradient (f, *varargs, **kwargs) |
Return the gradient of an N-dimensional array. |
histogram (a[, bins, range, normed, weights, …]) |
Blocked variant of numpy.histogram() . |
hstack (tup[, allow_unknown_chunksizes]) |
Stack arrays in sequence horizontally (column wise). |
hypot (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.hypot. |
imag (*args, **kwargs) |
Return the imaginary part of the complex argument. |
indices (dimensions[, dtype, chunks]) |
Implements NumPy’s indices for Dask Arrays. |
insert (arr, obj, values, axis) |
Insert values along the given axis before the given indices. |
invert (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.invert. |
isclose (arr1, arr2[, rtol, atol, equal_nan]) |
Returns a boolean array where two arrays are element-wise equal within a tolerance. |
iscomplex (*args, **kwargs) |
Returns a bool array, where True if input element is complex. |
isfinite (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.isfinite. |
isin (element, test_elements[, …]) |
Calculates element in test_elements, broadcasting over element only. |
isinf (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.isinf. |
isneginf |
This docstring was copied from numpy.equal. |
isnan (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.isnan. |
isnull (values) |
pandas.isnull for dask arrays |
isposinf |
This docstring was copied from numpy.equal. |
isreal (*args, **kwargs) |
Returns a bool array, where True if input element is real. |
ldexp (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.ldexp. |
linspace (start, stop[, num, endpoint, …]) |
Return num evenly spaced values over the closed interval [start, stop]. |
log (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.log. |
log10 (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.log10. |
log1p (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.log1p. |
log2 (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.log2. |
logaddexp (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.logaddexp. |
logaddexp2 (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.logaddexp2. |
logical_and (x1, x2, /[, out, where, …]) |
This docstring was copied from numpy.logical_and. |
logical_not (x, /[, out, where, casting, …]) |
This docstring was copied from numpy.logical_not. |
logical_or (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.logical_or. |
logical_xor (x1, x2, /[, out, where, …]) |
This docstring was copied from numpy.logical_xor. |
map_overlap (x, func, depth[, boundary, trim]) |
Map a function over blocks of the array with some overlap |
map_blocks (func, *args[, name, token, …]) |
Map a function across all blocks of a dask array. |
matmul (x1, x2, /[, out, casting, order, …]) |
This docstring was copied from numpy.matmul. |
max (a[, axis, keepdims, split_every, out]) |
Return the maximum of an array or maximum along an axis. |
maximum (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.maximum. |
mean (a[, axis, dtype, keepdims, …]) |
Compute the arithmetic mean along the specified axis. |
median (a[, axis, keepdims, out]) |
Compute the median along the specified axis. |
meshgrid (*xi, **kwargs) |
Return coordinate matrices from coordinate vectors. |
min (a[, axis, keepdims, split_every, out]) |
Return the minimum of an array or minimum along an axis. |
minimum (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.minimum. |
modf (x[, out1, out2], / [[, out, where, …]) |
This docstring was copied from numpy.modf. |
moment (a, order[, axis, dtype, keepdims, …]) |
|
moveaxis (a, source, destination) |
Move axes of an array to new positions. |
nanargmax (x[, axis, split_every, out]) |
Return the maximum of an array or maximum along an axis, ignoring any NaNs. |
nanargmin (x[, axis, split_every, out]) |
Return minimum of an array or minimum along an axis, ignoring any NaNs. |
nancumprod (x, axis[, dtype, out]) |
Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. |
nancumsum (x, axis[, dtype, out]) |
Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. |
nanmax (a[, axis, keepdims, split_every, out]) |
Return the maximum of an array or maximum along an axis, ignoring any NaNs. |
nanmean (a[, axis, dtype, keepdims, …]) |
Compute the arithmetic mean along the specified axis, ignoring NaNs. |
nanmedian (a[, axis, keepdims, out]) |
Compute the median along the specified axis, while ignoring NaNs. |
nanmin (a[, axis, keepdims, split_every, out]) |
Return minimum of an array or minimum along an axis, ignoring any NaNs. |
nanprod (a[, axis, dtype, keepdims, …]) |
Return the product of array elements over a given axis treating Not a Numbers (NaNs) as ones. |
nanstd (a[, axis, dtype, keepdims, ddof, …]) |
Compute the standard deviation along the specified axis, while ignoring NaNs. |
nansum (a[, axis, dtype, keepdims, …]) |
Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. |
nanvar (a[, axis, dtype, keepdims, ddof, …]) |
Compute the variance along the specified axis, while ignoring NaNs. |
nan_to_num (*args, **kwargs) |
Replace NaN with zero and infinity with large finite numbers (default behaviour) or with the numbers defined by the user using the nan, posinf and/or neginf keywords. |
nextafter (x1, x2, /[, out, where, casting, …]) |
This docstring was copied from numpy.nextafter. |
nonzero (a) |
Return the indices of the elements that are non-zero. |
notnull (values) |
pandas.notnull for dask arrays |
ones (*args, **kwargs) |
Blocked variant of ones |
ones_like (a[, dtype, chunks]) |
Return an array of ones with the same shape and type as a given array. |
outer (a, b) |
Compute the outer product of two vectors. |
pad (array, pad_width, mode, **kwargs) |
Pad an array. |
percentile (a, q[, interpolation, method]) |
Approximate percentile of 1-D array |
PerformanceWarning |
A warning given when bad chunking may cause poor performance |
piecewise (x, condlist, funclist, *args, **kw) |
Evaluate a piecewise-defined function. |
prod (a[, axis, dtype, keepdims, …]) |
Return the product of array elements over a given axis. |
ptp (a[, axis]) |
Range of values (maximum - minimum) along an axis. |
rad2deg (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.rad2deg. |
radians (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.radians. |
ravel (array) |
Return a contiguous flattened array. |
real (*args, **kwargs) |
Return the real part of the complex argument. |
rechunk (x[, chunks, threshold, block_size_limit]) |
Convert blocks in dask array x for new chunks. |
reduction (x, chunk, aggregate[, axis, …]) |
General version of reductions |
repeat (a, repeats[, axis]) |
Repeat elements of an array. |
reshape (x, shape) |
Reshape array to new shape |
result_type (*arrays_and_dtypes) |
This docstring was copied from numpy.result_type. |
rint (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.rint. |
roll (array, shift[, axis]) |
Roll array elements along a given axis. |
rollaxis (a, axis[, start]) |
|
round (a[, decimals]) |
Round an array to the given number of decimals. |
sign (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.sign. |
signbit (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.signbit. |
sin (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.sin. |
sinh (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.sinh. |
sqrt (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.sqrt. |
square (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.square. |
squeeze (a[, axis]) |
Remove single-dimensional entries from the shape of an array. |
stack (seq[, axis]) |
Stack arrays along a new axis |
std (a[, axis, dtype, keepdims, ddof, …]) |
Compute the standard deviation along the specified axis. |
sum (a[, axis, dtype, keepdims, split_every, out]) |
Sum of array elements over a given axis. |
take (a, indices[, axis]) |
Take elements from an array along an axis. |
tan (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.tan. |
tanh (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.tanh. |
tensordot (lhs, rhs[, axes]) |
Compute tensor dot product along specified axes. |
tile (A, reps) |
Construct an array by repeating A the number of times given by reps. |
topk (a, k[, axis, split_every]) |
Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest. |
trace (a[, offset, axis1, axis2, dtype]) |
Return the sum along diagonals of the array. |
transpose (a[, axes]) |
Permute the dimensions of an array. |
tril (m[, k]) |
Lower triangle of an array with elements above the k-th diagonal zeroed. |
triu (m[, k]) |
Upper triangle of an array with elements above the k-th diagonal zeroed. |
trunc (x, /[, out, where, casting, order, …]) |
This docstring was copied from numpy.trunc. |
unify_chunks (*args, **kwargs) |
Unify chunks across a sequence of arrays |
unique (ar[, return_index, return_inverse, …]) |
Find the unique elements of an array. |
unravel_index (indices, shape[, order]) |
This docstring was copied from numpy.unravel_index. |
var (a[, axis, dtype, keepdims, ddof, …]) |
Compute the variance along the specified axis. |
vdot (a, b) |
This docstring was copied from numpy.vdot. |
vstack (tup[, allow_unknown_chunksizes]) |
Stack arrays in sequence vertically (row wise). |
where (condition, [x, y]) |
This docstring was copied from numpy.where. |
zeros (*args, **kwargs) |
Blocked variant of zeros |
zeros_like (a[, dtype, chunks]) |
Return an array of zeros with the same shape and type as a given array. |
Fast Fourier Transforms¶
fft.fft_wrap (fft_func[, kind, dtype]) |
Wrap 1D, 2D, and ND real and complex FFT functions |
fft.fft (a[, n, axis]) |
Wrapping of numpy.fft.fft |
fft.fft2 (a[, s, axes]) |
Wrapping of numpy.fft.fft2 |
fft.fftn (a[, s, axes]) |
Wrapping of numpy.fft.fftn |
fft.ifft (a[, n, axis]) |
Wrapping of numpy.fft.ifft |
fft.ifft2 (a[, s, axes]) |
Wrapping of numpy.fft.ifft2 |
fft.ifftn (a[, s, axes]) |
Wrapping of numpy.fft.ifftn |
fft.rfft (a[, n, axis]) |
Wrapping of numpy.fft.rfft |
fft.rfft2 (a[, s, axes]) |
Wrapping of numpy.fft.rfft2 |
fft.rfftn (a[, s, axes]) |
Wrapping of numpy.fft.rfftn |
fft.irfft (a[, n, axis]) |
Wrapping of numpy.fft.irfft |
fft.irfft2 (a[, s, axes]) |
Wrapping of numpy.fft.irfft2 |
fft.irfftn (a[, s, axes]) |
Wrapping of numpy.fft.irfftn |
fft.hfft (a[, n, axis]) |
Wrapping of numpy.fft.hfft |
fft.ihfft (a[, n, axis]) |
Wrapping of numpy.fft.ihfft |
fft.fftfreq (n[, d, chunks]) |
Return the Discrete Fourier Transform sample frequencies. |
fft.rfftfreq (n[, d, chunks]) |
Return the Discrete Fourier Transform sample frequencies (for usage with rfft, irfft). |
fft.fftshift (x[, axes]) |
Shift the zero-frequency component to the center of the spectrum. |
fft.ifftshift (x[, axes]) |
The inverse of fftshift. |
Linear Algebra¶
linalg.cholesky (a[, lower]) |
Returns the Cholesky decomposition, \(A = L L^*\) or \(A = U^* U\) of a Hermitian positive-definite matrix A. |
linalg.inv (a) |
Compute the inverse of a matrix with LU decomposition and forward / backward substitutions. |
linalg.lstsq (a, b) |
Return the least-squares solution to a linear matrix equation using QR decomposition. |
linalg.lu (a) |
Compute the lu decomposition of a matrix. |
linalg.norm (x[, ord, axis, keepdims]) |
Matrix or vector norm. |
linalg.qr (a) |
Compute the qr factorization of a matrix. |
linalg.solve (a, b[, sym_pos]) |
Solve the equation a x = b for x . |
linalg.solve_triangular (a, b[, lower]) |
Solve the equation a x = b for x, assuming a is a triangular matrix. |
linalg.svd (a) |
Compute the singular value decomposition of a matrix. |
linalg.svd_compressed (a, k[, n_power_iter, …]) |
Randomly compressed rank-k thin Singular Value Decomposition. |
linalg.sfqr (data[, name]) |
Direct Short-and-Fat QR |
linalg.tsqr (data[, compute_svd, …]) |
Direct Tall-and-Skinny QR algorithm |
Masked Arrays¶
ma.average (a[, axis, weights, returned]) |
Return the weighted average of array over the given axis. |
ma.filled (a[, fill_value]) |
Return input as an array with masked data replaced by a fill value. |
ma.fix_invalid (a[, fill_value]) |
Return input with invalid data masked and replaced by a fill value. |
ma.getdata (a) |
Return the data of a masked array as an ndarray. |
ma.getmaskarray (a) |
Return the mask of a masked array, or full boolean array of False. |
ma.masked_array (data[, mask, fill_value]) |
An array class with possibly masked values. |
ma.masked_equal (a, value) |
Mask an array where equal to a given value. |
ma.masked_greater (x, value[, copy]) |
Mask an array where greater than a given value. |
ma.masked_greater_equal (x, value[, copy]) |
Mask an array where greater than or equal to a given value. |
ma.masked_inside (x, v1, v2) |
Mask an array inside a given interval. |
ma.masked_invalid (a) |
Mask an array where invalid values occur (NaNs or infs). |
ma.masked_less (x, value[, copy]) |
Mask an array where less than a given value. |
ma.masked_less_equal (x, value[, copy]) |
Mask an array where less than or equal to a given value. |
ma.masked_not_equal (x, value[, copy]) |
Mask an array where not equal to a given value. |
ma.masked_outside (x, v1, v2) |
Mask an array outside a given interval. |
ma.masked_values (x, value[, rtol, atol, shrink]) |
Mask using floating point equality. |
ma.masked_where (condition, a) |
Mask an array where a condition is met. |
ma.set_fill_value (a, fill_value) |
Set the filling value of a, if a is a masked array. |
Random¶
random.beta (a, b[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.beta. |
random.binomial (n, p[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.binomial. |
random.chisquare (df[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.chisquare. |
random.choice (a[, size, replace, p]) |
This docstring was copied from numpy.random.mtrand.RandomState.choice. |
random.exponential ([scale, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.exponential. |
random.f (dfnum, dfden[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.f. |
random.gamma (shape[, scale, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.gamma. |
random.geometric (p[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.geometric. |
random.gumbel ([loc, scale, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.gumbel. |
random.hypergeometric (ngood, nbad, nsample) |
This docstring was copied from numpy.random.mtrand.RandomState.hypergeometric. |
random.laplace ([loc, scale, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.laplace. |
random.logistic ([loc, scale, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.logistic. |
random.lognormal ([mean, sigma, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.lognormal. |
random.logseries (p[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.logseries. |
random.negative_binomial (n, p[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.negative_binomial. |
random.noncentral_chisquare (df, nonc[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.noncentral_chisquare. |
random.noncentral_f (dfnum, dfden, nonc[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.noncentral_f. |
random.normal ([loc, scale, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.normal. |
random.pareto (a[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.pareto. |
random.permutation (x) |
This docstring was copied from numpy.random.mtrand.RandomState.permutation. |
random.poisson ([lam, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.poisson. |
random.power (a[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.power. |
random.randint (low[, high, size, dtype]) |
This docstring was copied from numpy.random.mtrand.RandomState.randint. |
random.random ([size]) |
This docstring was copied from numpy.random.mtrand.RandomState.random_sample. |
random.random_sample ([size]) |
This docstring was copied from numpy.random.mtrand.RandomState.random_sample. |
random.rayleigh ([scale, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.rayleigh. |
random.standard_cauchy ([size]) |
This docstring was copied from numpy.random.mtrand.RandomState.standard_cauchy. |
random.standard_exponential ([size]) |
This docstring was copied from numpy.random.mtrand.RandomState.standard_exponential. |
random.standard_gamma (shape[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.standard_gamma. |
random.standard_normal ([size]) |
This docstring was copied from numpy.random.mtrand.RandomState.standard_normal. |
random.standard_t (df[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.standard_t. |
random.triangular (left, mode, right[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.triangular. |
random.uniform ([low, high, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.uniform. |
random.vonmises (mu, kappa[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.vonmises. |
random.wald (mean, scale[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.wald. |
random.weibull (a[, size]) |
This docstring was copied from numpy.random.mtrand.RandomState.weibull. |
random.zipf (a[, size]) |
Standard distributions |
Stats¶
stats.ttest_ind (a, b[, axis, equal_var]) |
Calculate the T-test for the means of two independent samples of scores. |
stats.ttest_1samp (a, popmean[, axis, nan_policy]) |
Calculate the T-test for the mean of ONE group of scores. |
stats.ttest_rel (a, b[, axis, nan_policy]) |
Calculate the T-test on TWO RELATED samples of scores, a and b. |
stats.chisquare (f_obs[, f_exp, ddof, axis]) |
Calculate a one-way chi square test. |
stats.power_divergence (f_obs[, f_exp, ddof, …]) |
Cressie-Read power divergence statistic and goodness of fit test. |
stats.skew (a[, axis, bias, nan_policy]) |
Compute the sample skewness of a data set. |
stats.skewtest (a[, axis, nan_policy]) |
Test whether the skew is different from the normal distribution. |
stats.kurtosis (a[, axis, fisher, bias, …]) |
Compute the kurtosis (Fisher or Pearson) of a dataset. |
stats.kurtosistest (a[, axis, nan_policy]) |
Test whether a dataset has normal kurtosis. |
stats.normaltest (a[, axis, nan_policy]) |
Test whether a sample differs from a normal distribution. |
stats.f_oneway (*args) |
Performs a 1-way ANOVA. |
stats.moment (a[, moment, axis, nan_policy]) |
Calculate the nth moment about the mean for a sample. |
Image Support¶
image.imread (filename[, imread, preprocess]) |
Read a stack of images into a dask array |
Slightly Overlapping Computations¶
overlap.overlap (x, depth, boundary) |
Share boundaries between neighboring blocks |
overlap.map_overlap (x, func, depth[, …]) |
Map a function over blocks of the array with some overlap |
overlap.trim_internal (x, axes[, boundary]) |
Trim sides from each block |
overlap.trim_overlap (x, depth[, boundary]) |
Trim sides from each block. |
Create and Store Arrays¶
from_array (x[, chunks, name, lock, asarray, …]) |
Create dask array from something that looks like an array |
from_delayed (value, shape[, dtype, meta, name]) |
Create a dask array from a dask delayed value |
from_npy_stack (dirname[, mmap_mode]) |
Load dask array from stack of npy files |
from_zarr (url[, component, storage_options, …]) |
Load array from the zarr storage format |
from_tiledb (uri[, attribute, chunks, …]) |
Load array from the TileDB storage format |
store (sources, targets[, lock, regions, …]) |
Store dask arrays in array-like objects, overwrite data in target |
to_hdf5 (filename, *args, **kwargs) |
Store arrays in HDF5 file |
to_zarr (arr, url[, component, …]) |
Save array to the zarr storage format |
to_npy_stack (dirname, x[, axis]) |
Write dask array to a stack of .npy files |
to_tiledb (darray, uri[, compute, …]) |
Save array to the TileDB storage format |
Generalized Ufuncs¶
apply_gufunc (func, signature, *args, **kwargs) |
Apply a generalized ufunc or similar python function to arrays. |
as_gufunc ([signature]) |
Decorator for dask.array.gufunc . |
gufunc (pyfunc, **kwargs) |
Binds pyfunc into dask.array.apply_gufunc when called. |
Internal functions¶
blockwise (func, out_ind, *args[, name, …]) |
Tensor operation: Generalized inner and outer products |
normalize_chunks (chunks[, shape, limit, …]) |
Normalize chunks to tuple of tuples |
Other functions¶
-
dask.array.
from_array
(x, chunks='auto', name=None, lock=False, asarray=None, fancy=True, getitem=None, meta=None)¶ Create dask array from something that looks like an array
Input must have a
.shape
,.ndim
,.dtype
and support numpy-style slicing.Parameters: x : array_like
chunks : int, tuple
How to chunk the array. Must be one of the following forms:
- A blocksize like 1000.
- A blockshape like (1000, 1000).
- Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)).
- A size in bytes, like “100 MiB” which will choose a uniform block-like shape
- The word “auto” which acts like the above, but uses a configuration
value
array.chunk-size
for the chunk size
-1 or None as a blocksize indicate the size of the corresponding dimension.
name : str, optional
The key name to use for the array. Defaults to a hash of
x
. By default, hash uses python’s standard sha1. This behaviour can be changed by installing cityhash, xxhash or murmurhash. If installed, a large-factor speedup can be obtained in the tokenisation step. Usename=False
to generate a random name instead of hashing (fast)lock : bool or Lock, optional
If
x
doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you.asarray : bool, optional
If True then call np.asarray on chunks to convert them to numpy arrays. If False then chunks are passed through unchanged. If None (default) then we use True if the
__array_function__
method is undefined.fancy : bool, optional
If
x
doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.meta : Array-like, optional
The metadata for the resulting dask array. This is the kind of array that will result from slicing the input array. Defaults to the input array.
Examples
>>> x = h5py.File('...')['/data/path'] # doctest: +SKIP >>> a = da.from_array(x, chunks=(1000, 1000)) # doctest: +SKIP
If your underlying datastore does not support concurrent reads then include the
lock=True
keyword argument orlock=mylock
if you want multiple arrays to coordinate around the same lock.>>> a = da.from_array(x, chunks=(1000, 1000), lock=True) # doctest: +SKIP
If your underlying datastore has a
.chunks
attribute (as h5py and zarr datasets do) then a multiple of that chunk shape will be used if you do not provide a chunk shape.>>> a = da.from_array(x, chunks='auto') # doctest: +SKIP >>> a = da.from_array(x, chunks='100 MiB') # doctest: +SKIP >>> a = da.from_array(x) # doctest: +SKIP
-
dask.array.
from_delayed
(value, shape, dtype=None, meta=None, name=None)¶ Create a dask array from a dask delayed value
This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.
The dask array will consist of a single chunk.
Examples
>>> import dask >>> import dask.array as da >>> value = dask.delayed(np.ones)(5) >>> array = da.from_delayed(value, (5,), dtype=float) >>> array dask.array<from-value, shape=(5,), dtype=float64, chunksize=(5,), chunktype=numpy.ndarray> >>> array.compute() array([1., 1., 1., 1., 1.])
-
dask.array.
store
(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs)¶ Store dask arrays in array-like objects, overwrite data in target
This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.
If your data fits in memory then you may prefer calling
np.array(myarray)
instead.Parameters: sources: Array or iterable of Arrays
targets: array-like or Delayed or iterable of array-likes and/or Delayeds
These should support setitem syntax
target[10:20] = ...
lock: boolean or threading.Lock, optional
Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular
threading.Lock
object to be shared among all writes.regions: tuple of slices or list of tuples of slices
Each
region
tuple inregions
should be such thattarget[region].shape = source.shape
for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.compute: boolean, optional
If true compute immediately, return
dask.delayed.Delayed
otherwisereturn_stored: boolean, optional
Optionally return the stored result (default False).
Examples
>>> x = ... # doctest: +SKIP
>>> import h5py # doctest: +SKIP >>> f = h5py.File('myfile.hdf5', mode='a') # doctest: +SKIP >>> dset = f.create_dataset('/data', shape=x.shape, ... chunks=x.chunks, ... dtype='f8') # doctest: +SKIP
>>> store(x, dset) # doctest: +SKIP
Alternatively store many arrays at the same time
>>> store([x, y, z], [dset1, dset2, dset3]) # doctest: +SKIP
-
dask.array.
coarsen
(reduction, x, axes, trim_excess=False, **kwargs)¶ Coarsen array by applying reduction to fixed size neighborhoods
Parameters: reduction: function
Function like np.sum, np.mean, etc…
x: np.ndarray
Array to be coarsened
axes: dict
Mapping of axis to coarsening factor
Examples
>>> x = np.array([1, 2, 3, 4, 5, 6]) >>> coarsen(np.sum, x, {0: 2}) array([ 3, 7, 11]) >>> coarsen(np.max, x, {0: 3}) array([3, 6])
Provide dictionary of scale per dimension
>>> x = np.arange(24).reshape((4, 6)) >>> x array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])
>>> coarsen(np.min, x, {0: 2, 1: 3}) array([[ 0, 3], [12, 15]])
You must avoid excess elements explicitly
>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8]) >>> coarsen(np.min, x, {0: 3}, trim_excess=True) array([1, 4])
-
dask.array.
stack
(seq, axis=0)¶ Stack arrays along a new axis
Given a sequence of dask arrays, form a new dask array by stacking them along a new dimension (axis=0 by default)
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.stack(data, axis=0) >>> x.shape (3, 4, 4)
>>> da.stack(data, axis=1).shape (4, 3, 4)
>>> da.stack(data, axis=-1).shape (4, 4, 3)
Result is a new dask Array
-
dask.array.
concatenate
(seq, axis=0, allow_unknown_chunksizes=False)¶ Concatenate arrays along an existing axis
Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)
Parameters: seq: list of dask.arrays
axis: int
Dimension along which to align all of the arrays
allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.concatenate(data, axis=0) >>> x.shape (12, 4)
>>> da.concatenate(data, axis=1).shape (4, 12)
Result is a new dask Array
-
dask.array.
all
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Test whether all array elements along a given axis evaluate to True.
This docstring was copied from numpy.all.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input array or object that can be converted to an array.
axis : None or int or tuple of ints, optional
Axis or axes along which a logical AND reduction is performed. The default (axis = None) is to perform a logical AND over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis.
New in version 1.7.0.
If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.
out : ndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if
dtype(out)
is float, the result will consist of 0.0’s and 1.0’s). See doc.ufuncs (Section “Output arguments”) for more details.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the all method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
Returns: all : ndarray, bool
A new boolean or array is returned unless out is specified, in which case a reference to out is returned.
See also
ndarray.all
- equivalent method
any
- Test whether any element along a given axis evaluates to True.
Notes
Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.
Examples
>>> np.all([[True,False],[True,True]]) # doctest: +SKIP False
>>> np.all([[True,False],[True,True]], axis=0) # doctest: +SKIP array([ True, False])
>>> np.all([-1, 4, 5]) # doctest: +SKIP True
>>> np.all([1.0, np.nan]) # doctest: +SKIP True
>>> o=np.array(False) # doctest: +SKIP >>> z=np.all([-1, 4, 5], out=o) # doctest: +SKIP >>> id(z), id(o), z # doctest: +SKIP (28293632, 28293632, array(True)) # may vary
-
dask.array.
allclose
(arr1, arr2, rtol=1e-05, atol=1e-08, equal_nan=False)¶ Returns True if two arrays are element-wise equal within a tolerance.
This docstring was copied from numpy.allclose.
Some inconsistencies with the Dask version may exist.
The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.
If either array contains one or more NaNs, False is returned. Infs are treated as equal if they are in the same place and of the same sign in both arrays.
Parameters: a, b : array_like
Input arrays to compare.
rtol : float
The relative tolerance parameter (see Notes).
atol : float
The absolute tolerance parameter (see Notes).
equal_nan : bool
Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.
New in version 1.10.0.
Returns: allclose : bool
Returns True if the two arrays are equal within the given tolerance; False otherwise.
Notes
If the following equation is element-wise True, then allclose returns True.
absolute(a - b) <= (atol + rtol * absolute(b))The above equation is not symmetric in a and b, so that
allclose(a, b)
might be different fromallclose(b, a)
in some rare cases.The comparison of a and b uses standard broadcasting, which means that a and b need not have the same shape in order for
allclose(a, b)
to evaluate to True. The same is true for equal but not array_equal.Examples
>>> np.allclose([1e10,1e-7], [1.00001e10,1e-8]) # doctest: +SKIP False >>> np.allclose([1e10,1e-8], [1.00001e10,1e-9]) # doctest: +SKIP True >>> np.allclose([1e10,1e-8], [1.0001e10,1e-9]) # doctest: +SKIP False >>> np.allclose([1.0, np.nan], [1.0, np.nan]) # doctest: +SKIP False >>> np.allclose([1.0, np.nan], [1.0, np.nan], equal_nan=True) # doctest: +SKIP True
-
dask.array.
angle
(x, deg=0)¶ Return the angle of the complex argument.
This docstring was copied from numpy.angle.
Some inconsistencies with the Dask version may exist.
Parameters: z : array_like (Not supported in Dask)
A complex number or sequence of complex numbers.
deg : bool, optional
Return angle in degrees if True, radians if False (default).
Returns: angle : ndarray or scalar
The counterclockwise angle from the positive real axis on the complex plane in the range
(-pi, pi]
, with dtype as numpy.float64.- ..versionchanged:: 1.16.0
This function works on subclasses of ndarray like ma.array.
See also
arctan2
,absolute
Examples
>>> np.angle([1.0, 1.0j, 1+1j]) # in radians # doctest: +SKIP array([ 0. , 1.57079633, 0.78539816]) # may vary >>> np.angle(1+1j, deg=True) # in degrees # doctest: +SKIP 45.0
-
dask.array.
any
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Test whether any array element along a given axis evaluates to True.
This docstring was copied from numpy.any.
Some inconsistencies with the Dask version may exist.
Returns single boolean unless axis is not
None
Parameters: a : array_like
Input array or object that can be converted to an array.
axis : None or int or tuple of ints, optional
Axis or axes along which a logical OR reduction is performed. The default (axis = None) is to perform a logical OR over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis.
New in version 1.7.0.
If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.
out : ndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if it is of type float, then it will remain so, returning 1.0 for True and 0.0 for False, regardless of the type of a). See doc.ufuncs (Section “Output arguments”) for details.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the any method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
Returns: any : bool or ndarray
A new boolean or ndarray is returned unless out is specified, in which case a reference to out is returned.
See also
ndarray.any
- equivalent method
all
- Test whether all elements along a given axis evaluate to True.
Notes
Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.
Examples
>>> np.any([[True, False], [True, True]]) # doctest: +SKIP True
>>> np.any([[True, False], [False, False]], axis=0) # doctest: +SKIP array([ True, False])
>>> np.any([-1, 0, 5]) # doctest: +SKIP True
>>> np.any(np.nan) # doctest: +SKIP True
>>> o=np.array(False) # doctest: +SKIP >>> z=np.any([-1, 4, 5], out=o) # doctest: +SKIP >>> z, o # doctest: +SKIP (array(True), array(True)) >>> # Check now that z is a reference to o >>> z is o # doctest: +SKIP True >>> id(z), id(o) # identity of z and o # doctest: +SKIP (191614240, 191614240)
-
dask.array.
apply_along_axis
(func1d, axis, arr, *args, dtype=None, shape=None, **kwargs)¶ Apply a function to 1-D slices along the given axis.
This docstring was copied from numpy.apply_along_axis.
Some inconsistencies with the Dask version may exist.
Apply a function to 1-D slices along the given axis. This is a blocked variant of
numpy.apply_along_axis()
implemented viadask.array.map_blocks()
Parameters: func1d : callable
Function to apply to 1-D slices of the array along the given axis
axis : int
Axis along which func1d will be applied
arr : dask array
Dask array to which
func1d
will be appliedargs : any
Additional arguments to
func1d
.dtype : str or dtype, optional
The dtype of the output of
func1d
.shape : tuple, optional
The shape of the output of
func1d
.kwargs : any
Additional keyword arguments for
func1d
.Returns: out : ndarray (Ni…, Nj…, Nk…)
The output array. The shape of out is identical to the shape of arr, except along the axis dimension. This axis is removed, and replaced with new dimensions equal to the shape of the return value of func1d. So if func1d returns a scalar out will have one fewer dimensions than arr.
See also
apply_over_axes
- Apply a function repeatedly over multiple axes.
Notes
If either of dtype or shape are not provided, Dask attempts to determine them by calling func1d on a dummy array. This may produce incorrect values for dtype or shape, so we recommend providing them.
Execute func1d(a, *args) where func1d operates on 1-D arrays and a is a 1-D slice of arr along axis.
This is equivalent to (but faster than) the following use of ndindex and s_, which sets each of
ii
,jj
, andkk
to a tuple of indices:Ni, Nk = a.shape[:axis], a.shape[axis+1:] for ii in ndindex(Ni): for kk in ndindex(Nk): f = func1d(arr[ii + s_[:,] + kk]) Nj = f.shape for jj in ndindex(Nj): out[ii + jj + kk] = f[jj]
Equivalently, eliminating the inner loop, this can be expressed as:
Ni, Nk = a.shape[:axis], a.shape[axis+1:] for ii in ndindex(Ni): for kk in ndindex(Nk): out[ii + s_[...,] + kk] = func1d(arr[ii + s_[:,] + kk])
Examples
>>> def my_func(a): # doctest: +SKIP ... """Average first and last element of a 1-D array""" ... return (a[0] + a[-1]) * 0.5 >>> b = np.array([[1,2,3], [4,5,6], [7,8,9]]) # doctest: +SKIP >>> np.apply_along_axis(my_func, 0, b) # doctest: +SKIP array([4., 5., 6.]) >>> np.apply_along_axis(my_func, 1, b) # doctest: +SKIP array([2., 5., 8.])
For a function that returns a 1D array, the number of dimensions in outarr is the same as arr.
>>> b = np.array([[8,1,7], [4,3,9], [5,2,6]]) # doctest: +SKIP >>> np.apply_along_axis(sorted, 1, b) # doctest: +SKIP array([[1, 7, 8], [3, 4, 9], [2, 5, 6]])
For a function that returns a higher dimensional array, those dimensions are inserted in place of the axis dimension.
>>> b = np.array([[1,2,3], [4,5,6], [7,8,9]]) # doctest: +SKIP >>> np.apply_along_axis(np.diag, -1, b) # doctest: +SKIP array([[[1, 0, 0], [0, 2, 0], [0, 0, 3]], [[4, 0, 0], [0, 5, 0], [0, 0, 6]], [[7, 0, 0], [0, 8, 0], [0, 0, 9]]])
-
dask.array.
apply_over_axes
(func, a, axes)¶ Apply a function repeatedly over multiple axes.
This docstring was copied from numpy.apply_over_axes.
Some inconsistencies with the Dask version may exist.
func is called as res = func(a, axis), where axis is the first element of axes. The result res of the function call must have either the same dimensions as a or one less dimension. If res has one less dimension than a, a dimension is inserted before axis. The call to func is then repeated for each axis in axes, with res as the first argument.
Parameters: func : function
This function must take two arguments, func(a, axis).
a : array_like
Input array.
axes : array_like
Axes over which func is applied; the elements must be integers.
Returns: apply_over_axis : ndarray
The output array. The number of dimensions is the same as a, but the shape can be different. This depends on whether func changes the shape of its output with respect to its input.
See also
apply_along_axis
- Apply a function to 1-D slices of an array along the given axis.
Notes
This function is equivalent to tuple axis arguments to reorderable ufuncs with keepdims=True. Tuple axis arguments to ufuncs have been available since version 1.7.0.
Examples
>>> a = np.arange(24).reshape(2,3,4) # doctest: +SKIP >>> a # doctest: +SKIP array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]])
Sum over axes 0 and 2. The result has same number of dimensions as the original array:
>>> np.apply_over_axes(np.sum, a, [0,2]) # doctest: +SKIP array([[[ 60], [ 92], [124]]])
Tuple axis arguments to ufuncs are equivalent:
>>> np.sum(a, axis=(0,2), keepdims=True) # doctest: +SKIP array([[[ 60], [ 92], [124]]])
-
dask.array.
arange
(*args, **kwargs)¶ Return evenly spaced values from start to stop with step size step.
The values are half-open [start, stop), so including start and excluding stop. This is basically the same as python’s range function but for dask arrays.
When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.
Parameters: start : int, optional
The starting value of the sequence. The default is 0.
stop : int
The end of the interval, this value is excluded from the interval.
step : int, optional
The spacing between the values. The default is 1 when not specified. The last value of the sequence.
chunks : int
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.dtype : numpy.dtype
Output dtype. Omit to infer it from start, stop, step
Returns: samples : dask array
See also
-
dask.array.
arccos
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arccos.
Some inconsistencies with the Dask version may exist.
Trigonometric inverse cosine, element-wise.
The inverse of cos so that, if
y = cos(x)
, thenx = arccos(y)
.Parameters: x : array_like
x-coordinate on the unit circle. For real arguments, the domain is [-1, 1].
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: angle : ndarray
The angle of the ray intersecting the unit circle at the given x-coordinate in radians [0, pi]. This is a scalar if x is a scalar.
Notes
arccos is a multivalued function: for each x there are infinitely many numbers z such that cos(z) = x. The convention is to return the angle z whose real part lies in [0, pi].
For real-valued input data types, arccos always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arccos is a complex analytic function that has branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.
The inverse cos is also known as acos or cos^-1.
References
M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/
Examples
We expect the arccos of 1 to be 0, and of -1 to be pi:
>>> np.arccos([1, -1]) # doctest: +SKIP array([ 0. , 3.14159265])
Plot arccos:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> x = np.linspace(-1, 1, num=100) # doctest: +SKIP >>> plt.plot(x, np.arccos(x)) # doctest: +SKIP >>> plt.axis('tight') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.
arccosh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arccosh.
Some inconsistencies with the Dask version may exist.
Inverse hyperbolic cosine, element-wise.
Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: arccosh : ndarray
Array of the same shape as x. This is a scalar if x is a scalar.
Notes
arccosh is a multivalued function: for each x there are infinitely many numbers z such that cosh(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi] and the real part in
[0, inf]
.For real-valued input data types, arccosh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arccosh is a complex analytical function that has a branch cut [-inf, 1] and is continuous from above on it.
References
[R117] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/ [R118] Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arccosh Examples
>>> np.arccosh([np.e, 10.0]) # doctest: +SKIP array([ 1.65745445, 2.99322285]) >>> np.arccosh(1) # doctest: +SKIP 0.0
-
dask.array.
arcsin
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arcsin.
Some inconsistencies with the Dask version may exist.
Inverse sine, element-wise.
Parameters: x : array_like
y-coordinate on the unit circle.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: angle : ndarray
The inverse sine of each element in x, in radians and in the closed interval
[-pi/2, pi/2]
. This is a scalar if x is a scalar.Notes
arcsin is a multivalued function: for each x there are infinitely many numbers z such that \(sin(z) = x\). The convention is to return the angle z whose real part lies in [-pi/2, pi/2].
For real-valued input data types, arcsin always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arcsin is a complex analytic function that has, by convention, the branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.
The inverse sine is also known as asin or sin^{-1}.
References
Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79ff. http://www.math.sfu.ca/~cbm/aands/
Examples
>>> np.arcsin(1) # pi/2 # doctest: +SKIP 1.5707963267948966 >>> np.arcsin(-1) # -pi/2 # doctest: +SKIP -1.5707963267948966 >>> np.arcsin(0) # doctest: +SKIP 0.0
-
dask.array.
arcsinh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arcsinh.
Some inconsistencies with the Dask version may exist.
Inverse hyperbolic sine element-wise.
Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Array of the same shape as x. This is a scalar if x is a scalar.
Notes
arcsinh is a multivalued function: for each x there are infinitely many numbers z such that sinh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].
For real-valued input data types, arcsinh always returns real output. For each value that cannot be expressed as a real number or infinity, it returns
nan
and sets the invalid floating point error flag.For complex-valued input, arccos is a complex analytical function that has branch cuts [1j, infj] and [-1j, -infj] and is continuous from the right on the former and from the left on the latter.
The inverse hyperbolic sine is also known as asinh or
sinh^-1
.References
[R119] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/ [R120] Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arcsinh Examples
>>> np.arcsinh(np.array([np.e, 10.0])) # doctest: +SKIP array([ 1.72538256, 2.99822295])
-
dask.array.
arctan
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arctan.
Some inconsistencies with the Dask version may exist.
Trigonometric inverse tangent, element-wise.
The inverse of tan, so that if
y = tan(x)
thenx = arctan(y)
.Parameters: x : array_like
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Out has the same shape as x. Its real part is in
[-pi/2, pi/2]
(arctan(+/-inf)
returns+/-pi/2
). This is a scalar if x is a scalar.See also
Notes
arctan is a multi-valued function: for each x there are infinitely many numbers z such that tan(z) = x. The convention is to return the angle z whose real part lies in [-pi/2, pi/2].
For real-valued input data types, arctan always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arctan is a complex analytic function that has [1j, infj] and [-1j, -infj] as branch cuts, and is continuous from the left on the former and from the right on the latter.
The inverse tangent is also known as atan or tan^{-1}.
References
Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/
Examples
We expect the arctan of 0 to be 0, and of 1 to be pi/4:
>>> np.arctan([0, 1]) # doctest: +SKIP array([ 0. , 0.78539816])
>>> np.pi/4 # doctest: +SKIP 0.78539816339744828
Plot arctan:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> x = np.linspace(-10, 10) # doctest: +SKIP >>> plt.plot(x, np.arctan(x)) # doctest: +SKIP >>> plt.axis('tight') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.
arctan2
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arctan2.
Some inconsistencies with the Dask version may exist.
Element-wise arc tangent of
x1/x2
choosing the quadrant correctly.The quadrant (i.e., branch) is chosen so that
arctan2(x1, x2)
is the signed angle in radians between the ray ending at the origin and passing through the point (1,0), and the ray ending at the origin and passing through the point (x2, x1). (Note the role reversal: the “y-coordinate” is the first function parameter, the “x-coordinate” is the second.) By IEEE convention, this function is defined for x2 = +/-0 and for either or both of x1 and x2 = +/-inf (see Notes for specific values).This function is not defined for complex-valued arguments; for the so-called argument of complex values, use angle.
Parameters: x1 : array_like, real-valued
y-coordinates.
x2 : array_like, real-valued
x-coordinates. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: angle : ndarray
Array of angles in radians, in the range
[-pi, pi]
. This is a scalar if both x1 and x2 are scalars.Notes
arctan2 is identical to the atan2 function of the underlying C library. The following special values are defined in the C standard: [R121]
x1 x2 arctan2(x1,x2) +/- 0 +0 +/- 0 +/- 0 -0 +/- pi > 0 +/-inf +0 / +pi < 0 +/-inf -0 / -pi +/-inf +inf +/- (pi/4) +/-inf -inf +/- (3*pi/4) Note that +0 and -0 are distinct floating point numbers, as are +inf and -inf.
References
[R121] (1, 2) ISO/IEC standard 9899:1999, “Programming language C.” Examples
Consider four points in different quadrants:
>>> x = np.array([-1, +1, +1, -1]) # doctest: +SKIP >>> y = np.array([-1, -1, +1, +1]) # doctest: +SKIP >>> np.arctan2(y, x) * 180 / np.pi # doctest: +SKIP array([-135., -45., 45., 135.])
Note the order of the parameters. arctan2 is defined also when x2 = 0 and at several other special points, obtaining values in the range
[-pi, pi]
:>>> np.arctan2([1., -1.], [0., 0.]) # doctest: +SKIP array([ 1.57079633, -1.57079633]) >>> np.arctan2([0., 0., np.inf], [+0., -0., np.inf]) # doctest: +SKIP array([ 0. , 3.14159265, 0.78539816])
-
dask.array.
arctanh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.arctanh.
Some inconsistencies with the Dask version may exist.
Inverse hyperbolic tangent element-wise.
Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Array of the same shape as x. This is a scalar if x is a scalar.
See also
emath.arctanh
Notes
arctanh is a multivalued function: for each x there are infinitely many numbers z such that tanh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].
For real-valued input data types, arctanh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, arctanh is a complex analytical function that has branch cuts [-1, -inf] and [1, inf] and is continuous from above on the former and from below on the latter.
The inverse hyperbolic tangent is also known as atanh or
tanh^-1
.References
[R122] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/ [R123] Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arctanh Examples
>>> np.arctanh([0, -0.5]) # doctest: +SKIP array([ 0. , -0.54930614])
-
dask.array.
argmax
(x, axis=None, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis.
This docstring was copied from numpy.amax.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like (Not supported in Dask)
Input data.
axis : None or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before.
out : ndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See doc.ufuncs (Section “Output arguments”) for more details.
keepdims : bool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amax method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
initial : scalar, optional (Not supported in Dask)
The minimum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
where : array_like of bool, optional (Not supported in Dask)
Elements to compare for the maximum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
Returns: amax : ndarray or scalar
Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.See also
amin
- The minimum value of an array along a given axis, propagating any NaNs.
nanmax
- The maximum value of an array along a given axis, ignoring any NaNs.
maximum
- Element-wise maximum of two arrays, propagating any NaNs.
fmax
- Element-wise maximum of two arrays, ignoring any NaNs.
argmax
- Return the indices of the maximum values.
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.
Don’t use amax for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,maximum(a[0], a[1])
is faster thanamax(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) # doctest: +SKIP >>> a # doctest: +SKIP array([[0, 1], [2, 3]]) >>> np.amax(a) # Maximum of the flattened array # doctest: +SKIP 3 >>> np.amax(a, axis=0) # Maxima along the first axis # doctest: +SKIP array([2, 3]) >>> np.amax(a, axis=1) # Maxima along the second axis # doctest: +SKIP array([1, 3]) >>> np.amax(a, where=[False, True], initial=-1, axis=0) # doctest: +SKIP array([-1, 3]) >>> b = np.arange(5, dtype=float) # doctest: +SKIP >>> b[2] = np.NaN # doctest: +SKIP >>> np.amax(b) # doctest: +SKIP nan >>> np.amax(b, where=~np.isnan(b), initial=-1) # doctest: +SKIP 4.0 >>> np.nanmax(b) # doctest: +SKIP 4.0
You can use an initial value to compute the maximum of an empty slice, or to initialize it to a different value:
>>> np.max([[-50], [10]], axis=-1, initial=0) # doctest: +SKIP array([ 0, 10])
Notice that the initial value is used as one of the elements for which the maximum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
>>> np.max([5], initial=6) # doctest: +SKIP 6 >>> max([5], default=6) # doctest: +SKIP 5
-
dask.array.
argmin
(x, axis=None, split_every=None, out=None)¶ Return the minimum of an array or minimum along an axis.
This docstring was copied from numpy.amin.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like (Not supported in Dask)
Input data.
axis : None or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the minimum is selected over multiple axes, instead of a single axis or all the axes as before.
out : ndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See doc.ufuncs (Section “Output arguments”) for more details.
keepdims : bool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amin method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
initial : scalar, optional (Not supported in Dask)
The maximum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
where : array_like of bool, optional (Not supported in Dask)
Elements to compare for the minimum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
Returns: amin : ndarray or scalar
Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.See also
amax
- The maximum value of an array along a given axis, propagating any NaNs.
nanmin
- The minimum value of an array along a given axis, ignoring any NaNs.
minimum
- Element-wise minimum of two arrays, propagating any NaNs.
fmin
- Element-wise minimum of two arrays, ignoring any NaNs.
argmin
- Return the indices of the minimum values.
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmin.
Don’t use amin for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,minimum(a[0], a[1])
is faster thanamin(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) # doctest: +SKIP >>> a # doctest: +SKIP array([[0, 1], [2, 3]]) >>> np.amin(a) # Minimum of the flattened array # doctest: +SKIP 0 >>> np.amin(a, axis=0) # Minima along the first axis # doctest: +SKIP array([0, 1]) >>> np.amin(a, axis=1) # Minima along the second axis # doctest: +SKIP array([0, 2]) >>> np.amin(a, where=[False, True], initial=10, axis=0) # doctest: +SKIP array([10, 1])
>>> b = np.arange(5, dtype=float) # doctest: +SKIP >>> b[2] = np.NaN # doctest: +SKIP >>> np.amin(b) # doctest: +SKIP nan >>> np.amin(b, where=~np.isnan(b), initial=10) # doctest: +SKIP 0.0 >>> np.nanmin(b) # doctest: +SKIP 0.0
>>> np.min([[-50], [10]], axis=-1, initial=0) # doctest: +SKIP array([-50, 0])
Notice that the initial value is used as one of the elements for which the minimum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
Notice that this isn’t the same as Python’s
default
argument.>>> np.min([6], initial=5) # doctest: +SKIP 5 >>> min([6], default=5) # doctest: +SKIP 6
-
dask.array.
argtopk
(a, k, axis=-1, split_every=None)¶ Extract the indices of the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the indices of the -k smallest elements instead, and return them sorted from smallest to largest.
This performs best when
k
is much smaller than the chunk size. All results will be returned in a single chunk along the given axis.Parameters: x: Array
Data being sorted
k: int
axis: int, optional
split_every: int >=2, optional
See
topk()
. The performance considerations for topk also apply here.Returns: Selection of np.intp indices of x with size abs(k) along the given axis.
Examples
>>> import dask.array as da >>> x = np.array([5, 1, 3, 6]) >>> d = da.from_array(x, chunks=2) >>> d.argtopk(2).compute() array([3, 0]) >>> d.argtopk(-2).compute() array([1, 2])
-
dask.array.
argwhere
(a)¶ Find the indices of array elements that are non-zero, grouped by element.
This docstring was copied from numpy.argwhere.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input data.
Returns: index_array : ndarray
Indices of elements that are non-zero. Indices are grouped by element.
Notes
np.argwhere(a)
is the same asnp.transpose(np.nonzero(a))
.The output of
argwhere
is not suitable for indexing arrays. For this purpose usenonzero(a)
instead.Examples
>>> x = np.arange(6).reshape(2,3) # doctest: +SKIP >>> x # doctest: +SKIP array([[0, 1, 2], [3, 4, 5]]) >>> np.argwhere(x>1) # doctest: +SKIP array([[0, 2], [1, 0], [1, 1], [1, 2]])
-
dask.array.
around
(x, decimals=0)¶ Evenly round to the given number of decimals.
This docstring was copied from numpy.around.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like (Not supported in Dask)
Input data.
decimals : int, optional
Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.
out : ndarray, optional (Not supported in Dask)
Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary. See doc.ufuncs (Section “Output arguments”) for details.
Returns: rounded_array : ndarray
An array of the same type as a, containing the rounded values. Unless out was specified, a new array is created. A reference to the result is returned.
The real and imaginary parts of complex numbers are rounded separately. The result of rounding a float is a float.
Notes
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R124] and errors introduced when scaling by powers of ten.
References
[R124] (1, 2) “Lecture Notes on the Status of IEEE 754”, William Kahan, https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF [R125] “How Futile are Mindless Assessments of Roundoff in Floating-Point Computation?”, William Kahan, https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf Examples
>>> np.around([0.37, 1.64]) # doctest: +SKIP array([0., 2.]) >>> np.around([0.37, 1.64], decimals=1) # doctest: +SKIP array([0.4, 1.6]) >>> np.around([.5, 1.5, 2.5, 3.5, 4.5]) # rounds to nearest even value # doctest: +SKIP array([0., 2., 2., 4., 4.]) >>> np.around([1,2,3,11], decimals=1) # ndarray of ints is returned # doctest: +SKIP array([ 1, 2, 3, 11]) >>> np.around([1,2,3,11], decimals=-1) # doctest: +SKIP array([ 0, 0, 0, 10])
-
dask.array.
array
(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)¶ This docstring was copied from numpy.array.
Some inconsistencies with the Dask version may exist.
Create an array.
Parameters: object : array_like
An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence.
dtype : data-type, optional
The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the .astype(t) method.
copy : bool, optional
If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (dtype, order, etc.).
order : {‘K’, ‘A’, ‘C’, ‘F’}, optional
Specify the memory layout of the array. If object is not an array, the newly created array will be in C order (row major) unless ‘F’ is specified, in which case it will be in Fortran order (column major). If object is an array the following holds.
order no copy copy=True ‘K’ unchanged F & C order preserved, otherwise most similar order ‘A’ unchanged F order if input is F and not C, otherwise C order ‘C’ C order C order ‘F’ F order F order When
copy=False
and a copy is made for other reasons, the result is the same as ifcopy=True
, with some exceptions for A, see the Notes section. The default order is ‘K’.subok : bool, optional
If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default).
ndmin : int, optional
Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement.
Returns: out : ndarray
An array object satisfying the specified requirements.
See also
empty_like
- Return an empty array with shape and type of input.
ones_like
- Return an array of ones with shape and type of input.
zeros_like
- Return an array of zeros with shape and type of input.
full_like
- Return a new array with shape of input filled with value.
empty
- Return a new uninitialized array.
ones
- Return a new array setting values to one.
zeros
- Return a new array setting values to zero.
full
- Return a new array of given shape filled with value.
Notes
When order is ‘A’ and object is an array in neither ‘C’ nor ‘F’ order, and a copy is forced by a change in dtype, then the order of the result is not necessarily ‘C’ as expected. This is likely a bug.
Examples
>>> np.array([1, 2, 3]) # doctest: +SKIP array([1, 2, 3])
Upcasting:
>>> np.array([1, 2, 3.0]) # doctest: +SKIP array([ 1., 2., 3.])
More than one dimension:
>>> np.array([[1, 2], [3, 4]]) # doctest: +SKIP array([[1, 2], [3, 4]])
Minimum dimensions 2:
>>> np.array([1, 2, 3], ndmin=2) # doctest: +SKIP array([[1, 2, 3]])
Type provided:
>>> np.array([1, 2, 3], dtype=complex) # doctest: +SKIP array([ 1.+0.j, 2.+0.j, 3.+0.j])
Data-type consisting of more than one element:
>>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')]) # doctest: +SKIP >>> x['a'] # doctest: +SKIP array([1, 3])
Creating an array from sub-classes:
>>> np.array(np.mat('1 2; 3 4')) # doctest: +SKIP array([[1, 2], [3, 4]])
>>> np.array(np.mat('1 2; 3 4'), subok=True) # doctest: +SKIP matrix([[1, 2], [3, 4]])
-
dask.array.
asanyarray
(a)¶ Convert the input to a dask array.
Subclasses of
np.ndarray
will be passed through as chunks unchanged.Parameters: a : array-like
Input data, in any form that can be converted to a dask array.
Returns: out : dask array
Dask array interpretation of a.
Examples
>>> import dask.array as da >>> import numpy as np >>> x = np.arange(3) >>> da.asanyarray(x) dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
>>> y = [[1, 2, 3], [4, 5, 6]] >>> da.asanyarray(y) dask.array<array, shape=(2, 3), dtype=int64, chunksize=(2, 3), chunktype=numpy.ndarray>
-
dask.array.
asarray
(a, **kwargs)¶ Convert the input to a dask array.
Parameters: a : array-like
Input data, in any form that can be converted to a dask array.
Returns: out : dask array
Dask array interpretation of a.
Examples
>>> import dask.array as da >>> import numpy as np >>> x = np.arange(3) >>> da.asarray(x) dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
>>> y = [[1, 2, 3], [4, 5, 6]] >>> da.asarray(y) dask.array<array, shape=(2, 3), dtype=int64, chunksize=(2, 3), chunktype=numpy.ndarray>
-
dask.array.
atleast_1d
(*arys)¶ Convert inputs to arrays with at least one dimension.
This docstring was copied from numpy.atleast_1d.
Some inconsistencies with the Dask version may exist.
Scalar inputs are converted to 1-dimensional arrays, whilst higher-dimensional inputs are preserved.
Parameters: arys1, arys2, … : array_like
One or more input arrays.
Returns: ret : ndarray
An array, or list of arrays, each with
a.ndim >= 1
. Copies are made only if necessary.See also
Examples
>>> np.atleast_1d(1.0) # doctest: +SKIP array([1.])
>>> x = np.arange(9.0).reshape(3,3) # doctest: +SKIP >>> np.atleast_1d(x) # doctest: +SKIP array([[0., 1., 2.], [3., 4., 5.], [6., 7., 8.]]) >>> np.atleast_1d(x) is x # doctest: +SKIP True
>>> np.atleast_1d(1, [3, 4]) # doctest: +SKIP [array([1]), array([3, 4])]
-
dask.array.
atleast_2d
(*arys)¶ View inputs as arrays with at least two dimensions.
This docstring was copied from numpy.atleast_2d.
Some inconsistencies with the Dask version may exist.
Parameters: arys1, arys2, … : array_like
One or more array-like sequences. Non-array inputs are converted to arrays. Arrays that already have two or more dimensions are preserved.
Returns: res, res2, … : ndarray
An array, or list of arrays, each with
a.ndim >= 2
. Copies are avoided where possible, and views with two or more dimensions are returned.See also
Examples
>>> np.atleast_2d(3.0) # doctest: +SKIP array([[3.]])
>>> x = np.arange(3.0) # doctest: +SKIP >>> np.atleast_2d(x) # doctest: +SKIP array([[0., 1., 2.]]) >>> np.atleast_2d(x).base is x # doctest: +SKIP True
>>> np.atleast_2d(1, [1, 2], [[1, 2]]) # doctest: +SKIP [array([[1]]), array([[1, 2]]), array([[1, 2]])]
-
dask.array.
atleast_3d
(*arys)¶ View inputs as arrays with at least three dimensions.
This docstring was copied from numpy.atleast_3d.
Some inconsistencies with the Dask version may exist.
Parameters: arys1, arys2, … : array_like
One or more array-like sequences. Non-array inputs are converted to arrays. Arrays that already have three or more dimensions are preserved.
Returns: res1, res2, … : ndarray
An array, or list of arrays, each with
a.ndim >= 3
. Copies are avoided where possible, and views with three or more dimensions are returned. For example, a 1-D array of shape(N,)
becomes a view of shape(1, N, 1)
, and a 2-D array of shape(M, N)
becomes a view of shape(M, N, 1)
.See also
Examples
>>> np.atleast_3d(3.0) # doctest: +SKIP array([[[3.]]])
>>> x = np.arange(3.0) # doctest: +SKIP >>> np.atleast_3d(x).shape # doctest: +SKIP (1, 3, 1)
>>> x = np.arange(12.0).reshape(4,3) # doctest: +SKIP >>> np.atleast_3d(x).shape # doctest: +SKIP (4, 3, 1) >>> np.atleast_3d(x).base is x.base # x is a reshape, so not base itself # doctest: +SKIP True
>>> for arr in np.atleast_3d([1, 2], [[1, 2]], [[[1, 2]]]): # doctest: +SKIP ... print(arr, arr.shape) # doctest: +SKIP ... [[[1] [2]]] (1, 2, 1) [[[1] [2]]] (1, 2, 1) [[[1 2]]] (1, 1, 2)
-
dask.array.
average
(a, axis=None, weights=None, returned=False)¶ Compute the weighted average along the specified axis.
This docstring was copied from numpy.average.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Array containing data to be averaged. If a is not an array, a conversion is attempted.
axis : None or int or tuple of ints, optional
Axis or axes along which to average a. The default, axis=None, will average over all of the elements of the input array. If axis is negative it counts from the last to the first axis.
New in version 1.7.0.
If axis is a tuple of ints, averaging is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
weights : array_like, optional
An array of weights associated with the values in a. Each value in a contributes to the average according to its associated weight. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a. If weights=None, then all data in a are assumed to have a weight equal to one.
returned : bool, optional
Default is False. If True, the tuple (average, sum_of_weights) is returned, otherwise only the average is returned. If weights=None, sum_of_weights is equivalent to the number of elements over which the average is taken.
Returns: retval, [sum_of_weights] : array_type or double
Return the average along the specified axis. When returned is True, return a tuple with the average as the first element and the sum of the weights as the second element. sum_of_weights is of the same type as retval. The result dtype follows a genereal pattern. If weights is None, the result dtype will be that of a , or
float64
if a is integral. Otherwise, if weights is not None and a is non- integral, the result type will be the type of lowest precision capable of representing values of both a and weights. If a happens to be integral, the previous rules still applies but the result dtype will at least befloat64
.Raises: ZeroDivisionError
When all weights along axis are zero. See numpy.ma.average for a version robust to this type of error.
TypeError
When the length of 1D weights is not the same as the shape of a along axis.
See also
ma.average
- average for masked arrays – useful if your data contains “missing” values
numpy.result_type
- Returns the type that results from applying the numpy type promotion rules to the arguments.
Examples
>>> data = list(range(1,5)) # doctest: +SKIP >>> data # doctest: +SKIP [1, 2, 3, 4] >>> np.average(data) # doctest: +SKIP 2.5 >>> np.average(range(1,11), weights=range(10,0,-1)) # doctest: +SKIP 4.0
>>> data = np.arange(6).reshape((3,2)) # doctest: +SKIP >>> data # doctest: +SKIP array([[0, 1], [2, 3], [4, 5]]) >>> np.average(data, axis=1, weights=[1./4, 3./4]) # doctest: +SKIP array([0.75, 2.75, 4.75]) >>> np.average(data, weights=[1./4, 3./4]) # doctest: +SKIP Traceback (most recent call last): ... TypeError: Axis must be specified when shapes of a and weights differ.
>>> a = np.ones(5, dtype=np.float128) # doctest: +SKIP >>> w = np.ones(5, dtype=np.complex64) # doctest: +SKIP >>> avg = np.average(a, weights=w) # doctest: +SKIP >>> print(avg.dtype) # doctest: +SKIP complex256
-
dask.array.
bincount
(x, weights=None, minlength=0)¶ This docstring was copied from numpy.bincount.
Some inconsistencies with the Dask version may exist.
Count number of occurrences of each value in array of non-negative ints.
The number of bins (of size 1) is one larger than the largest value in x. If minlength is specified, there will be at least this number of bins in the output array (though it will be longer if necessary, depending on the contents of x). Each bin gives the number of occurrences of its index value in x. If weights is specified the input array is weighted by it, i.e. if a value
n
is found at positioni
,out[n] += weight[i]
instead ofout[n] += 1
.Parameters: x : array_like, 1 dimension, nonnegative ints
Input array.
weights : array_like, optional
Weights, array of the same shape as x.
minlength : int, optional
A minimum number of bins for the output array.
New in version 1.6.0.
Returns: out : ndarray of ints
The result of binning the input array. The length of out is equal to
np.amax(x)+1
.Raises: ValueError
If the input is not 1-dimensional, or contains elements with negative values, or if minlength is negative.
TypeError
If the type of the input is float or complex.
Examples
>>> np.bincount(np.arange(5)) # doctest: +SKIP array([1, 1, 1, 1, 1]) >>> np.bincount(np.array([0, 1, 1, 3, 2, 1, 7])) # doctest: +SKIP array([1, 3, 1, 1, 0, 0, 0, 1])
>>> x = np.array([0, 1, 1, 3, 2, 1, 7, 23]) # doctest: +SKIP >>> np.bincount(x).size == np.amax(x)+1 # doctest: +SKIP True
The input array needs to be of integer dtype, otherwise a TypeError is raised:
>>> np.bincount(np.arange(5, dtype=float)) # doctest: +SKIP Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: array cannot be safely cast to required type
A possible use of
bincount
is to perform sums over variable-size chunks of an array, using theweights
keyword.>>> w = np.array([0.3, 0.5, 0.2, 0.7, 1., -0.6]) # weights # doctest: +SKIP >>> x = np.array([0, 1, 1, 2, 2, 2]) # doctest: +SKIP >>> np.bincount(x, weights=w) # doctest: +SKIP array([ 0.3, 0.7, 1.1])
-
dask.array.
bitwise_and
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.bitwise_and.
Some inconsistencies with the Dask version may exist.
Compute the bit-wise AND of two arrays element-wise.
Computes the bit-wise AND of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
&
.Parameters: x1, x2 : array_like
Only integer and boolean types are handled. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Result. This is a scalar if both x1 and x2 are scalars.
See also
logical_and
,bitwise_or
,bitwise_xor
binary_repr
- Return the binary representation of the input number as a string.
Examples
The number 13 is represented by
00001101
. Likewise, 17 is represented by00010001
. The bit-wise AND of 13 and 17 is therefore000000001
, or 1:>>> np.bitwise_and(13, 17) # doctest: +SKIP 1
>>> np.bitwise_and(14, 13) # doctest: +SKIP 12 >>> np.binary_repr(12) # doctest: +SKIP '1100' >>> np.bitwise_and([14,3], 13) # doctest: +SKIP array([12, 1])
>>> np.bitwise_and([11,7], [4,25]) # doctest: +SKIP array([0, 1]) >>> np.bitwise_and(np.array([2,5,255]), np.array([3,14,16])) # doctest: +SKIP array([ 2, 4, 16]) >>> np.bitwise_and([True, True], [False, True]) # doctest: +SKIP array([False, True])
-
dask.array.
bitwise_not
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.invert.
Some inconsistencies with the Dask version may exist.
Compute bit-wise inversion, or bit-wise NOT, element-wise.
Computes the bit-wise NOT of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
~
.For signed integer inputs, the two’s complement is returned. In a two’s-complement system negative numbers are represented by the two’s complement of the absolute value. This is the most common method of representing signed integers on computers [R126]. A N-bit two’s-complement system can represent every integer in the range \(-2^{N-1}\) to \(+2^{N-1}-1\).
Parameters: x : array_like
Only integer and boolean types are handled.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Result. This is a scalar if x is a scalar.
See also
bitwise_and
,bitwise_or
,bitwise_xor
,logical_not
binary_repr
- Return the binary representation of the input number as a string.
Notes
bitwise_not is an alias for invert:
>>> np.bitwise_not is np.invert # doctest: +SKIP True
References
[R126] (1, 2) Wikipedia, “Two’s complement”, https://en.wikipedia.org/wiki/Two’s_complement Examples
We’ve seen that 13 is represented by
00001101
. The invert or bit-wise NOT of 13 is then:>>> x = np.invert(np.array(13, dtype=np.uint8)) # doctest: +SKIP >>> x # doctest: +SKIP 242 >>> np.binary_repr(x, width=8) # doctest: +SKIP '11110010'
The result depends on the bit-width:
>>> x = np.invert(np.array(13, dtype=np.uint16)) # doctest: +SKIP >>> x # doctest: +SKIP 65522 >>> np.binary_repr(x, width=16) # doctest: +SKIP '1111111111110010'
When using signed integer types the result is the two’s complement of the result for the unsigned type:
>>> np.invert(np.array([13], dtype=np.int8)) # doctest: +SKIP array([-14], dtype=int8) >>> np.binary_repr(-14, width=8) # doctest: +SKIP '11110010'
Booleans are accepted as well:
>>> np.invert(np.array([True, False])) # doctest: +SKIP array([False, True])
-
dask.array.
bitwise_or
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.bitwise_or.
Some inconsistencies with the Dask version may exist.
Compute the bit-wise OR of two arrays element-wise.
Computes the bit-wise OR of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
|
.Parameters: x1, x2 : array_like
Only integer and boolean types are handled. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Result. This is a scalar if both x1 and x2 are scalars.
See also
logical_or
,bitwise_and
,bitwise_xor
binary_repr
- Return the binary representation of the input number as a string.
Examples
The number 13 has the binaray representation
00001101
. Likewise, 16 is represented by00010000
. The bit-wise OR of 13 and 16 is then000111011
, or 29:>>> np.bitwise_or(13, 16) # doctest: +SKIP 29 >>> np.binary_repr(29) # doctest: +SKIP '11101'
>>> np.bitwise_or(32, 2) # doctest: +SKIP 34 >>> np.bitwise_or([33, 4], 1) # doctest: +SKIP array([33, 5]) >>> np.bitwise_or([33, 4], [1, 2]) # doctest: +SKIP array([33, 6])
>>> np.bitwise_or(np.array([2, 5, 255]), np.array([4, 4, 4])) # doctest: +SKIP array([ 6, 5, 255]) >>> np.array([2, 5, 255]) | np.array([4, 4, 4]) # doctest: +SKIP array([ 6, 5, 255]) >>> np.bitwise_or(np.array([2, 5, 255, 2147483647], dtype=np.int32), # doctest: +SKIP ... np.array([4, 4, 4, 2147483647], dtype=np.int32)) array([ 6, 5, 255, 2147483647]) >>> np.bitwise_or([True, True], [False, True]) # doctest: +SKIP array([ True, True])
-
dask.array.
bitwise_xor
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.bitwise_xor.
Some inconsistencies with the Dask version may exist.
Compute the bit-wise XOR of two arrays element-wise.
Computes the bit-wise XOR of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
^
.Parameters: x1, x2 : array_like
Only integer and boolean types are handled. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Result. This is a scalar if both x1 and x2 are scalars.
See also
logical_xor
,bitwise_and
,bitwise_or
binary_repr
- Return the binary representation of the input number as a string.
Examples
The number 13 is represented by
00001101
. Likewise, 17 is represented by00010001
. The bit-wise XOR of 13 and 17 is therefore00011100
, or 28:>>> np.bitwise_xor(13, 17) # doctest: +SKIP 28 >>> np.binary_repr(28) # doctest: +SKIP '11100'
>>> np.bitwise_xor(31, 5) # doctest: +SKIP 26 >>> np.bitwise_xor([31,3], 5) # doctest: +SKIP array([26, 6])
>>> np.bitwise_xor([31,3], [5,6]) # doctest: +SKIP array([26, 5]) >>> np.bitwise_xor([True, True], [False, True]) # doctest: +SKIP array([ True, False])
-
dask.array.
block
(arrays, allow_unknown_chunksizes=False)¶ Assemble an nd-array from nested lists of blocks.
Blocks in the innermost lists are concatenated along the last dimension (-1), then these are concatenated along the second-last dimension (-2), and so on until the outermost list is reached
Blocks can be of any dimension, but will not be broadcasted using the normal rules. Instead, leading axes of size 1 are inserted, to make
block.ndim
the same for all blocks. This is primarily useful for working with scalars, and means that code likeblock([v, 1])
is valid, wherev.ndim == 1
.When the nested list is two levels deep, this allows block matrices to be constructed from their components.
Parameters: arrays : nested list of array_like or scalars (but not tuples)
If passed a single ndarray or scalar (a nested list of depth 0), this is returned unmodified (and not copied).
Elements shapes must match along the appropriate axes (without broadcasting), but leading 1s will be prepended to the shape as necessary to make the dimensions match.
allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
Returns: block_array : ndarray
The array assembled from the given blocks.
The dimensionality of the output is equal to the greatest of: * the dimensionality of all the inputs * the depth to which the input list is nested
Raises: ValueError
- If list depths are mismatched - for instance,
[[a, b], c]
is illegal, and should be spelt[[a, b], [c]]
- If lists are empty - for instance,
[[a, b], []]
See also
concatenate
- Join a sequence of arrays together.
stack
- Stack arrays in sequence along a new dimension.
hstack
- Stack arrays in sequence horizontally (column wise).
vstack
- Stack arrays in sequence vertically (row wise).
dstack
- Stack arrays in sequence depth wise (along third dimension).
vsplit
- Split array into a list of multiple sub-arrays vertically.
Notes
When called with only scalars,
block
is equivalent to an ndarray call. Soblock([[1, 2], [3, 4]])
is equivalent toarray([[1, 2], [3, 4]])
.This function does not enforce that the blocks lie on a fixed grid.
block([[a, b], [c, d]])
is not restricted to arrays of the form:AAAbb AAAbb cccDD
But is also allowed to produce, for some
a, b, c, d
:AAAbb AAAbb cDDDD
Since concatenation happens along the last axis first, block is _not_ capable of producing the following directly:
AAAbb cccbb cccDD
Matlab’s “square bracket stacking”,
[A, B, ...; p, q, ...]
, is equivalent toblock([[A, B, ...], [p, q, ...]])
.- If list depths are mismatched - for instance,
-
dask.array.
blockwise
(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)¶ Tensor operation: Generalized inner and outer products
A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The
blockwise
function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.Parameters: func : callable
Function to apply to individual tuples of blocks
out_ind : iterable
Block pattern of the output, something like ‘ijk’ or (1, 2, 3)
*args : sequence of Array, index pairs
Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’)
**kwargs : dict
Extra keyword arguments to pass to function
dtype : np.dtype
Datatype of resulting array.
concatenate : bool, keyword only
If true concatenate arrays along dummy indices, else provide lists
adjust_chunks : dict
Dictionary mapping index to function to be applied to chunk sizes
new_axes : dict, keyword only
New indexes and their dimension lengths
Examples
2D embarrassingly parallel operation from two arrays, x, and y.
>>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8') # z = x + y # doctest: +SKIP
Outer product multiplying x by y, two 1-d vectors
>>> z = blockwise(operator.mul, 'ij', x, 'i', y, 'j', dtype='f8') # doctest: +SKIP
z = x.T
>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype) # doctest: +SKIP
The transpose case above is illustrative because it does same transposition both on each in-memory block by calling
np.transpose
and on the order of the blocks themselves, by switching the order of the indexij -> ji
.We can compose these same patterns with more variables and more complex in-memory functions
z = X + Y.T
>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8') # doctest: +SKIP
Any index, like
i
missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead passconcatenate=True
.Inner product multiplying x by y, two 1-d vectors
>>> def sequence_dot(x_blocks, y_blocks): ... result = 0 ... for x, y in zip(x_blocks, y_blocks): ... result += x.dot(y) ... return result
>>> z = blockwise(sequence_dot, '', x, 'i', y, 'i', dtype='f8') # doctest: +SKIP
Add new single-chunk dimensions with the
new_axes=
keyword, including the length of the new dimension. New dimensions will always be in a single chunk.>>> def f(x): ... return x[:, None] * np.ones((1, 5))
>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': 5}, dtype=x.dtype) # doctest: +SKIP
New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see
da.map_blocks
).>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype) # doctest: +SKIP
If the applied function changes the size of each chunk you can specify this with a
adjust_chunks={...}
dictionary holding a function for each index that modifies the dimension size in that index.>>> def double(x): ... return np.concatenate([x, x])
>>> y = blockwise(double, 'ij', x, 'ij', ... adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype) # doctest: +SKIP
Include literals by indexing with None
>>> y = blockwise(add, 'ij', x, 'ij', 1234, None, dtype=x.dtype) # doctest: +SKIP
-
dask.array.
broadcast_arrays
(*args, **kwargs)¶ Broadcast any number of arrays against each other.
This docstring was copied from numpy.broadcast_arrays.
Some inconsistencies with the Dask version may exist.
Parameters: `*args` : array_likes
The arrays to broadcast.
subok : bool, optional
If True, then sub-classes will be passed-through, otherwise the returned arrays will be forced to be a base-class array (default).
Returns: broadcasted : list of arrays
These arrays are views on the original arrays. They are typically not contiguous. Furthermore, more than one element of a broadcasted array may refer to a single memory location. If you need to write to the arrays, make copies first. While you can set the
writable
flag True, writing to a single output value may end up changing more than one location in the output array.Deprecated since version 1.17: The output is currently marked so that if written to, a deprecation warning will be emitted. A future version will set the
writable
flag False so writing to it will raise an error.Examples
>>> x = np.array([[1,2,3]]) # doctest: +SKIP >>> y = np.array([[4],[5]]) # doctest: +SKIP >>> np.broadcast_arrays(x, y) # doctest: +SKIP [array([[1, 2, 3], [1, 2, 3]]), array([[4, 4, 4], [5, 5, 5]])]
Here is a useful idiom for getting contiguous copies instead of non-contiguous views.
>>> [np.array(a) for a in np.broadcast_arrays(x, y)] # doctest: +SKIP [array([[1, 2, 3], [1, 2, 3]]), array([[4, 4, 4], [5, 5, 5]])]
-
dask.array.
broadcast_to
(x, shape, chunks=None)¶ Broadcast an array to a new shape.
Parameters: x : array_like
The array to broadcast.
shape : tuple
The shape of the desired array.
chunks : tuple, optional
If provided, then the result will use these chunks instead of the same chunks as the source array. Setting chunks explicitly as part of broadcast_to is more efficient than rechunking afterwards. Chunks are only allowed to differ from the original shape along dimensions that are new on the result or have size 1 the input array.
Returns: broadcast : dask array
See also
-
dask.array.
coarsen
(reduction, x, axes, trim_excess=False, **kwargs) Coarsen array by applying reduction to fixed size neighborhoods
Parameters: reduction: function
Function like np.sum, np.mean, etc…
x: np.ndarray
Array to be coarsened
axes: dict
Mapping of axis to coarsening factor
Examples
>>> x = np.array([1, 2, 3, 4, 5, 6]) >>> coarsen(np.sum, x, {0: 2}) array([ 3, 7, 11]) >>> coarsen(np.max, x, {0: 3}) array([3, 6])
Provide dictionary of scale per dimension
>>> x = np.arange(24).reshape((4, 6)) >>> x array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23]])
>>> coarsen(np.min, x, {0: 2, 1: 3}) array([[ 0, 3], [12, 15]])
You must avoid excess elements explicitly
>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8]) >>> coarsen(np.min, x, {0: 3}, trim_excess=True) array([1, 4])
-
dask.array.
ceil
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.ceil.
Some inconsistencies with the Dask version may exist.
Return the ceiling of the input, element-wise.
The ceil of the scalar x is the smallest integer i, such that i >= x. It is often denoted as \(\lceil x \rceil\).
Parameters: x : array_like
Input data.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The ceiling of each element in x, with float dtype. This is a scalar if x is a scalar.
Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) # doctest: +SKIP >>> np.ceil(a) # doctest: +SKIP array([-1., -1., -0., 1., 2., 2., 2.])
-
dask.array.
choose
(a, choices)¶ Construct an array from an index array and a set of arrays to choose from.
This docstring was copied from numpy.choose.
Some inconsistencies with the Dask version may exist.
First of all, if confused or uncertain, definitely look at the Examples - in its full generality, this function is less simple than it might seem from the following code description (below ndi = numpy.lib.index_tricks):
np.choose(a,c) == np.array([c[a[I]][I] for I in ndi.ndindex(a.shape)])
.But this omits some subtleties. Here is a fully general summary:
Given an “index” array (a) of integers and a sequence of n arrays (choices), a and each choice array are first broadcast, as necessary, to arrays of a common shape; calling these Ba and Bchoices[i], i = 0,…,n-1 we have that, necessarily,
Ba.shape == Bchoices[i].shape
for each i. Then, a new array with shapeBa.shape
is created as follows:- if
mode=raise
(the default), then, first of all, each element of a (and thus Ba) must be in the range [0, n-1]; now, suppose that i (in that range) is the value at the (j0, j1, …, jm) position in Ba - then the value at the same position in the new array is the value in Bchoices[i] at that same position; - if
mode=wrap
, values in a (and thus Ba) may be any (signed) integer; modular arithmetic is used to map integers outside the range [0, n-1] back into that range; and then the new array is constructed as above; - if
mode=clip
, values in a (and thus Ba) may be any (signed) integer; negative integers are mapped to 0; values greater than n-1 are mapped to n-1; and then the new array is constructed as above.
Parameters: a : int array
This array must contain integers in [0, n-1], where n is the number of choices, unless
mode=wrap
ormode=clip
, in which cases any integers are permissible.choices : sequence of arrays
Choice arrays. a and all of the choices must be broadcastable to the same shape. If choices is itself an array (not recommended), then its outermost dimension (i.e., the one corresponding to
choices.shape[0]
) is taken as defining the “sequence”.out : array, optional (Not supported in Dask)
If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. Note that out is always buffered if mode=’raise’; use other modes for better performance.
mode : {‘raise’ (default), ‘wrap’, ‘clip’}, optional (Not supported in Dask)
Specifies how indices outside [0, n-1] will be treated:
- ‘raise’ : an exception is raised
- ‘wrap’ : value becomes value mod n
- ‘clip’ : values < 0 are mapped to 0, values > n-1 are mapped to n-1
Returns: merged_array : array
The merged result.
Raises: ValueError: shape mismatch
If a and each choice array are not all broadcastable to the same shape.
See also
ndarray.choose
- equivalent method
Notes
To reduce the chance of misinterpretation, even though the following “abuse” is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.
Examples
>>> choices = [[0, 1, 2, 3], [10, 11, 12, 13], # doctest: +SKIP ... [20, 21, 22, 23], [30, 31, 32, 33]] >>> np.choose([2, 3, 1, 0], choices # doctest: +SKIP ... # the first element of the result will be the first element of the ... # third (2+1) "array" in choices, namely, 20; the second element ... # will be the second element of the fourth (3+1) choice array, i.e., ... # 31, etc. ... ) array([20, 31, 12, 3]) >>> np.choose([2, 4, 1, 0], choices, mode='clip') # 4 goes to 3 (4-1) # doctest: +SKIP array([20, 31, 12, 3]) >>> # because there are 4 choice arrays >>> np.choose([2, 4, 1, 0], choices, mode='wrap') # 4 goes to (4 mod 4) # doctest: +SKIP array([20, 1, 12, 3]) >>> # i.e., 0
A couple examples illustrating how choose broadcasts:
>>> a = [[1, 0, 1], [0, 1, 0], [1, 0, 1]] # doctest: +SKIP >>> choices = [-10, 10] # doctest: +SKIP >>> np.choose(a, choices) # doctest: +SKIP array([[ 10, -10, 10], [-10, 10, -10], [ 10, -10, 10]])
>>> # With thanks to Anne Archibald >>> a = np.array([0, 1]).reshape((2,1,1)) # doctest: +SKIP >>> c1 = np.array([1, 2, 3]).reshape((1,3,1)) # doctest: +SKIP >>> c2 = np.array([-1, -2, -3, -4, -5]).reshape((1,1,5)) # doctest: +SKIP >>> np.choose(a, (c1, c2)) # result is 2x3x5, res[0,:,:]=c1, res[1,:,:]=c2 # doctest: +SKIP array([[[ 1, 1, 1, 1, 1], [ 2, 2, 2, 2, 2], [ 3, 3, 3, 3, 3]], [[-1, -2, -3, -4, -5], [-1, -2, -3, -4, -5], [-1, -2, -3, -4, -5]]])
- if
-
dask.array.
clip
(*args, **kwargs)¶ Clip (limit) the values in an array.
This docstring was copied from numpy.clip.
Some inconsistencies with the Dask version may exist.
Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of
[0, 1]
is specified, values smaller than 0 become 0, and values larger than 1 become 1.Equivalent to but faster than
np.maximum(a_min, np.minimum(a, a_max))
. No check is performed to ensurea_min < a_max
.Parameters: a : array_like (Not supported in Dask)
Array containing elements to clip.
a_min : scalar or array_like or None (Not supported in Dask)
Minimum value. If None, clipping is not performed on lower interval edge. Not more than one of a_min and a_max may be None.
a_max : scalar or array_like or None (Not supported in Dask)
Maximum value. If None, clipping is not performed on upper interval edge. Not more than one of a_min and a_max may be None. If a_min or a_max are array_like, then the three arrays will be broadcasted to match their shapes.
out : ndarray, optional (Not supported in Dask)
The results will be placed in this array. It may be the input array for in-place clipping. out must be of the right shape to hold the output. Its type is preserved.
**kwargs
For other keyword-only arguments, see the ufunc docs.
New in version 1.17.0.
Returns: clipped_array : ndarray
An array with the elements of a, but where values < a_min are replaced with a_min, and those > a_max with a_max.
See also
numpy.doc.ufuncs
- Section “Output arguments”
Examples
>>> a = np.arange(10) # doctest: +SKIP >>> np.clip(a, 1, 8) # doctest: +SKIP array([1, 1, 2, 3, 4, 5, 6, 7, 8, 8]) >>> a # doctest: +SKIP array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> np.clip(a, 3, 6, out=a) # doctest: +SKIP array([3, 3, 3, 3, 4, 5, 6, 6, 6, 6]) >>> a = np.arange(10) # doctest: +SKIP >>> a # doctest: +SKIP array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> np.clip(a, [3, 4, 1, 1, 1, 4, 4, 4, 4, 4], 8) # doctest: +SKIP array([3, 4, 2, 3, 4, 5, 6, 7, 8, 8])
-
dask.array.
compress
(condition, a, axis=None)¶ Return selected slices of an array along given axis.
This docstring was copied from numpy.compress.
Some inconsistencies with the Dask version may exist.
When working along a given axis, a slice along that axis is returned in output for each index where condition evaluates to True. When working on a 1-D array, compress is equivalent to extract.
Parameters: condition : 1-D array of bools
Array that selects which entries to return. If len(condition) is less than the size of a along the given axis, then output is truncated to the length of the condition array.
a : array_like
Array from which to extract a part.
axis : int, optional
Axis along which to take slices. If None (default), work on the flattened array.
out : ndarray, optional (Not supported in Dask)
Output array. Its type is preserved and it must be of the right shape to hold the output.
Returns: compressed_array : ndarray
A copy of a without the slices along axis for which condition is false.
See also
take
,choose
,diag
,diagonal
,select
ndarray.compress
- Equivalent method in ndarray
np.extract
- Equivalent method when working on 1-D arrays
numpy.doc.ufuncs
- Section “Output arguments”
Examples
>>> a = np.array([[1, 2], [3, 4], [5, 6]]) # doctest: +SKIP >>> a # doctest: +SKIP array([[1, 2], [3, 4], [5, 6]]) >>> np.compress([0, 1], a, axis=0) # doctest: +SKIP array([[3, 4]]) >>> np.compress([False, True, True], a, axis=0) # doctest: +SKIP array([[3, 4], [5, 6]]) >>> np.compress([False, True], a, axis=1) # doctest: +SKIP array([[2], [4], [6]])
Working on the flattened array does not return slices along an axis but selects elements.
>>> np.compress([False, True], a) # doctest: +SKIP array([2])
-
dask.array.
concatenate
(seq, axis=0, allow_unknown_chunksizes=False) Concatenate arrays along an existing axis
Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)
Parameters: seq: list of dask.arrays
axis: int
Dimension along which to align all of the arrays
allow_unknown_chunksizes: bool
Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.concatenate(data, axis=0) >>> x.shape (12, 4)
>>> da.concatenate(data, axis=1).shape (4, 12)
Result is a new dask Array
-
dask.array.
conj
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.conjugate.
Some inconsistencies with the Dask version may exist.
Return the complex conjugate, element-wise.
The complex conjugate of a complex number is obtained by changing the sign of its imaginary part.
Parameters: x : array_like
Input value.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The complex conjugate of x, with same dtype as y. This is a scalar if x is a scalar.
Notes
conj is an alias for conjugate:
>>> np.conj is np.conjugate # doctest: +SKIP True
Examples
>>> np.conjugate(1+2j) # doctest: +SKIP (1-2j)
>>> x = np.eye(2) + 1j * np.eye(2) # doctest: +SKIP >>> np.conjugate(x) # doctest: +SKIP array([[ 1.-1.j, 0.-0.j], [ 0.-0.j, 1.-1.j]])
-
dask.array.
copysign
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.copysign.
Some inconsistencies with the Dask version may exist.
Change the sign of x1 to that of x2, element-wise.
If x2 is a scalar, its sign will be copied to all elements of x1.
Parameters: x1 : array_like
Values to change the sign of.
x2 : array_like
The sign of x2 is copied to x1. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
The values of x1 with the sign of x2. This is a scalar if both x1 and x2 are scalars.
Examples
>>> np.copysign(1.3, -1) # doctest: +SKIP -1.3 >>> 1/np.copysign(0, 1) # doctest: +SKIP inf >>> 1/np.copysign(0, -1) # doctest: +SKIP -inf
>>> np.copysign([-1, 0, 1], -1.1) # doctest: +SKIP array([-1., -0., -1.]) >>> np.copysign([-1, 0, 1], np.arange(3)-1) # doctest: +SKIP array([-1., 0., 1.])
-
dask.array.
corrcoef
(x, y=None, rowvar=1)¶ Return Pearson product-moment correlation coefficients.
This docstring was copied from numpy.corrcoef.
Some inconsistencies with the Dask version may exist.
Please refer to the documentation for cov for more detail. The relationship between the correlation coefficient matrix, R, and the covariance matrix, C, is
\[R_{ij} = \frac{ C_{ij} } { \sqrt{ C_{ii} * C_{jj} } }\]The values of R are between -1 and 1, inclusive.
Parameters: x : array_like
A 1-D or 2-D array containing multiple variables and observations. Each row of x represents a variable, and each column a single observation of all those variables. Also see rowvar below.
y : array_like, optional
An additional set of variables and observations. y has the same shape as x.
rowvar : bool, optional
If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.
bias : _NoValue, optional (Not supported in Dask)
Has no effect, do not use.
Deprecated since version 1.10.0.
ddof : _NoValue, optional (Not supported in Dask)
Has no effect, do not use.
Deprecated since version 1.10.0.
Returns: R : ndarray
The correlation coefficient matrix of the variables.
See also
cov
- Covariance matrix
Notes
Due to floating point rounding the resulting array may not be Hermitian, the diagonal elements may not be 1, and the elements may not satisfy the inequality abs(a) <= 1. The real and imaginary parts are clipped to the interval [-1, 1] in an attempt to improve on that situation but is not much help in the complex case.
This function accepts but discards arguments bias and ddof. This is for backwards compatibility with previous versions of this function. These arguments had no effect on the return values of the function and can be safely ignored in this and previous versions of numpy.
-
dask.array.
cos
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.cos.
Some inconsistencies with the Dask version may exist.
Cosine element-wise.
Parameters: x : array_like
Input array in radians.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The corresponding cosine values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.
Examples
>>> np.cos(np.array([0, np.pi/2, np.pi])) # doctest: +SKIP array([ 1.00000000e+00, 6.12303177e-17, -1.00000000e+00]) >>> >>> # Example of providing the optional output parameter >>> out1 = np.array([0], dtype='d') # doctest: +SKIP >>> out2 = np.cos([0.1], out1) # doctest: +SKIP >>> out2 is out1 # doctest: +SKIP True >>> >>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.cos(np.zeros((3,3)),np.zeros((2,2))) # doctest: +SKIP Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
cosh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.cosh.
Some inconsistencies with the Dask version may exist.
Hyperbolic cosine, element-wise.
Equivalent to
1/2 * (np.exp(x) + np.exp(-x))
andnp.cos(1j*x)
.Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Output array of same shape as x. This is a scalar if x is a scalar.
Examples
>>> np.cosh(0) # doctest: +SKIP 1.0
The hyperbolic cosine describes the shape of a hanging cable:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> x = np.linspace(-4, 4, 1000) # doctest: +SKIP >>> plt.plot(x, np.cosh(x)) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.
count_nonzero
(a, axis=None)¶ Counts the number of non-zero values in the array
a
.This docstring was copied from numpy.count_nonzero.
Some inconsistencies with the Dask version may exist.
The word “non-zero” is in reference to the Python 2.x built-in method
__nonzero__()
(renamed__bool__()
in Python 3.x) of Python objects that tests an object’s “truthfulness”. For example, any number is considered truthful if it is nonzero, whereas any string is considered truthful if it is not the empty string. Thus, this function (recursively) counts how many elements ina
(and in sub-arrays thereof) have their__nonzero__()
or__bool__()
method evaluated toTrue
.Parameters: a : array_like
The array for which to count non-zeros.
axis : int or tuple, optional
Axis or tuple of axes along which to count non-zeros. Default is None, meaning that non-zeros will be counted along a flattened version of
a
.New in version 1.12.0.
Returns: count : int or array of int
Number of non-zero values in the array along a given axis. Otherwise, the total number of non-zero values in the array is returned.
See also
nonzero
- Return the coordinates of all the non-zero values.
Examples
>>> np.count_nonzero(np.eye(4)) # doctest: +SKIP 4 >>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]]) # doctest: +SKIP 5 >>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=0) # doctest: +SKIP array([1, 1, 1, 1, 1]) >>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=1) # doctest: +SKIP array([2, 3])
-
dask.array.
cov
(m, y=None, rowvar=1, bias=0, ddof=None)¶ Estimate a covariance matrix, given data and weights.
This docstring was copied from numpy.cov.
Some inconsistencies with the Dask version may exist.
Covariance indicates the level to which two variables vary together. If we examine N-dimensional samples, \(X = [x_1, x_2, ... x_N]^T\), then the covariance matrix element \(C_{ij}\) is the covariance of \(x_i\) and \(x_j\). The element \(C_{ii}\) is the variance of \(x_i\).
See the notes for an outline of the algorithm.
Parameters: m : array_like
A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables. Also see rowvar below.
y : array_like, optional
An additional set of variables and observations. y has the same form as that of m.
rowvar : bool, optional
If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.
bias : bool, optional
Default normalization (False) is by
(N - 1)
, whereN
is the number of observations given (unbiased estimate). If bias is True, then normalization is byN
. These values can be overridden by using the keywordddof
in numpy versions >= 1.5.ddof : int, optional
If not
None
the default value implied by bias is overridden. Note thatddof=1
will return the unbiased estimate, even if both fweights and aweights are specified, andddof=0
will return the simple average. See the notes for the details. The default value isNone
.New in version 1.5.
fweights : array_like, int, optional (Not supported in Dask)
1-D array of integer frequency weights; the number of times each observation vector should be repeated.
New in version 1.10.
aweights : array_like, optional (Not supported in Dask)
1-D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If
ddof=0
the array of weights can be used to assign probabilities to observation vectors.New in version 1.10.
Returns: out : ndarray
The covariance matrix of the variables.
See also
corrcoef
- Normalized covariance matrix
Notes
Assume that the observations are in the columns of the observation array m and let
f = fweights
anda = aweights
for brevity. The steps to compute the weighted covariance are as follows:>>> m = np.arange(10, dtype=np.float64) >>> f = np.arange(10) * 2 >>> a = np.arange(10) ** 2. >>> ddof = 9 # N - 1 >>> w = f * a >>> v1 = np.sum(w) >>> v2 = np.sum(w * a) >>> m -= np.sum(m * w, axis=None, keepdims=True) / v1 >>> cov = np.dot(m * w, m.T) * v1 / (v1**2 - ddof * v2)
Note that when
a == 1
, the normalization factorv1 / (v1**2 - ddof * v2)
goes over to1 / (np.sum(f) - ddof)
as it should.Examples
Consider two variables, \(x_0\) and \(x_1\), which correlate perfectly, but in opposite directions:
>>> x = np.array([[0, 2], [1, 1], [2, 0]]).T # doctest: +SKIP >>> x # doctest: +SKIP array([[0, 1, 2], [2, 1, 0]])
Note how \(x_0\) increases while \(x_1\) decreases. The covariance matrix shows this clearly:
>>> np.cov(x) # doctest: +SKIP array([[ 1., -1.], [-1., 1.]])
Note that element \(C_{0,1}\), which shows the correlation between \(x_0\) and \(x_1\), is negative.
Further, note how x and y are combined:
>>> x = [-2.1, -1, 4.3] # doctest: +SKIP >>> y = [3, 1.1, 0.12] # doctest: +SKIP >>> X = np.stack((x, y), axis=0) # doctest: +SKIP >>> np.cov(X) # doctest: +SKIP array([[11.71 , -4.286 ], # may vary [-4.286 , 2.144133]]) >>> np.cov(x, y) # doctest: +SKIP array([[11.71 , -4.286 ], # may vary [-4.286 , 2.144133]]) >>> np.cov(x) # doctest: +SKIP array(11.71)
-
dask.array.
cumprod
(x, axis=None, dtype=None, out=None)¶ Return the cumulative product of elements along a given axis.
This docstring was copied from numpy.cumprod.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like (Not supported in Dask)
Input array.
axis : int, optional
Axis along which the cumulative product is computed. By default the input is flattened.
dtype : dtype, optional
Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary.
Returns: cumprod : ndarray
A new array holding the result is returned unless out is specified, in which case a reference to out is returned.
See also
numpy.doc.ufuncs
- Section “Output arguments”
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow.
Examples
>>> a = np.array([1,2,3]) # doctest: +SKIP >>> np.cumprod(a) # intermediate results 1, 1*2 # doctest: +SKIP ... # total product 1*2*3 = 6 array([1, 2, 6]) >>> a = np.array([[1, 2, 3], [4, 5, 6]]) # doctest: +SKIP >>> np.cumprod(a, dtype=float) # specify type of output # doctest: +SKIP array([ 1., 2., 6., 24., 120., 720.])
The cumulative product for each column (i.e., over the rows) of a:
>>> np.cumprod(a, axis=0) # doctest: +SKIP array([[ 1, 2, 3], [ 4, 10, 18]])
The cumulative product for each row (i.e. over the columns) of a:
>>> np.cumprod(a,axis=1) # doctest: +SKIP array([[ 1, 2, 6], [ 4, 20, 120]])
-
dask.array.
cumsum
(x, axis=None, dtype=None, out=None)¶ Return the cumulative sum of the elements along a given axis.
This docstring was copied from numpy.cumsum.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like (Not supported in Dask)
Input array.
axis : int, optional
Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.
dtype : dtype, optional
Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See doc.ufuncs (Section “Output arguments”) for more details.
Returns: cumsum_along_axis : ndarray.
A new array holding the result is returned unless out is specified, in which case a reference to out is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.
See also
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow.
Examples
>>> a = np.array([[1,2,3], [4,5,6]]) # doctest: +SKIP >>> a # doctest: +SKIP array([[1, 2, 3], [4, 5, 6]]) >>> np.cumsum(a) # doctest: +SKIP array([ 1, 3, 6, 10, 15, 21]) >>> np.cumsum(a, dtype=float) # specifies type of output value(s) # doctest: +SKIP array([ 1., 3., 6., 10., 15., 21.])
>>> np.cumsum(a,axis=0) # sum over rows for each of the 3 columns # doctest: +SKIP array([[1, 2, 3], [5, 7, 9]]) >>> np.cumsum(a,axis=1) # sum over columns for each of the 2 rows # doctest: +SKIP array([[ 1, 3, 6], [ 4, 9, 15]])
-
dask.array.
deg2rad
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.deg2rad.
Some inconsistencies with the Dask version may exist.
Convert angles from degrees to radians.
Parameters: x : array_like
Angles in degrees.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The corresponding angle in radians. This is a scalar if x is a scalar.
See also
rad2deg
- Convert angles from radians to degrees.
unwrap
- Remove large jumps in angle by wrapping.
Notes
New in version 1.3.0.
deg2rad(x)
isx * pi / 180
.Examples
>>> np.deg2rad(180) # doctest: +SKIP 3.1415926535897931
-
dask.array.
degrees
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.degrees.
Some inconsistencies with the Dask version may exist.
Convert angles from radians to degrees.
Parameters: x : array_like
Input array in radians.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray of floats
The corresponding degree values; if out was supplied this is a reference to it. This is a scalar if x is a scalar.
See also
rad2deg
- equivalent function
Examples
Convert a radian array to degrees
>>> rad = np.arange(12.)*np.pi/6 # doctest: +SKIP >>> np.degrees(rad) # doctest: +SKIP array([ 0., 30., 60., 90., 120., 150., 180., 210., 240., 270., 300., 330.])
>>> out = np.zeros((rad.shape)) # doctest: +SKIP >>> r = np.degrees(rad, out) # doctest: +SKIP >>> np.all(r == out) # doctest: +SKIP True
-
dask.array.
diag
(v)¶ Extract a diagonal or construct a diagonal array.
This docstring was copied from numpy.diag.
Some inconsistencies with the Dask version may exist.
See the more detailed documentation for
numpy.diagonal
if you use this function to extract a diagonal and wish to write to the resulting array; whether it returns a copy or a view depends on what version of numpy you are using.Parameters: v : array_like
If v is a 2-D array, return a copy of its k-th diagonal. If v is a 1-D array, return a 2-D array with v on the k-th diagonal.
k : int, optional (Not supported in Dask)
Diagonal in question. The default is 0. Use k>0 for diagonals above the main diagonal, and k<0 for diagonals below the main diagonal.
Returns: out : ndarray
The extracted diagonal or constructed diagonal array.
See also
Examples
>>> x = np.arange(9).reshape((3,3)) # doctest: +SKIP >>> x # doctest: +SKIP array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
>>> np.diag(x) # doctest: +SKIP array([0, 4, 8]) >>> np.diag(x, k=1) # doctest: +SKIP array([1, 5]) >>> np.diag(x, k=-1) # doctest: +SKIP array([3, 7])
>>> np.diag(np.diag(x)) # doctest: +SKIP array([[0, 0, 0], [0, 4, 0], [0, 0, 8]])
-
dask.array.
diagonal
(a, offset=0, axis1=0, axis2=1)¶ Return specified diagonals.
This docstring was copied from numpy.diagonal.
Some inconsistencies with the Dask version may exist.
If a is 2-D, returns the diagonal of a with the given offset, i.e., the collection of elements of the form
a[i, i+offset]
. If a has more than two dimensions, then the axes specified by axis1 and axis2 are used to determine the 2-D sub-array whose diagonal is returned. The shape of the resulting array can be determined by removing axis1 and axis2 and appending an index to the right equal to the size of the resulting diagonals.In versions of NumPy prior to 1.7, this function always returned a new, independent array containing a copy of the values in the diagonal.
In NumPy 1.7 and 1.8, it continues to return a copy of the diagonal, but depending on this fact is deprecated. Writing to the resulting array continues to work as it used to, but a FutureWarning is issued.
Starting in NumPy 1.9 it returns a read-only view on the original array. Attempting to write to the resulting array will produce an error.
In some future release, it will return a read/write view and writing to the returned array will alter your original array. The returned array will have the same type as the input array.
If you don’t write to the array returned by this function, then you can just ignore all of the above.
If you depend on the current behavior, then we suggest copying the returned array explicitly, i.e., use
np.diagonal(a).copy()
instead of justnp.diagonal(a)
. This will work with both past and future versions of NumPy.Parameters: a : array_like
Array from which the diagonals are taken.
offset : int, optional
Offset of the diagonal from the main diagonal. Can be positive or negative. Defaults to main diagonal (0).
axis1 : int, optional
Axis to be used as the first axis of the 2-D sub-arrays from which the diagonals should be taken. Defaults to first axis (0).
axis2 : int, optional
Axis to be used as the second axis of the 2-D sub-arrays from which the diagonals should be taken. Defaults to second axis (1).
Returns: array_of_diagonals : ndarray
If a is 2-D, then a 1-D array containing the diagonal and of the same type as a is returned unless a is a matrix, in which case a 1-D array rather than a (2-D) matrix is returned in order to maintain backward compatibility.
If
a.ndim > 2
, then the dimensions specified by axis1 and axis2 are removed, and a new axis inserted at the end corresponding to the diagonal.Raises: ValueError
If the dimension of a is less than 2.
See also
diag
- MATLAB work-a-like for 1-D and 2-D arrays.
diagflat
- Create diagonal arrays.
trace
- Sum along diagonals.
Examples
>>> a = np.arange(4).reshape(2,2) # doctest: +SKIP >>> a # doctest: +SKIP array([[0, 1], [2, 3]]) >>> a.diagonal() # doctest: +SKIP array([0, 3]) >>> a.diagonal(1) # doctest: +SKIP array([1])
A 3-D example:
>>> a = np.arange(8).reshape(2,2,2); a # doctest: +SKIP array([[[0, 1], [2, 3]], [[4, 5], [6, 7]]]) >>> a.diagonal(0, # Main diagonals of two arrays created by skipping # doctest: +SKIP ... 0, # across the outer(left)-most axis last and ... 1) # the "middle" (row) axis first. array([[0, 6], [1, 7]])
The sub-arrays whose main diagonals we just obtained; note that each corresponds to fixing the right-most (column) axis, and that the diagonals are “packed” in rows.
>>> a[:,:,0] # main diagonal is [0 6] # doctest: +SKIP array([[0, 2], [4, 6]]) >>> a[:,:,1] # main diagonal is [1 7] # doctest: +SKIP array([[1, 3], [5, 7]])
The anti-diagonal can be obtained by reversing the order of elements using either numpy.flipud or numpy.fliplr.
>>> a = np.arange(9).reshape(3, 3) # doctest: +SKIP >>> a # doctest: +SKIP array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> np.fliplr(a).diagonal() # Horizontal flip # doctest: +SKIP array([2, 4, 6]) >>> np.flipud(a).diagonal() # Vertical flip # doctest: +SKIP array([6, 4, 2])
Note that the order in which the diagonal is retrieved varies depending on the flip function.
-
dask.array.
diff
(a, n=1, axis=-1)¶ Calculate the n-th discrete difference along the given axis.
This docstring was copied from numpy.diff.
Some inconsistencies with the Dask version may exist.
The first difference is given by
out[i] = a[i+1] - a[i]
along the given axis, higher differences are calculated by using diff recursively.Parameters: a : array_like
Input array
n : int, optional
The number of times values are differenced. If zero, the input is returned as-is.
axis : int, optional
The axis along which the difference is taken, default is the last axis.
prepend, append : array_like, optional
Values to prepend or append to “a” along axis prior to performing the difference. Scalar values are expanded to arrays with length 1 in the direction of axis and the shape of the input array in along all other axes. Otherwise the dimension and shape must match “a” except along axis.
Returns: diff : ndarray
The n-th differences. The shape of the output is the same as a except along axis where the dimension is smaller by n. The type of the output is the same as the type of the difference between any two elements of a. This is the same as the type of a in most cases. A notable exception is datetime64, which results in a timedelta64 output array.
Notes
Type is preserved for boolean arrays, so the result will contain False when consecutive elements are the same and True when they differ.
For unsigned integer arrays, the results will also be unsigned. This should not be surprising, as the result is consistent with calculating the difference directly:
>>> u8_arr = np.array([1, 0], dtype=np.uint8) # doctest: +SKIP >>> np.diff(u8_arr) # doctest: +SKIP array([255], dtype=uint8) >>> u8_arr[1,...] - u8_arr[0,...] # doctest: +SKIP 255
If this is not desirable, then the array should be cast to a larger integer type first:
>>> i16_arr = u8_arr.astype(np.int16) # doctest: +SKIP >>> np.diff(i16_arr) # doctest: +SKIP array([-1], dtype=int16)
Examples
>>> x = np.array([1, 2, 4, 7, 0]) # doctest: +SKIP >>> np.diff(x) # doctest: +SKIP array([ 1, 2, 3, -7]) >>> np.diff(x, n=2) # doctest: +SKIP array([ 1, 1, -10])
>>> x = np.array([[1, 3, 6, 10], [0, 5, 6, 8]]) # doctest: +SKIP >>> np.diff(x) # doctest: +SKIP array([[2, 3, 4], [5, 1, 2]]) >>> np.diff(x, axis=0) # doctest: +SKIP array([[-1, 2, 0, -2]])
>>> x = np.arange('1066-10-13', '1066-10-16', dtype=np.datetime64) # doctest: +SKIP >>> np.diff(x) # doctest: +SKIP array([1, 1], dtype='timedelta64[D]')
-
dask.array.
digitize
(a, bins, right=False)¶ Return the indices of the bins to which each value in input array belongs.
This docstring was copied from numpy.digitize.
Some inconsistencies with the Dask version may exist.
right order of bins returned index i satisfies False
increasing bins[i-1] <= x < bins[i]
True
increasing bins[i-1] < x <= bins[i]
False
decreasing bins[i-1] > x >= bins[i]
True
decreasing bins[i-1] >= x > bins[i]
If values in x are beyond the bounds of bins, 0 or
len(bins)
is returned as appropriate.Parameters: x : array_like (Not supported in Dask)
Input array to be binned. Prior to NumPy 1.10.0, this array had to be 1-dimensional, but can now have any shape.
bins : array_like
Array of bins. It has to be 1-dimensional and monotonic.
right : bool, optional
Indicating whether the intervals include the right or the left bin edge. Default behavior is (right==False) indicating that the interval does not include the right edge. The left bin end is open in this case, i.e., bins[i-1] <= x < bins[i] is the default behavior for monotonically increasing bins.
Returns: indices : ndarray of ints
Output array of indices, of same shape as x.
Raises: ValueError
If bins is not monotonic.
TypeError
If the type of the input is complex.
Notes
If values in x are such that they fall outside the bin range, attempting to index bins with the indices that digitize returns will result in an IndexError.
New in version 1.10.0.
np.digitize is implemented in terms of np.searchsorted. This means that a binary search is used to bin the values, which scales much better for larger number of bins than the previous linear search. It also removes the requirement for the input array to be 1-dimensional.
For monotonically _increasing_ bins, the following are equivalent:
np.digitize(x, bins, right=True) np.searchsorted(bins, x, side='left')
Note that as the order of the arguments are reversed, the side must be too. The searchsorted call is marginally faster, as it does not do any monotonicity checks. Perhaps more importantly, it supports all dtypes.
Examples
>>> x = np.array([0.2, 6.4, 3.0, 1.6]) # doctest: +SKIP >>> bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0]) # doctest: +SKIP >>> inds = np.digitize(x, bins) # doctest: +SKIP >>> inds # doctest: +SKIP array([1, 4, 3, 2]) >>> for n in range(x.size): # doctest: +SKIP ... print(bins[inds[n]-1], "<=", x[n], "<", bins[inds[n]]) ... 0.0 <= 0.2 < 1.0 4.0 <= 6.4 < 10.0 2.5 <= 3.0 < 4.0 1.0 <= 1.6 < 2.5
>>> x = np.array([1.2, 10.0, 12.4, 15.5, 20.]) # doctest: +SKIP >>> bins = np.array([0, 5, 10, 15, 20]) # doctest: +SKIP >>> np.digitize(x,bins,right=True) # doctest: +SKIP array([1, 2, 3, 4, 4]) >>> np.digitize(x,bins,right=False) # doctest: +SKIP array([1, 3, 3, 4, 5])
-
dask.array.
dot
(a, b, out=None)¶ This docstring was copied from numpy.dot.
Some inconsistencies with the Dask version may exist.
Dot product of two arrays. Specifically,
If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication, but using
matmul()
ora @ b
is preferred.If either a or b is 0-D (scalar), it is equivalent to
multiply()
and usingnumpy.multiply(a, b)
ora * b
is preferred.If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
If a is an N-D array and b is an M-D array (where
M>=2
), it is a sum product over the last axis of a and the second-to-last axis of b:dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
Parameters: a : array_like
First argument.
b : array_like
Second argument.
out : ndarray, optional
Output argument. This must have the exact kind that would be returned if it was not used. In particular, it must have the right type, must be C-contiguous, and its dtype must be the dtype that would be returned for dot(a,b). This is a performance feature. Therefore, if these conditions are not met, an exception is raised, instead of attempting to be flexible.
Returns: output : ndarray
Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. If out is given, then it is returned.
Raises: ValueError
If the last dimension of a is not the same size as the second-to-last dimension of b.
See also
Examples
>>> np.dot(3, 4) # doctest: +SKIP 12
Neither argument is complex-conjugated:
>>> np.dot([2j, 3j], [2j, 3j]) # doctest: +SKIP (-13+0j)
For 2-D arrays it is the matrix product:
>>> a = [[1, 0], [0, 1]] # doctest: +SKIP >>> b = [[4, 1], [2, 2]] # doctest: +SKIP >>> np.dot(a, b) # doctest: +SKIP array([[4, 1], [2, 2]])
>>> a = np.arange(3*4*5*6).reshape((3,4,5,6)) # doctest: +SKIP >>> b = np.arange(3*4*5*6)[::-1].reshape((5,4,6,3)) # doctest: +SKIP >>> np.dot(a, b)[2,3,2,1,2,2] # doctest: +SKIP 499128 >>> sum(a[2,3,2,:] * b[1,2,:,2]) # doctest: +SKIP 499128
-
dask.array.
dstack
(tup, allow_unknown_chunksizes=False)¶ Stack arrays in sequence depth wise (along third axis).
This docstring was copied from numpy.dstack.
Some inconsistencies with the Dask version may exist.
This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1). Rebuilds arrays divided by dsplit.
This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.
Parameters: tup : sequence of arrays
The arrays must have the same shape along all but the third axis. 1-D or 2-D arrays must have the same shape.
Returns: stacked : ndarray
The array formed by stacking the given arrays, will be at least 3-D.
See also
stack
- Join a sequence of arrays along a new axis.
vstack
- Stack along first axis.
hstack
- Stack along second axis.
concatenate
- Join a sequence of arrays along an existing axis.
dsplit
- Split array along third axis.
Examples
>>> a = np.array((1,2,3)) # doctest: +SKIP >>> b = np.array((2,3,4)) # doctest: +SKIP >>> np.dstack((a,b)) # doctest: +SKIP array([[[1, 2], [2, 3], [3, 4]]])
>>> a = np.array([[1],[2],[3]]) # doctest: +SKIP >>> b = np.array([[2],[3],[4]]) # doctest: +SKIP >>> np.dstack((a,b)) # doctest: +SKIP array([[[1, 2]], [[2, 3]], [[3, 4]]])
-
dask.array.
ediff1d
(ary, to_end=None, to_begin=None)¶ The differences between consecutive elements of an array.
This docstring was copied from numpy.ediff1d.
Some inconsistencies with the Dask version may exist.
Parameters: ary : array_like
If necessary, will be flattened before the differences are taken.
to_end : array_like, optional
Number(s) to append at the end of the returned differences.
to_begin : array_like, optional
Number(s) to prepend at the beginning of the returned differences.
Returns: ediff1d : ndarray
The differences. Loosely, this is
ary.flat[1:] - ary.flat[:-1]
.Notes
When applied to masked arrays, this function drops the mask information if the to_begin and/or to_end parameters are used.
Examples
>>> x = np.array([1, 2, 4, 7, 0]) # doctest: +SKIP >>> np.ediff1d(x) # doctest: +SKIP array([ 1, 2, 3, -7])
>>> np.ediff1d(x, to_begin=-99, to_end=np.array([88, 99])) # doctest: +SKIP array([-99, 1, 2, ..., -7, 88, 99])
The returned array is always 1D.
>>> y = [[1, 2, 4], [1, 6, 24]] # doctest: +SKIP >>> np.ediff1d(y) # doctest: +SKIP array([ 1, 2, -3, 5, 18])
-
dask.array.
empty
(*args, **kwargs)¶ Blocked variant of empty
Follows the signature of empty exactly except that it also requires a keyword argument chunks=(…)
Original signature follows below. empty(shape, dtype=float, order=’C’)
Return a new array of given shape and type, without initializing entries.
Parameters: shape : int or tuple of int
Shape of the empty array, e.g.,
(2, 3)
or2
.dtype : data-type, optional
Desired output data-type for the array, e.g, numpy.int8. Default is numpy.float64.
order : {‘C’, ‘F’}, optional, default: ‘C’
Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
Returns: out : ndarray
Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays will be initialized to None.
See also
empty_like
- Return an empty array with shape and type of input.
ones
- Return a new array setting values to one.
zeros
- Return a new array setting values to zero.
full
- Return a new array of given shape filled with value.
Notes
empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster. On the other hand, it requires the user to manually set all the values in the array, and should be used with caution.
Examples
>>> np.empty([2, 2]) array([[ -9.74499359e+001, 6.69583040e-309], [ 2.13182611e-314, 3.06959433e-309]]) #uninitialized
>>> np.empty([2, 2], dtype=int) array([[-1073741821, -1067949133], [ 496041986, 19249760]]) #uninitialized
-
dask.array.
empty_like
(a, dtype=None, chunks=None)¶ Return a new array with the same shape and type as a given array.
Parameters: a : array_like
The shape and data-type of a define these same attributes of the returned array.
dtype : data-type, optional
Overrides the data type of the result.
chunks : sequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.Returns: out : ndarray
Array of uninitialized (arbitrary) data with the same shape and type as a.
See also
ones_like
- Return an array of ones with shape and type of input.
zeros_like
- Return an array of zeros with shape and type of input.
empty
- Return a new uninitialized array.
ones
- Return a new array setting values to one.
zeros
- Return a new array setting values to zero.
Notes
This function does not initialize the returned array; to do that use zeros_like or ones_like instead. It may be marginally faster than the functions that do set the array values.
-
dask.array.
einsum
(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False)¶ This docstring was copied from numpy.einsum.
Some inconsistencies with the Dask version may exist.
Evaluates the Einstein summation convention on the operands.
Using the Einstein summation convention, many common multi-dimensional, linear algebraic array operations can be represented in a simple fashion. In implicit mode einsum computes these values.
In explicit mode, einsum provides further flexibility to compute other array operations that might not be considered classical Einstein summation operations, by disabling, or forcing summation over specified subscript labels.
See the notes and examples for clarification.
Parameters: subscripts : str
Specifies the subscripts for summation as comma separated list of subscript labels. An implicit (classical Einstein summation) calculation is performed unless the explicit indicator ‘->’ is included as well as subscript labels of the precise output form.
operands : list of array_like
These are the arrays for the operation.
out : ndarray, optional
If provided, the calculation is done into this array.
dtype : {data-type, None}, optional
If provided, forces the calculation to use the data type specified. Note that you may have to also give a more liberal casting parameter to allow the conversions. Default is None.
order : {‘C’, ‘F’, ‘A’, ‘K’}, optional
Controls the memory layout of the output. ‘C’ means it should be C contiguous. ‘F’ means it should be Fortran contiguous, ‘A’ means it should be ‘F’ if the inputs are all ‘F’, ‘C’ otherwise. ‘K’ means it should be as close to the layout as the inputs as is possible, including arbitrarily permuted axes. Default is ‘K’.
casting : {‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional
Controls what kind of data casting may occur. Setting this to ‘unsafe’ is not recommended, as it can adversely affect accumulations.
- ‘no’ means the data types should not be cast at all.
- ‘equiv’ means only byte-order changes are allowed.
- ‘safe’ means only casts which can preserve values are allowed.
- ‘same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.
- ‘unsafe’ means any data conversions may be done.
Default is ‘safe’.
optimize : {False, True, ‘greedy’, ‘optimal’}, optional
Controls if intermediate optimization should occur. No optimization will occur if False and True will default to the ‘greedy’ algorithm. Also accepts an explicit contraction list from the
np.einsum_path
function. Seenp.einsum_path
for more details. Defaults to False.Returns: output : ndarray
The calculation based on the Einstein summation convention.
Notes
New in version 1.6.0.
The Einstein summation convention can be used to compute many multi-dimensional, linear algebraic array operations. einsum provides a succinct way of representing these.
A non-exhaustive list of these operations, which can be computed by einsum, is shown below along with examples:
- Trace of an array,
numpy.trace()
. - Return a diagonal,
numpy.diag()
. - Array axis summations,
numpy.sum()
. - Transpositions and permutations,
numpy.transpose()
. - Matrix multiplication and dot product,
numpy.matmul()
numpy.dot()
. - Vector inner and outer products,
numpy.inner()
numpy.outer()
. - Broadcasting, element-wise and scalar multiplication,
numpy.multiply()
. - Tensor contractions,
numpy.tensordot()
. - Chained array operations, in efficient calculation order,
numpy.einsum_path()
.
The subscripts string is a comma-separated list of subscript labels, where each label refers to a dimension of the corresponding operand. Whenever a label is repeated it is summed, so
np.einsum('i,i', a, b)
is equivalent tonp.inner(a,b)
. If a label appears only once, it is not summed, sonp.einsum('i', a)
produces a view ofa
with no changes. A further examplenp.einsum('ij,jk', a, b)
describes traditional matrix multiplication and is equivalent tonp.matmul(a,b)
. Repeated subscript labels in one operand take the diagonal. For example,np.einsum('ii', a)
is equivalent tonp.trace(a)
.In implicit mode, the chosen subscripts are important since the axes of the output are reordered alphabetically. This means that
np.einsum('ij', a)
doesn’t affect a 2D array, whilenp.einsum('ji', a)
takes its transpose. Additionally,np.einsum('ij,jk', a, b)
returns a matrix multiplication, while,np.einsum('ij,jh', a, b)
returns the transpose of the multiplication since subscript ‘h’ precedes subscript ‘i’.In explicit mode the output can be directly controlled by specifying output subscript labels. This requires the identifier ‘->’ as well as the list of output subscript labels. This feature increases the flexibility of the function since summing can be disabled or forced when required. The call
np.einsum('i->', a)
is likenp.sum(a, axis=-1)
, andnp.einsum('ii->i', a)
is likenp.diag(a)
. The difference is that einsum does not allow broadcasting by default. Additionallynp.einsum('ij,jh->ih', a, b)
directly specifies the order of the output subscript labels and therefore returns matrix multiplication, unlike the example above in implicit mode.To enable and control broadcasting, use an ellipsis. Default NumPy-style broadcasting is done by adding an ellipsis to the left of each term, like
np.einsum('...ii->...i', a)
. To take the trace along the first and last axes, you can donp.einsum('i...i', a)
, or to do a matrix-matrix product with the left-most indices instead of rightmost, one can donp.einsum('ij...,jk...->ik...', a, b)
.When there is only one operand, no axes are summed, and no output parameter is provided, a view into the operand is returned instead of a new array. Thus, taking the diagonal as
np.einsum('ii->i', a)
produces a view (changed in version 1.10.0).einsum also provides an alternative way to provide the subscripts and operands as
einsum(op0, sublist0, op1, sublist1, ..., [sublistout])
. If the output shape is not provided in this format einsum will be calculated in implicit mode, otherwise it will be performed explicitly. The examples below have corresponding einsum calls with the two parameter methods.New in version 1.10.0.
Views returned from einsum are now writeable whenever the input array is writeable. For example,
np.einsum('ijk...->kji...', a)
will now have the same effect asnp.swapaxes(a, 0, 2)
andnp.einsum('ii->i', a)
will return a writeable view of the diagonal of a 2D array.New in version 1.12.0.
Added the
optimize
argument which will optimize the contraction order of an einsum expression. For a contraction with three or more operands this can greatly increase the computational efficiency at the cost of a larger memory footprint during computation.Typically a ‘greedy’ algorithm is applied which empirical tests have shown returns the optimal path in the majority of cases. In some cases ‘optimal’ will return the superlative path through a more expensive, exhaustive search. For iterative calculations it may be advisable to calculate the optimal path once and reuse that path by supplying it as an argument. An example is given below.
See
numpy.einsum_path()
for more details.Examples
>>> a = np.arange(25).reshape(5,5) # doctest: +SKIP >>> b = np.arange(5) # doctest: +SKIP >>> c = np.arange(6).reshape(2,3) # doctest: +SKIP
Trace of a matrix:
>>> np.einsum('ii', a) # doctest: +SKIP 60 >>> np.einsum(a, [0,0]) # doctest: +SKIP 60 >>> np.trace(a) # doctest: +SKIP 60
Extract the diagonal (requires explicit form):
>>> np.einsum('ii->i', a) # doctest: +SKIP array([ 0, 6, 12, 18, 24]) >>> np.einsum(a, [0,0], [0]) # doctest: +SKIP array([ 0, 6, 12, 18, 24]) >>> np.diag(a) # doctest: +SKIP array([ 0, 6, 12, 18, 24])
Sum over an axis (requires explicit form):
>>> np.einsum('ij->i', a) # doctest: +SKIP array([ 10, 35, 60, 85, 110]) >>> np.einsum(a, [0,1], [0]) # doctest: +SKIP array([ 10, 35, 60, 85, 110]) >>> np.sum(a, axis=1) # doctest: +SKIP array([ 10, 35, 60, 85, 110])
For higher dimensional arrays summing a single axis can be done with ellipsis:
>>> np.einsum('...j->...', a) # doctest: +SKIP array([ 10, 35, 60, 85, 110]) >>> np.einsum(a, [Ellipsis,1], [Ellipsis]) # doctest: +SKIP array([ 10, 35, 60, 85, 110])
Compute a matrix transpose, or reorder any number of axes:
>>> np.einsum('ji', c) # doctest: +SKIP array([[0, 3], [1, 4], [2, 5]]) >>> np.einsum('ij->ji', c) # doctest: +SKIP array([[0, 3], [1, 4], [2, 5]]) >>> np.einsum(c, [1,0]) # doctest: +SKIP array([[0, 3], [1, 4], [2, 5]]) >>> np.transpose(c) # doctest: +SKIP array([[0, 3], [1, 4], [2, 5]])
Vector inner products:
>>> np.einsum('i,i', b, b) # doctest: +SKIP 30 >>> np.einsum(b, [0], b, [0]) # doctest: +SKIP 30 >>> np.inner(b,b) # doctest: +SKIP 30
Matrix vector multiplication:
>>> np.einsum('ij,j', a, b) # doctest: +SKIP array([ 30, 80, 130, 180, 230]) >>> np.einsum(a, [0,1], b, [1]) # doctest: +SKIP array([ 30, 80, 130, 180, 230]) >>> np.dot(a, b) # doctest: +SKIP array([ 30, 80, 130, 180, 230]) >>> np.einsum('...j,j', a, b) # doctest: +SKIP array([ 30, 80, 130, 180, 230])
Broadcasting and scalar multiplication:
>>> np.einsum('..., ...', 3, c) # doctest: +SKIP array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.einsum(',ij', 3, c) # doctest: +SKIP array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.einsum(3, [Ellipsis], c, [Ellipsis]) # doctest: +SKIP array([[ 0, 3, 6], [ 9, 12, 15]]) >>> np.multiply(3, c) # doctest: +SKIP array([[ 0, 3, 6], [ 9, 12, 15]])
Vector outer product:
>>> np.einsum('i,j', np.arange(2)+1, b) # doctest: +SKIP array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]) >>> np.einsum(np.arange(2)+1, [0], b, [1]) # doctest: +SKIP array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]]) >>> np.outer(np.arange(2)+1, b) # doctest: +SKIP array([[0, 1, 2, 3, 4], [0, 2, 4, 6, 8]])
Tensor contraction:
>>> a = np.arange(60.).reshape(3,4,5) # doctest: +SKIP >>> b = np.arange(24.).reshape(4,3,2) # doctest: +SKIP >>> np.einsum('ijk,jil->kl', a, b) # doctest: +SKIP array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]]) >>> np.einsum(a, [0,1,2], b, [1,0,3], [2,3]) # doctest: +SKIP array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]]) >>> np.tensordot(a,b, axes=([1,0],[0,1])) # doctest: +SKIP array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]])
Writeable returned arrays (since version 1.10.0):
>>> a = np.zeros((3, 3)) # doctest: +SKIP >>> np.einsum('ii->i', a)[:] = 1 # doctest: +SKIP >>> a # doctest: +SKIP array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
Example of ellipsis use:
>>> a = np.arange(6).reshape((3,2)) # doctest: +SKIP >>> b = np.arange(12).reshape((4,3)) # doctest: +SKIP >>> np.einsum('ki,jk->ij', a, b) # doctest: +SKIP array([[10, 28, 46, 64], [13, 40, 67, 94]]) >>> np.einsum('ki,...k->i...', a, b) # doctest: +SKIP array([[10, 28, 46, 64], [13, 40, 67, 94]]) >>> np.einsum('k...,jk', a, b) # doctest: +SKIP array([[10, 28, 46, 64], [13, 40, 67, 94]])
Chained array operations. For more complicated contractions, speed ups might be achieved by repeatedly computing a ‘greedy’ path or pre-computing the ‘optimal’ path and repeatedly applying it, using an einsum_path insertion (since version 1.12.0). Performance improvements can be particularly significant with larger arrays:
>>> a = np.ones(64).reshape(2,4,8) # doctest: +SKIP
Basic einsum: ~1520ms (benchmarked on 3.1GHz Intel i5.)
>>> for iteration in range(500): # doctest: +SKIP ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a)
Sub-optimal einsum (due to repeated path calculation time): ~330ms
>>> for iteration in range(500): # doctest: +SKIP ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='optimal')
Greedy einsum (faster optimal path approximation): ~160ms
>>> for iteration in range(500): # doctest: +SKIP ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='greedy')
Optimal einsum (best usage pattern in some use cases): ~110ms
>>> path = np.einsum_path('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='optimal')[0] # doctest: +SKIP >>> for iteration in range(500): # doctest: +SKIP ... _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize=path)
-
dask.array.
exp
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.exp.
Some inconsistencies with the Dask version may exist.
Calculate the exponential of all elements in the input array.
Parameters: x : array_like
Input values.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Output array, element-wise exponential of x. This is a scalar if x is a scalar.
See also
expm1
- Calculate
exp(x) - 1
for all elements in the array. exp2
- Calculate
2**x
for all elements in the array.
Notes
The irrational number
e
is also known as Euler’s number. It is approximately 2.718281, and is the base of the natural logarithm,ln
(this means that, if \(x = \ln y = \log_e y\), then \(e^x = y\). For real input,exp(x)
is always positive.For complex arguments,
x = a + ib
, we can write \(e^x = e^a e^{ib}\). The first term, \(e^a\), is already known (it is the real argument, described above). The second term, \(e^{ib}\), is \(\cos b + i \sin b\), a function with magnitude 1 and a periodic phase.References
[R127] Wikipedia, “Exponential function”, https://en.wikipedia.org/wiki/Exponential_function [R128] M. Abramovitz and I. A. Stegun, “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables,” Dover, 1964, p. 69, http://www.math.sfu.ca/~cbm/aands/page_69.htm Examples
Plot the magnitude and phase of
exp(x)
in the complex plane:>>> import matplotlib.pyplot as plt # doctest: +SKIP
>>> x = np.linspace(-2*np.pi, 2*np.pi, 100) # doctest: +SKIP >>> xx = x + 1j * x[:, np.newaxis] # a + ib over complex plane # doctest: +SKIP >>> out = np.exp(xx) # doctest: +SKIP
>>> plt.subplot(121) # doctest: +SKIP >>> plt.imshow(np.abs(out), # doctest: +SKIP ... extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi], cmap='gray') >>> plt.title('Magnitude of exp(x)') # doctest: +SKIP
>>> plt.subplot(122) # doctest: +SKIP >>> plt.imshow(np.angle(out), # doctest: +SKIP ... extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi], cmap='hsv') >>> plt.title('Phase (angle) of exp(x)') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.
expm1
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.expm1.
Some inconsistencies with the Dask version may exist.
Calculate
exp(x) - 1
for all elements in the array.Parameters: x : array_like
Input values.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Element-wise exponential minus one:
out = exp(x) - 1
. This is a scalar if x is a scalar.See also
log1p
log(1 + x)
, the inverse of expm1.
Notes
This function provides greater precision than
exp(x) - 1
for small values ofx
.Examples
The true value of
exp(1e-10) - 1
is1.00000000005e-10
to about 32 significant digits. This example shows the superiority of expm1 in this case.>>> np.expm1(1e-10) # doctest: +SKIP 1.00000000005e-10 >>> np.exp(1e-10) - 1 # doctest: +SKIP 1.000000082740371e-10
-
dask.array.
eye
(N, chunks='auto', M=None, k=0, dtype=<class 'float'>)¶ Return a 2-D Array with ones on the diagonal and zeros elsewhere.
Parameters: N : int
Number of rows in the output.
chunks : int, str
How to chunk the array. Must be one of the following forms:
- A blocksize like 1000.
- A size in bytes, like “100 MiB” which will choose a uniform block-like shape
- The word “auto” which acts like the above, but uses a configuration
value
array.chunk-size
for the chunk size
M : int, optional
Number of columns in the output. If None, defaults to N.
k : int, optional
Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal.
dtype : data-type, optional
Data-type of the returned array.
Returns: I : Array of shape (N,M)
An array where all elements are equal to zero, except for the k-th diagonal, whose values are equal to one.
-
dask.array.
fabs
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fabs.
Some inconsistencies with the Dask version may exist.
Compute the absolute values element-wise.
This function returns the absolute values (positive magnitude) of the data in x. Complex values are not handled, use absolute to find the absolute values of complex data.
Parameters: x : array_like
The array of numbers for which the absolute values are required. If x is a scalar, the result y will also be a scalar.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The absolute values of x, the returned values are always floats. This is a scalar if x is a scalar.
See also
absolute
- Absolute values including complex types.
Examples
>>> np.fabs(-1) # doctest: +SKIP 1.0 >>> np.fabs([-1.2, 1.2]) # doctest: +SKIP array([ 1.2, 1.2])
-
dask.array.
fix
(*args, **kwargs)¶ Round to nearest integer towards zero.
This docstring was copied from numpy.fix.
Some inconsistencies with the Dask version may exist.
Round an array of floats element-wise to nearest integer towards zero. The rounded values are returned as floats.
Parameters: x : array_like (Not supported in Dask)
An array of floats to be rounded
y : ndarray, optional
Output array
Returns: out : ndarray of floats (Not supported in Dask)
The array of rounded numbers
Examples
>>> np.fix(3.14) # doctest: +SKIP 3.0 >>> np.fix(3) # doctest: +SKIP 3.0 >>> np.fix([2.1, 2.9, -2.1, -2.9]) # doctest: +SKIP array([ 2., 2., -2., -2.])
-
dask.array.
flatnonzero
(a)¶ Return indices that are non-zero in the flattened version of a.
This docstring was copied from numpy.flatnonzero.
Some inconsistencies with the Dask version may exist.
This is equivalent to np.nonzero(np.ravel(a))[0].
Parameters: a : array_like
Input data.
Returns: res : ndarray
Output array, containing the indices of the elements of a.ravel() that are non-zero.
See also
Examples
>>> x = np.arange(-2, 3) # doctest: +SKIP >>> x # doctest: +SKIP array([-2, -1, 0, 1, 2]) >>> np.flatnonzero(x) # doctest: +SKIP array([0, 1, 3, 4])
Use the indices of the non-zero elements as an index array to extract these elements:
>>> x.ravel()[np.flatnonzero(x)] # doctest: +SKIP array([-2, -1, 1, 2])
-
dask.array.
flip
(m, axis)¶ Reverse element order along axis.
Parameters: axis : int
Axis to reverse element order of.
Returns: reversed array : ndarray
-
dask.array.
flipud
(m)¶ Flip array in the up/down direction.
This docstring was copied from numpy.flipud.
Some inconsistencies with the Dask version may exist.
Flip the entries in each column in the up/down direction. Rows are preserved, but appear in a different order than before.
Parameters: m : array_like
Input array.
Returns: out : array_like
A view of m with the rows reversed. Since a view is returned, this operation is \(\mathcal O(1)\).
See also
fliplr
- Flip array in the left/right direction.
rot90
- Rotate array counterclockwise.
Notes
Equivalent to
m[::-1,...]
. Does not require the array to be two-dimensional.Examples
>>> A = np.diag([1.0, 2, 3]) # doctest: +SKIP >>> A # doctest: +SKIP array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]]) >>> np.flipud(A) # doctest: +SKIP array([[0., 0., 3.], [0., 2., 0.], [1., 0., 0.]])
>>> A = np.random.randn(2,3,5) # doctest: +SKIP >>> np.all(np.flipud(A) == A[::-1,...]) # doctest: +SKIP True
>>> np.flipud([1,2]) # doctest: +SKIP array([2, 1])
-
dask.array.
fliplr
(m)¶ Flip array in the left/right direction.
This docstring was copied from numpy.fliplr.
Some inconsistencies with the Dask version may exist.
Flip the entries in each row in the left/right direction. Columns are preserved, but appear in a different order than before.
Parameters: m : array_like
Input array, must be at least 2-D.
Returns: f : ndarray
A view of m with the columns reversed. Since a view is returned, this operation is \(\mathcal O(1)\).
See also
flipud
- Flip array in the up/down direction.
rot90
- Rotate array counterclockwise.
Notes
Equivalent to m[:,::-1]. Requires the array to be at least 2-D.
Examples
>>> A = np.diag([1.,2.,3.]) # doctest: +SKIP >>> A # doctest: +SKIP array([[1., 0., 0.], [0., 2., 0.], [0., 0., 3.]]) >>> np.fliplr(A) # doctest: +SKIP array([[0., 0., 1.], [0., 2., 0.], [3., 0., 0.]])
>>> A = np.random.randn(2,3,5) # doctest: +SKIP >>> np.all(np.fliplr(A) == A[:,::-1,...]) # doctest: +SKIP True
-
dask.array.
floor
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.floor.
Some inconsistencies with the Dask version may exist.
Return the floor of the input, element-wise.
The floor of the scalar x is the largest integer i, such that i <= x. It is often denoted as \(\lfloor x \rfloor\).
Parameters: x : array_like
Input data.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The floor of each element in x. This is a scalar if x is a scalar.
Notes
Some spreadsheet programs calculate the “floor-towards-zero”, in other words
floor(-2.5) == -2
. NumPy instead uses the definition of floor where floor(-2.5) == -3.Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) # doctest: +SKIP >>> np.floor(a) # doctest: +SKIP array([-2., -2., -1., 0., 1., 1., 2.])
-
dask.array.
fmax
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fmax.
Some inconsistencies with the Dask version may exist.
Element-wise maximum of array elements.
Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.
Parameters: x1, x2 : array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The maximum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
New in version 1.3.0.
The fmax is equivalent to
np.where(x1 >= x2, x1, x2)
when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.Examples
>>> np.fmax([2, 3, 4], [1, 5, 2]) # doctest: +SKIP array([ 2., 5., 4.])
>>> np.fmax(np.eye(2), [0.5, 2]) # doctest: +SKIP array([[ 1. , 2. ], [ 0.5, 2. ]])
>>> np.fmax([np.nan, 0, np.nan],[0, np.nan, np.nan]) # doctest: +SKIP array([ 0., 0., nan])
-
dask.array.
fmin
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fmin.
Some inconsistencies with the Dask version may exist.
Element-wise minimum of array elements.
Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.
Parameters: x1, x2 : array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The minimum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
New in version 1.3.0.
The fmin is equivalent to
np.where(x1 <= x2, x1, x2)
when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.Examples
>>> np.fmin([2, 3, 4], [1, 5, 2]) # doctest: +SKIP array([1, 3, 2])
>>> np.fmin(np.eye(2), [0.5, 2]) # doctest: +SKIP array([[ 0.5, 0. ], [ 0. , 1. ]])
>>> np.fmin([np.nan, 0, np.nan],[0, np.nan, np.nan]) # doctest: +SKIP array([ 0., 0., nan])
-
dask.array.
fmod
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.fmod.
Some inconsistencies with the Dask version may exist.
Return the element-wise remainder of division.
This is the NumPy implementation of the C library function fmod, the remainder has the same sign as the dividend x1. It is equivalent to the Matlab(TM)
rem
function and should not be confused with the Python modulus operatorx1 % x2
.Parameters: x1 : array_like
Dividend.
x2 : array_like
Divisor. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : array_like
The remainder of the division of x1 by x2. This is a scalar if both x1 and x2 are scalars.
See also
remainder
- Equivalent to the Python
%
operator.
divide
Notes
The result of the modulo operation for negative dividend and divisors is bound by conventions. For fmod, the sign of result is the sign of the dividend, while for remainder the sign of the result is the sign of the divisor. The fmod function is equivalent to the Matlab(TM)
rem
function.Examples
>>> np.fmod([-3, -2, -1, 1, 2, 3], 2) # doctest: +SKIP array([-1, 0, -1, 1, 0, 1]) >>> np.remainder([-3, -2, -1, 1, 2, 3], 2) # doctest: +SKIP array([1, 0, 1, 1, 0, 1])
>>> np.fmod([5, 3], [2, 2.]) # doctest: +SKIP array([ 1., 1.]) >>> a = np.arange(-3, 3).reshape(3, 2) # doctest: +SKIP >>> a # doctest: +SKIP array([[-3, -2], [-1, 0], [ 1, 2]]) >>> np.fmod(a, [2,2]) # doctest: +SKIP array([[-1, 0], [-1, 0], [ 1, 0]])
-
dask.array.
frexp
(x, [out1, out2, ]/, [out=(None, None), ]*, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.frexp.
Some inconsistencies with the Dask version may exist.
Decompose the elements of x into mantissa and twos exponent.
Returns (mantissa, exponent), where x = mantissa * 2**exponent`. The mantissa is lies in the open interval(-1, 1), while the twos exponent is a signed integer.
Parameters: x : array_like
Array of numbers to be decomposed.
out1 : ndarray, optional
Output array for the mantissa. Must have the same shape as x.
out2 : ndarray, optional
Output array for the exponent. Must have the same shape as x.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: mantissa : ndarray
Floating values between -1 and 1. This is a scalar if x is a scalar.
exponent : ndarray
Integer exponents of 2. This is a scalar if x is a scalar.
See also
ldexp
- Compute
y = x1 * 2**x2
, the inverse of frexp.
Notes
Complex dtypes are not supported, they will raise a TypeError.
Examples
>>> x = np.arange(9) # doctest: +SKIP >>> y1, y2 = np.frexp(x) # doctest: +SKIP >>> y1 # doctest: +SKIP array([ 0. , 0.5 , 0.5 , 0.75 , 0.5 , 0.625, 0.75 , 0.875, 0.5 ]) >>> y2 # doctest: +SKIP array([0, 1, 2, 2, 3, 3, 3, 3, 4]) >>> y1 * 2**y2 # doctest: +SKIP array([ 0., 1., 2., 3., 4., 5., 6., 7., 8.])
-
dask.array.
fromfunction
(func, chunks='auto', shape=None, dtype=None, **kwargs)¶ Construct an array by executing a function over each coordinate.
This docstring was copied from numpy.fromfunction.
Some inconsistencies with the Dask version may exist.
The resulting array therefore has a value
fn(x, y, z)
at coordinate(x, y, z)
.Parameters: function : callable (Not supported in Dask)
The function is called with N parameters, where N is the rank of shape. Each parameter represents the coordinates of the array varying along a specific axis. For example, if shape were
(2, 2)
, then the parameters would bearray([[0, 0], [1, 1]])
andarray([[0, 1], [0, 1]])
shape : (N,) tuple of ints
Shape of the output array, which also determines the shape of the coordinate arrays passed to function.
dtype : data-type, optional
Data-type of the coordinate arrays passed to function. By default, dtype is float.
Returns: fromfunction : any
The result of the call to function is passed back directly. Therefore the shape of fromfunction is completely determined by function. If function returns a scalar value, the shape of fromfunction would not match the shape parameter.
Notes
Keywords other than dtype are passed to function.
Examples
>>> np.fromfunction(lambda i, j: i == j, (3, 3), dtype=int) # doctest: +SKIP array([[ True, False, False], [False, True, False], [False, False, True]])
>>> np.fromfunction(lambda i, j: i + j, (3, 3), dtype=int) # doctest: +SKIP array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
-
dask.array.
frompyfunc
(func, nin, nout)¶ This docstring was copied from numpy.frompyfunc.
Some inconsistencies with the Dask version may exist.
Takes an arbitrary Python function and returns a NumPy ufunc.
Can be used, for example, to add broadcasting to a built-in Python function (see Examples section).
Parameters: func : Python function object
An arbitrary Python function.
nin : int
The number of input arguments.
nout : int
The number of objects returned by func.
Returns: out : ufunc
Returns a NumPy universal function (
ufunc
) object.See also
vectorize
- evaluates pyfunc over input arrays using broadcasting rules of numpy
Notes
The returned ufunc always returns PyObject arrays.
Examples
Use frompyfunc to add broadcasting to the Python function
oct
:>>> oct_array = np.frompyfunc(oct, 1, 1) # doctest: +SKIP >>> oct_array(np.array((10, 30, 100))) # doctest: +SKIP array(['0o12', '0o36', '0o144'], dtype=object) >>> np.array((oct(10), oct(30), oct(100))) # for comparison # doctest: +SKIP array(['0o12', '0o36', '0o144'], dtype='<U5')
-
dask.array.
full
(*args, **kwargs)¶ Blocked variant of full
Follows the signature of full exactly except that it also requires a keyword argument chunks=(…)
Original signature follows below.
Return a new array of given shape and type, filled with fill_value.
Parameters: shape : int or sequence of ints
Shape of the new array, e.g.,
(2, 3)
or2
.fill_value : scalar
Fill value.
dtype : data-type, optional
- The desired data-type for the array The default, None, means
np.array(fill_value).dtype.
order : {‘C’, ‘F’}, optional
Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.
Returns: out : ndarray
Array of fill_value with the given shape, dtype, and order.
See also
Examples
>>> np.full((2, 2), np.inf) array([[inf, inf], [inf, inf]]) >>> np.full((2, 2), 10) array([[10, 10], [10, 10]])
-
dask.array.
full_like
(a, fill_value, dtype=None, chunks=None)¶ Return a full array with the same shape and type as a given array.
Parameters: a : array_like
The shape and data-type of a define these same attributes of the returned array.
fill_value : scalar
Fill value.
dtype : data-type, optional
Overrides the data type of the result.
chunks : sequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.Returns: out : ndarray
Array of fill_value with the same shape and type as a.
See also
zeros_like
- Return an array of zeros with shape and type of input.
ones_like
- Return an array of ones with shape and type of input.
empty_like
- Return an empty array with shape and type of input.
zeros
- Return a new array setting values to zero.
ones
- Return a new array setting values to one.
empty
- Return a new uninitialized array.
full
- Fill a new array.
-
dask.array.
gradient
(f, *varargs, **kwargs)¶ Return the gradient of an N-dimensional array.
This docstring was copied from numpy.gradient.
Some inconsistencies with the Dask version may exist.
The gradient is computed using second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.
Parameters: f : array_like
An N-dimensional array containing samples of a scalar function.
varargs : list of scalar or array, optional
Spacing between f values. Default unitary spacing for all dimensions. Spacing can be specified using:
- single scalar to specify a sample distance for all dimensions.
- N scalars to specify a constant sample distance for each dimension. i.e. dx, dy, dz, …
- N arrays to specify the coordinates of the values along each dimension of F. The length of the array must match the size of the corresponding dimension
- Any combination of N scalars/arrays with the meaning of 2. and 3.
If axis is given, the number of varargs must equal the number of axes. Default: 1.
edge_order : {1, 2}, optional
Gradient is calculated using N-th order accurate differences at the boundaries. Default: 1.
New in version 1.9.1.
axis : None or int or tuple of ints, optional
Gradient is calculated only along the given axis or axes The default (axis = None) is to calculate the gradient for all the axes of the input array. axis may be negative, in which case it counts from the last to the first axis.
New in version 1.11.0.
Returns: gradient : ndarray or list of ndarray
A set of ndarrays (or a single ndarray if there is only one dimension) corresponding to the derivatives of f with respect to each dimension. Each derivative has the same shape as f.
Notes
Assuming that \(f\in C^{3}\) (i.e., \(f\) has at least 3 continuous derivatives) and let \(h_{*}\) be a non-homogeneous stepsize, we minimize the “consistency error” \(\eta_{i}\) between the true gradient and its estimate from a linear combination of the neighboring grid-points:
\[\eta_{i} = f_{i}^{\left(1\right)} - \left[ \alpha f\left(x_{i}\right) + \beta f\left(x_{i} + h_{d}\right) + \gamma f\left(x_{i}-h_{s}\right) \right]\]By substituting \(f(x_{i} + h_{d})\) and \(f(x_{i} - h_{s})\) with their Taylor series expansion, this translates into solving the following the linear system:
\[\begin{split}\left\{ \begin{array}{r} \alpha+\beta+\gamma=0 \\ \beta h_{d}-\gamma h_{s}=1 \\ \beta h_{d}^{2}+\gamma h_{s}^{2}=0 \end{array} \right.\end{split}\]The resulting approximation of \(f_{i}^{(1)}\) is the following:
\[\hat f_{i}^{(1)} = \frac{ h_{s}^{2}f\left(x_{i} + h_{d}\right) + \left(h_{d}^{2} - h_{s}^{2}\right)f\left(x_{i}\right) - h_{d}^{2}f\left(x_{i}-h_{s}\right)} { h_{s}h_{d}\left(h_{d} + h_{s}\right)} + \mathcal{O}\left(\frac{h_{d}h_{s}^{2} + h_{s}h_{d}^{2}}{h_{d} + h_{s}}\right)\]It is worth noting that if \(h_{s}=h_{d}\) (i.e., data are evenly spaced) we find the standard second order approximation:
\[\hat f_{i}^{(1)}= \frac{f\left(x_{i+1}\right) - f\left(x_{i-1}\right)}{2h} + \mathcal{O}\left(h^{2}\right)\]With a similar procedure the forward/backward approximations used for boundaries can be derived.
References
[R129] Quarteroni A., Sacco R., Saleri F. (2007) Numerical Mathematics (Texts in Applied Mathematics). New York: Springer. [R130] Durran D. R. (1999) Numerical Methods for Wave Equations in Geophysical Fluid Dynamics. New York: Springer. [R131] Fornberg B. (1988) Generation of Finite Difference Formulas on Arbitrarily Spaced Grids, Mathematics of Computation 51, no. 184 : 699-706. PDF. Examples
>>> f = np.array([1, 2, 4, 7, 11, 16], dtype=float) # doctest: +SKIP >>> np.gradient(f) # doctest: +SKIP array([1. , 1.5, 2.5, 3.5, 4.5, 5. ]) >>> np.gradient(f, 2) # doctest: +SKIP array([0.5 , 0.75, 1.25, 1.75, 2.25, 2.5 ])
Spacing can be also specified with an array that represents the coordinates of the values F along the dimensions. For instance a uniform spacing:
>>> x = np.arange(f.size) # doctest: +SKIP >>> np.gradient(f, x) # doctest: +SKIP array([1. , 1.5, 2.5, 3.5, 4.5, 5. ])
Or a non uniform one:
>>> x = np.array([0., 1., 1.5, 3.5, 4., 6.], dtype=float) # doctest: +SKIP >>> np.gradient(f, x) # doctest: +SKIP array([1. , 3. , 3.5, 6.7, 6.9, 2.5])
For two dimensional arrays, the return will be two arrays ordered by axis. In this example the first array stands for the gradient in rows and the second one in columns direction:
>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float)) # doctest: +SKIP [array([[ 2., 2., -1.], [ 2., 2., -1.]]), array([[1. , 2.5, 4. ], [1. , 1. , 1. ]])]
In this example the spacing is also specified: uniform for axis=0 and non uniform for axis=1
>>> dx = 2. # doctest: +SKIP >>> y = [1., 1.5, 3.5] # doctest: +SKIP >>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), dx, y) # doctest: +SKIP [array([[ 1. , 1. , -0.5], [ 1. , 1. , -0.5]]), array([[2. , 2. , 2. ], [2. , 1.7, 0.5]])]
It is possible to specify how boundaries are treated using edge_order
>>> x = np.array([0, 1, 2, 3, 4]) # doctest: +SKIP >>> f = x**2 # doctest: +SKIP >>> np.gradient(f, edge_order=1) # doctest: +SKIP array([1., 2., 4., 6., 7.]) >>> np.gradient(f, edge_order=2) # doctest: +SKIP array([0., 2., 4., 6., 8.])
The axis keyword can be used to specify a subset of axes of which the gradient is calculated
>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), axis=0) # doctest: +SKIP array([[ 2., 2., -1.], [ 2., 2., -1.]])
-
dask.array.
histogram
(a, bins=None, range=None, normed=False, weights=None, density=None)¶ Blocked variant of
numpy.histogram()
.Follows the signature of
numpy.histogram()
exactly with the following exceptions:- Either an iterable specifying the
bins
or the number ofbins
and arange
argument is required as computingmin
andmax
over blocked arrays is an expensive operation that must be performed explicitly. weights
must be a dask.array.Array with the same block structure asa
.
Examples
Using number of bins and range:
>>> import dask.array as da >>> import numpy as np >>> x = da.from_array(np.arange(10000), chunks=10) >>> h, bins = da.histogram(x, bins=10, range=[0, 10000]) >>> bins array([ 0., 1000., 2000., 3000., 4000., 5000., 6000., 7000., 8000., 9000., 10000.]) >>> h.compute() array([1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000])
Explicitly specifying the bins:
>>> h, bins = da.histogram(x, bins=np.array([0, 5000, 10000])) >>> bins array([ 0, 5000, 10000]) >>> h.compute() array([5000, 5000])
- Either an iterable specifying the
-
dask.array.
hstack
(tup, allow_unknown_chunksizes=False)¶ Stack arrays in sequence horizontally (column wise).
This docstring was copied from numpy.hstack.
Some inconsistencies with the Dask version may exist.
This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis. Rebuilds arrays divided by hsplit.
This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.
Parameters: tup : sequence of ndarrays
The arrays must have the same shape along all but the second axis, except 1-D arrays which can be any length.
Returns: stacked : ndarray
The array formed by stacking the given arrays.
See also
stack
- Join a sequence of arrays along a new axis.
vstack
- Stack arrays in sequence vertically (row wise).
dstack
- Stack arrays in sequence depth wise (along third axis).
concatenate
- Join a sequence of arrays along an existing axis.
hsplit
- Split array along second axis.
block
- Assemble arrays from blocks.
Examples
>>> a = np.array((1,2,3)) # doctest: +SKIP >>> b = np.array((2,3,4)) # doctest: +SKIP >>> np.hstack((a,b)) # doctest: +SKIP array([1, 2, 3, 2, 3, 4]) >>> a = np.array([[1],[2],[3]]) # doctest: +SKIP >>> b = np.array([[2],[3],[4]]) # doctest: +SKIP >>> np.hstack((a,b)) # doctest: +SKIP array([[1, 2], [2, 3], [3, 4]])
-
dask.array.
hypot
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.hypot.
Some inconsistencies with the Dask version may exist.
Given the “legs” of a right triangle, return its hypotenuse.
Equivalent to
sqrt(x1**2 + x2**2)
, element-wise. If x1 or x2 is scalar_like (i.e., unambiguously cast-able to a scalar type), it is broadcast for use with each element of the other argument. (See Examples)Parameters: x1, x2 : array_like
Leg of the triangle(s). If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: z : ndarray
The hypotenuse of the triangle(s). This is a scalar if both x1 and x2 are scalars.
Examples
>>> np.hypot(3*np.ones((3, 3)), 4*np.ones((3, 3))) # doctest: +SKIP array([[ 5., 5., 5.], [ 5., 5., 5.], [ 5., 5., 5.]])
Example showing broadcast of scalar_like argument:
>>> np.hypot(3*np.ones((3, 3)), [4]) # doctest: +SKIP array([[ 5., 5., 5.], [ 5., 5., 5.], [ 5., 5., 5.]])
-
dask.array.
imag
(*args, **kwargs)¶ Return the imaginary part of the complex argument.
This docstring was copied from numpy.imag.
Some inconsistencies with the Dask version may exist.
Parameters: val : array_like (Not supported in Dask)
Input array.
Returns: out : ndarray or scalar
The imaginary component of the complex argument. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.
Examples
>>> a = np.array([1+2j, 3+4j, 5+6j]) # doctest: +SKIP >>> a.imag # doctest: +SKIP array([2., 4., 6.]) >>> a.imag = np.array([8, 10, 12]) # doctest: +SKIP >>> a # doctest: +SKIP array([1. +8.j, 3.+10.j, 5.+12.j]) >>> np.imag(1 + 1j) # doctest: +SKIP 1.0
-
dask.array.
indices
(dimensions, dtype=<class 'int'>, chunks='auto')¶ Implements NumPy’s
indices
for Dask Arrays.Generates a grid of indices covering the dimensions provided.
The final array has the shape
(len(dimensions), *dimensions)
. The chunks are used to specify the chunking for axis 1 up tolen(dimensions)
. The 0th axis always has chunks of length 1.Parameters: dimensions : sequence of ints
The shape of the index grid.
dtype : dtype, optional
Type to use for the array. Default is
int
.chunks : sequence of ints, str
The size of each block. Must be one of the following forms:
- A blocksize like (500, 1000)
- A size in bytes, like “100 MiB” which will choose a uniform block-like shape
- The word “auto” which acts like the above, but uses a configuration
value
array.chunk-size
for the chunk size
Note that the last block will have fewer samples if
len(array) % chunks != 0
.Returns: grid : dask array
-
dask.array.
insert
(arr, obj, values, axis)¶ Insert values along the given axis before the given indices.
This docstring was copied from numpy.insert.
Some inconsistencies with the Dask version may exist.
Parameters: arr : array_like
Input array.
obj : int, slice or sequence of ints
Object that defines the index or indices before which values is inserted.
New in version 1.8.0.
Support for multiple insertions when obj is a single scalar or a sequence with one element (similar to calling insert multiple times).
values : array_like
Values to insert into arr. If the type of values is different from that of arr, values is converted to the type of arr. values should be shaped so that
arr[...,obj,...] = values
is legal.axis : int, optional
Axis along which to insert values. If axis is None then arr is flattened first.
Returns: out : ndarray
A copy of arr with values inserted. Note that insert does not occur in-place: a new array is returned. If axis is None, out is a flattened array.
See also
append
- Append elements at the end of an array.
concatenate
- Join a sequence of arrays along an existing axis.
delete
- Delete elements from an array.
Notes
Note that for higher dimensional inserts obj=0 behaves very different from obj=[0] just like arr[:,0,:] = values is different from arr[:,[0],:] = values.
Examples
>>> a = np.array([[1, 1], [2, 2], [3, 3]]) # doctest: +SKIP >>> a # doctest: +SKIP array([[1, 1], [2, 2], [3, 3]]) >>> np.insert(a, 1, 5) # doctest: +SKIP array([1, 5, 1, ..., 2, 3, 3]) >>> np.insert(a, 1, 5, axis=1) # doctest: +SKIP array([[1, 5, 1], [2, 5, 2], [3, 5, 3]])
Difference between sequence and scalars:
>>> np.insert(a, [1], [[1],[2],[3]], axis=1) # doctest: +SKIP array([[1, 1, 1], [2, 2, 2], [3, 3, 3]]) >>> np.array_equal(np.insert(a, 1, [1, 2, 3], axis=1), # doctest: +SKIP ... np.insert(a, [1], [[1],[2],[3]], axis=1)) True
>>> b = a.flatten() # doctest: +SKIP >>> b # doctest: +SKIP array([1, 1, 2, 2, 3, 3]) >>> np.insert(b, [2, 2], [5, 6]) # doctest: +SKIP array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, slice(2, 4), [5, 6]) # doctest: +SKIP array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, [2, 2], [7.13, False]) # type casting # doctest: +SKIP array([1, 1, 7, ..., 2, 3, 3])
>>> x = np.arange(8).reshape(2, 4) # doctest: +SKIP >>> idx = (1, 3) # doctest: +SKIP >>> np.insert(x, idx, 999, axis=1) # doctest: +SKIP array([[ 0, 999, 1, 2, 999, 3], [ 4, 999, 5, 6, 999, 7]])
-
dask.array.
invert
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.invert.
Some inconsistencies with the Dask version may exist.
Compute bit-wise inversion, or bit-wise NOT, element-wise.
Computes the bit-wise NOT of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator
~
.For signed integer inputs, the two’s complement is returned. In a two’s-complement system negative numbers are represented by the two’s complement of the absolute value. This is the most common method of representing signed integers on computers [R132]. A N-bit two’s-complement system can represent every integer in the range \(-2^{N-1}\) to \(+2^{N-1}-1\).
Parameters: x : array_like
Only integer and boolean types are handled.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Result. This is a scalar if x is a scalar.
See also
bitwise_and
,bitwise_or
,bitwise_xor
,logical_not
binary_repr
- Return the binary representation of the input number as a string.
Notes
bitwise_not is an alias for invert:
>>> np.bitwise_not is np.invert # doctest: +SKIP True
References
[R132] (1, 2) Wikipedia, “Two’s complement”, https://en.wikipedia.org/wiki/Two’s_complement Examples
We’ve seen that 13 is represented by
00001101
. The invert or bit-wise NOT of 13 is then:>>> x = np.invert(np.array(13, dtype=np.uint8)) # doctest: +SKIP >>> x # doctest: +SKIP 242 >>> np.binary_repr(x, width=8) # doctest: +SKIP '11110010'
The result depends on the bit-width:
>>> x = np.invert(np.array(13, dtype=np.uint16)) # doctest: +SKIP >>> x # doctest: +SKIP 65522 >>> np.binary_repr(x, width=16) # doctest: +SKIP '1111111111110010'
When using signed integer types the result is the two’s complement of the result for the unsigned type:
>>> np.invert(np.array([13], dtype=np.int8)) # doctest: +SKIP array([-14], dtype=int8) >>> np.binary_repr(-14, width=8) # doctest: +SKIP '11110010'
Booleans are accepted as well:
>>> np.invert(np.array([True, False])) # doctest: +SKIP array([False, True])
-
dask.array.
isclose
(arr1, arr2, rtol=1e-05, atol=1e-08, equal_nan=False)¶ Returns a boolean array where two arrays are element-wise equal within a tolerance.
This docstring was copied from numpy.isclose.
Some inconsistencies with the Dask version may exist.
The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.
Warning
The default atol is not appropriate for comparing numbers that are much smaller than one (see Notes).
Parameters: a, b : array_like
Input arrays to compare.
rtol : float
The relative tolerance parameter (see Notes).
atol : float
The absolute tolerance parameter (see Notes).
equal_nan : bool
Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.
Returns: y : array_like
Returns a boolean array of where a and b are equal within the given tolerance. If both a and b are scalars, returns a single boolean value.
See also
Notes
New in version 1.7.0.
For finite values, isclose uses the following equation to test whether two floating point values are equivalent.
absolute(a - b) <= (atol + rtol * absolute(b))Unlike the built-in math.isclose, the above equation is not symmetric in a and b – it assumes b is the reference value – so that isclose(a, b) might be different from isclose(b, a). Furthermore, the default value of atol is not zero, and is used to determine what small values should be considered close to zero. The default value is appropriate for expected values of order unity: if the expected values are significantly smaller than one, it can result in false positives. atol should be carefully selected for the use case at hand. A zero value for atol will result in False if either a or b is zero.
Examples
>>> np.isclose([1e10,1e-7], [1.00001e10,1e-8]) # doctest: +SKIP array([ True, False]) >>> np.isclose([1e10,1e-8], [1.00001e10,1e-9]) # doctest: +SKIP array([ True, True]) >>> np.isclose([1e10,1e-8], [1.0001e10,1e-9]) # doctest: +SKIP array([False, True]) >>> np.isclose([1.0, np.nan], [1.0, np.nan]) # doctest: +SKIP array([ True, False]) >>> np.isclose([1.0, np.nan], [1.0, np.nan], equal_nan=True) # doctest: +SKIP array([ True, True]) >>> np.isclose([1e-8, 1e-7], [0.0, 0.0]) # doctest: +SKIP array([ True, False]) >>> np.isclose([1e-100, 1e-7], [0.0, 0.0], atol=0.0) # doctest: +SKIP array([False, False]) >>> np.isclose([1e-10, 1e-10], [1e-20, 0.0]) # doctest: +SKIP array([ True, True]) >>> np.isclose([1e-10, 1e-10], [1e-20, 0.999999e-10], atol=0.0) # doctest: +SKIP array([False, True])
-
dask.array.
iscomplex
(*args, **kwargs)¶ Returns a bool array, where True if input element is complex.
This docstring was copied from numpy.iscomplex.
Some inconsistencies with the Dask version may exist.
What is tested is whether the input has a non-zero imaginary part, not if the input type is complex.
Parameters: x : array_like (Not supported in Dask)
Input array.
Returns: out : ndarray of bools
Output array.
Examples
>>> np.iscomplex([1+1j, 1+0j, 4.5, 3, 2, 2j]) # doctest: +SKIP array([ True, False, False, False, False, True])
-
dask.array.
isfinite
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.isfinite.
Some inconsistencies with the Dask version may exist.
Test element-wise for finiteness (not infinity or not Not a Number).
The result is returned as a boolean array.
Parameters: x : array_like
Input values.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray, bool
True where
x
is not positive infinity, negative infinity, or NaN; false otherwise. This is a scalar if x is a scalar.Notes
Not a Number, positive infinity and negative infinity are considered to be non-finite.
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Also that positive infinity is not equivalent to negative infinity. But infinity is equivalent to positive infinity. Errors result if the second argument is also supplied when x is a scalar input, or if first and second arguments have different shapes.
Examples
>>> np.isfinite(1) # doctest: +SKIP True >>> np.isfinite(0) # doctest: +SKIP True >>> np.isfinite(np.nan) # doctest: +SKIP False >>> np.isfinite(np.inf) # doctest: +SKIP False >>> np.isfinite(np.NINF) # doctest: +SKIP False >>> np.isfinite([np.log(-1.),1.,np.log(0)]) # doctest: +SKIP array([False, True, False])
>>> x = np.array([-np.inf, 0., np.inf]) # doctest: +SKIP >>> y = np.array([2, 2, 2]) # doctest: +SKIP >>> np.isfinite(x, y) # doctest: +SKIP array([0, 1, 0]) >>> y # doctest: +SKIP array([0, 1, 0])
-
dask.array.
isin
(element, test_elements, assume_unique=False, invert=False)¶ Calculates element in test_elements, broadcasting over element only. Returns a boolean array of the same shape as element that is True where an element of element is in test_elements and False otherwise.
Parameters: element : array_like
Input array.
test_elements : array_like
The values against which to test each value of element. This argument is flattened if it is an array or array_like. See notes for behavior with non-array-like parameters.
assume_unique : bool, optional
If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.
invert : bool, optional
If True, the values in the returned array are inverted, as if calculating element not in test_elements. Default is False.
np.isin(a, b, invert=True)
is equivalent to (but faster than)np.invert(np.isin(a, b))
.Returns: isin : ndarray, bool
Has the same shape as element. The values element[isin] are in test_elements.
See also
in1d
- Flattened version of this function.
numpy.lib.arraysetops
- Module with a number of other functions for performing set operations on arrays.
Notes
isin is an element-wise function version of the python keyword in.
isin(a, b)
is roughly equivalent tonp.array([item in b for item in a])
if a and b are 1-D sequences.element and test_elements are converted to arrays if they are not already. If test_elements is a set (or other non-sequence collection) it will be converted to an object array with one element, rather than an array of the values contained in test_elements. This is a consequence of the array constructor’s way of handling non-sequence collections. Converting the set to a list usually gives the desired behavior.
New in version 1.13.0.
Examples
>>> element = 2*np.arange(4).reshape((2, 2)) >>> element array([[0, 2], [4, 6]]) >>> test_elements = [1, 2, 4, 8] >>> mask = np.isin(element, test_elements) >>> mask array([[False, True], [ True, False]]) >>> element[mask] array([2, 4])
The indices of the matched values can be obtained with nonzero:
>>> np.nonzero(mask) (array([0, 1]), array([1, 0]))
The test can also be inverted:
>>> mask = np.isin(element, test_elements, invert=True) >>> mask array([[ True, False], [False, True]]) >>> element[mask] array([0, 6])
Because of how array handles sets, the following does not work as expected:
>>> test_set = {1, 2, 4, 8} >>> np.isin(element, test_set) array([[False, False], [False, False]])
Casting the set to a list gives the expected result:
>>> np.isin(element, list(test_set)) array([[False, True], [ True, False]])
-
dask.array.
isinf
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.isinf.
Some inconsistencies with the Dask version may exist.
Test element-wise for positive or negative infinity.
Returns a boolean array of the same shape as x, True where
x == +/-inf
, otherwise False.Parameters: x : array_like
Input values
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : bool (scalar) or boolean ndarray
True where
x
is positive or negative infinity, false otherwise. This is a scalar if x is a scalar.Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754).
Errors result if the second argument is supplied when the first argument is a scalar, or if the first and second arguments have different shapes.
Examples
>>> np.isinf(np.inf) # doctest: +SKIP True >>> np.isinf(np.nan) # doctest: +SKIP False >>> np.isinf(np.NINF) # doctest: +SKIP True >>> np.isinf([np.inf, -np.inf, 1.0, np.nan]) # doctest: +SKIP array([ True, True, False, False])
>>> x = np.array([-np.inf, 0., np.inf]) # doctest: +SKIP >>> y = np.array([2, 2, 2]) # doctest: +SKIP >>> np.isinf(x, y) # doctest: +SKIP array([1, 0, 1]) >>> y # doctest: +SKIP array([1, 0, 1])
-
dask.array.
isneginf
(*args, **kwargs)¶ This docstring was copied from numpy.equal.
Some inconsistencies with the Dask version may exist.
Return (x1 == x2) element-wise.
Parameters: x1, x2 : array_like
Input arrays. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Output array, element-wise comparison of x1 and x2. Typically of type bool, unless
dtype=object
is passed. This is a scalar if both x1 and x2 are scalars.See also
not_equal
,greater_equal
,less_equal
,greater
,less
Examples
>>> np.equal([0, 1, 3], np.arange(3)) # doctest: +SKIP array([ True, True, False])
What is compared are values, not types. So an int (1) and an array of length one can evaluate as True:
>>> np.equal(1, np.ones(1)) # doctest: +SKIP array([ True])
-
dask.array.
isnan
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.isnan.
Some inconsistencies with the Dask version may exist.
Test element-wise for NaN and return result as a boolean array.
Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or bool
True where
x
is NaN, false otherwise. This is a scalar if x is a scalar.Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity.
Examples
>>> np.isnan(np.nan) # doctest: +SKIP True >>> np.isnan(np.inf) # doctest: +SKIP False >>> np.isnan([np.log(-1.),1.,np.log(0)]) # doctest: +SKIP array([ True, False, False])
-
dask.array.
isnull
(values)¶ pandas.isnull for dask arrays
-
dask.array.
isposinf
(*args, **kwargs)¶ This docstring was copied from numpy.equal.
Some inconsistencies with the Dask version may exist.
Return (x1 == x2) element-wise.
Parameters: x1, x2 : array_like
Input arrays. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Output array, element-wise comparison of x1 and x2. Typically of type bool, unless
dtype=object
is passed. This is a scalar if both x1 and x2 are scalars.See also
not_equal
,greater_equal
,less_equal
,greater
,less
Examples
>>> np.equal([0, 1, 3], np.arange(3)) # doctest: +SKIP array([ True, True, False])
What is compared are values, not types. So an int (1) and an array of length one can evaluate as True:
>>> np.equal(1, np.ones(1)) # doctest: +SKIP array([ True])
-
dask.array.
isreal
(*args, **kwargs)¶ Returns a bool array, where True if input element is real.
This docstring was copied from numpy.isreal.
Some inconsistencies with the Dask version may exist.
If element has complex type with zero complex part, the return value for that element is True.
Parameters: x : array_like (Not supported in Dask)
Input array.
Returns: out : ndarray, bool
Boolean array of same shape as x.
Examples
>>> np.isreal([1+1j, 1+0j, 4.5, 3, 2, 2j]) # doctest: +SKIP array([False, True, True, True, True, False])
-
dask.array.
ldexp
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.ldexp.
Some inconsistencies with the Dask version may exist.
Returns x1 * 2**x2, element-wise.
The mantissas x1 and twos exponents x2 are used to construct floating point numbers
x1 * 2**x2
.Parameters: x1 : array_like
Array of multipliers.
x2 : array_like, int
Array of twos exponents. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The result of
x1 * 2**x2
. This is a scalar if both x1 and x2 are scalars.See also
frexp
- Return (y1, y2) from
x = y1 * 2**y2
, inverse to ldexp.
Notes
Complex dtypes are not supported, they will raise a TypeError.
ldexp is useful as the inverse of frexp, if used by itself it is more clear to simply use the expression
x1 * 2**x2
.Examples
>>> np.ldexp(5, np.arange(4)) # doctest: +SKIP array([ 5., 10., 20., 40.], dtype=float16)
>>> x = np.arange(6) # doctest: +SKIP >>> np.ldexp(*np.frexp(x)) # doctest: +SKIP array([ 0., 1., 2., 3., 4., 5.])
-
dask.array.
linspace
(start, stop, num=50, endpoint=True, retstep=False, chunks='auto', dtype=None)¶ Return num evenly spaced values over the closed interval [start, stop].
Parameters: start : scalar
The starting value of the sequence.
stop : scalar
The last value of the sequence.
num : int, optional
Number of samples to include in the returned dask array, including the endpoints. Default is 50.
endpoint : bool, optional
If True,
stop
is the last sample. Otherwise, it is not included. Default is True.retstep : bool, optional
If True, return (samples, step), where step is the spacing between samples. Default is False.
chunks : int
The number of samples on each block. Note that the last block will have fewer samples if num % blocksize != 0
dtype : dtype, optional
The type of the output array.
Returns: samples : dask array
step : float, optional
Only returned if
retstep
is True. Size of spacing between samples.See also
-
dask.array.
log
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log.
Some inconsistencies with the Dask version may exist.
Natural logarithm, element-wise.
The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x. The natural logarithm is logarithm in base e.
Parameters: x : array_like
Input value.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The natural logarithm of x, element-wise. This is a scalar if x is a scalar.
Notes
Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
References
[R133] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/ [R134] Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm Examples
>>> np.log([1, np.e, np.e**2, 0]) # doctest: +SKIP array([ 0., 1., 2., -Inf])
-
dask.array.
log10
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log10.
Some inconsistencies with the Dask version may exist.
Return the base 10 logarithm of the input array, element-wise.
Parameters: x : array_like
Input values.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The logarithm to the base 10 of x, element-wise. NaNs are returned where x is negative. This is a scalar if x is a scalar.
See also
emath.log10
Notes
Logarithm is a multivalued function: for each x there is an infinite number of z such that 10**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log10 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log10 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log10 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
References
[R135] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/ [R136] Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm Examples
>>> np.log10([1e-15, -3.]) # doctest: +SKIP array([-15., nan])
-
dask.array.
log1p
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log1p.
Some inconsistencies with the Dask version may exist.
Return the natural logarithm of one plus the input array, element-wise.
Calculates
log(1 + x)
.Parameters: x : array_like
Input values.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
Natural logarithm of 1 + x, element-wise. This is a scalar if x is a scalar.
See also
expm1
exp(x) - 1
, the inverse of log1p.
Notes
For real-valued input, log1p is accurate also for x so small that 1 + x == 1 in floating-point accuracy.
Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = 1 + x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log1p always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log1p is a complex analytical function that has a branch cut [-inf, -1] and is continuous from above on it. log1p handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
References
[R137] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/ [R138] Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm Examples
>>> np.log1p(1e-99) # doctest: +SKIP 1e-99 >>> np.log(1 + 1e-99) # doctest: +SKIP 0.0
-
dask.array.
log2
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.log2.
Some inconsistencies with the Dask version may exist.
Base-2 logarithm of x.
Parameters: x : array_like
Input values.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
Base-2 logarithm of x. This is a scalar if x is a scalar.
Notes
New in version 1.3.0.
Logarithm is a multivalued function: for each x there is an infinite number of z such that 2**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].
For real-valued input data types, log2 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields
nan
and sets the invalid floating point error flag.For complex-valued input, log2 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log2 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.
Examples
>>> x = np.array([0, 1, 2, 2**4]) # doctest: +SKIP >>> np.log2(x) # doctest: +SKIP array([-Inf, 0., 1., 4.])
>>> xi = np.array([0+1.j, 1, 2+0.j, 4.j]) # doctest: +SKIP >>> np.log2(xi) # doctest: +SKIP array([ 0.+2.26618007j, 0.+0.j , 1.+0.j , 2.+2.26618007j])
-
dask.array.
logaddexp
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logaddexp.
Some inconsistencies with the Dask version may exist.
Logarithm of the sum of exponentiations of the inputs.
Calculates
log(exp(x1) + exp(x2))
. This function is useful in statistics where the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the logarithm of the calculated probability is stored. This function allows adding probabilities stored in such a fashion.Parameters: x1, x2 : array_like
Input values. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: result : ndarray
Logarithm of
exp(x1) + exp(x2)
. This is a scalar if both x1 and x2 are scalars.See also
logaddexp2
- Logarithm of the sum of exponentiations of inputs in base 2.
Notes
New in version 1.3.0.
Examples
>>> prob1 = np.log(1e-50) # doctest: +SKIP >>> prob2 = np.log(2.5e-50) # doctest: +SKIP >>> prob12 = np.logaddexp(prob1, prob2) # doctest: +SKIP >>> prob12 # doctest: +SKIP -113.87649168120691 >>> np.exp(prob12) # doctest: +SKIP 3.5000000000000057e-50
-
dask.array.
logaddexp2
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logaddexp2.
Some inconsistencies with the Dask version may exist.
Logarithm of the sum of exponentiations of the inputs in base-2.
Calculates
log2(2**x1 + 2**x2)
. This function is useful in machine learning when the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the base-2 logarithm of the calculated probability can be used instead. This function allows adding probabilities stored in such a fashion.Parameters: x1, x2 : array_like
Input values. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: result : ndarray
Base-2 logarithm of
2**x1 + 2**x2
. This is a scalar if both x1 and x2 are scalars.See also
logaddexp
- Logarithm of the sum of exponentiations of the inputs.
Notes
New in version 1.3.0.
Examples
>>> prob1 = np.log2(1e-50) # doctest: +SKIP >>> prob2 = np.log2(2.5e-50) # doctest: +SKIP >>> prob12 = np.logaddexp2(prob1, prob2) # doctest: +SKIP >>> prob1, prob2, prob12 # doctest: +SKIP (-166.09640474436813, -164.77447664948076, -164.28904982231052) >>> 2**prob12 # doctest: +SKIP 3.4999999999999914e-50
-
dask.array.
logical_and
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_and.
Some inconsistencies with the Dask version may exist.
Compute the truth value of x1 AND x2 element-wise.
Parameters: x1, x2 : array_like
Input arrays. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or bool
Boolean result of the logical OR operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.
See also
Examples
>>> np.logical_and(True, False) # doctest: +SKIP False >>> np.logical_and([True, False], [False, False]) # doctest: +SKIP array([False, False])
>>> x = np.arange(5) # doctest: +SKIP >>> np.logical_and(x>1, x<4) # doctest: +SKIP array([False, False, True, True, False])
-
dask.array.
logical_not
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_not.
Some inconsistencies with the Dask version may exist.
Compute the truth value of NOT x element-wise.
Parameters: x : array_like
Logical NOT is applied to the elements of x.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : bool or ndarray of bool
Boolean result with the same shape as x of the NOT operation on elements of x. This is a scalar if x is a scalar.
See also
Examples
>>> np.logical_not(3) # doctest: +SKIP False >>> np.logical_not([True, False, 0, 1]) # doctest: +SKIP array([False, True, True, False])
>>> x = np.arange(5) # doctest: +SKIP >>> np.logical_not(x<3) # doctest: +SKIP array([False, False, False, True, True])
-
dask.array.
logical_or
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_or.
Some inconsistencies with the Dask version may exist.
Compute the truth value of x1 OR x2 element-wise.
Parameters: x1, x2 : array_like
Logical OR is applied to the elements of x1 and x2. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or bool
Boolean result of the logical OR operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.
See also
Examples
>>> np.logical_or(True, False) # doctest: +SKIP True >>> np.logical_or([True, False], [False, False]) # doctest: +SKIP array([ True, False])
>>> x = np.arange(5) # doctest: +SKIP >>> np.logical_or(x < 1, x > 3) # doctest: +SKIP array([ True, False, False, False, True])
-
dask.array.
logical_xor
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.logical_xor.
Some inconsistencies with the Dask version may exist.
Compute the truth value of x1 XOR x2, element-wise.
Parameters: x1, x2 : array_like
Logical XOR is applied to the elements of x1 and x2. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : bool or ndarray of bool
Boolean result of the logical XOR operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.
See also
Examples
>>> np.logical_xor(True, False) # doctest: +SKIP True >>> np.logical_xor([True, True, False, False], [True, False, True, False]) # doctest: +SKIP array([False, True, True, False])
>>> x = np.arange(5) # doctest: +SKIP >>> np.logical_xor(x < 1, x > 3) # doctest: +SKIP array([ True, False, False, False, True])
Simple example showing support of broadcasting
>>> np.logical_xor(0, np.eye(2)) # doctest: +SKIP array([[ True, False], [False, True]])
-
dask.array.
map_blocks
(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)¶ Map a function across all blocks of a dask array.
Parameters: func : callable
Function to apply to every block in the array.
args : dask arrays or other objects
dtype : np.dtype, optional
The
dtype
of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.chunks : tuple, optional
Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.
drop_axis : number or iterable, optional
Dimensions lost by the function.
new_axis : number or iterable, optional
New dimensions created by the function. Note that these are applied after
drop_axis
(if present).token : string, optional
The key prefix to use for the output array. If not provided, will be determined from the function name.
name : string, optional
The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.
**kwargs :
Other keyword arguments to pass to function. Values must be constants (not dask.arrays)
See also
dask.array.blockwise
- Generalized operation with control over block alignment.
Examples
>>> import dask.array as da >>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute() array([ 0, 2, 4, 6, 8, 10])
The
da.map_blocks
function can also accept multiple arrays.>>> d = da.arange(5, chunks=2) >>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e) >>> f.compute() array([ 0, 2, 6, 12, 20])
If the function changes shape of the blocks then you must provide chunks explicitly.
>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))
You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.
>>> a = da.arange(18, chunks=(6,)) >>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))
If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.
>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1), ... new_axis=[0, 2])
If
chunks
is specified butnew_axis
is not, then it is inferred to add the necessary number of axes on the left.Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.
>>> x = da.arange(1000, chunks=(100,)) >>> y = da.arange(100, chunks=(10,))
The relevant attribute to match is numblocks.
>>> x.numblocks (10,) >>> y.numblocks (10,)
If these match (up to broadcasting rules) then we can map arbitrary functions across blocks
>>> def func(a, b): ... return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8') dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute() array([ 99, 9, 199, 19, 299, 29, 399, 39, 499, 49, 599, 59, 699, 69, 799, 79, 899, 89, 999, 99])
Your block function get information about where it is in the array by accepting a special
block_info
keyword argument.>>> def func(block, block_info=None): ... pass
This will receive the following information:
>>> block_info # doctest: +SKIP {0: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)]}, None: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)], 'chunk-shape': (100,), 'dtype': dtype('float64')}}
For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to
40:50
). The same information is provided for the output, with the keyNone
, plus the shape and dtype that should be returned.These features can be combined to synthesize an array from scratch, for example:
>>> def func(block_info=None): ... loc = block_info[None]['array-location'][0] ... return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_) dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute() array([0, 1, 2, 3, 4, 5, 6, 7])
You may specify the key name prefix of the resulting task in the graph with the optional
token
keyword argument.>>> x.map_blocks(lambda x: x + 1, name='increment') # doctest: +SKIP dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
-
dask.array.
matmul
(x1, x2, /, out=None, *, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.matmul.
Some inconsistencies with the Dask version may exist.
Matrix product of two arrays.
Parameters: x1, x2 : array_like
Input arrays, scalars not allowed.
out : ndarray, optional
A location into which the result is stored. If provided, it must have a shape that matches the signature (n,k),(k,m)->(n,m). If not provided or None, a freshly-allocated array is returned.
**kwargs
For other keyword-only arguments, see the ufunc docs.
New in version 1.16: Now handles ufunc kwargs
Returns: y : ndarray
The matrix product of the inputs. This is a scalar only when both x1, x2 are 1-d vectors.
Raises: ValueError
If the last dimension of a is not the same size as the second-to-last dimension of b.
If a scalar value is passed in.
See also
Notes
The behavior depends on the arguments in the following way.
- If both arguments are 2-D they are multiplied like conventional matrices.
- If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
- If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.
- If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.
matmul
differs fromdot
in two important ways:Multiplication by scalars is not allowed, use
*
instead.Stacks of matrices are broadcast together as if the matrices were elements, respecting the signature
(n,k),(k,m)->(n,m)
:>>> a = np.ones([9, 5, 7, 4]) # doctest: +SKIP >>> c = np.ones([9, 5, 4, 3]) # doctest: +SKIP >>> np.dot(a, c).shape # doctest: +SKIP (9, 5, 7, 9, 5, 3) >>> np.matmul(a, c).shape # doctest: +SKIP (9, 5, 7, 3) >>> # n is 7, k is 4, m is 3
The matmul function implements the semantics of the @ operator introduced in Python 3.5 following PEP465.
Examples
For 2-D arrays it is the matrix product:
>>> a = np.array([[1, 0], # doctest: +SKIP ... [0, 1]]) >>> b = np.array([[4, 1], # doctest: +SKIP ... [2, 2]]) >>> np.matmul(a, b) # doctest: +SKIP array([[4, 1], [2, 2]])
For 2-D mixed with 1-D, the result is the usual.
>>> a = np.array([[1, 0], # doctest: +SKIP ... [0, 1]]) >>> b = np.array([1, 2]) # doctest: +SKIP >>> np.matmul(a, b) # doctest: +SKIP array([1, 2]) >>> np.matmul(b, a) # doctest: +SKIP array([1, 2])
Broadcasting is conventional for stacks of arrays
>>> a = np.arange(2 * 2 * 4).reshape((2, 2, 4)) # doctest: +SKIP >>> b = np.arange(2 * 2 * 4).reshape((2, 4, 2)) # doctest: +SKIP >>> np.matmul(a,b).shape # doctest: +SKIP (2, 2, 2) >>> np.matmul(a, b)[0, 1, 1] # doctest: +SKIP 98 >>> sum(a[0, 1, :] * b[0 , :, 1]) # doctest: +SKIP 98
Vector, vector returns the scalar inner product, but neither argument is complex-conjugated:
>>> np.matmul([2j, 3j], [2j, 3j]) # doctest: +SKIP (-13+0j)
Scalar multiplication raises an error.
>>> np.matmul([1,2], 3) # doctest: +SKIP Traceback (most recent call last): ... ValueError: matmul: Input operand 1 does not have enough dimensions ...
New in version 1.10.0.
-
dask.array.
max
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis.
This docstring was copied from numpy.max.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input data.
axis : None or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before.
out : ndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See doc.ufuncs (Section “Output arguments”) for more details.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amax method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
initial : scalar, optional (Not supported in Dask)
The minimum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
where : array_like of bool, optional (Not supported in Dask)
Elements to compare for the maximum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
Returns: amax : ndarray or scalar
Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.See also
amin
- The minimum value of an array along a given axis, propagating any NaNs.
nanmax
- The maximum value of an array along a given axis, ignoring any NaNs.
maximum
- Element-wise maximum of two arrays, propagating any NaNs.
fmax
- Element-wise maximum of two arrays, ignoring any NaNs.
argmax
- Return the indices of the maximum values.
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.
Don’t use amax for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,maximum(a[0], a[1])
is faster thanamax(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) # doctest: +SKIP >>> a # doctest: +SKIP array([[0, 1], [2, 3]]) >>> np.amax(a) # Maximum of the flattened array # doctest: +SKIP 3 >>> np.amax(a, axis=0) # Maxima along the first axis # doctest: +SKIP array([2, 3]) >>> np.amax(a, axis=1) # Maxima along the second axis # doctest: +SKIP array([1, 3]) >>> np.amax(a, where=[False, True], initial=-1, axis=0) # doctest: +SKIP array([-1, 3]) >>> b = np.arange(5, dtype=float) # doctest: +SKIP >>> b[2] = np.NaN # doctest: +SKIP >>> np.amax(b) # doctest: +SKIP nan >>> np.amax(b, where=~np.isnan(b), initial=-1) # doctest: +SKIP 4.0 >>> np.nanmax(b) # doctest: +SKIP 4.0
You can use an initial value to compute the maximum of an empty slice, or to initialize it to a different value:
>>> np.max([[-50], [10]], axis=-1, initial=0) # doctest: +SKIP array([ 0, 10])
Notice that the initial value is used as one of the elements for which the maximum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
>>> np.max([5], initial=6) # doctest: +SKIP 6 >>> max([5], default=6) # doctest: +SKIP 5
-
dask.array.
maximum
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.maximum.
Some inconsistencies with the Dask version may exist.
Element-wise maximum of array elements.
Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.
Parameters: x1, x2 : array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The maximum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
The maximum is equivalent to
np.where(x1 >= x2, x1, x2)
when neither x1 nor x2 are nans, but it is faster and does proper broadcasting.Examples
>>> np.maximum([2, 3, 4], [1, 5, 2]) # doctest: +SKIP array([2, 5, 4])
>>> np.maximum(np.eye(2), [0.5, 2]) # broadcasting # doctest: +SKIP array([[ 1. , 2. ], [ 0.5, 2. ]])
>>> np.maximum([np.nan, 0, np.nan], [0, np.nan, np.nan]) # doctest: +SKIP array([nan, nan, nan]) >>> np.maximum(np.Inf, 1) # doctest: +SKIP inf
-
dask.array.
mean
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Compute the arithmetic mean along the specified axis.
This docstring was copied from numpy.mean.
Some inconsistencies with the Dask version may exist.
Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.
Parameters: a : array_like
Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.
axis : None or int or tuple of ints, optional
Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.
New in version 1.7.0.
If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before.
dtype : data-type, optional
Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the mean method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
Returns: m : ndarray, see dtype parameter above
If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.
Notes
The arithmetic mean is the sum of the elements along the axis divided by the number of elements.
Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.
By default, float16 results are computed using float32 intermediates for extra precision.
Examples
>>> a = np.array([[1, 2], [3, 4]]) # doctest: +SKIP >>> np.mean(a) # doctest: +SKIP 2.5 >>> np.mean(a, axis=0) # doctest: +SKIP array([2., 3.]) >>> np.mean(a, axis=1) # doctest: +SKIP array([1.5, 3.5])
In single precision, mean can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32) # doctest: +SKIP >>> a[0, :] = 1.0 # doctest: +SKIP >>> a[1, :] = 0.1 # doctest: +SKIP >>> np.mean(a) # doctest: +SKIP 0.54999924
Computing the mean in float64 is more accurate:
>>> np.mean(a, dtype=np.float64) # doctest: +SKIP 0.55000000074505806 # may vary
-
dask.array.
median
(a, axis=None, keepdims=False, out=None)¶ Compute the median along the specified axis.
This docstring was copied from numpy.median.
Some inconsistencies with the Dask version may exist.
This works by automatically chunking the reduced axes to a single chunk and then calling
numpy.median
function across the remaining dimensionsReturns the median of the array elements.
Parameters: a : array_like
Input array or object that can be converted to an array.
axis : {int, sequence of int, None}, optional
Axis or axes along which the medians are computed. The default is to compute the median along a flattened version of the array. A sequence of axes is supported since version 1.9.0.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type (of the output) will be cast if necessary.
overwrite_input : bool, optional (Not supported in Dask)
If True, then allow use of memory of input array a for calculations. The input array will be modified by the call to median. This will save memory when you do not need to preserve the contents of the input array. Treat the input as undefined, but it will probably be fully or partially sorted. Default is False. If overwrite_input is
True
and a is not already an ndarray, an error will be raised.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
New in version 1.9.0.
Returns: median : ndarray
A new array holding the result. If the input contains integers or floats smaller than
float64
, then the output data-type isnp.float64
. Otherwise, the data-type of the output is the same as that of the input. If out is specified, that array is returned instead.See also
Notes
Given a vector
V
of lengthN
, the median ofV
is the middle value of a sorted copy ofV
,V_sorted
- i e.,V_sorted[(N-1)/2]
, whenN
is odd, and the average of the two middle values ofV_sorted
whenN
is even.Examples
>>> a = np.array([[10, 7, 4], [3, 2, 1]]) # doctest: +SKIP >>> a # doctest: +SKIP array([[10, 7, 4], [ 3, 2, 1]]) >>> np.median(a) # doctest: +SKIP 3.5 >>> np.median(a, axis=0) # doctest: +SKIP array([6.5, 4.5, 2.5]) >>> np.median(a, axis=1) # doctest: +SKIP array([7., 2.]) >>> m = np.median(a, axis=0) # doctest: +SKIP >>> out = np.zeros_like(m) # doctest: +SKIP >>> np.median(a, axis=0, out=m) # doctest: +SKIP array([6.5, 4.5, 2.5]) >>> m # doctest: +SKIP array([6.5, 4.5, 2.5]) >>> b = a.copy() # doctest: +SKIP >>> np.median(b, axis=1, overwrite_input=True) # doctest: +SKIP array([7., 2.]) >>> assert not np.all(a==b) # doctest: +SKIP >>> b = a.copy() # doctest: +SKIP >>> np.median(b, axis=None, overwrite_input=True) # doctest: +SKIP 3.5 >>> assert not np.all(a==b) # doctest: +SKIP
-
dask.array.
meshgrid
(*xi, **kwargs)¶ Return coordinate matrices from coordinate vectors.
This docstring was copied from numpy.meshgrid.
Some inconsistencies with the Dask version may exist.
Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,…, xn.
Changed in version 1.9: 1-D and 0-D cases are allowed.
Parameters: x1, x2,…, xn : array_like
1-D arrays representing the coordinates of a grid.
indexing : {‘xy’, ‘ij’}, optional
Cartesian (‘xy’, default) or matrix (‘ij’) indexing of output. See Notes for more details.
New in version 1.7.0.
sparse : bool, optional
If True a sparse grid is returned in order to conserve memory. Default is False.
New in version 1.7.0.
copy : bool, optional
If False, a view into the original arrays are returned in order to conserve memory. Default is True. Please note that
sparse=False, copy=False
will likely return non-contiguous arrays. Furthermore, more than one element of a broadcast array may refer to a single memory location. If you need to write to the arrays, make copies first.New in version 1.7.0.
Returns: X1, X2,…, XN : ndarray
For vectors x1, x2,…, ‘xn’ with lengths
Ni=len(xi)
, return(N1, N2, N3,...Nn)
shaped arrays if indexing=’ij’ or(N2, N1, N3,...Nn)
shaped arrays if indexing=’xy’ with the elements of xi repeated to fill the matrix along the first dimension for x1, the second for x2 and so on.See also
index_tricks.mgrid
- Construct a multi-dimensional “meshgrid” using indexing notation.
index_tricks.ogrid
- Construct an open multi-dimensional “meshgrid” using indexing notation.
Notes
This function supports both indexing conventions through the indexing keyword argument. Giving the string ‘ij’ returns a meshgrid with matrix indexing, while ‘xy’ returns a meshgrid with Cartesian indexing. In the 2-D case with inputs of length M and N, the outputs are of shape (N, M) for ‘xy’ indexing and (M, N) for ‘ij’ indexing. In the 3-D case with inputs of length M, N and P, outputs are of shape (N, M, P) for ‘xy’ indexing and (M, N, P) for ‘ij’ indexing. The difference is illustrated by the following code snippet:
xv, yv = np.meshgrid(x, y, sparse=False, indexing='ij') for i in range(nx): for j in range(ny): # treat xv[i,j], yv[i,j] xv, yv = np.meshgrid(x, y, sparse=False, indexing='xy') for i in range(nx): for j in range(ny): # treat xv[j,i], yv[j,i]
In the 1-D and 0-D case, the indexing and sparse keywords have no effect.
Examples
>>> nx, ny = (3, 2) # doctest: +SKIP >>> x = np.linspace(0, 1, nx) # doctest: +SKIP >>> y = np.linspace(0, 1, ny) # doctest: +SKIP >>> xv, yv = np.meshgrid(x, y) # doctest: +SKIP >>> xv # doctest: +SKIP array([[0. , 0.5, 1. ], [0. , 0.5, 1. ]]) >>> yv # doctest: +SKIP array([[0., 0., 0.], [1., 1., 1.]]) >>> xv, yv = np.meshgrid(x, y, sparse=True) # make sparse output arrays # doctest: +SKIP >>> xv # doctest: +SKIP array([[0. , 0.5, 1. ]]) >>> yv # doctest: +SKIP array([[0.], [1.]])
meshgrid is very useful to evaluate functions on a grid.
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> x = np.arange(-5, 5, 0.1) # doctest: +SKIP >>> y = np.arange(-5, 5, 0.1) # doctest: +SKIP >>> xx, yy = np.meshgrid(x, y, sparse=True) # doctest: +SKIP >>> z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2) # doctest: +SKIP >>> h = plt.contourf(x,y,z) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.
min
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return the minimum of an array or minimum along an axis.
This docstring was copied from numpy.min.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input data.
axis : None or int or tuple of ints, optional
Axis or axes along which to operate. By default, flattened input is used.
New in version 1.7.0.
If this is a tuple of ints, the minimum is selected over multiple axes, instead of a single axis or all the axes as before.
out : ndarray, optional
Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See doc.ufuncs (Section “Output arguments”) for more details.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the amin method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
initial : scalar, optional (Not supported in Dask)
The maximum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
where : array_like of bool, optional (Not supported in Dask)
Elements to compare for the minimum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
Returns: amin : ndarray or scalar
Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension
a.ndim - 1
.See also
amax
- The maximum value of an array along a given axis, propagating any NaNs.
nanmin
- The minimum value of an array along a given axis, ignoring any NaNs.
minimum
- Element-wise minimum of two arrays, propagating any NaNs.
fmin
- Element-wise minimum of two arrays, ignoring any NaNs.
argmin
- Return the indices of the minimum values.
Notes
NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmin.
Don’t use amin for element-wise comparison of 2 arrays; when
a.shape[0]
is 2,minimum(a[0], a[1])
is faster thanamin(a, axis=0)
.Examples
>>> a = np.arange(4).reshape((2,2)) # doctest: +SKIP >>> a # doctest: +SKIP array([[0, 1], [2, 3]]) >>> np.amin(a) # Minimum of the flattened array # doctest: +SKIP 0 >>> np.amin(a, axis=0) # Minima along the first axis # doctest: +SKIP array([0, 1]) >>> np.amin(a, axis=1) # Minima along the second axis # doctest: +SKIP array([0, 2]) >>> np.amin(a, where=[False, True], initial=10, axis=0) # doctest: +SKIP array([10, 1])
>>> b = np.arange(5, dtype=float) # doctest: +SKIP >>> b[2] = np.NaN # doctest: +SKIP >>> np.amin(b) # doctest: +SKIP nan >>> np.amin(b, where=~np.isnan(b), initial=10) # doctest: +SKIP 0.0 >>> np.nanmin(b) # doctest: +SKIP 0.0
>>> np.min([[-50], [10]], axis=-1, initial=0) # doctest: +SKIP array([-50, 0])
Notice that the initial value is used as one of the elements for which the minimum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.
Notice that this isn’t the same as Python’s
default
argument.>>> np.min([6], initial=5) # doctest: +SKIP 5 >>> min([6], default=5) # doctest: +SKIP 6
-
dask.array.
minimum
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.minimum.
Some inconsistencies with the Dask version may exist.
Element-wise minimum of array elements.
Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.
Parameters: x1, x2 : array_like
The arrays holding the elements to be compared. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The minimum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.
See also
Notes
The minimum is equivalent to
np.where(x1 <= x2, x1, x2)
when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.Examples
>>> np.minimum([2, 3, 4], [1, 5, 2]) # doctest: +SKIP array([1, 3, 2])
>>> np.minimum(np.eye(2), [0.5, 2]) # broadcasting # doctest: +SKIP array([[ 0.5, 0. ], [ 0. , 1. ]])
>>> np.minimum([np.nan, 0, np.nan],[0, np.nan, np.nan]) # doctest: +SKIP array([nan, nan, nan]) >>> np.minimum(-np.Inf, 1) # doctest: +SKIP -inf
-
dask.array.
modf
(x, [out1, out2, ]/, [out=(None, None), ]*, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.modf.
Some inconsistencies with the Dask version may exist.
Return the fractional and integral parts of an array, element-wise.
The fractional and integral parts are negative if the given number is negative.
Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y1 : ndarray
Fractional part of x. This is a scalar if x is a scalar.
y2 : ndarray
Integral part of x. This is a scalar if x is a scalar.
See also
divmod
divmod(x, 1)
is equivalent tomodf
with the return values switched, except it always has a positive remainder.
Notes
For integer input the return values are floats.
Examples
>>> np.modf([0, 3.5]) # doctest: +SKIP (array([ 0. , 0.5]), array([ 0., 3.])) >>> np.modf(-0.5) # doctest: +SKIP (-0.5, -0)
-
dask.array.
moment
(a, order, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶
-
dask.array.
moveaxis
(a, source, destination)¶ Move axes of an array to new positions.
This docstring was copied from numpy.moveaxis.
Some inconsistencies with the Dask version may exist.
Other axes remain in their original order.
New in version 1.11.0.
Parameters: a : np.ndarray
The array whose axes should be reordered.
source : int or sequence of int
Original positions of the axes to move. These must be unique.
destination : int or sequence of int
Destination positions for each of the original axes. These must also be unique.
Returns: result : np.ndarray
Array with moved axes. This array is a view of the input array.
See also
transpose
- Permute the dimensions of an array.
swapaxes
- Interchange two axes of an array.
Examples
>>> x = np.zeros((3, 4, 5)) # doctest: +SKIP >>> np.moveaxis(x, 0, -1).shape # doctest: +SKIP (4, 5, 3) >>> np.moveaxis(x, -1, 0).shape # doctest: +SKIP (5, 3, 4)
These all achieve the same result:
>>> np.transpose(x).shape # doctest: +SKIP (5, 4, 3) >>> np.swapaxes(x, 0, -1).shape # doctest: +SKIP (5, 4, 3) >>> np.moveaxis(x, [0, 1], [-1, -2]).shape # doctest: +SKIP (5, 4, 3) >>> np.moveaxis(x, [0, 1, 2], [-1, -2, -3]).shape # doctest: +SKIP (5, 4, 3)
-
dask.array.
nanargmax
(x, axis=None, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and NaN is returned for that slice.This docstring was copied from numpy.nanmax.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like (Not supported in Dask)
Array containing numbers whose maximum is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the maximum is computed. The default is to compute the maximum of the flattened array.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details.New in version 1.8.0.
keepdims : bool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the max method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
Returns: nanmax : ndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmin
- The minimum value of an array along a given axis, ignoring any NaNs.
amax
- The maximum value of an array along a given axis, propagating any NaNs.
fmax
- Element-wise maximum of two arrays, ignoring any NaNs.
maximum
- Element-wise maximum of two arrays, propagating any NaNs.
isnan
- Shows which elements are Not a Number (NaN).
isfinite
- Shows which elements are neither NaN nor infinity.
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.max.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) # doctest: +SKIP >>> np.nanmax(a) # doctest: +SKIP 3.0 >>> np.nanmax(a, axis=0) # doctest: +SKIP array([3., 2.]) >>> np.nanmax(a, axis=1) # doctest: +SKIP array([2., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmax([1, 2, np.nan, np.NINF]) # doctest: +SKIP 2.0 >>> np.nanmax([1, 2, np.nan, np.inf]) # doctest: +SKIP inf
-
dask.array.
nanargmin
(x, axis=None, split_every=None, out=None)¶ Return minimum of an array or minimum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and Nan is returned for that slice.This docstring was copied from numpy.nanmin.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like (Not supported in Dask)
Array containing numbers whose minimum is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the minimum is computed. The default is to compute the minimum of the flattened array.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details.New in version 1.8.0.
keepdims : bool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the min method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
Returns: nanmin : ndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmax
- The maximum value of an array along a given axis, ignoring any NaNs.
amin
- The minimum value of an array along a given axis, propagating any NaNs.
fmin
- Element-wise minimum of two arrays, ignoring any NaNs.
minimum
- Element-wise minimum of two arrays, propagating any NaNs.
isnan
- Shows which elements are Not a Number (NaN).
isfinite
- Shows which elements are neither NaN nor infinity.
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.min.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) # doctest: +SKIP >>> np.nanmin(a) # doctest: +SKIP 1.0 >>> np.nanmin(a, axis=0) # doctest: +SKIP array([1., 2.]) >>> np.nanmin(a, axis=1) # doctest: +SKIP array([1., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmin([1, 2, np.nan, np.inf]) # doctest: +SKIP 1.0 >>> np.nanmin([1, 2, np.nan, np.NINF]) # doctest: +SKIP -inf
-
dask.array.
nancumprod
(x, axis, dtype=None, out=None)¶ Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. The cumulative product does not change when NaNs are encountered and leading NaNs are replaced by ones.
This docstring was copied from numpy.nancumprod.
Some inconsistencies with the Dask version may exist.
Ones are returned for slices that are all-NaN or empty.
New in version 1.12.0.
Parameters: a : array_like (Not supported in Dask)
Input array.
axis : int, optional
Axis along which the cumulative product is computed. By default the input is flattened.
dtype : dtype, optional
Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary.
Returns: nancumprod : ndarray
A new array holding the result is returned unless out is specified, in which case it is returned.
See also
numpy.cumprod
- Cumulative product across array propagating NaNs.
isnan
- Show which elements are NaN.
Examples
>>> np.nancumprod(1) # doctest: +SKIP array([1]) >>> np.nancumprod([1]) # doctest: +SKIP array([1]) >>> np.nancumprod([1, np.nan]) # doctest: +SKIP array([1., 1.]) >>> a = np.array([[1, 2], [3, np.nan]]) # doctest: +SKIP >>> np.nancumprod(a) # doctest: +SKIP array([1., 2., 6., 6.]) >>> np.nancumprod(a, axis=0) # doctest: +SKIP array([[1., 2.], [3., 2.]]) >>> np.nancumprod(a, axis=1) # doctest: +SKIP array([[1., 2.], [3., 3.]])
-
dask.array.
nancumsum
(x, axis, dtype=None, out=None)¶ Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. The cumulative sum does not change when NaNs are encountered and leading NaNs are replaced by zeros.
This docstring was copied from numpy.nancumsum.
Some inconsistencies with the Dask version may exist.
Zeros are returned for slices that are all-NaN or empty.
New in version 1.12.0.
Parameters: a : array_like (Not supported in Dask)
Input array.
axis : int, optional
Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.
dtype : dtype, optional
Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See doc.ufuncs (Section “Output arguments”) for more details.
Returns: nancumsum : ndarray.
A new array holding the result is returned unless out is specified, in which it is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.
See also
numpy.cumsum
- Cumulative sum across array propagating NaNs.
isnan
- Show which elements are NaN.
Examples
>>> np.nancumsum(1) # doctest: +SKIP array([1]) >>> np.nancumsum([1]) # doctest: +SKIP array([1]) >>> np.nancumsum([1, np.nan]) # doctest: +SKIP array([1., 1.]) >>> a = np.array([[1, 2], [3, np.nan]]) # doctest: +SKIP >>> np.nancumsum(a) # doctest: +SKIP array([1., 3., 6., 6.]) >>> np.nancumsum(a, axis=0) # doctest: +SKIP array([[1., 2.], [4., 2.]]) >>> np.nancumsum(a, axis=1) # doctest: +SKIP array([[1., 3.], [3., 3.]])
-
dask.array.
nanmax
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return the maximum of an array or maximum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and NaN is returned for that slice.This docstring was copied from numpy.nanmax.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Array containing numbers whose maximum is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the maximum is computed. The default is to compute the maximum of the flattened array.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details.New in version 1.8.0.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the max method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
Returns: nanmax : ndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmin
- The minimum value of an array along a given axis, ignoring any NaNs.
amax
- The maximum value of an array along a given axis, propagating any NaNs.
fmax
- Element-wise maximum of two arrays, ignoring any NaNs.
maximum
- Element-wise maximum of two arrays, propagating any NaNs.
isnan
- Shows which elements are Not a Number (NaN).
isfinite
- Shows which elements are neither NaN nor infinity.
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.max.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) # doctest: +SKIP >>> np.nanmax(a) # doctest: +SKIP 3.0 >>> np.nanmax(a, axis=0) # doctest: +SKIP array([3., 2.]) >>> np.nanmax(a, axis=1) # doctest: +SKIP array([2., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmax([1, 2, np.nan, np.NINF]) # doctest: +SKIP 2.0 >>> np.nanmax([1, 2, np.nan, np.inf]) # doctest: +SKIP inf
-
dask.array.
nanmean
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Compute the arithmetic mean along the specified axis, ignoring NaNs.
This docstring was copied from numpy.nanmean.
Some inconsistencies with the Dask version may exist.
Compute the arithmetic mean along the specified axis, ignoring NaNs.
This docstring was copied from numpy.nanmean.
Some inconsistencies with the Dask version may exist.
Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.
For all-NaN slices, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
Parameters: a : array_like
Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.
dtype : data-type, optional
Type to use in computing the mean. For integer inputs, the default is float64; for inexact inputs, it is the same as the input dtype.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the mean or sum methods of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
Returns: m : ndarray, see dtype parameter above
If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned. Nan is returned for slices that contain only NaNs.
Notes
The arithmetic mean is the sum of the non-NaN elements along the axis divided by the number of non-NaN elements.
Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32. Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.
Examples
>>> a = np.array([[1, np.nan], [3, 4]]) # doctest: +SKIP >>> np.nanmean(a) # doctest: +SKIP 2.6666666666666665 >>> np.nanmean(a, axis=0) # doctest: +SKIP array([2., 4.]) >>> np.nanmean(a, axis=1) # doctest: +SKIP array([1., 3.5]) # may vary
Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.
For all-NaN slices, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
-
dask.array.
nanmedian
(a, axis=None, keepdims=False, out=None)¶ Compute the median along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanmedian.
Some inconsistencies with the Dask version may exist.
This works by automatically chunking the reduced axes to a single chunk and then calling
numpy.nanmedian
function across the remaining dimensionsReturns the median of the array elements.
New in version 1.9.0.
Parameters: a : array_like
Input array or object that can be converted to an array.
axis : {int, sequence of int, None}, optional
Axis or axes along which the medians are computed. The default is to compute the median along a flattened version of the array. A sequence of axes is supported since version 1.9.0.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type (of the output) will be cast if necessary.
overwrite_input : bool, optional (Not supported in Dask)
If True, then allow use of memory of input array a for calculations. The input array will be modified by the call to median. This will save memory when you do not need to preserve the contents of the input array. Treat the input as undefined, but it will probably be fully or partially sorted. Default is False. If overwrite_input is
True
and a is not already an ndarray, an error will be raised.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If this is anything but the default value it will be passed through (in the special case of an empty array) to the mean function of the underlying array. If the array is a sub-class and mean does not have the kwarg keepdims this will raise a RuntimeError.
Returns: median : ndarray
A new array holding the result. If the input contains integers or floats smaller than
float64
, then the output data-type isnp.float64
. Otherwise, the data-type of the output is the same as that of the input. If out is specified, that array is returned instead.See also
Notes
Given a vector
V
of lengthN
, the median ofV
is the middle value of a sorted copy ofV
,V_sorted
- i.e.,V_sorted[(N-1)/2]
, whenN
is odd and the average of the two middle values ofV_sorted
whenN
is even.Examples
>>> a = np.array([[10.0, 7, 4], [3, 2, 1]]) # doctest: +SKIP >>> a[0, 1] = np.nan # doctest: +SKIP >>> a # doctest: +SKIP array([[10., nan, 4.], [ 3., 2., 1.]]) >>> np.median(a) # doctest: +SKIP nan >>> np.nanmedian(a) # doctest: +SKIP 3.0 >>> np.nanmedian(a, axis=0) # doctest: +SKIP array([6.5, 2. , 2.5]) >>> np.median(a, axis=1) # doctest: +SKIP array([nan, 2.]) >>> b = a.copy() # doctest: +SKIP >>> np.nanmedian(b, axis=1, overwrite_input=True) # doctest: +SKIP array([7., 2.]) >>> assert not np.all(a==b) # doctest: +SKIP >>> b = a.copy() # doctest: +SKIP >>> np.nanmedian(b, axis=None, overwrite_input=True) # doctest: +SKIP 3.0 >>> assert not np.all(a==b) # doctest: +SKIP
-
dask.array.
nanmin
(a, axis=None, keepdims=False, split_every=None, out=None)¶ Return minimum of an array or minimum along an axis, ignoring any NaNs. When all-NaN slices are encountered a
RuntimeWarning
is raised and Nan is returned for that slice.This docstring was copied from numpy.nanmin.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Array containing numbers whose minimum is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the minimum is computed. The default is to compute the minimum of the flattened array.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details.New in version 1.8.0.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the min method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
Returns: nanmin : ndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.
See also
nanmax
- The maximum value of an array along a given axis, ignoring any NaNs.
amin
- The minimum value of an array along a given axis, propagating any NaNs.
fmin
- Element-wise minimum of two arrays, ignoring any NaNs.
minimum
- Element-wise minimum of two arrays, propagating any NaNs.
isnan
- Shows which elements are Not a Number (NaN).
isfinite
- Shows which elements are neither NaN nor infinity.
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.
If the input has a integer type the function is equivalent to np.min.
Examples
>>> a = np.array([[1, 2], [3, np.nan]]) # doctest: +SKIP >>> np.nanmin(a) # doctest: +SKIP 1.0 >>> np.nanmin(a, axis=0) # doctest: +SKIP array([1., 2.]) >>> np.nanmin(a, axis=1) # doctest: +SKIP array([1., 3.])
When positive infinity and negative infinity are present:
>>> np.nanmin([1, 2, np.nan, np.inf]) # doctest: +SKIP 1.0 >>> np.nanmin([1, 2, np.nan, np.NINF]) # doctest: +SKIP -inf
-
dask.array.
nanprod
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Return the product of array elements over a given axis treating Not a Numbers (NaNs) as ones.
This docstring was copied from numpy.nanprod.
Some inconsistencies with the Dask version may exist.
One is returned for slices that are all-NaN or empty.
New in version 1.10.0.
Parameters: a : array_like
Array containing numbers whose product is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the product is computed. The default is to compute the product of the flattened array.
dtype : data-type, optional
The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. The casting of NaN to integer can yield unexpected results.keepdims : bool, optional
If True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.
Returns: nanprod : ndarray
A new array holding the result is returned unless out is specified, in which case it is returned.
See also
numpy.prod
- Product across array propagating NaNs.
isnan
- Show which elements are NaN.
Examples
>>> np.nanprod(1) # doctest: +SKIP 1 >>> np.nanprod([1]) # doctest: +SKIP 1 >>> np.nanprod([1, np.nan]) # doctest: +SKIP 1.0 >>> a = np.array([[1, 2], [3, np.nan]]) # doctest: +SKIP >>> np.nanprod(a) # doctest: +SKIP 6.0 >>> np.nanprod(a, axis=0) # doctest: +SKIP array([3., 2.])
-
dask.array.
nanstd
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the standard deviation along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanstd.
Some inconsistencies with the Dask version may exist.
Compute the standard deviation along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanstd.
Some inconsistencies with the Dask version may exist.
Returns the standard deviation, a measure of the spread of a distribution, of the non-NaN array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
Parameters: a : array_like
Calculate the standard deviation of the non-NaN values.
axis : {int, tuple of int, None}, optional
Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array.
dtype : dtype, optional
Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary.
ddof : int, optional
Means Delta Degrees of Freedom. The divisor used in calculations is
N - ddof
, whereN
represents the number of non-NaN elements. By default ddof is zero.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If this value is anything but the default it is passed through as-is to the relevant functions of the sub-classes. If these functions do not have a keepdims kwarg, a RuntimeError will be raised.
Returns: standard_deviation : ndarray, see dtype parameter above.
If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.
Notes
The standard deviation is the square root of the average of the squared deviations from the mean:
std = sqrt(mean(abs(x - x.mean())**2))
.The average squared deviation is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of the infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even withddof=1
, it will not be an unbiased estimate of the standard deviation per se.Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.
For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.
Examples
>>> a = np.array([[1, np.nan], [3, 4]]) # doctest: +SKIP >>> np.nanstd(a) # doctest: +SKIP 1.247219128924647 >>> np.nanstd(a, axis=0) # doctest: +SKIP array([1., 0.]) >>> np.nanstd(a, axis=1) # doctest: +SKIP array([0., 0.5]) # may vary
Returns the standard deviation, a measure of the spread of a distribution, of the non-NaN array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
-
dask.array.
nansum
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero.
This docstring was copied from numpy.nansum.
Some inconsistencies with the Dask version may exist.
In NumPy versions <= 1.9.0 Nan is returned for slices that are all-NaN or empty. In later versions zero is returned.
Parameters: a : array_like
Array containing numbers whose sum is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the sum is computed. The default is to compute the sum of the flattened array.
dtype : data-type, optional
The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact.
New in version 1.8.0.
out : ndarray, optional
Alternate output array in which to place the result. The default is
None
. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. The casting of NaN to integer can yield unexpected results.New in version 1.8.0.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
If the value is anything but the default, then keepdims will be passed through to the mean or sum methods of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.
New in version 1.8.0.
Returns: nansum : ndarray.
A new array holding the result is returned unless out is specified, in which it is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.
See also
Notes
If both positive and negative infinity are present, the sum will be Not A Number (NaN).
Examples
>>> np.nansum(1) # doctest: +SKIP 1 >>> np.nansum([1]) # doctest: +SKIP 1 >>> np.nansum([1, np.nan]) # doctest: +SKIP 1.0 >>> a = np.array([[1, 1], [1, np.nan]]) # doctest: +SKIP >>> np.nansum(a) # doctest: +SKIP 3.0 >>> np.nansum(a, axis=0) # doctest: +SKIP array([2., 1.]) >>> np.nansum([1, np.nan, np.inf]) # doctest: +SKIP inf >>> np.nansum([1, np.nan, np.NINF]) # doctest: +SKIP -inf >>> from numpy.testing import suppress_warnings # doctest: +SKIP >>> with suppress_warnings() as sup: # doctest: +SKIP ... sup.filter(RuntimeWarning) ... np.nansum([1, np.nan, np.inf, -np.inf]) # both +/- infinity present nan
-
dask.array.
nanvar
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the variance along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanvar.
Some inconsistencies with the Dask version may exist.
Compute the variance along the specified axis, while ignoring NaNs.
This docstring was copied from numpy.nanvar.
Some inconsistencies with the Dask version may exist.
Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
Parameters: a : array_like
Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted.
axis : {int, tuple of int, None}, optional
Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array.
dtype : data-type, optional
Type to use in computing the variance. For arrays of integer type the default is float32; for arrays of float types it is the same as the array type.
out : ndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary.
ddof : int, optional
“Delta Degrees of Freedom”: the divisor used in the calculation is
N - ddof
, whereN
represents the number of non-NaN elements. By default ddof is zero.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.
Returns: variance : ndarray, see dtype parameter above
If out is None, return a new array containing the variance, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.
See also
numpy.doc.ufuncs
- Section “Output arguments”
Notes
The variance is the average of the squared deviations from the mean, i.e.,
var = mean(abs(x - x.mean())**2)
.The mean is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of a hypothetical infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables.Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.
For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the
dtype
keyword can alleviate this issue.For this function to work on sub-classes of ndarray, they must define sum with the kwarg keepdims
Examples
>>> a = np.array([[1, np.nan], [3, 4]]) # doctest: +SKIP >>> np.nanvar(a) # doctest: +SKIP 1.5555555555555554 >>> np.nanvar(a, axis=0) # doctest: +SKIP array([1., 0.]) >>> np.nanvar(a, axis=1) # doctest: +SKIP array([0., 0.25]) # may vary
Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.
For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.
New in version 1.8.0.
-
dask.array.
nan_to_num
(*args, **kwargs)¶ Replace NaN with zero and infinity with large finite numbers (default behaviour) or with the numbers defined by the user using the nan, posinf and/or neginf keywords.
This docstring was copied from numpy.nan_to_num.
Some inconsistencies with the Dask version may exist.
If x is inexact, NaN is replaced by zero or by the user defined value in nan keyword, infinity is replaced by the largest finite floating point values representable by
x.dtype
or by the user defined value in posinf keyword and -infinity is replaced by the most negative finite floating point values representable byx.dtype
or by the user defined value in neginf keyword.For complex dtypes, the above is applied to each of the real and imaginary components of x separately.
If x is not inexact, then no replacements are made.
Parameters: x : scalar or array_like (Not supported in Dask)
Input data.
copy : bool, optional (Not supported in Dask)
Whether to create a copy of x (True) or to replace values in-place (False). The in-place operation only occurs if casting to an array does not require a copy. Default is True.
nan : int, float, optional (Not supported in Dask)
Value to be used to fill NaN values. If no value is passed then NaN values will be replaced with 0.0.
posinf : int, float, optional (Not supported in Dask)
Value to be used to fill positive infinity values. If no value is passed then positive infinity values will be replaced with a very large number.
neginf : int, float, optional (Not supported in Dask)
Value to be used to fill negative infinity values. If no value is passed then negative infinity values will be replaced with a very small (or negative) number.
New in version 1.13.
Returns: out : ndarray
x, with the non-finite values replaced. If copy is False, this may be x itself.
See also
Notes
NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity.
Examples
>>> np.nan_to_num(np.inf) # doctest: +SKIP 1.7976931348623157e+308 >>> np.nan_to_num(-np.inf) # doctest: +SKIP -1.7976931348623157e+308 >>> np.nan_to_num(np.nan) # doctest: +SKIP 0.0 >>> x = np.array([np.inf, -np.inf, np.nan, -128, 128]) # doctest: +SKIP >>> np.nan_to_num(x) # doctest: +SKIP array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, # may vary -1.28000000e+002, 1.28000000e+002]) >>> np.nan_to_num(x, nan=-9999, posinf=33333333, neginf=33333333) # doctest: +SKIP array([ 3.3333333e+07, 3.3333333e+07, -9.9990000e+03, -1.2800000e+02, 1.2800000e+02]) >>> y = np.array([complex(np.inf, np.nan), np.nan, complex(np.nan, np.inf)]) # doctest: +SKIP array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000, # may vary -1.28000000e+002, 1.28000000e+002]) >>> np.nan_to_num(y) # doctest: +SKIP array([ 1.79769313e+308 +0.00000000e+000j, # may vary 0.00000000e+000 +0.00000000e+000j, 0.00000000e+000 +1.79769313e+308j]) >>> np.nan_to_num(y, nan=111111, posinf=222222) # doctest: +SKIP array([222222.+111111.j, 111111. +0.j, 111111.+222222.j])
-
dask.array.
nextafter
(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.nextafter.
Some inconsistencies with the Dask version may exist.
Return the next floating-point value after x1 towards x2, element-wise.
Parameters: x1 : array_like
Values to find the next representable value of.
x2 : array_like
The direction where to look for the next representable value of x1. If
x1.shape != x2.shape
, they must be broadcastable to a common shape (which becomes the shape of the output).out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
The next representable values of x1 in the direction of x2. This is a scalar if both x1 and x2 are scalars.
Examples
>>> eps = np.finfo(np.float64).eps # doctest: +SKIP >>> np.nextafter(1, 2) == eps + 1 # doctest: +SKIP True >>> np.nextafter([1, 2], [2, 1]) == [eps + 1, 2 - eps] # doctest: +SKIP array([ True, True])
-
dask.array.
nonzero
(a)¶ Return the indices of the elements that are non-zero.
This docstring was copied from numpy.nonzero.
Some inconsistencies with the Dask version may exist.
Returns a tuple of arrays, one for each dimension of a, containing the indices of the non-zero elements in that dimension. The values in a are always tested and returned in row-major, C-style order.
To group the indices by element, rather than dimension, use argwhere, which returns a row for each non-zero element.
Note
When called on a zero-d array or scalar,
nonzero(a)
is treated asnonzero(atleast1d(a))
.- ..deprecated:: 1.17.0
- Use atleast1d explicitly if this behavior is deliberate.
Parameters: a : array_like
Input array.
Returns: tuple_of_arrays : tuple
Indices of elements that are non-zero.
See also
flatnonzero
- Return indices that are non-zero in the flattened version of the input array.
ndarray.nonzero
- Equivalent ndarray method.
count_nonzero
- Counts the number of non-zero elements in the input array.
Notes
While the nonzero values can be obtained with
a[nonzero(a)]
, it is recommended to usex[x.astype(bool)]
orx[x != 0]
instead, which will correctly handle 0-d arrays.Examples
>>> x = np.array([[3, 0, 0], [0, 4, 0], [5, 6, 0]]) # doctest: +SKIP >>> x # doctest: +SKIP array([[3, 0, 0], [0, 4, 0], [5, 6, 0]]) >>> np.nonzero(x) # doctest: +SKIP (array([0, 1, 2, 2]), array([0, 1, 0, 1]))
>>> x[np.nonzero(x)] # doctest: +SKIP array([3, 4, 5, 6]) >>> np.transpose(np.nonzero(x)) # doctest: +SKIP array([[0, 0], [1, 1], [2, 0], [2, 1]])
A common use for
nonzero
is to find the indices of an array, where a condition is True. Given an array a, the condition a > 3 is a boolean array and since False is interpreted as 0, np.nonzero(a > 3) yields the indices of the a where the condition is true.>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # doctest: +SKIP >>> a > 3 # doctest: +SKIP array([[False, False, False], [ True, True, True], [ True, True, True]]) >>> np.nonzero(a > 3) # doctest: +SKIP (array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))
Using this result to index a is equivalent to using the mask directly:
>>> a[np.nonzero(a > 3)] # doctest: +SKIP array([4, 5, 6, 7, 8, 9]) >>> a[a > 3] # prefer this spelling # doctest: +SKIP array([4, 5, 6, 7, 8, 9])
nonzero
can also be called as a method of the array.>>> (a > 3).nonzero() # doctest: +SKIP (array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))
-
dask.array.
notnull
(values)¶ pandas.notnull for dask arrays
-
dask.array.
ones
(*args, **kwargs)¶ Blocked variant of ones
Follows the signature of ones exactly except that it also requires a keyword argument chunks=(…)
Original signature follows below.
Return a new array of given shape and type, filled with ones.
Parameters: shape : int or sequence of ints
Shape of the new array, e.g.,
(2, 3)
or2
.dtype : data-type, optional
The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64.
order : {‘C’, ‘F’}, optional, default: C
Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
Returns: out : ndarray
Array of ones with the given shape, dtype, and order.
See also
Examples
>>> np.ones(5) array([1., 1., 1., 1., 1.])
>>> np.ones((5,), dtype=int) array([1, 1, 1, 1, 1])
>>> np.ones((2, 1)) array([[1.], [1.]])
>>> s = (2,2) >>> np.ones(s) array([[1., 1.], [1., 1.]])
-
dask.array.
ones_like
(a, dtype=None, chunks=None)¶ Return an array of ones with the same shape and type as a given array.
Parameters: a : array_like
The shape and data-type of a define these same attributes of the returned array.
dtype : data-type, optional
Overrides the data type of the result.
chunks : sequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.Returns: out : ndarray
Array of ones with the same shape and type as a.
See also
zeros_like
- Return an array of zeros with shape and type of input.
empty_like
- Return an empty array with shape and type of input.
zeros
- Return a new array setting values to zero.
ones
- Return a new array setting values to one.
empty
- Return a new uninitialized array.
-
dask.array.
outer
(a, b)¶ Compute the outer product of two vectors.
This docstring was copied from numpy.outer.
Some inconsistencies with the Dask version may exist.
Given two vectors,
a = [a0, a1, ..., aM]
andb = [b0, b1, ..., bN]
, the outer product [R139] is:[[a0*b0 a0*b1 ... a0*bN ] [a1*b0 . [ ... . [aM*b0 aM*bN ]]
Parameters: a : (M,) array_like
First input vector. Input is flattened if not already 1-dimensional.
b : (N,) array_like
Second input vector. Input is flattened if not already 1-dimensional.
out : (M, N) ndarray, optional
A location where the result is stored
New in version 1.9.0.
Returns: out : (M, N) ndarray
out[i, j] = a[i] * b[j]
See also
inner
einsum
einsum('i,j->ij', a.ravel(), b.ravel())
is the equivalent.ufunc.outer
- A generalization to N dimensions and other operations.
np.multiply.outer(a.ravel(), b.ravel())
is the equivalent.
References
[R139] (1, 2) : G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed., Baltimore, MD, Johns Hopkins University Press, 1996, pg. 8. Examples
Make a (very coarse) grid for computing a Mandelbrot set:
>>> rl = np.outer(np.ones((5,)), np.linspace(-2, 2, 5)) # doctest: +SKIP >>> rl # doctest: +SKIP array([[-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.]]) >>> im = np.outer(1j*np.linspace(2, -2, 5), np.ones((5,))) # doctest: +SKIP >>> im # doctest: +SKIP array([[0.+2.j, 0.+2.j, 0.+2.j, 0.+2.j, 0.+2.j], [0.+1.j, 0.+1.j, 0.+1.j, 0.+1.j, 0.+1.j], [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], [0.-1.j, 0.-1.j, 0.-1.j, 0.-1.j, 0.-1.j], [0.-2.j, 0.-2.j, 0.-2.j, 0.-2.j, 0.-2.j]]) >>> grid = rl + im # doctest: +SKIP >>> grid # doctest: +SKIP array([[-2.+2.j, -1.+2.j, 0.+2.j, 1.+2.j, 2.+2.j], [-2.+1.j, -1.+1.j, 0.+1.j, 1.+1.j, 2.+1.j], [-2.+0.j, -1.+0.j, 0.+0.j, 1.+0.j, 2.+0.j], [-2.-1.j, -1.-1.j, 0.-1.j, 1.-1.j, 2.-1.j], [-2.-2.j, -1.-2.j, 0.-2.j, 1.-2.j, 2.-2.j]])
An example using a “vector” of letters:
>>> x = np.array(['a', 'b', 'c'], dtype=object) # doctest: +SKIP >>> np.outer(x, [1, 2, 3]) # doctest: +SKIP array([['a', 'aa', 'aaa'], ['b', 'bb', 'bbb'], ['c', 'cc', 'ccc']], dtype=object)
-
dask.array.
pad
(array, pad_width, mode, **kwargs)¶ Pad an array.
This docstring was copied from numpy.pad.
Some inconsistencies with the Dask version may exist.
Parameters: array : array_like of rank N
The array to pad.
pad_width : {sequence, array_like, int}
Number of values padded to the edges of each axis. ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.
mode : str or function, optional
One of the following string values or a user supplied function.
- ‘constant’ (default)
Pads with a constant value.
- ‘edge’
Pads with the edge values of array.
- ‘linear_ramp’
Pads with the linear ramp between end_value and the array edge value.
- ‘maximum’
Pads with the maximum value of all or part of the vector along each axis.
- ‘mean’
Pads with the mean value of all or part of the vector along each axis.
- ‘median’
Pads with the median value of all or part of the vector along each axis.
- ‘minimum’
Pads with the minimum value of all or part of the vector along each axis.
- ‘reflect’
Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.
- ‘symmetric’
Pads with the reflection of the vector mirrored along the edge of the array.
- ‘wrap’
Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.
- ‘empty’
Pads with undefined values.
New in version 1.17.
- <function>
Padding function, see Notes.
stat_length : sequence or int, optional
Used in ‘maximum’, ‘mean’, ‘median’, and ‘minimum’. Number of values at edge of each axis used to calculate the statistic value.
((before_1, after_1), … (before_N, after_N)) unique statistic lengths for each axis.
((before, after),) yields same before and after statistic lengths for each axis.
(stat_length,) or int is a shortcut for before = after = statistic length for all axes.
Default is
None
, to use the entire axis.constant_values : sequence or scalar, optional
Used in ‘constant’. The values to set the padded values for each axis.
((before_1, after_1), ... (before_N, after_N))
unique pad constants for each axis.((before, after),)
yields same before and after constants for each axis.(constant,)
orconstant
is a shortcut forbefore = after = constant
for all axes.Default is 0.
end_values : sequence or scalar, optional
Used in ‘linear_ramp’. The values used for the ending value of the linear_ramp and that will form the edge of the padded array.
((before_1, after_1), ... (before_N, after_N))
unique end values for each axis.((before, after),)
yields same before and after end values for each axis.(constant,)
orconstant
is a shortcut forbefore = after = constant
for all axes.Default is 0.
reflect_type : {‘even’, ‘odd’}, optional
Used in ‘reflect’, and ‘symmetric’. The ‘even’ style is the default with an unaltered reflection around the edge value. For the ‘odd’ style, the extended part of the array is created by subtracting the reflected values from two times the edge value.
Returns: pad : ndarray
Padded array of rank equal to array with shape increased according to pad_width.
Notes
New in version 1.7.0.
For an array with rank greater than 1, some of the padding of later axes is calculated from padding of previous axes. This is easiest to think about with a rank 2 array where the corners of the padded array are calculated by using padded values from the first axis.
The padding function, if used, should modify a rank 1 array in-place. It has the following signature:
padding_func(vector, iaxis_pad_width, iaxis, kwargs)
where
- vector : ndarray
- A rank 1 array already padded with zeros. Padded values are vector[:iaxis_pad_width[0]] and vector[-iaxis_pad_width[1]:].
- iaxis_pad_width : tuple
- A 2-tuple of ints, iaxis_pad_width[0] represents the number of values padded at the beginning of vector where iaxis_pad_width[1] represents the number of values padded at the end of vector.
- iaxis : int
- The axis currently being calculated.
- kwargs : dict
- Any keyword arguments the function requires.
Examples
>>> a = [1, 2, 3, 4, 5] # doctest: +SKIP >>> np.pad(a, (2, 3), 'constant', constant_values=(4, 6)) # doctest: +SKIP array([4, 4, 1, ..., 6, 6, 6])
>>> np.pad(a, (2, 3), 'edge') # doctest: +SKIP array([1, 1, 1, ..., 5, 5, 5])
>>> np.pad(a, (2, 3), 'linear_ramp', end_values=(5, -4)) # doctest: +SKIP array([ 5, 3, 1, 2, 3, 4, 5, 2, -1, -4])
>>> np.pad(a, (2,), 'maximum') # doctest: +SKIP array([5, 5, 1, 2, 3, 4, 5, 5, 5])
>>> np.pad(a, (2,), 'mean') # doctest: +SKIP array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> np.pad(a, (2,), 'median') # doctest: +SKIP array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> a = [[1, 2], [3, 4]] # doctest: +SKIP >>> np.pad(a, ((3, 2), (2, 3)), 'minimum') # doctest: +SKIP array([[1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1], [3, 3, 3, 4, 3, 3, 3], [1, 1, 1, 2, 1, 1, 1], [1, 1, 1, 2, 1, 1, 1]])
>>> a = [1, 2, 3, 4, 5] # doctest: +SKIP >>> np.pad(a, (2, 3), 'reflect') # doctest: +SKIP array([3, 2, 1, 2, 3, 4, 5, 4, 3, 2])
>>> np.pad(a, (2, 3), 'reflect', reflect_type='odd') # doctest: +SKIP array([-1, 0, 1, 2, 3, 4, 5, 6, 7, 8])
>>> np.pad(a, (2, 3), 'symmetric') # doctest: +SKIP array([2, 1, 1, 2, 3, 4, 5, 5, 4, 3])
>>> np.pad(a, (2, 3), 'symmetric', reflect_type='odd') # doctest: +SKIP array([0, 1, 1, 2, 3, 4, 5, 5, 6, 7])
>>> np.pad(a, (2, 3), 'wrap') # doctest: +SKIP array([4, 5, 1, 2, 3, 4, 5, 1, 2, 3])
>>> def pad_with(vector, pad_width, iaxis, kwargs): # doctest: +SKIP ... pad_value = kwargs.get('padder', 10) ... vector[:pad_width[0]] = pad_value ... vector[-pad_width[1]:] = pad_value >>> a = np.arange(6) # doctest: +SKIP >>> a = a.reshape((2, 3)) # doctest: +SKIP >>> np.pad(a, 2, pad_with) # doctest: +SKIP array([[10, 10, 10, 10, 10, 10, 10], [10, 10, 10, 10, 10, 10, 10], [10, 10, 0, 1, 2, 10, 10], [10, 10, 3, 4, 5, 10, 10], [10, 10, 10, 10, 10, 10, 10], [10, 10, 10, 10, 10, 10, 10]]) >>> np.pad(a, 2, pad_with, padder=100) # doctest: +SKIP array([[100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100], [100, 100, 0, 1, 2, 100, 100], [100, 100, 3, 4, 5, 100, 100], [100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100]])
-
dask.array.
percentile
(a, q, interpolation='linear', method='default')¶ Approximate percentile of 1-D array
Parameters: a : Array
q : array_like of float
Percentile or sequence of percentiles to compute, which must be between 0 and 100 inclusive.
interpolation : {‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}, optional
The interpolation method to use when the desired percentile lies between two data points
i < j
. Only valid formethod='dask'
.- ‘linear’:
i + (j - i) * fraction
, wherefraction
is the fractional part of the index surrounded byi
andj
. - ‘lower’:
i
. - ‘higher’:
j
. - ‘nearest’:
i
orj
, whichever is nearest. - ‘midpoint’:
(i + j) / 2
.
method : {‘default’, ‘dask’, ‘tdigest’}, optional
What method to use. By default will use dask’s internal custom algorithm (
'dask'
). If set to'tdigest'
will use tdigest for floats and ints and fallback to the'dask'
otherwise.See also
numpy.percentile
- Numpy’s equivalent Percentile function
- ‘linear’:
-
dask.array.
piecewise
(x, condlist, funclist, *args, **kw)¶ Evaluate a piecewise-defined function.
This docstring was copied from numpy.piecewise.
Some inconsistencies with the Dask version may exist.
Given a set of conditions and corresponding functions, evaluate each function on the input data wherever its condition is true.
Parameters: x : ndarray or scalar
The input domain.
condlist : list of bool arrays or bool scalars
Each boolean array corresponds to a function in funclist. Wherever condlist[i] is True, funclist[i](x) is used as the output value.
Each boolean array in condlist selects a piece of x, and should therefore be of the same shape as x.
The length of condlist must correspond to that of funclist. If one extra function is given, i.e. if
len(funclist) == len(condlist) + 1
, then that extra function is the default value, used wherever all conditions are false.funclist : list of callables, f(x,*args,**kw), or scalars
Each function is evaluated over x wherever its corresponding condition is True. It should take a 1d array as input and give an 1d array or a scalar value as output. If, instead of a callable, a scalar is provided then a constant function (
lambda x: scalar
) is assumed.args : tuple, optional
Any further arguments given to piecewise are passed to the functions upon execution, i.e., if called
piecewise(..., ..., 1, 'a')
, then each function is called asf(x, 1, 'a')
.kw : dict, optional
Keyword arguments used in calling piecewise are passed to the functions upon execution, i.e., if called
piecewise(..., ..., alpha=1)
, then each function is called asf(x, alpha=1)
.Returns: out : ndarray
The output is the same shape and type as x and is found by calling the functions in funclist on the appropriate portions of x, as defined by the boolean arrays in condlist. Portions not covered by any condition have a default value of 0.
Notes
This is similar to choose or select, except that functions are evaluated on elements of x that satisfy the corresponding condition from condlist.
The result is:
|-- |funclist[0](x[condlist[0]]) out = |funclist[1](x[condlist[1]]) |... |funclist[n2](x[condlist[n2]]) |--
Examples
Define the sigma function, which is -1 for
x < 0
and +1 forx >= 0
.>>> x = np.linspace(-2.5, 2.5, 6) # doctest: +SKIP >>> np.piecewise(x, [x < 0, x >= 0], [-1, 1]) # doctest: +SKIP array([-1., -1., -1., 1., 1., 1.])
Define the absolute value, which is
-x
forx <0
andx
forx >= 0
.>>> np.piecewise(x, [x < 0, x >= 0], [lambda x: -x, lambda x: x]) # doctest: +SKIP array([2.5, 1.5, 0.5, 0.5, 1.5, 2.5])
Apply the same function to a scalar value.
>>> y = -2 # doctest: +SKIP >>> np.piecewise(y, [y < 0, y >= 0], [lambda x: -x, lambda x: x]) # doctest: +SKIP array(2)
-
dask.array.
prod
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Return the product of array elements over a given axis.
This docstring was copied from numpy.prod.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input data.
axis : None or int or tuple of ints, optional
Axis or axes along which a product is performed. The default, axis=None, will calculate the product of all the elements in the input array. If axis is negative it counts from the last to the first axis.
New in version 1.7.0.
If axis is a tuple of ints, a product is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
dtype : dtype, optional
The type of the returned array, as well as of the accumulator in which the elements are multiplied. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the prod method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
initial : scalar, optional (Not supported in Dask)
The starting value for this product. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
where : array_like of bool, optional (Not supported in Dask)
Elements to include in the product. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
Returns: product_along_axis : ndarray, see dtype parameter above.
An array shaped as a but with the specified axis removed. Returns a reference to out if specified.
See also
ndarray.prod
- equivalent method
numpy.doc.ufuncs
- Section “Output arguments”
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow. That means that, on a 32-bit platform:
>>> x = np.array([536870910, 536870910, 536870910, 536870910]) # doctest: +SKIP >>> np.prod(x) # doctest: +SKIP 16 # may vary
The product of an empty array is the neutral element 1:
>>> np.prod([]) # doctest: +SKIP 1.0
Examples
By default, calculate the product of all elements:
>>> np.prod([1.,2.]) # doctest: +SKIP 2.0
Even when the input array is two-dimensional:
>>> np.prod([[1.,2.],[3.,4.]]) # doctest: +SKIP 24.0
But we can also specify the axis over which to multiply:
>>> np.prod([[1.,2.],[3.,4.]], axis=1) # doctest: +SKIP array([ 2., 12.])
Or select specific elements to include:
>>> np.prod([1., np.nan, 3.], where=[True, False, True]) # doctest: +SKIP 3.0
If the type of x is unsigned, then the output type is the unsigned platform integer:
>>> x = np.array([1, 2, 3], dtype=np.uint8) # doctest: +SKIP >>> np.prod(x).dtype == np.uint # doctest: +SKIP True
If x is of a signed integer type, then the output type is the default platform integer:
>>> x = np.array([1, 2, 3], dtype=np.int8) # doctest: +SKIP >>> np.prod(x).dtype == int # doctest: +SKIP True
You can also start the product with a value other than one:
>>> np.prod([1, 2], initial=5) # doctest: +SKIP 10
-
dask.array.
ptp
(a, axis=None)¶ Range of values (maximum - minimum) along an axis.
This docstring was copied from numpy.ptp.
Some inconsistencies with the Dask version may exist.
The name of the function comes from the acronym for ‘peak to peak’.
Parameters: a : array_like
Input values.
axis : None or int or tuple of ints, optional
Axis along which to find the peaks. By default, flatten the array. axis may be negative, in which case it counts from the last to the first axis.
New in version 1.15.0.
If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.
out : array_like (Not supported in Dask)
Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type of the output values will be cast if necessary.
keepdims : bool, optional (Not supported in Dask)
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the ptp method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
Returns: ptp : ndarray
A new array holding the result, unless out was specified, in which case a reference to out is returned.
Examples
>>> x = np.arange(4).reshape((2,2)) # doctest: +SKIP >>> x # doctest: +SKIP array([[0, 1], [2, 3]])
>>> np.ptp(x, axis=0) # doctest: +SKIP array([2, 2])
>>> np.ptp(x, axis=1) # doctest: +SKIP array([1, 1])
-
dask.array.
rad2deg
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.rad2deg.
Some inconsistencies with the Dask version may exist.
Convert angles from radians to degrees.
Parameters: x : array_like
Angle in radians.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The corresponding angle in degrees. This is a scalar if x is a scalar.
See also
deg2rad
- Convert angles from degrees to radians.
unwrap
- Remove large jumps in angle by wrapping.
Notes
New in version 1.3.0.
rad2deg(x) is
180 * x / pi
.Examples
>>> np.rad2deg(np.pi/2) # doctest: +SKIP 90.0
-
dask.array.
radians
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.radians.
Some inconsistencies with the Dask version may exist.
Convert angles from degrees to radians.
Parameters: x : array_like
Input array in degrees.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The corresponding radian values. This is a scalar if x is a scalar.
See also
deg2rad
- equivalent function
Examples
Convert a degree array to radians
>>> deg = np.arange(12.) * 30. # doctest: +SKIP >>> np.radians(deg) # doctest: +SKIP array([ 0. , 0.52359878, 1.04719755, 1.57079633, 2.0943951 , 2.61799388, 3.14159265, 3.66519143, 4.1887902 , 4.71238898, 5.23598776, 5.75958653])
>>> out = np.zeros((deg.shape)) # doctest: +SKIP >>> ret = np.radians(deg, out) # doctest: +SKIP >>> ret is out # doctest: +SKIP True
-
dask.array.
ravel
(array)¶ Return a contiguous flattened array.
This docstring was copied from numpy.ravel.
Some inconsistencies with the Dask version may exist.
A 1-D array, containing the elements of the input, is returned. A copy is made only if needed.
As of NumPy 1.10, the returned array will have the same type as the input array. (for example, a masked array will be returned for a masked array input)
Parameters: a : array_like (Not supported in Dask)
Input array. The elements in a are read in the order specified by order, and packed as a 1-D array.
order : {‘C’,’F’, ‘A’, ‘K’}, optional (Not supported in Dask)
The elements of a are read using this index order. ‘C’ means to index the elements in row-major, C-style order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of axis indexing. ‘A’ means to read the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. ‘K’ means to read the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, ‘C’ index order is used.
Returns: y : array_like
y is an array of the same subtype as a, with shape
(a.size,)
. Note that matrices are special cased for backward compatibility, if a is a matrix, then y is a 1-D ndarray.See also
ndarray.flat
- 1-D iterator over an array.
ndarray.flatten
- 1-D array copy of the elements of an array in row-major order.
ndarray.reshape
- Change the shape of an array without changing its data.
Notes
In row-major, C-style order, in two dimensions, the row index varies the slowest, and the column index the quickest. This can be generalized to multiple dimensions, where row-major order implies that the index along the first axis varies slowest, and the index along the last quickest. The opposite holds for column-major, Fortran-style index ordering.
When a view is desired in as many cases as possible,
arr.reshape(-1)
may be preferable.Examples
It is equivalent to
reshape(-1, order=order)
.>>> x = np.array([[1, 2, 3], [4, 5, 6]]) # doctest: +SKIP >>> np.ravel(x) # doctest: +SKIP array([1, 2, 3, 4, 5, 6])
>>> x.reshape(-1) # doctest: +SKIP array([1, 2, 3, 4, 5, 6])
>>> np.ravel(x, order='F') # doctest: +SKIP array([1, 4, 2, 5, 3, 6])
When
order
is ‘A’, it will preserve the array’s ‘C’ or ‘F’ ordering:>>> np.ravel(x.T) # doctest: +SKIP array([1, 4, 2, 5, 3, 6]) >>> np.ravel(x.T, order='A') # doctest: +SKIP array([1, 2, 3, 4, 5, 6])
When
order
is ‘K’, it will preserve orderings that are neither ‘C’ nor ‘F’, but won’t reverse axes:>>> a = np.arange(3)[::-1]; a # doctest: +SKIP array([2, 1, 0]) >>> a.ravel(order='C') # doctest: +SKIP array([2, 1, 0]) >>> a.ravel(order='K') # doctest: +SKIP array([2, 1, 0])
>>> a = np.arange(12).reshape(2,3,2).swapaxes(1,2); a # doctest: +SKIP array([[[ 0, 2, 4], [ 1, 3, 5]], [[ 6, 8, 10], [ 7, 9, 11]]]) >>> a.ravel(order='C') # doctest: +SKIP array([ 0, 2, 4, 1, 3, 5, 6, 8, 10, 7, 9, 11]) >>> a.ravel(order='K') # doctest: +SKIP array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
-
dask.array.
real
(*args, **kwargs)¶ Return the real part of the complex argument.
This docstring was copied from numpy.real.
Some inconsistencies with the Dask version may exist.
Parameters: val : array_like (Not supported in Dask)
Input array.
Returns: out : ndarray or scalar
The real component of the complex argument. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.
Examples
>>> a = np.array([1+2j, 3+4j, 5+6j]) # doctest: +SKIP >>> a.real # doctest: +SKIP array([1., 3., 5.]) >>> a.real = 9 # doctest: +SKIP >>> a # doctest: +SKIP array([9.+2.j, 9.+4.j, 9.+6.j]) >>> a.real = np.array([9, 8, 7]) # doctest: +SKIP >>> a # doctest: +SKIP array([9.+2.j, 8.+4.j, 7.+6.j]) >>> np.real(1 + 1j) # doctest: +SKIP 1.0
-
dask.array.
rechunk
(x, chunks='auto', threshold=None, block_size_limit=None)¶ Convert blocks in dask array x for new chunks.
Parameters: x: dask array
Array to be rechunked.
chunks: int, tuple, dict or str, optional
The new block dimensions to create. -1 indicates the full size of the corresponding dimension. Default is “auto” which automatically determines chunk sizes.
threshold: int, optional
The graph growth factor under which we don’t bother introducing an intermediate step.
block_size_limit: int, optional
The maximum block size (in bytes) we want to produce Defaults to the configuration value
array.chunk-size
Examples
>>> import dask.array as da >>> x = da.ones((1000, 1000), chunks=(100, 100))
Specify uniform chunk sizes with a tuple
>>> y = x.rechunk((1000, 10))
Or chunk only specific dimensions with a dictionary
>>> y = x.rechunk({0: 1000})
Use the value
-1
to specify that you want a single chunk along a dimension or the value"auto"
to specify that dask can freely rechunk a dimension to attain blocks of a uniform block size>>> y = x.rechunk({0: -1, 1: 'auto'}, block_size_limit=1e8)
-
dask.array.
reduction
(x, chunk, aggregate, axis=None, keepdims=False, dtype=None, split_every=None, combine=None, name=None, out=None, concatenate=True, output_size=1, meta=None)¶ General version of reductions
Parameters: x: Array
Data being reduced along one or more axes
chunk: callable(x_chunk, axis, keepdims)
First function to be executed when resolving the dask graph. This function is applied in parallel to all original chunks of x. See below for function parameters.
combine: callable(x_chunk, axis, keepdims), optional
Function used for intermediate recursive aggregation (see split_every below). If omitted, it defaults to aggregate. If the reduction can be performed in less than 3 steps, it will not be invoked at all.
aggregate: callable(x_chunk, axis, keepdims)
Last function to be executed when resolving the dask graph, producing the final output. It is always invoked, even when the reduced Array counts a single chunk along the reduced axes.
axis: int or sequence of ints, optional
Axis or axes to aggregate upon. If omitted, aggregate along all axes.
keepdims: boolean, optional
Whether the reduction function should preserve the reduced axes, leaving them at size
output_size
, or remove them.dtype: np.dtype, optional
Force output dtype. Defaults to x.dtype if omitted.
split_every: int >= 2 or dict(axis: int), optional
Determines the depth of the recursive aggregation. If set to or more than the number of input chunks, the aggregation will be performed in two steps, one
chunk
function per input chunk and a singleaggregate
function at the end. If set to less than that, an intermediatecombine
function will be used, so that any onecombine
oraggregate
function has no more thansplit_every
inputs. The depth of the aggregation graph will be \(log_{split_every}(input chunks along reduced axes)\). Setting to a low value can reduce cache size and network transfers, at the cost of more CPU and a larger dask graph.Omit to let dask heuristically decide a good default. A default can also be set globally with the
split_every
key indask.config
.name: str, optional
Prefix of the keys of the intermediate and output nodes. If omitted it defaults to the function names.
out: Array, optional
Another dask array whose contents will be replaced. Omit to create a new one. Note that, unlike in numpy, this setting gives no performance benefits whatsoever, but can still be useful if one needs to preserve the references to a previously existing Array.
concatenate: bool, optional
If True (the default), the outputs of the
chunk
/combine
functions are concatenated into a single np.array before being passed to thecombine
/aggregate
functions. If False, the input ofcombine
andaggregate
will be either a list of the raw outputs of the previous step or a single output, and the function will have to concatenate it itself. It can be useful to set this to False if the chunk and/or combine steps do not produce np.arrays.output_size: int >= 1, optional
Size of the output of the
aggregate
function along the reduced axes. Ignored if keepdims is False.Returns: dask array
Function Parameters
x_chunk: numpy.ndarray
Individual input chunk. For
chunk
functions, it is one of the original chunks of x. Forcombine
andaggregate
functions, it’s the concatenation of the outputs produced by the previouschunk
orcombine
functions. If concatenate=False, it’s a list of the raw outputs from the previous functions.axis: tuple
Normalized list of axes to reduce upon, e.g.
(0, )
Scalar, negative, and None axes have been normalized away. Note that some numpy reduction functions cannot reduce along multiple axes at once and strictly require an int in input. Such functions have to be wrapped to cope.keepdims: bool
Whether the reduction function should preserve the reduced axes or remove them.
-
dask.array.
repeat
(a, repeats, axis=None)¶ Repeat elements of an array.
This docstring was copied from numpy.repeat.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input array.
repeats : int or array of ints
The number of repetitions for each element. repeats is broadcasted to fit the shape of the given axis.
axis : int, optional
The axis along which to repeat values. By default, use the flattened input array, and return a flat output array.
Returns: repeated_array : ndarray
Output array which has the same shape as a, except along the given axis.
See also
tile
- Tile an array.
Examples
>>> np.repeat(3, 4) # doctest: +SKIP array([3, 3, 3, 3]) >>> x = np.array([[1,2],[3,4]]) # doctest: +SKIP >>> np.repeat(x, 2) # doctest: +SKIP array([1, 1, 2, 2, 3, 3, 4, 4]) >>> np.repeat(x, 3, axis=1) # doctest: +SKIP array([[1, 1, 1, 2, 2, 2], [3, 3, 3, 4, 4, 4]]) >>> np.repeat(x, [1, 2], axis=0) # doctest: +SKIP array([[1, 2], [3, 4], [3, 4]])
-
dask.array.
reshape
(x, shape)¶ Reshape array to new shape
This is a parallelized version of the
np.reshape
function with the following limitations:- It assumes that the array is stored in row-major order
- It only allows for reshapings that collapse or merge dimensions like
(1, 2, 3, 4) -> (1, 6, 4)
or(64,) -> (4, 4, 4)
When communication is necessary this algorithm depends on the logic within rechunk. It endeavors to keep chunk sizes roughly the same when possible.
See also
-
dask.array.
result_type
(*arrays_and_dtypes)¶ This docstring was copied from numpy.result_type.
Some inconsistencies with the Dask version may exist.
Returns the type that results from applying the NumPy type promotion rules to the arguments.
Type promotion in NumPy works similarly to the rules in languages like C++, with some slight differences. When both scalars and arrays are used, the array’s type takes precedence and the actual value of the scalar is taken into account.
For example, calculating 3*a, where a is an array of 32-bit floats, intuitively should result in a 32-bit float output. If the 3 is a 32-bit integer, the NumPy rules indicate it can’t convert losslessly into a 32-bit float, so a 64-bit float should be the result type. By examining the value of the constant, ‘3’, we see that it fits in an 8-bit integer, which can be cast losslessly into the 32-bit float.
Parameters: arrays_and_dtypes : list of arrays and dtypes
The operands of some operation whose result type is needed.
Returns: out : dtype
The result type.
See also
dtype
,promote_types
,min_scalar_type
,can_cast
Notes
New in version 1.6.0.
The specific algorithm used is as follows.
Categories are determined by first checking which of boolean, integer (int/uint), or floating point (float/complex) the maximum kind of all the arrays and the scalars are.
If there are only scalars or the maximum category of the scalars is higher than the maximum category of the arrays, the data types are combined with
promote_types()
to produce the return value.Otherwise, min_scalar_type is called on each array, and the resulting data types are all combined with
promote_types()
to produce the return value.The set of int values is not a subset of the uint values for types with the same number of bits, something not reflected in
min_scalar_type()
, but handled as a special case in result_type.Examples
>>> np.result_type(3, np.arange(7, dtype='i1')) # doctest: +SKIP dtype('int8')
>>> np.result_type('i4', 'c8') # doctest: +SKIP dtype('complex128')
>>> np.result_type(3.0, -2) # doctest: +SKIP dtype('float64')
-
dask.array.
rint
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.rint.
Some inconsistencies with the Dask version may exist.
Round elements of the array to the nearest integer.
Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Output array is same shape and type as x. This is a scalar if x is a scalar.
Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) # doctest: +SKIP >>> np.rint(a) # doctest: +SKIP array([-2., -2., -0., 0., 2., 2., 2.])
-
dask.array.
roll
(array, shift, axis=None)¶ Roll array elements along a given axis.
This docstring was copied from numpy.roll.
Some inconsistencies with the Dask version may exist.
Elements that roll beyond the last position are re-introduced at the first.
Parameters: a : array_like (Not supported in Dask)
Input array.
shift : int or tuple of ints
The number of places by which elements are shifted. If a tuple, then axis must be a tuple of the same size, and each of the given axes is shifted by the corresponding number. If an int while axis is a tuple of ints, then the same value is used for all given axes.
axis : int or tuple of ints, optional
Axis or axes along which elements are shifted. By default, the array is flattened before shifting, after which the original shape is restored.
Returns: res : ndarray
Output array, with the same shape as a.
See also
rollaxis
- Roll the specified axis backwards, until it lies in a given position.
Notes
New in version 1.12.0.
Supports rolling over multiple dimensions simultaneously.
Examples
>>> x = np.arange(10) # doctest: +SKIP >>> np.roll(x, 2) # doctest: +SKIP array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7]) >>> np.roll(x, -2) # doctest: +SKIP array([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])
>>> x2 = np.reshape(x, (2,5)) # doctest: +SKIP >>> x2 # doctest: +SKIP array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) >>> np.roll(x2, 1) # doctest: +SKIP array([[9, 0, 1, 2, 3], [4, 5, 6, 7, 8]]) >>> np.roll(x2, -1) # doctest: +SKIP array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 0]]) >>> np.roll(x2, 1, axis=0) # doctest: +SKIP array([[5, 6, 7, 8, 9], [0, 1, 2, 3, 4]]) >>> np.roll(x2, -1, axis=0) # doctest: +SKIP array([[5, 6, 7, 8, 9], [0, 1, 2, 3, 4]]) >>> np.roll(x2, 1, axis=1) # doctest: +SKIP array([[4, 0, 1, 2, 3], [9, 5, 6, 7, 8]]) >>> np.roll(x2, -1, axis=1) # doctest: +SKIP array([[1, 2, 3, 4, 0], [6, 7, 8, 9, 5]])
-
dask.array.
rollaxis
(a, axis, start=0)¶
-
dask.array.
round
(a, decimals=0)¶ Round an array to the given number of decimals.
This docstring was copied from numpy.round.
Some inconsistencies with the Dask version may exist.
See also
around
- equivalent function; see for details.
-
dask.array.
sign
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sign.
Some inconsistencies with the Dask version may exist.
Returns an element-wise indication of the sign of a number.
The sign function returns
-1 if x < 0, 0 if x==0, 1 if x > 0
. nan is returned for nan inputs.For complex inputs, the sign function returns
sign(x.real) + 0j if x.real != 0 else sign(x.imag) + 0j
.complex(nan, 0) is returned for complex nan inputs.
Parameters: x : array_like
Input values.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The sign of x. This is a scalar if x is a scalar.
Notes
There is more than one definition of sign in common use for complex numbers. The definition used here is equivalent to \(x/\sqrt{x*x}\) which is different from a common alternative, \(x/|x|\).
Examples
>>> np.sign([-5., 4.5]) # doctest: +SKIP array([-1., 1.]) >>> np.sign(0) # doctest: +SKIP 0 >>> np.sign(5-2j) # doctest: +SKIP (1+0j)
-
dask.array.
signbit
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.signbit.
Some inconsistencies with the Dask version may exist.
Returns element-wise True where signbit is set (less than zero).
Parameters: x : array_like
The input value(s).
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: result : ndarray of bool
Output array, or reference to out if that was supplied. This is a scalar if x is a scalar.
Examples
>>> np.signbit(-1.2) # doctest: +SKIP True >>> np.signbit(np.array([1, -2.3, 2.1])) # doctest: +SKIP array([False, True, False])
-
dask.array.
sin
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sin.
Some inconsistencies with the Dask version may exist.
Trigonometric sine, element-wise.
Parameters: x : array_like
Angle, in radians (\(2 \pi\) rad equals 360 degrees).
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : array_like
The sine of each element of x. This is a scalar if x is a scalar.
Notes
The sine is one of the fundamental functions of trigonometry (the mathematical study of triangles). Consider a circle of radius 1 centered on the origin. A ray comes in from the \(+x\) axis, makes an angle at the origin (measured counter-clockwise from that axis), and departs from the origin. The \(y\) coordinate of the outgoing ray’s intersection with the unit circle is the sine of that angle. It ranges from -1 for \(x=3\pi / 2\) to +1 for \(\pi / 2.\) The function has zeroes where the angle is a multiple of \(\pi\). Sines of angles between \(\pi\) and \(2\pi\) are negative. The numerous properties of the sine and related functions are included in any standard trigonometry text.
Examples
Print sine of one angle:
>>> np.sin(np.pi/2.) # doctest: +SKIP 1.0
Print sines of an array of angles given in degrees:
>>> np.sin(np.array((0., 30., 45., 60., 90.)) * np.pi / 180. ) # doctest: +SKIP array([ 0. , 0.5 , 0.70710678, 0.8660254 , 1. ])
Plot the sine function:
>>> import matplotlib.pylab as plt # doctest: +SKIP >>> x = np.linspace(-np.pi, np.pi, 201) # doctest: +SKIP >>> plt.plot(x, np.sin(x)) # doctest: +SKIP >>> plt.xlabel('Angle [rad]') # doctest: +SKIP >>> plt.ylabel('sin(x)') # doctest: +SKIP >>> plt.axis('tight') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.
sinh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sinh.
Some inconsistencies with the Dask version may exist.
Hyperbolic sine, element-wise.
Equivalent to
1/2 * (np.exp(x) - np.exp(-x))
or-1j * np.sin(1j*x)
.Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The corresponding hyperbolic sine values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83.
Examples
>>> np.sinh(0) # doctest: +SKIP 0.0 >>> np.sinh(np.pi*1j/2) # doctest: +SKIP 1j >>> np.sinh(np.pi*1j) # (exact value is 0) # doctest: +SKIP 1.2246063538223773e-016j >>> # Discrepancy due to vagaries of floating point arithmetic.
>>> # Example of providing the optional output parameter >>> out1 = np.array([0], dtype='d') # doctest: +SKIP >>> out2 = np.sinh([0.1], out1) # doctest: +SKIP >>> out2 is out1 # doctest: +SKIP True
>>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.sinh(np.zeros((3,3)),np.zeros((2,2))) # doctest: +SKIP Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
sqrt
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.sqrt.
Some inconsistencies with the Dask version may exist.
Return the non-negative square-root of an array, element-wise.
Parameters: x : array_like
The values whose square-roots are required.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
An array of the same shape as x, containing the positive square-root of each element in x. If any element in x is complex, a complex array is returned (and the square-roots of negative reals are calculated). If all of the elements in x are real, so is y, with negative elements returning
nan
. If out was provided, y is a reference to it. This is a scalar if x is a scalar.See also
lib.scimath.sqrt
- A version which returns complex numbers when given negative reals.
Notes
sqrt has–consistent with common convention–as its branch cut the real “interval” [-inf, 0), and is continuous from above on it. A branch cut is a curve in the complex plane across which a given complex function fails to be continuous.
Examples
>>> np.sqrt([1,4,9]) # doctest: +SKIP array([ 1., 2., 3.])
>>> np.sqrt([4, -1, -3+4J]) # doctest: +SKIP array([ 2.+0.j, 0.+1.j, 1.+2.j])
>>> np.sqrt([4, -1, np.inf]) # doctest: +SKIP array([ 2., nan, inf])
-
dask.array.
square
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.square.
Some inconsistencies with the Dask version may exist.
Return the element-wise square of the input.
Parameters: x : array_like
Input data.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: out : ndarray or scalar
Element-wise x*x, of the same shape and dtype as x. This is a scalar if x is a scalar.
See also
numpy.linalg.matrix_power
,sqrt
,power
Examples
>>> np.square([-1j, 1]) # doctest: +SKIP array([-1.-0.j, 1.+0.j])
-
dask.array.
squeeze
(a, axis=None)¶ Remove single-dimensional entries from the shape of an array.
This docstring was copied from numpy.squeeze.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input data.
axis : None or int or tuple of ints, optional
New in version 1.7.0.
Selects a subset of the single-dimensional entries in the shape. If an axis is selected with shape entry greater than one, an error is raised.
Returns: squeezed : ndarray
The input array, but with all or a subset of the dimensions of length 1 removed. This is always a itself or a view into a.
Raises: ValueError
If axis is not None, and an axis being squeezed is not of length 1
See also
expand_dims
- The inverse operation, adding singleton dimensions
reshape
- Insert, remove, and combine dimensions, and resize existing ones
Examples
>>> x = np.array([[[0], [1], [2]]]) # doctest: +SKIP >>> x.shape # doctest: +SKIP (1, 3, 1) >>> np.squeeze(x).shape # doctest: +SKIP (3,) >>> np.squeeze(x, axis=0).shape # doctest: +SKIP (3, 1) >>> np.squeeze(x, axis=1).shape # doctest: +SKIP Traceback (most recent call last): ... ValueError: cannot select an axis to squeeze out which has size not equal to one >>> np.squeeze(x, axis=2).shape # doctest: +SKIP (1, 3)
-
dask.array.
stack
(seq, axis=0) Stack arrays along a new axis
Given a sequence of dask arrays, form a new dask array by stacking them along a new dimension (axis=0 by default)
See also
Examples
Create slices
>>> import dask.array as da >>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2)) ... for i in range(3)]
>>> x = da.stack(data, axis=0) >>> x.shape (3, 4, 4)
>>> da.stack(data, axis=1).shape (4, 3, 4)
>>> da.stack(data, axis=-1).shape (4, 4, 3)
Result is a new dask Array
-
dask.array.
std
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the standard deviation along the specified axis.
This docstring was copied from numpy.std.
Some inconsistencies with the Dask version may exist.
Returns the standard deviation, a measure of the spread of a distribution, of the array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.
Parameters: a : array_like
Calculate the standard deviation of these values.
axis : None or int or tuple of ints, optional
Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array.
New in version 1.7.0.
If this is a tuple of ints, a standard deviation is performed over multiple axes, instead of a single axis or all the axes as before.
dtype : dtype, optional
Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary.
ddof : int, optional
Means Delta Degrees of Freedom. The divisor used in calculations is
N - ddof
, whereN
represents the number of elements. By default ddof is zero.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the std method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
Returns: standard_deviation : ndarray, see dtype parameter above.
If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array.
Notes
The standard deviation is the square root of the average of the squared deviations from the mean, i.e.,
std = sqrt(mean(abs(x - x.mean())**2))
.The average squared deviation is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of the infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even withddof=1
, it will not be an unbiased estimate of the standard deviation per se.Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.
For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.
Examples
>>> a = np.array([[1, 2], [3, 4]]) # doctest: +SKIP >>> np.std(a) # doctest: +SKIP 1.1180339887498949 # may vary >>> np.std(a, axis=0) # doctest: +SKIP array([1., 1.]) >>> np.std(a, axis=1) # doctest: +SKIP array([0.5, 0.5])
In single precision, std() can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32) # doctest: +SKIP >>> a[0, :] = 1.0 # doctest: +SKIP >>> a[1, :] = 0.1 # doctest: +SKIP >>> np.std(a) # doctest: +SKIP 0.45000005
Computing the standard deviation in float64 is more accurate:
>>> np.std(a, dtype=np.float64) # doctest: +SKIP 0.44999999925494177 # may vary
-
dask.array.
sum
(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)¶ Sum of array elements over a given axis.
This docstring was copied from numpy.sum.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Elements to sum.
axis : None or int or tuple of ints, optional
Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input array. If axis is negative it counts from the last to the first axis.
New in version 1.7.0.
If axis is a tuple of ints, a sum is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
dtype : dtype, optional
The type of the returned array and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.
out : ndarray, optional
Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the sum method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
initial : scalar, optional (Not supported in Dask)
Starting value for the sum. See ~numpy.ufunc.reduce for details.
New in version 1.15.0.
where : array_like of bool, optional (Not supported in Dask)
Elements to include in the sum. See ~numpy.ufunc.reduce for details.
New in version 1.17.0.
Returns: sum_along_axis : ndarray
An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, a scalar is returned. If an output array is specified, a reference to out is returned.
See also
ndarray.sum
- Equivalent method.
add.reduce
- Equivalent functionality of add.
cumsum
- Cumulative sum of array elements.
trapz
- Integration of array values using the composite trapezoidal rule.
Notes
Arithmetic is modular when using integer types, and no error is raised on overflow.
The sum of an empty array is the neutral element 0:
>>> np.sum([]) # doctest: +SKIP 0.0
For floating point numbers the numerical precision of sum (and
np.add.reduce
) is in general limited by directly adding each number individually to the result causing rounding errors in every step. However, often numpy will use a numerically better approach (partial pairwise summation) leading to improved precision in many use-cases. This improved precision is always provided when noaxis
is given. Whenaxis
is given, it will depend on which axis is summed. Technically, to provide the best speed possible, the improved precision is only used when the summation is along the fast axis in memory. Note that the exact precision may vary depending on other parameters. In contrast to NumPy, Python’smath.fsum
function uses a slower but more precise approach to summation. Especially when summing a large number of lower precision floating point numbers, such asfloat32
, numerical errors can become significant. In such cases it can be advisable to use dtype=”float64” to use a higher precision for the output.Examples
>>> np.sum([0.5, 1.5]) # doctest: +SKIP 2.0 >>> np.sum([0.5, 0.7, 0.2, 1.5], dtype=np.int32) # doctest: +SKIP 1 >>> np.sum([[0, 1], [0, 5]]) # doctest: +SKIP 6 >>> np.sum([[0, 1], [0, 5]], axis=0) # doctest: +SKIP array([0, 6]) >>> np.sum([[0, 1], [0, 5]], axis=1) # doctest: +SKIP array([1, 5]) >>> np.sum([[0, 1], [np.nan, 5]], where=[False, True], axis=1) # doctest: +SKIP array([1., 5.])
If the accumulator is too small, overflow occurs:
>>> np.ones(128, dtype=np.int8).sum(dtype=np.int8) # doctest: +SKIP -128
You can also start the sum with a value other than zero:
>>> np.sum([10], initial=5) # doctest: +SKIP 15
-
dask.array.
take
(a, indices, axis=0)¶ Take elements from an array along an axis.
This docstring was copied from numpy.take.
Some inconsistencies with the Dask version may exist.
When axis is not None, this function does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis. A call such as
np.take(arr, indices, axis=3)
is equivalent toarr[:,:,:,indices,...]
.Explained without fancy indexing, this is equivalent to the following use of ndindex, which sets each of
ii
,jj
, andkk
to a tuple of indices:Ni, Nk = a.shape[:axis], a.shape[axis+1:] Nj = indices.shape for ii in ndindex(Ni): for jj in ndindex(Nj): for kk in ndindex(Nk): out[ii + jj + kk] = a[ii + (indices[jj],) + kk]
Parameters: a : array_like (Ni…, M, Nk…)
The source array.
indices : array_like (Nj…)
The indices of the values to extract.
New in version 1.8.0.
Also allow scalars for indices.
axis : int, optional
The axis over which to select values. By default, the flattened input array is used.
out : ndarray, optional (Ni…, Nj…, Nk…)
If provided, the result will be placed in this array. It should be of the appropriate shape and dtype. Note that out is always buffered if mode=’raise’; use other modes for better performance.
mode : {‘raise’, ‘wrap’, ‘clip’}, optional (Not supported in Dask)
Specifies how out-of-bounds indices will behave.
- ‘raise’ – raise an error (default)
- ‘wrap’ – wrap around
- ‘clip’ – clip to the range
‘clip’ mode means that all indices that are too large are replaced by the index that addresses the last element along that axis. Note that this disables indexing with negative numbers.
Returns: out : ndarray (Ni…, Nj…, Nk…)
The returned array has the same type as a.
See also
compress
- Take elements using a boolean mask
ndarray.take
- equivalent method
take_along_axis
- Take elements by matching the array and the index arrays
Notes
By eliminating the inner loop in the description above, and using s_ to build simple slice objects, take can be expressed in terms of applying fancy indexing to each 1-d slice:
Ni, Nk = a.shape[:axis], a.shape[axis+1:] for ii in ndindex(Ni): for kk in ndindex(Nj): out[ii + s_[...,] + kk] = a[ii + s_[:,] + kk][indices]
For this reason, it is equivalent to (but faster than) the following use of apply_along_axis:
out = np.apply_along_axis(lambda a_1d: a_1d[indices], axis, a)
Examples
>>> a = [4, 3, 5, 7, 6, 8] # doctest: +SKIP >>> indices = [0, 1, 4] # doctest: +SKIP >>> np.take(a, indices) # doctest: +SKIP array([4, 3, 6])
In this example if a is an ndarray, “fancy” indexing can be used.
>>> a = np.array(a) # doctest: +SKIP >>> a[indices] # doctest: +SKIP array([4, 3, 6])
If indices is not one dimensional, the output also has these dimensions.
>>> np.take(a, [[0, 1], [2, 3]]) # doctest: +SKIP array([[4, 3], [5, 7]])
-
dask.array.
tan
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.tan.
Some inconsistencies with the Dask version may exist.
Compute tangent element-wise.
Equivalent to
np.sin(x)/np.cos(x)
element-wise.Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The corresponding tangent values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.
Examples
>>> from math import pi # doctest: +SKIP >>> np.tan(np.array([-pi,pi/2,pi])) # doctest: +SKIP array([ 1.22460635e-16, 1.63317787e+16, -1.22460635e-16]) >>> >>> # Example of providing the optional output parameter illustrating >>> # that what is returned is a reference to said parameter >>> out1 = np.array([0], dtype='d') # doctest: +SKIP >>> out2 = np.cos([0.1], out1) # doctest: +SKIP >>> out2 is out1 # doctest: +SKIP True >>> >>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.cos(np.zeros((3,3)),np.zeros((2,2))) # doctest: +SKIP Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
tanh
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.tanh.
Some inconsistencies with the Dask version may exist.
Compute hyperbolic tangent element-wise.
Equivalent to
np.sinh(x)/np.cosh(x)
or-1j * np.tan(1j*x)
.Parameters: x : array_like
Input array.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray
The corresponding hyperbolic tangent values. This is a scalar if x is a scalar.
Notes
If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)
References
[R140] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83. http://www.math.sfu.ca/~cbm/aands/ [R141] Wikipedia, “Hyperbolic function”, https://en.wikipedia.org/wiki/Hyperbolic_function Examples
>>> np.tanh((0, np.pi*1j, np.pi*1j/2)) # doctest: +SKIP array([ 0. +0.00000000e+00j, 0. -1.22460635e-16j, 0. +1.63317787e+16j])
>>> # Example of providing the optional output parameter illustrating >>> # that what is returned is a reference to said parameter >>> out1 = np.array([0], dtype='d') # doctest: +SKIP >>> out2 = np.tanh([0.1], out1) # doctest: +SKIP >>> out2 is out1 # doctest: +SKIP True
>>> # Example of ValueError due to provision of shape mis-matched `out` >>> np.tanh(np.zeros((3,3)),np.zeros((2,2))) # doctest: +SKIP Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
-
dask.array.
tensordot
(lhs, rhs, axes=2)¶ Compute tensor dot product along specified axes.
This docstring was copied from numpy.tensordot.
Some inconsistencies with the Dask version may exist.
Given two tensors, a and b, and an array_like object containing two array_like objects,
(a_axes, b_axes)
, sum the products of a’s and b’s elements (components) over the axes specified bya_axes
andb_axes
. The third argument can be a single non-negative integer_like scalar,N
; if it is such, then the lastN
dimensions of a and the firstN
dimensions of b are summed over.Parameters: a, b : array_like
Tensors to “dot”.
axes : int or (2,) array_like
- integer_like If an int N, sum over the last N axes of a and the first N axes of b in order. The sizes of the corresponding axes must match.
- (2,) array_like Or, a list of axes to be summed over, first sequence applying to a, second to b. Both elements array_like must be of the same length.
Returns: output : ndarray
The tensor dot product of the input.
Notes
- Three common use cases are:
axes = 0
: tensor product \(a\otimes b\)axes = 1
: tensor dot product \(a\cdot b\)axes = 2
: (default) tensor double contraction \(a:b\)
When axes is integer_like, the sequence for evaluation will be: first the -Nth axis in a and 0th axis in b, and the -1th axis in a and Nth axis in b last.
When there is more than one axis to sum over - and they are not the last (first) axes of a (b) - the argument axes should consist of two sequences of the same length, with the first axis to sum over given first in both sequences, the second axis second, and so forth.
Examples
A “traditional” example:
>>> a = np.arange(60.).reshape(3,4,5) # doctest: +SKIP >>> b = np.arange(24.).reshape(4,3,2) # doctest: +SKIP >>> c = np.tensordot(a,b, axes=([1,0],[0,1])) # doctest: +SKIP >>> c.shape # doctest: +SKIP (5, 2) >>> c # doctest: +SKIP array([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]]) >>> # A slower but equivalent way of computing the same... >>> d = np.zeros((5,2)) # doctest: +SKIP >>> for i in range(5): # doctest: +SKIP ... for j in range(2): ... for k in range(3): ... for n in range(4): ... d[i,j] += a[k,n,i] * b[n,k,j] >>> c == d # doctest: +SKIP array([[ True, True], [ True, True], [ True, True], [ True, True], [ True, True]])
An extended example taking advantage of the overloading of + and *:
>>> a = np.array(range(1, 9)) # doctest: +SKIP >>> a.shape = (2, 2, 2) # doctest: +SKIP >>> A = np.array(('a', 'b', 'c', 'd'), dtype=object) # doctest: +SKIP >>> A.shape = (2, 2) # doctest: +SKIP >>> a; A # doctest: +SKIP array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) array([['a', 'b'], ['c', 'd']], dtype=object)
>>> np.tensordot(a, A) # third argument default is 2 for double-contraction # doctest: +SKIP array(['abbcccdddd', 'aaaaabbbbbbcccccccdddddddd'], dtype=object)
>>> np.tensordot(a, A, 1) # doctest: +SKIP array([[['acc', 'bdd'], ['aaacccc', 'bbbdddd']], [['aaaaacccccc', 'bbbbbdddddd'], ['aaaaaaacccccccc', 'bbbbbbbdddddddd']]], dtype=object)
>>> np.tensordot(a, A, 0) # tensor product (result too long to incl.) # doctest: +SKIP array([[[[['a', 'b'], ['c', 'd']], ...
>>> np.tensordot(a, A, (0, 1)) # doctest: +SKIP array([[['abbbbb', 'cddddd'], ['aabbbbbb', 'ccdddddd']], [['aaabbbbbbb', 'cccddddddd'], ['aaaabbbbbbbb', 'ccccdddddddd']]], dtype=object)
>>> np.tensordot(a, A, (2, 1)) # doctest: +SKIP array([[['abb', 'cdd'], ['aaabbbb', 'cccdddd']], [['aaaaabbbbbb', 'cccccdddddd'], ['aaaaaaabbbbbbbb', 'cccccccdddddddd']]], dtype=object)
>>> np.tensordot(a, A, ((0, 1), (0, 1))) # doctest: +SKIP array(['abbbcccccddddddd', 'aabbbbccccccdddddddd'], dtype=object)
>>> np.tensordot(a, A, ((2, 1), (1, 0))) # doctest: +SKIP array(['acccbbdddd', 'aaaaacccccccbbbbbbdddddddd'], dtype=object)
-
dask.array.
tile
(A, reps)¶ Construct an array by repeating A the number of times given by reps.
This docstring was copied from numpy.tile.
Some inconsistencies with the Dask version may exist.
If reps has length
d
, the result will have dimension ofmax(d, A.ndim)
.If
A.ndim < d
, A is promoted to be d-dimensional by prepending new axes. So a shape (3,) array is promoted to (1, 3) for 2-D replication, or shape (1, 1, 3) for 3-D replication. If this is not the desired behavior, promote A to d-dimensions manually before calling this function.If
A.ndim > d
, reps is promoted to A.ndim by pre-pending 1’s to it. Thus for an A of shape (2, 3, 4, 5), a reps of (2, 2) is treated as (1, 1, 2, 2).Note : Although tile may be used for broadcasting, it is strongly recommended to use numpy’s broadcasting operations and functions.
Parameters: A : array_like
The input array.
reps : array_like
The number of repetitions of A along each axis.
Returns: c : ndarray
The tiled output array.
See also
repeat
- Repeat elements of an array.
broadcast_to
- Broadcast an array to a new shape
Examples
>>> a = np.array([0, 1, 2]) # doctest: +SKIP >>> np.tile(a, 2) # doctest: +SKIP array([0, 1, 2, 0, 1, 2]) >>> np.tile(a, (2, 2)) # doctest: +SKIP array([[0, 1, 2, 0, 1, 2], [0, 1, 2, 0, 1, 2]]) >>> np.tile(a, (2, 1, 2)) # doctest: +SKIP array([[[0, 1, 2, 0, 1, 2]], [[0, 1, 2, 0, 1, 2]]])
>>> b = np.array([[1, 2], [3, 4]]) # doctest: +SKIP >>> np.tile(b, 2) # doctest: +SKIP array([[1, 2, 1, 2], [3, 4, 3, 4]]) >>> np.tile(b, (2, 1)) # doctest: +SKIP array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> c = np.array([1,2,3,4]) # doctest: +SKIP >>> np.tile(c,(4,1)) # doctest: +SKIP array([[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]])
-
dask.array.
topk
(a, k, axis=-1, split_every=None)¶ Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest.
This performs best when
k
is much smaller than the chunk size. All results will be returned in a single chunk along the given axis.Parameters: x: Array
Data being sorted
k: int
axis: int, optional
split_every: int >=2, optional
See
reduce()
. This parameter becomes very important when k is on the same order of magnitude of the chunk size or more, as it prevents getting the whole or a significant portion of the input array in memory all at once, with a negative impact on network transfer too when running on distributed.Returns: Selection of x with size abs(k) along the given axis.
Examples
>>> import dask.array as da >>> x = np.array([5, 1, 3, 6]) >>> d = da.from_array(x, chunks=2) >>> d.topk(2).compute() array([6, 5]) >>> d.topk(-2).compute() array([1, 3])
-
dask.array.
transpose
(a, axes=None)¶ Permute the dimensions of an array.
This docstring was copied from numpy.transpose.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Input array.
axes : list of ints, optional
By default, reverse the dimensions, otherwise permute the axes according to the values given.
Returns: p : ndarray
a with its axes permuted. A view is returned whenever possible.
See also
moveaxis
,argsort
Notes
Use transpose(a, argsort(axes)) to invert the transposition of tensors when using the axes keyword argument.
Transposing a 1-D array returns an unchanged view of the original array.
Examples
>>> x = np.arange(4).reshape((2,2)) # doctest: +SKIP >>> x # doctest: +SKIP array([[0, 1], [2, 3]])
>>> np.transpose(x) # doctest: +SKIP array([[0, 2], [1, 3]])
>>> x = np.ones((1, 2, 3)) # doctest: +SKIP >>> np.transpose(x, (1, 0, 2)).shape # doctest: +SKIP (2, 1, 3)
-
dask.array.
tril
(m, k=0)¶ Lower triangle of an array with elements above the k-th diagonal zeroed.
Parameters: m : array_like, shape (M, M)
Input array.
k : int, optional
Diagonal above which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above.
Returns: tril : ndarray, shape (M, M)
Lower triangle of m, of same shape and data-type as m.
See also
triu
- upper triangle of an array
-
dask.array.
triu
(m, k=0)¶ Upper triangle of an array with elements above the k-th diagonal zeroed.
Parameters: m : array_like, shape (M, N)
Input array.
k : int, optional
Diagonal above which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above.
Returns: triu : ndarray, shape (M, N)
Upper triangle of m, of same shape and data-type as m.
See also
tril
- lower triangle of an array
-
dask.array.
trunc
(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])¶ This docstring was copied from numpy.trunc.
Some inconsistencies with the Dask version may exist.
Return the truncated value of the input, element-wise.
The truncated value of the scalar x is the nearest integer i which is closer to zero than x is. In short, the fractional part of the signed number x is discarded.
Parameters: x : array_like
Input data.
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default
out=None
, locations within it where the condition is False will remain uninitialized.**kwargs
For other keyword-only arguments, see the ufunc docs.
Returns: y : ndarray or scalar
The truncated value of each element in x. This is a scalar if x is a scalar.
Notes
New in version 1.3.0.
Examples
>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) # doctest: +SKIP >>> np.trunc(a) # doctest: +SKIP array([-1., -1., -0., 0., 1., 1., 2.])
-
dask.array.
unique
(ar, return_index=False, return_inverse=False, return_counts=False)¶ Find the unique elements of an array.
This docstring was copied from numpy.unique.
Some inconsistencies with the Dask version may exist.
Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:
- the indices of the input array that give the unique values
- the indices of the unique array that reconstruct the input array
- the number of times each unique value comes up in the input array
Parameters: ar : array_like
Input array. Unless axis is specified, this will be flattened if it is not already 1-D.
return_index : bool, optional
If True, also return the indices of ar (along the specified axis, if provided, or in the flattened array) that result in the unique array.
return_inverse : bool, optional
If True, also return the indices of the unique array (for the specified axis, if provided) that can be used to reconstruct ar.
return_counts : bool, optional
If True, also return the number of times each unique item appears in ar.
New in version 1.9.0.
axis : int or None, optional (Not supported in Dask)
The axis to operate on. If None, ar will be flattened. If an integer, the subarrays indexed by the given axis will be flattened and treated as the elements of a 1-D array with the dimension of the given axis, see the notes for more details. Object arrays or structured arrays that contain objects are not supported if the axis kwarg is used. The default is None.
New in version 1.13.0.
Returns: unique : ndarray
The sorted unique values.
unique_indices : ndarray, optional
The indices of the first occurrences of the unique values in the original array. Only provided if return_index is True.
unique_inverse : ndarray, optional
The indices to reconstruct the original array from the unique array. Only provided if return_inverse is True.
unique_counts : ndarray, optional
The number of times each of the unique values comes up in the original array. Only provided if return_counts is True.
New in version 1.9.0.
See also
numpy.lib.arraysetops
- Module with a number of other functions for performing set operations on arrays.
Notes
When an axis is specified the subarrays indexed by the axis are sorted. This is done by making the specified axis the first dimension of the array and then flattening the subarrays in C order. The flattened subarrays are then viewed as a structured type with each element given a label, with the effect that we end up with a 1-D array of structured types that can be treated in the same way as any other 1-D array. The result is that the flattened subarrays are sorted in lexicographic order starting with the first element.
Examples
>>> np.unique([1, 1, 2, 2, 3, 3]) # doctest: +SKIP array([1, 2, 3]) >>> a = np.array([[1, 1], [2, 3]]) # doctest: +SKIP >>> np.unique(a) # doctest: +SKIP array([1, 2, 3])
Return the unique rows of a 2D array
>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]]) # doctest: +SKIP >>> np.unique(a, axis=0) # doctest: +SKIP array([[1, 0, 0], [2, 3, 4]])
Return the indices of the original array that give the unique values:
>>> a = np.array(['a', 'b', 'b', 'c', 'a']) # doctest: +SKIP >>> u, indices = np.unique(a, return_index=True) # doctest: +SKIP >>> u # doctest: +SKIP array(['a', 'b', 'c'], dtype='<U1') >>> indices # doctest: +SKIP array([0, 1, 3]) >>> a[indices] # doctest: +SKIP array(['a', 'b', 'c'], dtype='<U1')
Reconstruct the input array from the unique values:
>>> a = np.array([1, 2, 6, 4, 2, 3, 2]) # doctest: +SKIP >>> u, indices = np.unique(a, return_inverse=True) # doctest: +SKIP >>> u # doctest: +SKIP array([1, 2, 3, 4, 6]) >>> indices # doctest: +SKIP array([0, 1, 4, ..., 1, 2, 1]) >>> u[indices] # doctest: +SKIP array([1, 2, 6, ..., 2, 3, 2])
-
dask.array.
unravel_index
(indices, shape, order='C')¶ This docstring was copied from numpy.unravel_index.
Some inconsistencies with the Dask version may exist.
Converts a flat index or array of flat indices into a tuple of coordinate arrays.
Parameters: indices : array_like
An integer array whose elements are indices into the flattened version of an array of dimensions
shape
. Before version 1.6.0, this function accepted just one index value.shape : tuple of ints
The shape of the array to use for unraveling
indices
.Changed in version 1.16.0: Renamed from
dims
toshape
.order : {‘C’, ‘F’}, optional
Determines whether the indices should be viewed as indexing in row-major (C-style) or column-major (Fortran-style) order.
New in version 1.6.0.
Returns: unraveled_coords : tuple of ndarray
Each array in the tuple has the same shape as the
indices
array.See also
ravel_multi_index
Examples
>>> np.unravel_index([22, 41, 37], (7,6)) # doctest: +SKIP (array([3, 6, 6]), array([4, 5, 1])) >>> np.unravel_index([31, 41, 13], (7,6), order='F') # doctest: +SKIP (array([3, 6, 6]), array([4, 5, 1]))
>>> np.unravel_index(1621, (6,7,8,9)) # doctest: +SKIP (3, 1, 4, 1)
-
dask.array.
var
(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Compute the variance along the specified axis.
This docstring was copied from numpy.var.
Some inconsistencies with the Dask version may exist.
Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.
Parameters: a : array_like
Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted.
axis : None or int or tuple of ints, optional
Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array.
New in version 1.7.0.
If this is a tuple of ints, a variance is performed over multiple axes, instead of a single axis or all the axes as before.
dtype : data-type, optional
Type to use in computing the variance. For arrays of integer type the default is float32; for arrays of float types it is the same as the array type.
out : ndarray, optional
Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary.
ddof : int, optional
“Delta Degrees of Freedom”: the divisor used in the calculation is
N - ddof
, whereN
represents the number of elements. By default ddof is zero.keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the var method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
Returns: variance : ndarray, see dtype parameter above
If
out=None
, returns a new array containing the variance; otherwise, a reference to the output array is returned.Notes
The variance is the average of the squared deviations from the mean, i.e.,
var = mean(abs(x - x.mean())**2)
.The mean is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of a hypothetical infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables.Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.
For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the
dtype
keyword can alleviate this issue.Examples
>>> a = np.array([[1, 2], [3, 4]]) # doctest: +SKIP >>> np.var(a) # doctest: +SKIP 1.25 >>> np.var(a, axis=0) # doctest: +SKIP array([1., 1.]) >>> np.var(a, axis=1) # doctest: +SKIP array([0.25, 0.25])
In single precision, var() can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32) # doctest: +SKIP >>> a[0, :] = 1.0 # doctest: +SKIP >>> a[1, :] = 0.1 # doctest: +SKIP >>> np.var(a) # doctest: +SKIP 0.20250003
Computing the variance in float64 is more accurate:
>>> np.var(a, dtype=np.float64) # doctest: +SKIP 0.20249999932944759 # may vary >>> ((1-0.55)**2 + (0.1-0.55)**2)/2 # doctest: +SKIP 0.2025
-
dask.array.
vdot
(a, b)¶ This docstring was copied from numpy.vdot.
Some inconsistencies with the Dask version may exist.
Return the dot product of two vectors.
The vdot(a, b) function handles complex numbers differently than dot(a, b). If the first argument is complex the complex conjugate of the first argument is used for the calculation of the dot product.
Note that vdot handles multidimensional arrays differently than dot: it does not perform a matrix product, but flattens input arguments to 1-D vectors first. Consequently, it should only be used for vectors.
Parameters: a : array_like
If a is complex the complex conjugate is taken before calculation of the dot product.
b : array_like
Second argument to the dot product.
Returns: output : ndarray
Dot product of a and b. Can be an int, float, or complex depending on the types of a and b.
See also
dot
- Return the dot product without using the complex conjugate of the first argument.
Examples
>>> a = np.array([1+2j,3+4j]) # doctest: +SKIP >>> b = np.array([5+6j,7+8j]) # doctest: +SKIP >>> np.vdot(a, b) # doctest: +SKIP (70-8j) >>> np.vdot(b, a) # doctest: +SKIP (70+8j)
Note that higher-dimensional arrays are flattened!
>>> a = np.array([[1, 4], [5, 6]]) # doctest: +SKIP >>> b = np.array([[4, 1], [2, 2]]) # doctest: +SKIP >>> np.vdot(a, b) # doctest: +SKIP 30 >>> np.vdot(b, a) # doctest: +SKIP 30 >>> 1*4 + 4*1 + 5*2 + 6*2 # doctest: +SKIP 30
-
dask.array.
vstack
(tup, allow_unknown_chunksizes=False)¶ Stack arrays in sequence vertically (row wise).
This docstring was copied from numpy.vstack.
Some inconsistencies with the Dask version may exist.
This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N). Rebuilds arrays divided by vsplit.
This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.
Parameters: tup : sequence of ndarrays
The arrays must have the same shape along all but the first axis. 1-D arrays must have the same length.
Returns: stacked : ndarray
The array formed by stacking the given arrays, will be at least 2-D.
See also
stack
- Join a sequence of arrays along a new axis.
hstack
- Stack arrays in sequence horizontally (column wise).
dstack
- Stack arrays in sequence depth wise (along third dimension).
concatenate
- Join a sequence of arrays along an existing axis.
vsplit
- Split array into a list of multiple sub-arrays vertically.
block
- Assemble arrays from blocks.
Examples
>>> a = np.array([1, 2, 3]) # doctest: +SKIP >>> b = np.array([2, 3, 4]) # doctest: +SKIP >>> np.vstack((a,b)) # doctest: +SKIP array([[1, 2, 3], [2, 3, 4]])
>>> a = np.array([[1], [2], [3]]) # doctest: +SKIP >>> b = np.array([[2], [3], [4]]) # doctest: +SKIP >>> np.vstack((a,b)) # doctest: +SKIP array([[1], [2], [3], [2], [3], [4]])
-
dask.array.
where
(condition[, x, y])¶ This docstring was copied from numpy.where.
Some inconsistencies with the Dask version may exist.
Return elements chosen from x or y depending on condition.
Note
When only condition is provided, this function is a shorthand for
np.asarray(condition).nonzero()
. Using nonzero directly should be preferred, as it behaves correctly for subclasses. The rest of this documentation covers only the case where all three arguments are provided.Parameters: condition : array_like, bool
Where True, yield x, otherwise yield y.
x, y : array_like
Values from which to choose. x, y and condition need to be broadcastable to some shape.
Returns: out : ndarray
An array with elements from x where condition is True, and elements from y elsewhere.
Notes
If all the arrays are 1-D, where is equivalent to:
[xv if c else yv for c, xv, yv in zip(condition, x, y)]
Examples
>>> a = np.arange(10) # doctest: +SKIP >>> a # doctest: +SKIP array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> np.where(a < 5, a, 10*a) # doctest: +SKIP array([ 0, 1, 2, 3, 4, 50, 60, 70, 80, 90])
This can be used on multidimensional arrays too:
>>> np.where([[True, False], [True, True]], # doctest: +SKIP ... [[1, 2], [3, 4]], ... [[9, 8], [7, 6]]) array([[1, 8], [3, 4]])
The shapes of x, y, and the condition are broadcast together:
>>> x, y = np.ogrid[:3, :4] # doctest: +SKIP >>> np.where(x < y, x, 10 + y) # both x and 10+y are broadcast # doctest: +SKIP array([[10, 0, 0, 0], [10, 11, 1, 1], [10, 11, 12, 2]])
>>> a = np.array([[0, 1, 2], # doctest: +SKIP ... [0, 2, 4], ... [0, 3, 6]]) >>> np.where(a < 4, a, -1) # -1 is broadcast # doctest: +SKIP array([[ 0, 1, 2], [ 0, 2, -1], [ 0, 3, -1]])
-
dask.array.
zeros
(*args, **kwargs)¶ Blocked variant of zeros
Follows the signature of zeros exactly except that it also requires a keyword argument chunks=(…)
Original signature follows below. zeros(shape, dtype=float, order=’C’)
Return a new array of given shape and type, filled with zeros.
Parameters: shape : int or tuple of ints
Shape of the new array, e.g.,
(2, 3)
or2
.dtype : data-type, optional
The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64.
order : {‘C’, ‘F’}, optional, default: ‘C’
Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.
Returns: out : ndarray
Array of zeros with the given shape, dtype, and order.
See also
zeros_like
- Return an array of zeros with shape and type of input.
empty
- Return a new uninitialized array.
ones
- Return a new array setting values to one.
full
- Return a new array of given shape filled with value.
Examples
>>> np.zeros(5) array([ 0., 0., 0., 0., 0.])
>>> np.zeros((5,), dtype=int) array([0, 0, 0, 0, 0])
>>> np.zeros((2, 1)) array([[ 0.], [ 0.]])
>>> s = (2,2) >>> np.zeros(s) array([[ 0., 0.], [ 0., 0.]])
>>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype array([(0, 0), (0, 0)], dtype=[('x', '<i4'), ('y', '<i4')])
-
dask.array.
zeros_like
(a, dtype=None, chunks=None)¶ Return an array of zeros with the same shape and type as a given array.
Parameters: a : array_like
The shape and data-type of a define these same attributes of the returned array.
dtype : data-type, optional
Overrides the data type of the result.
chunks : sequence of ints
The number of samples on each block. Note that the last block will have fewer samples if
len(array) % chunks != 0
.Returns: out : ndarray
Array of zeros with the same shape and type as a.
See also
ones_like
- Return an array of ones with shape and type of input.
empty_like
- Return an empty array with shape and type of input.
zeros
- Return a new array setting values to zero.
ones
- Return a new array setting values to one.
empty
- Return a new uninitialized array.
-
dask.array.linalg.
cholesky
(a, lower=False)¶ Returns the Cholesky decomposition, \(A = L L^*\) or \(A = U^* U\) of a Hermitian positive-definite matrix A.
Parameters: a : (M, M) array_like
Matrix to be decomposed
lower : bool, optional
Whether to compute the upper or lower triangular Cholesky factorization. Default is upper-triangular.
Returns: c : (M, M) Array
Upper- or lower-triangular Cholesky factor of a.
-
dask.array.linalg.
inv
(a)¶ Compute the inverse of a matrix with LU decomposition and forward / backward substitutions.
Parameters: a : array_like
Square matrix to be inverted.
Returns: ainv : Array
Inverse of the matrix a.
-
dask.array.linalg.
lstsq
(a, b)¶ Return the least-squares solution to a linear matrix equation using QR decomposition.
Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2. The equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows of a can be less than, equal to, or greater than its number of linearly independent columns). If a is square and of full rank, then x (but for round-off error) is the “exact” solution of the equation.
Parameters: a : (M, N) array_like
“Coefficient” matrix.
b : (M,) array_like
Ordinate or “dependent variable” values.
Returns: x : (N,) Array
Least-squares solution. If b is two-dimensional, the solutions are in the K columns of x.
residuals : (1,) Array
Sums of residuals; squared Euclidean 2-norm for each column in
b - a*x
.rank : Array
Rank of matrix a.
s : (min(M, N),) Array
Singular values of a.
-
dask.array.linalg.
lu
(a)¶ Compute the lu decomposition of a matrix.
Returns: p: Array, permutation matrix
l: Array, lower triangular matrix with unit diagonal.
u: Array, upper triangular matrix
Examples
>>> p, l, u = da.linalg.lu(x) # doctest: +SKIP
-
dask.array.linalg.
norm
(x, ord=None, axis=None, keepdims=False)¶ Matrix or vector norm.
This docstring was copied from numpy.linalg.norm.
Some inconsistencies with the Dask version may exist.
This function is able to return one of eight different matrix norms, or one of an infinite number of vector norms (described below), depending on the value of the
ord
parameter.Parameters: x : array_like
Input array. If axis is None, x must be 1-D or 2-D.
ord : {non-zero int, inf, -inf, ‘fro’, ‘nuc’}, optional
Order of the norm (see table under
Notes
). inf means numpy’s inf object.axis : {int, 2-tuple of ints, None}, optional
If axis is an integer, it specifies the axis of x along which to compute the vector norms. If axis is a 2-tuple, it specifies the axes that hold 2-D matrices, and the matrix norms of these matrices are computed. If axis is None then either a vector norm (when x is 1-D) or a matrix norm (when x is 2-D) is returned.
New in version 1.8.0.
keepdims : bool, optional
If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.
New in version 1.10.0.
Returns: n : float or ndarray
Norm of the matrix or vector(s).
Notes
For values of
ord <= 0
, the result is, strictly speaking, not a mathematical ‘norm’, but it may still be useful for various numerical purposes.The following norms can be calculated:
ord norm for matrices norm for vectors None Frobenius norm 2-norm ‘fro’ Frobenius norm – ‘nuc’ nuclear norm – inf max(sum(abs(x), axis=1)) max(abs(x)) -inf min(sum(abs(x), axis=1)) min(abs(x)) 0 – sum(x != 0) 1 max(sum(abs(x), axis=0)) as below -1 min(sum(abs(x), axis=0)) as below 2 2-norm (largest sing. value) as below -2 smallest singular value as below other – sum(abs(x)**ord)**(1./ord) The Frobenius norm is given by [R142]:
\(||A||_F = [\sum_{i,j} abs(a_{i,j})^2]^{1/2}\)The nuclear norm is the sum of the singular values.
References
[R142] (1, 2) G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD, Johns Hopkins University Press, 1985, pg. 15 Examples
>>> from numpy import linalg as LA # doctest: +SKIP >>> a = np.arange(9) - 4 # doctest: +SKIP >>> a # doctest: +SKIP array([-4, -3, -2, ..., 2, 3, 4]) >>> b = a.reshape((3, 3)) # doctest: +SKIP >>> b # doctest: +SKIP array([[-4, -3, -2], [-1, 0, 1], [ 2, 3, 4]])
>>> LA.norm(a) # doctest: +SKIP 7.745966692414834 >>> LA.norm(b) # doctest: +SKIP 7.745966692414834 >>> LA.norm(b, 'fro') # doctest: +SKIP 7.745966692414834 >>> LA.norm(a, np.inf) # doctest: +SKIP 4.0 >>> LA.norm(b, np.inf) # doctest: +SKIP 9.0 >>> LA.norm(a, -np.inf) # doctest: +SKIP 0.0 >>> LA.norm(b, -np.inf) # doctest: +SKIP 2.0
>>> LA.norm(a, 1) # doctest: +SKIP 20.0 >>> LA.norm(b, 1) # doctest: +SKIP 7.0 >>> LA.norm(a, -1) # doctest: +SKIP -4.6566128774142013e-010 >>> LA.norm(b, -1) # doctest: +SKIP 6.0 >>> LA.norm(a, 2) # doctest: +SKIP 7.745966692414834 >>> LA.norm(b, 2) # doctest: +SKIP 7.3484692283495345
>>> LA.norm(a, -2) # doctest: +SKIP 0.0 >>> LA.norm(b, -2) # doctest: +SKIP 1.8570331885190563e-016 # may vary >>> LA.norm(a, 3) # doctest: +SKIP 5.8480354764257312 # may vary >>> LA.norm(a, -3) # doctest: +SKIP 0.0
Using the axis argument to compute vector norms:
>>> c = np.array([[ 1, 2, 3], # doctest: +SKIP ... [-1, 1, 4]]) >>> LA.norm(c, axis=0) # doctest: +SKIP array([ 1.41421356, 2.23606798, 5. ]) >>> LA.norm(c, axis=1) # doctest: +SKIP array([ 3.74165739, 4.24264069]) >>> LA.norm(c, ord=1, axis=1) # doctest: +SKIP array([ 6., 6.])
Using the axis argument to compute matrix norms:
>>> m = np.arange(8).reshape(2,2,2) # doctest: +SKIP >>> LA.norm(m, axis=(1,2)) # doctest: +SKIP array([ 3.74165739, 11.22497216]) >>> LA.norm(m[0, :, :]), LA.norm(m[1, :, :]) # doctest: +SKIP (3.7416573867739413, 11.224972160321824)
-
dask.array.linalg.
qr
(a)¶ Compute the qr factorization of a matrix.
Parameters: a : Array
Returns: q: Array, orthonormal
r: Array, upper-triangular
See also
numpy.linalg.qr
- Equivalent NumPy Operation
dask.array.linalg.tsqr
- Implementation for tall-and-skinny arrays
dask.array.linalg.sfqr
- Implementation for short-and-fat arrays
Examples
>>> q, r = da.linalg.qr(x) # doctest: +SKIP
-
dask.array.linalg.
solve
(a, b, sym_pos=False)¶ Solve the equation
a x = b
forx
. By default, use LU decomposition and forward / backward substitutions. Whensym_pos
isTrue
, use Cholesky decomposition.Parameters: a : (M, M) array_like
A square matrix.
b : (M,) or (M, N) array_like
Right-hand side matrix in
a x = b
.sym_pos : bool
Assume a is symmetric and positive definite. If
True
, use Cholesky decomposition.Returns: x : (M,) or (M, N) Array
Solution to the system
a x = b
. Shape of the return matches the shape of b.
-
dask.array.linalg.
solve_triangular
(a, b, lower=False)¶ Solve the equation a x = b for x, assuming a is a triangular matrix.
Parameters: a : (M, M) array_like
A triangular matrix
b : (M,) or (M, N) array_like
Right-hand side matrix in a x = b
lower : bool, optional
Use only data contained in the lower triangle of a. Default is to use upper triangle.
Returns: x : (M,) or (M, N) array
Solution to the system a x = b. Shape of return matches b.
-
dask.array.linalg.
svd
(a)¶ Compute the singular value decomposition of a matrix.
Returns: u: Array, unitary / orthogonal
s: Array, singular values in decreasing order (largest first)
v: Array, unitary / orthogonal
See also
np.linalg.svd
- Equivalent NumPy Operation
dask.array.linalg.tsqr
- Implementation for tall-and-skinny arrays
Examples
>>> u, s, v = da.linalg.svd(x) # doctest: +SKIP
-
dask.array.linalg.
svd_compressed
(a, k, n_power_iter=0, seed=None, compute=False)¶ Randomly compressed rank-k thin Singular Value Decomposition.
This computes the approximate singular value decomposition of a large array. This algorithm is generally faster than the normal algorithm but does not provide exact results. One can balance between performance and accuracy with input parameters (see below).
Parameters: a: Array
Input array
k: int
Rank of the desired thin SVD decomposition.
n_power_iter: int
Number of power iterations, useful when the singular values decay slowly. Error decreases exponentially as n_power_iter increases. In practice, set n_power_iter <= 4.
compute : bool
Whether or not to compute data at each use. Recomputing the input while performing several passes reduces memory pressure, but means that we have to compute the input multiple times. This is a good choice if the data is larger than memory and cheap to recreate.
Returns: u: Array, unitary / orthogonal
s: Array, singular values in decreasing order (largest first)
v: Array, unitary / orthogonal
References
N. Halko, P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011 https://arxiv.org/abs/0909.4061
Examples
>>> u, s, vt = svd_compressed(x, 20) # doctest: +SKIP
-
dask.array.linalg.
sfqr
(data, name=None)¶ Direct Short-and-Fat QR
Currently, this is a quick hack for non-tall-and-skinny matrices which are one chunk tall and (unless they are one chunk wide) have chunks that are wider than they are tall
Q [R_1 R_2 …] = [A_1 A_2 …]
it computes the factorization Q R_1 = A_1, then computes the other R_k’s in parallel.
Parameters: data: Array See also
dask.array.linalg.qr
- Main user API that uses this function
dask.array.linalg.tsqr
- Variant for tall-and-skinny case
-
dask.array.linalg.
tsqr
(data, compute_svd=False, _max_vchunk_size=None)¶ Direct Tall-and-Skinny QR algorithm
As presented in:
A. Benson, D. Gleich, and J. Demmel. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. IEEE International Conference on Big Data, 2013. https://arxiv.org/abs/1301.1071This algorithm is used to compute both the QR decomposition and the Singular Value Decomposition. It requires that the input array have a single column of blocks, each of which fit in memory.
Parameters: data: Array
compute_svd: bool
Whether to compute the SVD rather than the QR decomposition
_max_vchunk_size: Integer
Used internally in recursion to set the maximum row dimension of chunks in subsequent recursive calls.
See also
dask.array.linalg.qr
- Powered by this algorithm
dask.array.linalg.svd
- Powered by this algorithm
dask.array.linalg.sfqr
- Variant for short-and-fat arrays
Notes
With
k
blocks of size(m, n)
, this algorithm has memory use that scales ask * n * n
.The implementation here is the recursive variant due to the ultimate need for one “single core” QR decomposition. In the non-recursive version of the algorithm, given
k
blocks, afterk
m * n
QR decompositions, there will be a “single core” QR decomposition that will have to work with a(k * n, n)
matrix.Here, recursion is applied as necessary to ensure that
k * n
is not larger thanm
(ifm / n >= 2
). In particular, this is done to ensure that single core computations do not have to work on blocks larger than(m, n)
.Where blocks are irregular, the above logic is applied with the “height” of the “tallest” block used in place of
m
.Consider use of the
rechunk
method to control this behavior. Taller blocks will reduce overall memory use (assuming that many of them still fit in memory at once).
-
dask.array.ma.
average
(a, axis=None, weights=None, returned=False)¶ Return the weighted average of array over the given axis.
This docstring was copied from numpy.ma.average.
Some inconsistencies with the Dask version may exist.
Parameters: a : array_like
Data to be averaged. Masked entries are not taken into account in the computation.
axis : int, optional
Axis along which to average a. If None, averaging is done over the flattened array.
weights : array_like, optional
The importance that each element has in the computation of the average. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a. If
weights=None
, then all data in a are assumed to have a weight equal to one. If weights is complex, the imaginary parts are ignored.returned : bool, optional
Flag indicating whether a tuple
(result, sum of weights)
should be returned as output (True), or just the result (False). Default is False.Returns: average, [sum_of_weights] : (tuple of) scalar or MaskedArray
The average along the specified axis. When returned is True, return a tuple with the average as the first element and the sum of the weights as the second element. The return type is np.float64 if a is of integer type and floats smaller than float64, or the input data-type, otherwise. If returned, sum_of_weights is always float64.
Examples
>>> a = np.ma.array([1., 2., 3., 4.], mask=[False, False, True, True]) # doctest: +SKIP >>> np.ma.average(a, weights=[3, 1, 0, 0]) # doctest: +SKIP 1.25
>>> x = np.ma.arange(6.).reshape(3, 2) # doctest: +SKIP >>> x # doctest: +SKIP masked_array( data=[[0., 1.], [2., 3.], [4., 5.]], mask=False, fill_value=1e+20) >>> avg, sumweights = np.ma.average(x, axis=0, weights=[1, 2, 3], # doctest: +SKIP ... returned=True) >>> avg # doctest: +SKIP masked_array(data=[2.6666666666666665, 3.6666666666666665], mask=[False, False], fill_value=1e+20)
-
dask.array.ma.
filled
(a, fill_value=None)¶ Return input as an array with masked data replaced by a fill value.
This docstring was copied from numpy.ma.filled.
Some inconsistencies with the Dask version may exist.
If a is not a MaskedArray, a itself is returned. If a is a MaskedArray and fill_value is None, fill_value is set to
a.fill_value
.Parameters: a : MaskedArray or array_like
An input object.
fill_value : scalar, optional
Filling value. Default is None.
Returns: a : ndarray
The filled array.
See also
compressed
Examples
>>> x = np.ma.array(np.arange(9).reshape(3, 3), mask=[[1, 0, 0], # doctest: +SKIP ... [1, 0, 0], ... [0, 0, 0]]) >>> x.filled() # doctest: +SKIP array([[999999, 1, 2], [999999, 4, 5], [ 6, 7, 8]])
-
dask.array.ma.
fix_invalid
(a, fill_value=None)¶ Return input with invalid data masked and replaced by a fill value.
This docstring was copied from numpy.ma.fix_invalid.
Some inconsistencies with the Dask version may exist.
Invalid data means values of nan, inf, etc.
Parameters: a : array_like
Input array, a (subclass of) ndarray.
mask : sequence, optional (Not supported in Dask)
Mask. Must be convertible to an array of booleans with the same shape as data. True indicates a masked (i.e. invalid) data.
copy : bool, optional (Not supported in Dask)
Whether to use a copy of a (True) or to fix a in place (False). Default is True.
fill_value : scalar, optional
Value used for fixing invalid data. Default is None, in which case the
a.fill_value
is used.Returns: b : MaskedArray
The input array with invalid entries fixed.
Notes
A copy is performed by default.
Examples
>>> x = np.ma.array([1., -1, np.nan, np.inf], mask=[1] + [0]*3) # doctest: +SKIP >>> x # doctest: +SKIP masked_array(data=[--, -1.0, nan, inf], mask=[ True, False, False, False], fill_value=1e+20) >>> np.ma.fix_invalid(x) # doctest: +SKIP masked_array(data=[--, -1.0, --, --], mask=[ True, False, True, True], fill_value=1e+20)
>>> fixed = np.ma.fix_invalid(x) # doctest: +SKIP >>> fixed.data # doctest: +SKIP array([ 1.e+00, -1.e+00, 1.e+20, 1.e+20]) >>> x.data # doctest: +SKIP array([ 1., -1., nan, inf])
-
dask.array.ma.
getdata
(a)¶ Return the data of a masked array as an ndarray.
This docstring was copied from numpy.ma.getdata.
Some inconsistencies with the Dask version may exist.
Return the data of a (if any) as an ndarray if a is a
MaskedArray
, else return a as a ndarray or subclass (depending on subok) if not.Parameters: a : array_like
Input
MaskedArray
, alternatively a ndarray or a subclass thereof.subok : bool (Not supported in Dask)
Whether to force the output to be a pure ndarray (False) or to return a subclass of ndarray if appropriate (True, default).
See also
getmask
- Return the mask of a masked array, or nomask.
getmaskarray
- Return the mask of a masked array, or full array of False.
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> a = ma.masked_equal([[1,2],[3,4]], 2) # doctest: +SKIP >>> a # doctest: +SKIP masked_array( data=[[1, --], [3, 4]], mask=[[False, True], [False, False]], fill_value=2) >>> ma.getdata(a) # doctest: +SKIP array([[1, 2], [3, 4]])
Equivalently use the
MaskedArray
data attribute.>>> a.data # doctest: +SKIP array([[1, 2], [3, 4]])
-
dask.array.ma.
getmaskarray
(a)¶ Return the mask of a masked array, or full boolean array of False.
This docstring was copied from numpy.ma.getmaskarray.
Some inconsistencies with the Dask version may exist.
Return the mask of arr as an ndarray if arr is a MaskedArray and the mask is not nomask, else return a full boolean array of False of the same shape as arr.
Parameters: arr : array_like (Not supported in Dask)
Input MaskedArray for which the mask is required.
See also
getmask
- Return the mask of a masked array, or nomask.
getdata
- Return the data of a masked array as an ndarray.
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> a = ma.masked_equal([[1,2],[3,4]], 2) # doctest: +SKIP >>> a # doctest: +SKIP masked_array( data=[[1, --], [3, 4]], mask=[[False, True], [False, False]], fill_value=2) >>> ma.getmaskarray(a) # doctest: +SKIP array([[False, True], [False, False]])
Result when mask ==
nomask
>>> b = ma.masked_array([[1,2],[3,4]]) # doctest: +SKIP >>> b # doctest: +SKIP masked_array( data=[[1, 2], [3, 4]], mask=False, fill_value=999999) >>> ma.getmaskarray(b) # doctest: +SKIP array([[False, False], [False, False]])
-
dask.array.ma.
masked_array
(data, mask=False, fill_value=None, **kwargs)¶ An array class with possibly masked values.
This docstring was copied from numpy.ma.masked_array.
Some inconsistencies with the Dask version may exist.
Masked values of True exclude the corresponding element from any computation.
Construction:
x = MaskedArray(data, mask=nomask, dtype=None, copy=False, subok=True, ndmin=0, fill_value=None, keep_mask=True, hard_mask=None, shrink=True, order=None)
Parameters: data : array_like
Input data.
mask : sequence, optional
Mask. Must be convertible to an array of booleans with the same shape as data. True indicates a masked (i.e. invalid) data.
dtype : dtype, optional (Not supported in Dask)
Data type of the output. If dtype is None, the type of the data argument (
data.dtype
) is used. If dtype is not None and different fromdata.dtype
, a copy is performed.copy : bool, optional (Not supported in Dask)
Whether to copy the input data (True), or to use a reference instead. Default is False.
subok : bool, optional (Not supported in Dask)
Whether to return a subclass of MaskedArray if possible (True) or a plain MaskedArray. Default is True.
ndmin : int, optional (Not supported in Dask)
Minimum number of dimensions. Default is 0.
fill_value : scalar, optional
Value used to fill in the masked values when necessary. If None, a default based on the data-type is used.
keep_mask : bool, optional (Not supported in Dask)
Whether to combine mask with the mask of the input data, if any (True), or to use only mask for the output (False). Default is True.
hard_mask : bool, optional (Not supported in Dask)
Whether to use a hard mask or not. With a hard mask, masked values cannot be unmasked. Default is False.
shrink : bool, optional (Not supported in Dask)
Whether to force compression of an empty mask. Default is True.
order : {‘C’, ‘F’, ‘A’}, optional (Not supported in Dask)
Specify the order of the array. If order is ‘C’, then the array will be in C-contiguous order (last-index varies the fastest). If order is ‘F’, then the returned array will be in Fortran-contiguous order (first-index varies the fastest). If order is ‘A’ (default), then the returned array may be in any order (either C-, Fortran-contiguous, or even discontiguous), unless a copy is required, in which case it will be C-contiguous.
-
dask.array.ma.
masked_equal
(a, value)¶ Mask an array where equal to a given value.
This docstring was copied from numpy.ma.masked_equal.
Some inconsistencies with the Dask version may exist.
This function is a shortcut to
masked_where
, with condition = (x == value). For floating point arrays, consider usingmasked_values(x, value)
.See also
masked_where
- Mask where a condition is met.
masked_values
- Mask using floating point equality.
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> a = np.arange(4) # doctest: +SKIP >>> a # doctest: +SKIP array([0, 1, 2, 3]) >>> ma.masked_equal(a, 2) # doctest: +SKIP masked_array(data=[0, 1, --, 3], mask=[False, False, True, False], fill_value=2)
-
dask.array.ma.
masked_greater
(x, value, copy=True)¶ Mask an array where greater than a given value.
This function is a shortcut to
masked_where
, with condition = (x > value).See also
masked_where
- Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_greater(a, 2) masked_array(data=[0, 1, 2, --], mask=[False, False, False, True], fill_value=999999)
-
dask.array.ma.
masked_greater_equal
(x, value, copy=True)¶ Mask an array where greater than or equal to a given value.
This function is a shortcut to
masked_where
, with condition = (x >= value).See also
masked_where
- Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_greater_equal(a, 2) masked_array(data=[0, 1, --, --], mask=[False, False, True, True], fill_value=999999)
-
dask.array.ma.
masked_inside
(x, v1, v2)¶ Mask an array inside a given interval.
This docstring was copied from numpy.ma.masked_inside.
Some inconsistencies with the Dask version may exist.
Shortcut to
masked_where
, where condition is True for x inside the interval [v1,v2] (v1 <= x <= v2). The boundaries v1 and v2 can be given in either order.See also
masked_where
- Mask where a condition is met.
Notes
The array x is prefilled with its filling value.
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> x = [0.31, 1.2, 0.01, 0.2, -0.4, -1.1] # doctest: +SKIP >>> ma.masked_inside(x, -0.3, 0.3) # doctest: +SKIP masked_array(data=[0.31, 1.2, --, --, -0.4, -1.1], mask=[False, False, True, True, False, False], fill_value=1e+20)
The order of v1 and v2 doesn’t matter.
>>> ma.masked_inside(x, 0.3, -0.3) # doctest: +SKIP masked_array(data=[0.31, 1.2, --, --, -0.4, -1.1], mask=[False, False, True, True, False, False], fill_value=1e+20)
-
dask.array.ma.
masked_invalid
(a)¶ Mask an array where invalid values occur (NaNs or infs).
This docstring was copied from numpy.ma.masked_invalid.
Some inconsistencies with the Dask version may exist.
This function is a shortcut to
masked_where
, with condition = ~(np.isfinite(a)). Any pre-existing mask is conserved. Only applies to arrays with a dtype where NaNs or infs make sense (i.e. floating point types), but accepts any array_like object.See also
masked_where
- Mask where a condition is met.
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> a = np.arange(5, dtype=float) # doctest: +SKIP >>> a[2] = np.NaN # doctest: +SKIP >>> a[3] = np.PINF # doctest: +SKIP >>> a # doctest: +SKIP array([ 0., 1., nan, inf, 4.]) >>> ma.masked_invalid(a) # doctest: +SKIP masked_array(data=[0.0, 1.0, --, --, 4.0], mask=[False, False, True, True, False], fill_value=1e+20)
-
dask.array.ma.
masked_less
(x, value, copy=True)¶ Mask an array where less than a given value.
This function is a shortcut to
masked_where
, with condition = (x < value).See also
masked_where
- Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_less(a, 2) masked_array(data=[--, --, 2, 3], mask=[ True, True, False, False], fill_value=999999)
-
dask.array.ma.
masked_less_equal
(x, value, copy=True)¶ Mask an array where less than or equal to a given value.
This function is a shortcut to
masked_where
, with condition = (x <= value).See also
masked_where
- Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_less_equal(a, 2) masked_array(data=[--, --, --, 3], mask=[ True, True, True, False], fill_value=999999)
-
dask.array.ma.
masked_not_equal
(x, value, copy=True)¶ Mask an array where not equal to a given value.
This function is a shortcut to
masked_where
, with condition = (x != value).See also
masked_where
- Mask where a condition is met.
Examples
>>> import numpy.ma as ma >>> a = np.arange(4) >>> a array([0, 1, 2, 3]) >>> ma.masked_not_equal(a, 2) masked_array(data=[--, --, 2, --], mask=[ True, True, False, True], fill_value=999999)
-
dask.array.ma.
masked_outside
(x, v1, v2)¶ Mask an array outside a given interval.
This docstring was copied from numpy.ma.masked_outside.
Some inconsistencies with the Dask version may exist.
Shortcut to
masked_where
, where condition is True for x outside the interval [v1,v2] (x < v1)|(x > v2). The boundaries v1 and v2 can be given in either order.See also
masked_where
- Mask where a condition is met.
Notes
The array x is prefilled with its filling value.
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> x = [0.31, 1.2, 0.01, 0.2, -0.4, -1.1] # doctest: +SKIP >>> ma.masked_outside(x, -0.3, 0.3) # doctest: +SKIP masked_array(data=[--, --, 0.01, 0.2, --, --], mask=[ True, True, False, False, True, True], fill_value=1e+20)
The order of v1 and v2 doesn’t matter.
>>> ma.masked_outside(x, 0.3, -0.3) # doctest: +SKIP masked_array(data=[--, --, 0.01, 0.2, --, --], mask=[ True, True, False, False, True, True], fill_value=1e+20)
-
dask.array.ma.
masked_values
(x, value, rtol=1e-05, atol=1e-08, shrink=True)¶ Mask using floating point equality.
This docstring was copied from numpy.ma.masked_values.
Some inconsistencies with the Dask version may exist.
Return a MaskedArray, masked where the data in array x are approximately equal to value, determined using isclose. The default tolerances for masked_values are the same as those for isclose.
For integer types, exact equality is used, in the same way as masked_equal.
The fill_value is set to value and the mask is set to
nomask
if possible.Parameters: x : array_like
Array to mask.
value : float
Masking value.
rtol, atol : float, optional
Tolerance parameters passed on to isclose
copy : bool, optional (Not supported in Dask)
Whether to return a copy of x.
shrink : bool, optional
Whether to collapse a mask full of False to
nomask
.Returns: result : MaskedArray
The result of masking x where approximately equal to value.
See also
masked_where
- Mask where a condition is met.
masked_equal
- Mask where equal to a given value (integers).
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> x = np.array([1, 1.1, 2, 1.1, 3]) # doctest: +SKIP >>> ma.masked_values(x, 1.1) # doctest: +SKIP masked_array(data=[1.0, --, 2.0, --, 3.0], mask=[False, True, False, True, False], fill_value=1.1)
Note that mask is set to
nomask
if possible.>>> ma.masked_values(x, 1.5) # doctest: +SKIP masked_array(data=[1. , 1.1, 2. , 1.1, 3. ], mask=False, fill_value=1.5)
For integers, the fill value will be different in general to the result of
masked_equal
.>>> x = np.arange(5) # doctest: +SKIP >>> x # doctest: +SKIP array([0, 1, 2, 3, 4]) >>> ma.masked_values(x, 2) # doctest: +SKIP masked_array(data=[0, 1, --, 3, 4], mask=[False, False, True, False, False], fill_value=2) >>> ma.masked_equal(x, 2) # doctest: +SKIP masked_array(data=[0, 1, --, 3, 4], mask=[False, False, True, False, False], fill_value=2)
-
dask.array.ma.
masked_where
(condition, a)¶ Mask an array where a condition is met.
This docstring was copied from numpy.ma.masked_where.
Some inconsistencies with the Dask version may exist.
Return a as an array masked where condition is True. Any masked values of a or condition are also masked in the output.
Parameters: condition : array_like
Masking condition. When condition tests floating point values for equality, consider using
masked_values
instead.a : array_like
Array to mask.
copy : bool (Not supported in Dask)
If True (default) make a copy of a in the result. If False modify a in place and return a view.
Returns: result : MaskedArray
The result of masking a where condition is True.
See also
masked_values
- Mask using floating point equality.
masked_equal
- Mask where equal to a given value.
masked_not_equal
- Mask where not equal to a given value.
masked_less_equal
- Mask where less than or equal to a given value.
masked_greater_equal
- Mask where greater than or equal to a given value.
masked_less
- Mask where less than a given value.
masked_greater
- Mask where greater than a given value.
masked_inside
- Mask inside a given interval.
masked_outside
- Mask outside a given interval.
masked_invalid
- Mask invalid values (NaNs or infs).
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> a = np.arange(4) # doctest: +SKIP >>> a # doctest: +SKIP array([0, 1, 2, 3]) >>> ma.masked_where(a <= 2, a) # doctest: +SKIP masked_array(data=[--, --, --, 3], mask=[ True, True, True, False], fill_value=999999)
Mask array b conditional on a.
>>> b = ['a', 'b', 'c', 'd'] # doctest: +SKIP >>> ma.masked_where(a == 2, b) # doctest: +SKIP masked_array(data=['a', 'b', --, 'd'], mask=[False, False, True, False], fill_value='N/A', dtype='<U1')
Effect of the copy argument.
>>> c = ma.masked_where(a <= 2, a) # doctest: +SKIP >>> c # doctest: +SKIP masked_array(data=[--, --, --, 3], mask=[ True, True, True, False], fill_value=999999) >>> c[0] = 99 # doctest: +SKIP >>> c # doctest: +SKIP masked_array(data=[99, --, --, 3], mask=[False, True, True, False], fill_value=999999) >>> a # doctest: +SKIP array([0, 1, 2, 3]) >>> c = ma.masked_where(a <= 2, a, copy=False) # doctest: +SKIP >>> c[0] = 99 # doctest: +SKIP >>> c # doctest: +SKIP masked_array(data=[99, --, --, 3], mask=[False, True, True, False], fill_value=999999) >>> a # doctest: +SKIP array([99, 1, 2, 3])
When condition or a contain masked values.
>>> a = np.arange(4) # doctest: +SKIP >>> a = ma.masked_where(a == 2, a) # doctest: +SKIP >>> a # doctest: +SKIP masked_array(data=[0, 1, --, 3], mask=[False, False, True, False], fill_value=999999) >>> b = np.arange(4) # doctest: +SKIP >>> b = ma.masked_where(b == 0, b) # doctest: +SKIP >>> b # doctest: +SKIP masked_array(data=[--, 1, 2, 3], mask=[ True, False, False, False], fill_value=999999) >>> ma.masked_where(a == 3, b) # doctest: +SKIP masked_array(data=[--, 1, --, --], mask=[ True, False, True, True], fill_value=999999)
-
dask.array.ma.
set_fill_value
(a, fill_value)¶ Set the filling value of a, if a is a masked array.
This docstring was copied from numpy.ma.set_fill_value.
Some inconsistencies with the Dask version may exist.
This function changes the fill value of the masked array a in place. If a is not a masked array, the function returns silently, without doing anything.
Parameters: a : array_like
Input array.
fill_value : dtype
Filling value. A consistency test is performed to make sure the value is compatible with the dtype of a.
Returns: None
Nothing returned by this function.
See also
maximum_fill_value
- Return the default fill value for a dtype.
MaskedArray.fill_value
- Return current fill value.
MaskedArray.set_fill_value
- Equivalent method.
Examples
>>> import numpy.ma as ma # doctest: +SKIP >>> a = np.arange(5) # doctest: +SKIP >>> a # doctest: +SKIP array([0, 1, 2, 3, 4]) >>> a = ma.masked_where(a < 3, a) # doctest: +SKIP >>> a # doctest: +SKIP masked_array(data=[--, --, --, 3, 4], mask=[ True, True, True, False, False], fill_value=999999) >>> ma.set_fill_value(a, -999) # doctest: +SKIP >>> a # doctest: +SKIP masked_array(data=[--, --, --, 3, 4], mask=[ True, True, True, False, False], fill_value=-999)
Nothing happens if a is not a masked array.
>>> a = list(range(5)) # doctest: +SKIP >>> a # doctest: +SKIP [0, 1, 2, 3, 4] >>> ma.set_fill_value(a, 100) # doctest: +SKIP >>> a # doctest: +SKIP [0, 1, 2, 3, 4] >>> a = np.arange(5) # doctest: +SKIP >>> a # doctest: +SKIP array([0, 1, 2, 3, 4]) >>> ma.set_fill_value(a, 100) # doctest: +SKIP >>> a # doctest: +SKIP array([0, 1, 2, 3, 4])
-
dask.array.overlap.
overlap
(x, depth, boundary)¶ Share boundaries between neighboring blocks
Parameters: x: da.Array
A dask array
depth: dict
The size of the shared boundary per axis
boundary: dict
The boundary condition on each axis. Options are ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or an array value. Such a value will fill the boundary with that value.
The depth input informs how many cells to overlap between neighboring
blocks ``{0: 2, 2: 5}`` means share two cells in 0 axis, 5 cells in 2 axis.
Axes missing from this input will not be overlapped.
Examples
>>> import numpy as np >>> import dask.array as da
>>> x = np.arange(64).reshape((8, 8)) >>> d = da.from_array(x, chunks=(4, 4)) >>> d.chunks ((4, 4), (4, 4))
>>> g = da.overlap.overlap(d, depth={0: 2, 1: 1}, ... boundary={0: 100, 1: 'reflect'}) >>> g.chunks ((8, 8), (6, 6))
>>> np.array(g) array([[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100], [ 0, 0, 1, 2, 3, 4, 3, 4, 5, 6, 7, 7], [ 8, 8, 9, 10, 11, 12, 11, 12, 13, 14, 15, 15], [ 16, 16, 17, 18, 19, 20, 19, 20, 21, 22, 23, 23], [ 24, 24, 25, 26, 27, 28, 27, 28, 29, 30, 31, 31], [ 32, 32, 33, 34, 35, 36, 35, 36, 37, 38, 39, 39], [ 40, 40, 41, 42, 43, 44, 43, 44, 45, 46, 47, 47], [ 16, 16, 17, 18, 19, 20, 19, 20, 21, 22, 23, 23], [ 24, 24, 25, 26, 27, 28, 27, 28, 29, 30, 31, 31], [ 32, 32, 33, 34, 35, 36, 35, 36, 37, 38, 39, 39], [ 40, 40, 41, 42, 43, 44, 43, 44, 45, 46, 47, 47], [ 48, 48, 49, 50, 51, 52, 51, 52, 53, 54, 55, 55], [ 56, 56, 57, 58, 59, 60, 59, 60, 61, 62, 63, 63], [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100], [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100]])
-
dask.array.overlap.
map_overlap
(x, func, depth, boundary=None, trim=True, **kwargs)¶ Map a function over blocks of the array with some overlap
We share neighboring zones between blocks of the array, then map a function, then trim away the neighboring strips.
Parameters: func: function
The function to apply to each extended block
depth: int, tuple, or dict
The number of elements that each block should share with its neighbors If a tuple or dict then this can be different per axis. Asymmetric depths may be specified using a dict value of (-/+) tuples. Note that asymmetric depths are currently only supported when
boundary
is ‘none’.boundary: str, tuple, dict
How to handle the boundaries. Values include ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or any constant value like 0 or np.nan
trim: bool
Whether or not to trim
depth
elements from each block after calling the map function. Set this to False if your mapping function already does this for you**kwargs:
Other keyword arguments valid in
map_blocks
Examples
>>> import numpy as np >>> import dask.array as da
>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1]) >>> x = da.from_array(x, chunks=5) >>> def derivative(x): ... return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0) >>> y.compute() array([ 1, 0, 1, 1, 0, 0, -1, -1, 0])
>>> x = np.arange(16).reshape((4, 4)) >>> d = da.from_array(x, chunks=(2, 2)) >>> d.map_overlap(lambda x: x + x.size, depth=1).compute() array([[16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]])
>>> func = lambda x: x + x.size >>> depth = {0: 1, 1: 1} >>> boundary = {0: 'reflect', 1: 'none'} >>> d.map_overlap(func, depth, boundary).compute() # doctest: +NORMALIZE_WHITESPACE array([[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27]])
-
dask.array.overlap.
trim_internal
(x, axes, boundary=None)¶ Trim sides from each block
This couples well with the overlap operation, which may leave excess data on each block
See also
dask.array.chunk.trim
,dask.array.map_blocks
-
dask.array.overlap.
trim_overlap
(x, depth, boundary=None)¶ Trim sides from each block.
This couples well with the
map_overlap
operation which may leave excess data on each block.See also
-
dask.array.
from_array
(x, chunks='auto', name=None, lock=False, asarray=None, fancy=True, getitem=None, meta=None) Create dask array from something that looks like an array
Input must have a
.shape
,.ndim
,.dtype
and support numpy-style slicing.Parameters: x : array_like
chunks : int, tuple
How to chunk the array. Must be one of the following forms:
- A blocksize like 1000.
- A blockshape like (1000, 1000).
- Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)).
- A size in bytes, like “100 MiB” which will choose a uniform block-like shape
- The word “auto” which acts like the above, but uses a configuration
value
array.chunk-size
for the chunk size
-1 or None as a blocksize indicate the size of the corresponding dimension.
name : str, optional
The key name to use for the array. Defaults to a hash of
x
. By default, hash uses python’s standard sha1. This behaviour can be changed by installing cityhash, xxhash or murmurhash. If installed, a large-factor speedup can be obtained in the tokenisation step. Usename=False
to generate a random name instead of hashing (fast)lock : bool or Lock, optional
If
x
doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you.asarray : bool, optional
If True then call np.asarray on chunks to convert them to numpy arrays. If False then chunks are passed through unchanged. If None (default) then we use True if the
__array_function__
method is undefined.fancy : bool, optional
If
x
doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.meta : Array-like, optional
The metadata for the resulting dask array. This is the kind of array that will result from slicing the input array. Defaults to the input array.
Examples
>>> x = h5py.File('...')['/data/path'] # doctest: +SKIP >>> a = da.from_array(x, chunks=(1000, 1000)) # doctest: +SKIP
If your underlying datastore does not support concurrent reads then include the
lock=True
keyword argument orlock=mylock
if you want multiple arrays to coordinate around the same lock.>>> a = da.from_array(x, chunks=(1000, 1000), lock=True) # doctest: +SKIP
If your underlying datastore has a
.chunks
attribute (as h5py and zarr datasets do) then a multiple of that chunk shape will be used if you do not provide a chunk shape.>>> a = da.from_array(x, chunks='auto') # doctest: +SKIP >>> a = da.from_array(x, chunks='100 MiB') # doctest: +SKIP >>> a = da.from_array(x) # doctest: +SKIP
-
dask.array.
from_delayed
(value, shape, dtype=None, meta=None, name=None) Create a dask array from a dask delayed value
This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.
The dask array will consist of a single chunk.
Examples
>>> import dask >>> import dask.array as da >>> value = dask.delayed(np.ones)(5) >>> array = da.from_delayed(value, (5,), dtype=float) >>> array dask.array<from-value, shape=(5,), dtype=float64, chunksize=(5,), chunktype=numpy.ndarray> >>> array.compute() array([1., 1., 1., 1., 1.])
-
dask.array.
from_npy_stack
(dirname, mmap_mode='r')¶ Load dask array from stack of npy files
See
da.to_npy_stack
for docstringParameters: dirname: string
Directory of .npy files
mmap_mode: (None or ‘r’)
Read data in memory map mode
-
dask.array.
from_zarr
(url, component=None, storage_options=None, chunks=None, name=None, **kwargs)¶ Load array from the zarr storage format
See https://zarr.readthedocs.io for details about the format.
Parameters: url: Zarr Array or str or MutableMapping
Location of the data. A URL can include a protocol specifier like s3:// for remote data. Can also be any MutableMapping instance, which should be serializable if used in multiple processes.
component: str or None
If the location is a zarr group rather than an array, this is the subcomponent that should be loaded, something like
'foo/bar'
.storage_options: dict
Any additional parameters for the storage backend (ignored for local paths)
chunks: tuple of ints or tuples of ints
Passed to
da.from_array
, allows setting the chunks on initialisation, if the chunking scheme in the on-disc dataset is not optimal for the calculations to follow.name : str, optional
An optional keyname for the array. Defaults to hashing the input
kwargs: passed to ``zarr.Array``.
-
dask.array.
from_tiledb
(uri, attribute=None, chunks=None, storage_options=None, **kwargs)¶ Load array from the TileDB storage format
See https://docs.tiledb.io for more information about TileDB.
Parameters: uri: TileDB array or str
Location to save the data
attribute: str or None
Attribute selection (single-attribute view on multi-attribute array)
Returns: A Dask Array
Examples
>>> # create a tiledb array >>> import tiledb, numpy as np, tempfile # doctest: +SKIP >>> uri = tempfile.NamedTemporaryFile().name # doctest: +SKIP >>> tiledb.from_numpy(uri, np.arange(0,9).reshape(3,3)) # doctest: +SKIP <tiledb.libtiledb.DenseArray object at 0x...> >>> # read back the array >>> import dask.array as da # doctest: +SKIP >>> tdb_ar = da.from_tiledb(uri) # doctest: +SKIP >>> tdb_ar.shape # doctest: +SKIP (3, 3) >>> tdb_ar.mean().compute() # doctest: +SKIP 4.0
-
dask.array.
store
(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs) Store dask arrays in array-like objects, overwrite data in target
This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.
If your data fits in memory then you may prefer calling
np.array(myarray)
instead.Parameters: sources: Array or iterable of Arrays
targets: array-like or Delayed or iterable of array-likes and/or Delayeds
These should support setitem syntax
target[10:20] = ...
lock: boolean or threading.Lock, optional
Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular
threading.Lock
object to be shared among all writes.regions: tuple of slices or list of tuples of slices
Each
region
tuple inregions
should be such thattarget[region].shape = source.shape
for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.compute: boolean, optional
If true compute immediately, return
dask.delayed.Delayed
otherwisereturn_stored: boolean, optional
Optionally return the stored result (default False).
Examples
>>> x = ... # doctest: +SKIP
>>> import h5py # doctest: +SKIP >>> f = h5py.File('myfile.hdf5', mode='a') # doctest: +SKIP >>> dset = f.create_dataset('/data', shape=x.shape, ... chunks=x.chunks, ... dtype='f8') # doctest: +SKIP
>>> store(x, dset) # doctest: +SKIP
Alternatively store many arrays at the same time
>>> store([x, y, z], [dset1, dset2, dset3]) # doctest: +SKIP
-
dask.array.
to_hdf5
(filename, *args, **kwargs)¶ Store arrays in HDF5 file
This saves several dask arrays into several datapaths in an HDF5 file. It creates the necessary datasets and handles clean file opening/closing.
>>> da.to_hdf5('myfile.hdf5', '/x', x) # doctest: +SKIP
or
>>> da.to_hdf5('myfile.hdf5', {'/x': x, '/y': y}) # doctest: +SKIP
Optionally provide arguments as though to
h5py.File.create_dataset
>>> da.to_hdf5('myfile.hdf5', '/x', x, compression='lzf', shuffle=True) # doctest: +SKIP
This can also be used as a method on a single Array
>>> x.to_hdf5('myfile.hdf5', '/x') # doctest: +SKIP
See also
da.store
,h5py.File.create_dataset
-
dask.array.
to_zarr
(arr, url, component=None, storage_options=None, overwrite=False, compute=True, return_stored=False, **kwargs)¶ Save array to the zarr storage format
See https://zarr.readthedocs.io for details about the format.
Parameters: arr: dask.array
Data to store
url: Zarr Array or str or MutableMapping
Location of the data. A URL can include a protocol specifier like s3:// for remote data. Can also be any MutableMapping instance, which should be serializable if used in multiple processes.
component: str or None
If the location is a zarr group rather than an array, this is the subcomponent that should be created/over-written.
storage_options: dict
Any additional parameters for the storage backend (ignored for local paths)
overwrite: bool
If given array already exists, overwrite=False will cause an error, where overwrite=True will replace the existing data. Note that this check is done at computation time, not during graph creation.
compute, return_stored: see ``store()``
kwargs: passed to the ``zarr.create()`` function, e.g., compression options
Raises: ValueError
If
arr
has unknown chunk sizes, which is not supported by Zarr.See also
-
dask.array.
to_npy_stack
(dirname, x, axis=0)¶ Write dask array to a stack of .npy files
This partitions the dask.array along one axis and stores each block along that axis as a single .npy file in the specified directory
See also
Examples
>>> x = da.ones((5, 10, 10), chunks=(2, 4, 4)) # doctest: +SKIP >>> da.to_npy_stack('data/', x, axis=0) # doctest: +SKIP
The
.npy
files store numpy arrays forx[0:2], x[2:4], and x[4:5]
respectively, as is specified by the chunk size along the zeroth axis:$ tree data/ data/ |-- 0.npy |-- 1.npy |-- 2.npy |-- info
The
info
file stores the dtype, chunks, and axis information of the array. You can load these stacks with theda.from_npy_stack
function.>>> y = da.from_npy_stack('data/') # doctest: +SKIP
-
dask.array.
to_tiledb
(darray, uri, compute=True, return_stored=False, storage_options=None, **kwargs)¶ Save array to the TileDB storage format
Save ‘array’ using the TileDB storage manager, to any TileDB-supported URI, including local disk, S3, or HDFS.
See https://docs.tiledb.io for more information about TileDB.
Parameters: darray: dask.array
A dask array to write.
uri:
Any supported TileDB storage location.
storage_options: dict
Dict containing any configuration options for the TileDB backend. see https://docs.tiledb.io/en/stable/tutorials/config.html
compute, return_stored: see ``store()``
Returns: None
Unless
return_stored
is set toTrue
(False
by default)Notes
TileDB only supports regularly-chunked arrays. TileDB tile extents correspond to form 2 of the dask chunk specification, and the conversion is done automatically for supported arrays.
Examples
>>> import dask.array as da, tempfile # doctest: +SKIP >>> uri = tempfile.NamedTemporaryFile().name # doctest: +SKIP >>> data = da.random.random(5,5) # doctest: +SKIP >>> da.to_tiledb(data, uri) # doctest: +SKIP >>> import tiledb # doctest: +SKIP >>> tdb_ar = tiledb.open(uri) # doctest: +SKIP >>> all(tdb_ar == data) # doctest: +SKIP True
-
dask.array.fft.
fft_wrap
(fft_func, kind=None, dtype=None)¶ Wrap 1D, 2D, and ND real and complex FFT functions
Takes a function that behaves like
numpy.fft
functions and a specified kind to match it to that are named after the functions in thenumpy.fft
API.Supported kinds include:
- fft
- fft2
- fftn
- ifft
- ifft2
- ifftn
- rfft
- rfft2
- rfftn
- irfft
- irfft2
- irfftn
- hfft
- ihfft
Examples
>>> parallel_fft = fft_wrap(np.fft.fft) >>> parallel_ifft = fft_wrap(np.fft.ifft)
-
dask.array.fft.
fft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.fft
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.fft docstring follows below:
Compute the one-dimensional discrete Fourier Transform.
This function computes the one-dimensional n-point discrete Fourier Transform (DFT) with the efficient Fast Fourier Transform (FFT) algorithm [CT].
Parameters: a : array_like
Input array, can be complex.
n : int, optional
Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
axis : int, optional
Axis over which to compute the FFT. If not given, the last axis is used.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified.
Raises: IndexError
if axes is larger than the last axis of a.
See also
Notes
FFT (Fast Fourier Transform) refers to a way the discrete Fourier Transform (DFT) can be calculated efficiently, by using symmetries in the calculated terms. The symmetry is highest when n is a power of 2, and the transform is therefore most efficient for these sizes.
The DFT is defined, with the conventions used in this implementation, in the documentation for the numpy.fft module.
References
[CT142] Cooley, James W., and John W. Tukey, 1965, “An algorithm for the machine calculation of complex Fourier series,” Math. Comput. 19: 297-301. Examples
>>> np.fft.fft(np.exp(2j * np.pi * np.arange(8) / 8)) array([-2.33486982e-16+1.14423775e-17j, 8.00000000e+00-1.25557246e-15j, 2.33486982e-16+2.33486982e-16j, 0.00000000e+00+1.22464680e-16j, -1.14423775e-17+2.33486982e-16j, 0.00000000e+00+5.20784380e-16j, 1.14423775e-17+1.14423775e-17j, 0.00000000e+00+1.22464680e-16j])
In this example, real input has an FFT which is Hermitian, i.e., symmetric in the real part and anti-symmetric in the imaginary part, as described in the numpy.fft documentation:
>>> import matplotlib.pyplot as plt >>> t = np.arange(256) >>> sp = np.fft.fft(np.sin(t)) >>> freq = np.fft.fftfreq(t.shape[-1]) >>> plt.plot(freq, sp.real, freq, sp.imag) [<matplotlib.lines.Line2D object at 0x...>, <matplotlib.lines.Line2D object at 0x...>] >>> plt.show()
-
dask.array.fft.
fft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.fft2
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.fft2 docstring follows below:
Compute the 2-dimensional discrete Fourier Transform
This function computes the n-dimensional discrete Fourier Transform over any axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). By default, the transform is computed over the last two axes of the input array, i.e., a 2-dimensional FFT.
Parameters: a : array_like
Input array, can be complex
s : sequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds ton
forfft(x, n)
. Along each axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.axes : sequence of ints, optional
Axes over which to compute the FFT. If not given, the last two axes are used. A repeated index in axes means the transform over that axis is performed multiple times. A one-element sequence means that a one-dimensional FFT is performed.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or the last two axes if axes is not given.
Raises: ValueError
If s and axes have different length, or axes not given and
len(s) != 2
.IndexError
If an element of axes is larger than than the number of axes of a.
See also
numpy.fft
- Overall view of discrete Fourier transforms, with definitions and conventions used.
ifft2
- The inverse two-dimensional FFT.
fft
- The one-dimensional FFT.
fftn
- The n-dimensional FFT.
fftshift
- Shifts zero-frequency terms to the center of the array. For two-dimensional input, swaps first and third quadrants, and second and fourth quadrants.
Notes
fft2 is just fftn with a different default for axes.
The output, analogously to fft, contains the term for zero frequency in the low-order corner of the transformed axes, the positive frequency terms in the first half of these axes, the term for the Nyquist frequency in the middle of the axes and the negative frequency terms in the second half of the axes, in order of decreasingly negative frequency.
See fftn for details and a plotting example, and numpy.fft for definitions and conventions used.
Examples
>>> a = np.mgrid[:5, :5][0] >>> np.fft.fft2(a) array([[ 50. +0.j , 0. +0.j , 0. +0.j , # may vary 0. +0.j , 0. +0.j ], [-12.5+17.20477401j, 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ], [-12.5 +4.0614962j , 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ], [-12.5 -4.0614962j , 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ], [-12.5-17.20477401j, 0. +0.j , 0. +0.j , 0. +0.j , 0. +0.j ]])
-
dask.array.fft.
fftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.fftn
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.fftn docstring follows below:
Compute the N-dimensional discrete Fourier Transform.
This function computes the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT).
Parameters: a : array_like
Input array, can be complex.
s : sequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds ton
forfft(x, n)
. Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.axes : sequence of ints, optional
Axes over which to compute the FFT. If not given, the last
len(s)
axes are used, or all axes if s is also not specified. Repeated indices in axes means that the transform over that axis is performed multiple times.norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s and a, as explained in the parameters section above.
Raises: ValueError
If s and axes have different length.
IndexError
If an element of axes is larger than than the number of axes of a.
See also
numpy.fft
- Overall view of discrete Fourier transforms, with definitions and conventions used.
ifftn
- The inverse of fftn, the inverse n-dimensional FFT.
fft
- The one-dimensional FFT, with definitions and conventions used.
rfftn
- The n-dimensional FFT of real input.
fft2
- The two-dimensional FFT.
fftshift
- Shifts zero-frequency terms to centre of array
Notes
The output, analogously to fft, contains the term for zero frequency in the low-order corner of all axes, the positive frequency terms in the first half of all axes, the term for the Nyquist frequency in the middle of all axes and the negative frequency terms in the second half of all axes, in order of decreasingly negative frequency.
See numpy.fft for details, definitions and conventions used.
Examples
>>> a = np.mgrid[:3, :3, :3][0] >>> np.fft.fftn(a, axes=(1, 2)) array([[[ 0.+0.j, 0.+0.j, 0.+0.j], # may vary [ 0.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]], [[ 9.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]], [[18.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]]]) >>> np.fft.fftn(a, (2, 2), axes=(0, 1)) array([[[ 2.+0.j, 2.+0.j, 2.+0.j], # may vary [ 0.+0.j, 0.+0.j, 0.+0.j]], [[-2.+0.j, -2.+0.j, -2.+0.j], [ 0.+0.j, 0.+0.j, 0.+0.j]]])
>>> import matplotlib.pyplot as plt >>> [X, Y] = np.meshgrid(2 * np.pi * np.arange(200) / 12, ... 2 * np.pi * np.arange(200) / 34) >>> S = np.sin(X) + np.cos(Y) + np.random.uniform(0, 1, X.shape) >>> FS = np.fft.fftn(S) >>> plt.imshow(np.log(np.abs(np.fft.fftshift(FS))**2)) <matplotlib.image.AxesImage object at 0x...> >>> plt.show()
-
dask.array.fft.
ifft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.ifft
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ifft docstring follows below:
Compute the one-dimensional inverse discrete Fourier Transform.
This function computes the inverse of the one-dimensional n-point discrete Fourier transform computed by fft. In other words,
ifft(fft(a)) == a
to within numerical accuracy. For a general description of the algorithm and definitions, see numpy.fft.The input should be ordered in the same way as is returned by fft, i.e.,
a[0]
should contain the zero frequency term,a[1:n//2]
should contain the positive-frequency terms,a[n//2 + 1:]
should contain the negative-frequency terms, in increasing order starting from the most negative frequency.
For an even number of input points,
A[n//2]
represents the sum of the values at the positive and negative Nyquist frequencies, as the two are aliased together. See numpy.fft for details.Parameters: a : array_like
Input array, can be complex.
n : int, optional
Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used. See notes about padding issues.
axis : int, optional
Axis over which to compute the inverse DFT. If not given, the last axis is used.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified.
Raises: IndexError
If axes is larger than the last axis of a.
See also
Notes
If the input parameter n is larger than the size of the input, the input is padded by appending zeros at the end. Even though this is the common approach, it might lead to surprising results. If a different padding is desired, it must be performed before calling ifft.
Examples
>>> np.fft.ifft([0, 4, 0, 0]) array([ 1.+0.j, 0.+1.j, -1.+0.j, 0.-1.j]) # may vary
Create and plot a band-limited signal with random phases:
>>> import matplotlib.pyplot as plt >>> t = np.arange(400) >>> n = np.zeros((400,), dtype=complex) >>> n[40:60] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20,))) >>> s = np.fft.ifft(n) >>> plt.plot(t, s.real, 'b-', t, s.imag, 'r--') [<matplotlib.lines.Line2D object at ...>, <matplotlib.lines.Line2D object at ...>] >>> plt.legend(('real', 'imaginary')) <matplotlib.legend.Legend object at ...> >>> plt.show()
-
dask.array.fft.
ifft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.ifft2
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ifft2 docstring follows below:
Compute the 2-dimensional inverse discrete Fourier Transform.
This function computes the inverse of the 2-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words,
ifft2(fft2(a)) == a
to within numerical accuracy. By default, the inverse transform is computed over the last two axes of the input array.The input, analogously to ifft, should be ordered in the same way as is returned by fft2, i.e. it should have the term for zero frequency in the low-order corner of the two axes, the positive frequency terms in the first half of these axes, the term for the Nyquist frequency in the middle of the axes and the negative frequency terms in the second half of both axes, in order of decreasingly negative frequency.
Parameters: a : array_like
Input array, can be complex.
s : sequence of ints, optional
Shape (length of each axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds to n forifft(x, n)
. Along each axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used. See notes for issue on ifft zero padding.axes : sequence of ints, optional
Axes over which to compute the FFT. If not given, the last two axes are used. A repeated index in axes means the transform over that axis is performed multiple times. A one-element sequence means that a one-dimensional FFT is performed.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or the last two axes if axes is not given.
Raises: ValueError
If s and axes have different length, or axes not given and
len(s) != 2
.IndexError
If an element of axes is larger than than the number of axes of a.
See also
Notes
ifft2 is just ifftn with a different default for axes.
See ifftn for details and a plotting example, and numpy.fft for definition and conventions used.
Zero-padding, analogously with ifft, is performed by appending zeros to the input along the specified dimension. Although this is the common approach, it might lead to surprising results. If another form of zero padding is desired, it must be performed before ifft2 is called.
Examples
>>> a = 4 * np.eye(4) >>> np.fft.ifft2(a) array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j], [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j], [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]])
-
dask.array.fft.
ifftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.ifftn
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ifftn docstring follows below:
Compute the N-dimensional inverse discrete Fourier Transform.
This function computes the inverse of the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words,
ifftn(fftn(a)) == a
to within numerical accuracy. For a description of the definitions and conventions used, see numpy.fft.The input, analogously to ifft, should be ordered in the same way as is returned by fftn, i.e. it should have the term for zero frequency in all axes in the low-order corner, the positive frequency terms in the first half of all axes, the term for the Nyquist frequency in the middle of all axes and the negative frequency terms in the second half of all axes, in order of decreasingly negative frequency.
Parameters: a : array_like
Input array, can be complex.
s : sequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). This corresponds ton
forifft(x, n)
. Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used. See notes for issue on ifft zero padding.axes : sequence of ints, optional
Axes over which to compute the IFFT. If not given, the last
len(s)
axes are used, or all axes if s is also not specified. Repeated indices in axes means that the inverse transform over that axis is performed multiple times.norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s or a, as explained in the parameters section above.
Raises: ValueError
If s and axes have different length.
IndexError
If an element of axes is larger than than the number of axes of a.
See also
numpy.fft
- Overall view of discrete Fourier transforms, with definitions and conventions used.
fftn
- The forward n-dimensional FFT, of which ifftn is the inverse.
ifft
- The one-dimensional inverse FFT.
ifft2
- The two-dimensional inverse FFT.
ifftshift
- Undoes fftshift, shifts zero-frequency terms to beginning of array.
Notes
See numpy.fft for definitions and conventions used.
Zero-padding, analogously with ifft, is performed by appending zeros to the input along the specified dimension. Although this is the common approach, it might lead to surprising results. If another form of zero padding is desired, it must be performed before ifftn is called.
Examples
>>> a = np.eye(4) >>> np.fft.ifftn(np.fft.fftn(a, axes=(0,)), axes=(1,)) array([[1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], # may vary [0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j, 1.+0.j, 0.+0.j], [0.+0.j, 0.+0.j, 0.+0.j, 1.+0.j]])
Create and plot an image with band-limited frequency content:
>>> import matplotlib.pyplot as plt >>> n = np.zeros((200,200), dtype=complex) >>> n[60:80, 20:40] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20, 20))) >>> im = np.fft.ifftn(n).real >>> plt.imshow(im) <matplotlib.image.AxesImage object at 0x...> >>> plt.show()
-
dask.array.fft.
rfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.rfft
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.rfft docstring follows below:
Compute the one-dimensional discrete Fourier Transform for real input.
This function computes the one-dimensional n-point discrete Fourier Transform (DFT) of a real-valued array by means of an efficient algorithm called the Fast Fourier Transform (FFT).
Parameters: a : array_like
Input array
n : int, optional
Number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
axis : int, optional
Axis over which to compute the FFT. If not given, the last axis is used.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. If n is even, the length of the transformed axis is
(n/2)+1
. If n is odd, the length is(n+1)/2
.Raises: IndexError
If axis is larger than the last axis of a.
See also
Notes
When the DFT is computed for purely real input, the output is Hermitian-symmetric, i.e. the negative frequency terms are just the complex conjugates of the corresponding positive-frequency terms, and the negative-frequency terms are therefore redundant. This function does not compute the negative frequency terms, and the length of the transformed axis of the output is therefore
n//2 + 1
.When
A = rfft(a)
and fs is the sampling frequency,A[0]
contains the zero-frequency term 0*fs, which is real due to Hermitian symmetry.If n is even,
A[-1]
contains the term representing both positive and negative Nyquist frequency (+fs/2 and -fs/2), and must also be purely real. If n is odd, there is no term at fs/2;A[-1]
contains the largest positive frequency (fs/2*(n-1)/n), and is complex in the general case.If the input a contains an imaginary part, it is silently discarded.
Examples
>>> np.fft.fft([0, 1, 0, 0]) array([ 1.+0.j, 0.-1.j, -1.+0.j, 0.+1.j]) # may vary >>> np.fft.rfft([0, 1, 0, 0]) array([ 1.+0.j, 0.-1.j, -1.+0.j]) # may vary
Notice how the final element of the fft output is the complex conjugate of the second element, for real input. For rfft, this symmetry is exploited to compute only the non-negative frequency terms.
-
dask.array.fft.
rfft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.rfft2
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.rfft2 docstring follows below:
Compute the 2-dimensional FFT of a real array.
Parameters: a : array
Input array, taken to be real.
s : sequence of ints, optional
Shape of the FFT.
axes : sequence of ints, optional
Axes over which to compute the FFT.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : ndarray
The result of the real 2-D FFT.
See also
rfftn
- Compute the N-dimensional discrete Fourier Transform for real input.
Notes
This is really just rfftn with different default behavior. For more details see rfftn.
-
dask.array.fft.
rfftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.rfftn
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.rfftn docstring follows below:
Compute the N-dimensional discrete Fourier Transform for real input.
This function computes the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional real array by means of the Fast Fourier Transform (FFT). By default, all axes are transformed, with the real transform performed over the last axis, while the remaining transforms are complex.
Parameters: a : array_like
Input array, taken to be real.
s : sequence of ints, optional
Shape (length along each transformed axis) to use from the input. (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). The final element of s corresponds to n forrfft(x, n)
, while for the remaining axes, it corresponds to n forfft(x, n)
. Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.axes : sequence of ints, optional
Axes over which to compute the FFT. If not given, the last
len(s)
axes are used, or all axes if s is also not specified.norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s and a, as explained in the parameters section above. The length of the last axis transformed will be
s[-1]//2+1
, while the remaining transformed axes will have lengths according to s, or unchanged from the input.Raises: ValueError
If s and axes have different length.
IndexError
If an element of axes is larger than than the number of axes of a.
See also
Notes
The transform for real input is performed over the last transformation axis, as by rfft, then the transform over the remaining axes is performed as by fftn. The order of the output is as for rfft for the final transformation axis, and as for fftn for the remaining transformation axes.
See fft for details, definitions and conventions used.
Examples
>>> a = np.ones((2, 2, 2)) >>> np.fft.rfftn(a) array([[[8.+0.j, 0.+0.j], # may vary [0.+0.j, 0.+0.j]], [[0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j]]])
>>> np.fft.rfftn(a, axes=(2, 0)) array([[[4.+0.j, 0.+0.j], # may vary [4.+0.j, 0.+0.j]], [[0.+0.j, 0.+0.j], [0.+0.j, 0.+0.j]]])
-
dask.array.fft.
irfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.irfft
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.irfft docstring follows below:
Compute the inverse of the n-point DFT for real input.
This function computes the inverse of the one-dimensional n-point discrete Fourier Transform of real input computed by rfft. In other words,
irfft(rfft(a), len(a)) == a
to within numerical accuracy. (See Notes below for whylen(a)
is necessary here.)The input is expected to be in the form returned by rfft, i.e. the real zero-frequency term followed by the complex positive frequency terms in order of increasing frequency. Since the discrete Fourier Transform of real input is Hermitian-symmetric, the negative frequency terms are taken to be the complex conjugates of the corresponding positive frequency terms.
Parameters: a : array_like
The input array.
n : int, optional
Length of the transformed axis of the output. For n output points,
n//2+1
input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is taken to be2*(m-1)
wherem
is the length of the input along the axis specified by axis.axis : int, optional
Axis over which to compute the inverse FFT. If not given, the last axis is used.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given,
2*(m-1)
wherem
is the length of the transformed axis of the input. To get an odd number of output points, n must be specified.Raises: IndexError
If axis is larger than the last axis of a.
See also
Notes
Returns the real valued n-point inverse discrete Fourier transform of a, where a contains the non-negative frequency terms of a Hermitian-symmetric sequence. n is the length of the result, not the input.
If you specify an n such that a must be zero-padded or truncated, the extra/removed values will be added/removed at high frequencies. One can thus resample a series to m points via Fourier interpolation by:
a_resamp = irfft(rfft(a), m)
.The correct interpretation of the hermitian input depends on the length of the original data, as given by n. This is because each input shape could correspond to either an odd or even length signal. By default, irfft assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. By Hermitian symmetry, the value is thus treated as purely real. To avoid losing information, the correct length of the real input must be given.
Examples
>>> np.fft.ifft([1, -1j, -1, 1j]) array([0.+0.j, 1.+0.j, 0.+0.j, 0.+0.j]) # may vary >>> np.fft.irfft([1, -1j, -1]) array([0., 1., 0., 0.])
Notice how the last term in the input to the ordinary ifft is the complex conjugate of the second term, and the output has zero imaginary part everywhere. When calling irfft, the negative frequencies are not specified, and the output array is purely real.
-
dask.array.fft.
irfft2
(a, s=None, axes=None)¶ Wrapping of numpy.fft.irfft2
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.irfft2 docstring follows below:
Compute the 2-dimensional inverse FFT of a real array.
Parameters: a : array_like
The input array
s : sequence of ints, optional
Shape of the real output to the inverse FFT.
axes : sequence of ints, optional
The axes over which to compute the inverse fft. Default is the last two axes.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : ndarray
The result of the inverse real 2-D FFT.
See also
irfftn
- Compute the inverse of the N-dimensional FFT of real input.
Notes
This is really irfftn with different defaults. For more details see irfftn.
-
dask.array.fft.
irfftn
(a, s=None, axes=None)¶ Wrapping of numpy.fft.irfftn
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.irfftn docstring follows below:
Compute the inverse of the N-dimensional FFT of real input.
This function computes the inverse of the N-dimensional discrete Fourier Transform for real input over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words,
irfftn(rfftn(a), a.shape) == a
to within numerical accuracy. (Thea.shape
is necessary likelen(a)
is for irfft, and for the same reason.)The input should be ordered in the same way as is returned by rfftn, i.e. as for irfft for the final transformation axis, and as for ifftn along all the other axes.
Parameters: a : array_like
Input array.
s : sequence of ints, optional
Shape (length of each transformed axis) of the output (
s[0]
refers to axis 0,s[1]
to axis 1, etc.). s is also the number of input points used along this axis, except for the last axis, wheres[-1]//2+1
points of the input are used. Along any axis, if the shape indicated by s is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. If s is not given, the shape of the input along the axes specified by axes is used. Except for the last axis which is taken to be2*(m-1)
wherem
is the length of the input along that axis.axes : sequence of ints, optional
Axes over which to compute the inverse FFT. If not given, the last len(s) axes are used, or all axes if s is also not specified. Repeated indices in axes means that the inverse transform over that axis is performed multiple times.
norm : {None, “ortho”}, optional
New in version 1.10.0.
Normalization mode (see numpy.fft). Default is None.
Returns: out : ndarray
The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s or a, as explained in the parameters section above. The length of each transformed axis is as given by the corresponding element of s, or the length of the input in every axis except for the last one if s is not given. In the final transformed axis the length of the output when s is not given is
2*(m-1)
wherem
is the length of the final transformed axis of the input. To get an odd number of output points in the final axis, s must be specified.Raises: ValueError
If s and axes have different length.
IndexError
If an element of axes is larger than than the number of axes of a.
See also
Notes
See fft for definitions and conventions used.
See rfft for definitions and conventions used for real input.
The correct interpretation of the hermitian input depends on the shape of the original data, as given by s. This is because each input shape could correspond to either an odd or even length signal. By default, irfftn assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. When performing the final complex to real transform, the last value is thus treated as purely real. To avoid losing information, the correct shape of the real input must be given.
Examples
>>> a = np.zeros((3, 2, 2)) >>> a[0, 0, 0] = 3 * 2 * 2 >>> np.fft.irfftn(a) array([[[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]], [[1., 1.], [1., 1.]]])
-
dask.array.fft.
hfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.hfft
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.hfft docstring follows below:
Compute the FFT of a signal that has Hermitian symmetry, i.e., a real spectrum.
Parameters: a : array_like
The input array.
n : int, optional
Length of the transformed axis of the output. For n output points,
n//2 + 1
input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is taken to be2*(m-1)
wherem
is the length of the input along the axis specified by axis.axis : int, optional
Axis over which to compute the FFT. If not given, the last axis is used.
norm : {None, “ortho”}, optional
Normalization mode (see numpy.fft). Default is None.
New in version 1.10.0.
Returns: out : ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given,
2*m - 2
wherem
is the length of the transformed axis of the input. To get an odd number of output points, n must be specified, for instance as2*m - 1
in the typical case,Raises: IndexError
If axis is larger than the last axis of a.
Notes
hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd.
- even:
ihfft(hfft(a, 2*len(a) - 2) == a
, within roundoff error, - odd:
ihfft(hfft(a, 2*len(a) - 1) == a
, within roundoff error.
The correct interpretation of the hermitian input depends on the length of the original data, as given by n. This is because each input shape could correspond to either an odd or even length signal. By default, hfft assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. By Hermitian symmetry, the value is thus treated as purely real. To avoid losing information, the shape of the full signal must be given.
Examples
>>> signal = np.array([1, 2, 3, 4, 3, 2]) >>> np.fft.fft(signal) array([15.+0.j, -4.+0.j, 0.+0.j, -1.-0.j, 0.+0.j, -4.+0.j]) # may vary >>> np.fft.hfft(signal[:4]) # Input first half of signal array([15., -4., 0., -1., 0., -4.]) >>> np.fft.hfft(signal, 6) # Input entire signal and truncate array([15., -4., 0., -1., 0., -4.])
>>> signal = np.array([[1, 1.j], [-1.j, 2]]) >>> np.conj(signal.T) - signal # check Hermitian symmetry array([[ 0.-0.j, -0.+0.j], # may vary [ 0.+0.j, 0.-0.j]]) >>> freq_spectrum = np.fft.hfft(signal) >>> freq_spectrum array([[ 1., 1.], [ 2., -2.]])
- even:
-
dask.array.fft.
ihfft
(a, n=None, axis=None)¶ Wrapping of numpy.fft.ihfft
The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.
The numpy.fft.ihfft docstring follows below:
Compute the inverse FFT of a signal that has Hermitian symmetry.
Parameters: a : array_like
Input array.
n : int, optional
Length of the inverse FFT, the number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
axis : int, optional
Axis over which to compute the inverse FFT. If not given, the last axis is used.
norm : {None, “ortho”}, optional
Normalization mode (see numpy.fft). Default is None.
New in version 1.10.0.
Returns: out : complex ndarray
The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is
n//2 + 1
.Notes
hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd:
- even:
ihfft(hfft(a, 2*len(a) - 2) == a
, within roundoff error, - odd:
ihfft(hfft(a, 2*len(a) - 1) == a
, within roundoff error.
Examples
>>> spectrum = np.array([ 15, -4, 0, -1, 0, -4]) >>> np.fft.ifft(spectrum) array([1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 3.+0.j, 2.+0.j]) # may vary >>> np.fft.ihfft(spectrum) array([ 1.-0.j, 2.-0.j, 3.-0.j, 4.-0.j]) # may vary
- even:
-
dask.array.fft.
fftfreq
(n, d=1.0, chunks='auto')¶ Return the Discrete Fourier Transform sample frequencies.
This docstring was copied from numpy.fft.fftfreq.
Some inconsistencies with the Dask version may exist.
The returned float array f contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). For instance, if the sample spacing is in seconds, then the frequency unit is cycles/second.
Given a window length n and a sample spacing d:
f = [0, 1, ..., n/2-1, -n/2, ..., -1] / (d*n) if n is even f = [0, 1, ..., (n-1)/2, -(n-1)/2, ..., -1] / (d*n) if n is odd
Parameters: n : int
Window length.
d : scalar, optional
Sample spacing (inverse of the sampling rate). Defaults to 1.
Returns: f : ndarray
Array of length n containing the sample frequencies.
Examples
>>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5], dtype=float) # doctest: +SKIP >>> fourier = np.fft.fft(signal) # doctest: +SKIP >>> n = signal.size # doctest: +SKIP >>> timestep = 0.1 # doctest: +SKIP >>> freq = np.fft.fftfreq(n, d=timestep) # doctest: +SKIP >>> freq # doctest: +SKIP array([ 0. , 1.25, 2.5 , ..., -3.75, -2.5 , -1.25])
-
dask.array.fft.
rfftfreq
(n, d=1.0, chunks='auto')¶ Return the Discrete Fourier Transform sample frequencies (for usage with rfft, irfft).
This docstring was copied from numpy.fft.rfftfreq.
Some inconsistencies with the Dask version may exist.
The returned float array f contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). For instance, if the sample spacing is in seconds, then the frequency unit is cycles/second.
Given a window length n and a sample spacing d:
f = [0, 1, ..., n/2-1, n/2] / (d*n) if n is even f = [0, 1, ..., (n-1)/2-1, (n-1)/2] / (d*n) if n is odd
Unlike fftfreq (but like scipy.fftpack.rfftfreq) the Nyquist frequency component is considered to be positive.
Parameters: n : int
Window length.
d : scalar, optional
Sample spacing (inverse of the sampling rate). Defaults to 1.
Returns: f : ndarray
Array of length
n//2 + 1
containing the sample frequencies.Examples
>>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5, -3, 4], dtype=float) # doctest: +SKIP >>> fourier = np.fft.rfft(signal) # doctest: +SKIP >>> n = signal.size # doctest: +SKIP >>> sample_rate = 100 # doctest: +SKIP >>> freq = np.fft.fftfreq(n, d=1./sample_rate) # doctest: +SKIP >>> freq # doctest: +SKIP array([ 0., 10., 20., ..., -30., -20., -10.]) >>> freq = np.fft.rfftfreq(n, d=1./sample_rate) # doctest: +SKIP >>> freq # doctest: +SKIP array([ 0., 10., 20., 30., 40., 50.])
-
dask.array.fft.
fftshift
(x, axes=None)¶ Shift the zero-frequency component to the center of the spectrum.
This docstring was copied from numpy.fft.fftshift.
Some inconsistencies with the Dask version may exist.
This function swaps half-spaces for all axes listed (defaults to all). Note that
y[0]
is the Nyquist component only iflen(x)
is even.Parameters: x : array_like
Input array.
axes : int or shape tuple, optional
Axes over which to shift. Default is None, which shifts all axes.
Returns: y : ndarray
The shifted array.
See also
ifftshift
- The inverse of fftshift.
Examples
>>> freqs = np.fft.fftfreq(10, 0.1) # doctest: +SKIP >>> freqs # doctest: +SKIP array([ 0., 1., 2., ..., -3., -2., -1.]) >>> np.fft.fftshift(freqs) # doctest: +SKIP array([-5., -4., -3., -2., -1., 0., 1., 2., 3., 4.])
Shift the zero-frequency component only along the second axis:
>>> freqs = np.fft.fftfreq(9, d=1./9).reshape(3, 3) # doctest: +SKIP >>> freqs # doctest: +SKIP array([[ 0., 1., 2.], [ 3., 4., -4.], [-3., -2., -1.]]) >>> np.fft.fftshift(freqs, axes=(1,)) # doctest: +SKIP array([[ 2., 0., 1.], [-4., 3., 4.], [-1., -3., -2.]])
-
dask.array.fft.
ifftshift
(x, axes=None)¶ The inverse of fftshift. Although identical for even-length x, the functions differ by one sample for odd-length x.
This docstring was copied from numpy.fft.ifftshift.
Some inconsistencies with the Dask version may exist.
Parameters: x : array_like
Input array.
axes : int or shape tuple, optional
Axes over which to calculate. Defaults to None, which shifts all axes.
Returns: y : ndarray
The shifted array.
See also
fftshift
- Shift zero-frequency component to the center of the spectrum.
Examples
>>> freqs = np.fft.fftfreq(9, d=1./9).reshape(3, 3) # doctest: +SKIP >>> freqs # doctest: +SKIP array([[ 0., 1., 2.], [ 3., 4., -4.], [-3., -2., -1.]]) >>> np.fft.ifftshift(np.fft.fftshift(freqs)) # doctest: +SKIP array([[ 0., 1., 2.], [ 3., 4., -4.], [-3., -2., -1.]])
-
dask.array.random.
beta
(a, b, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.beta.
Some inconsistencies with the Dask version may exist.
Draw samples from a Beta distribution.
The Beta distribution is a special case of the Dirichlet distribution, and is related to the Gamma distribution. It has the probability distribution function
\[f(x; a,b) = \frac{1}{B(\alpha, \beta)} x^{\alpha - 1} (1 - x)^{\beta - 1},\]where the normalization, B, is the beta function,
\[B(\alpha, \beta) = \int_0^1 t^{\alpha - 1} (1 - t)^{\beta - 1} dt.\]It is often seen in Bayesian inference and order statistics.
Parameters: a : float or array_like of floats
Alpha, positive (>0).
b : float or array_like of floats
Beta, positive (>0).
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
andb
are both scalars. Otherwise,np.broadcast(a, b).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized beta distribution.
-
dask.array.random.
binomial
(n, p, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.binomial.
Some inconsistencies with the Dask version may exist.
Draw samples from a binomial distribution.
Samples are drawn from a binomial distribution with specified parameters, n trials and p probability of success where n an integer >= 0 and p is in the interval [0,1]. (n may be input as a float, but it is truncated to an integer in use)
Parameters: n : int or array_like of ints
Parameter of the distribution, >= 0. Floats are also accepted, but they will be truncated to integers.
p : float or array_like of floats
Parameter of the distribution, >= 0 and <=1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifn
andp
are both scalars. Otherwise,np.broadcast(n, p).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized binomial distribution, where each sample is equal to the number of successes over the n trials.
See also
scipy.stats.binom
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the binomial distribution is
\[P(N) = \binom{n}{N}p^N(1-p)^{n-N},\]where \(n\) is the number of trials, \(p\) is the probability of success, and \(N\) is the number of successes.
When estimating the standard error of a proportion in a population by using a random sample, the normal distribution works well unless the product p*n <=5, where p = population proportion estimate, and n = number of samples, in which case the binomial distribution is used instead. For example, a sample of 15 people shows 4 who are left handed, and 11 who are right handed. Then p = 4/15 = 27%. 0.27*15 = 4, so the binomial distribution should be used in this case.
References
[R144] Dalgaard, Peter, “Introductory Statistics with R”, Springer-Verlag, 2002. [R145] Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002. [R146] Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972. [R147] Weisstein, Eric W. “Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/BinomialDistribution.html [R148] Wikipedia, “Binomial distribution”, https://en.wikipedia.org/wiki/Binomial_distribution Examples
Draw samples from the distribution:
>>> n, p = 10, .5 # number of trials, probability of each trial # doctest: +SKIP >>> s = np.random.binomial(n, p, 1000) # doctest: +SKIP # result of flipping a coin 10 times, tested 1000 times.
A real world example. A company drills 9 wild-cat oil exploration wells, each with an estimated probability of success of 0.1. All nine wells fail. What is the probability of that happening?
Let’s do 20,000 trials of the model, and count the number that generate zero positive results.
>>> sum(np.random.binomial(9, 0.1, 20000) == 0)/20000. # doctest: +SKIP # answer = 0.38885, or 38%.
-
dask.array.random.
chisquare
(df, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.chisquare.
Some inconsistencies with the Dask version may exist.
Draw samples from a chi-square distribution.
When df independent random variables, each with standard normal distributions (mean 0, variance 1), are squared and summed, the resulting distribution is chi-square (see Notes). This distribution is often used in hypothesis testing.
Parameters: df : float or array_like of floats
Number of degrees of freedom, must be > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdf
is a scalar. Otherwise,np.array(df).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized chi-square distribution.
Raises: ValueError
When df <= 0 or when an inappropriate size (e.g.
size=-1
) is given.Notes
The variable obtained by summing the squares of df independent, standard normally distributed random variables:
\[Q = \sum_{i=0}^{\mathtt{df}} X^2_i\]is chi-square distributed, denoted
\[Q \sim \chi^2_k.\]The probability density function of the chi-squared distribution is
\[p(x) = \frac{(1/2)^{k/2}}{\Gamma(k/2)} x^{k/2 - 1} e^{-x/2},\]where \(\Gamma\) is the gamma function,
\[\Gamma(x) = \int_0^{-\infty} t^{x - 1} e^{-t} dt.\]References
[R149] NIST “Engineering Statistics Handbook” https://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm Examples
>>> np.random.chisquare(2,4) # doctest: +SKIP array([ 1.89920014, 9.00867716, 3.13710533, 5.62318272]) # random
-
dask.array.random.
choice
(a, size=None, replace=True, p=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.choice.
Some inconsistencies with the Dask version may exist.
Generates a random sample from a given 1-D array
New in version 1.7.0.
Parameters: a : 1-D array-like or int
If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.replace : boolean, optional
Whether the sample is with or without replacement
p : 1-D array-like, optional
The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
Returns: samples : single item or ndarray
The generated random samples
Raises: ValueError
If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size
See also
randint
,shuffle
,permutation
Examples
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) # doctest: +SKIP array([0, 3, 4]) # random >>> #This is equivalent to np.random.randint(0,5,3)
Generate a non-uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) # doctest: +SKIP array([3, 3, 0]) # random
Generate a uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False) # doctest: +SKIP array([3,1,0]) # random >>> #This is equivalent to np.random.permutation(np.arange(5))[:3]
Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) # doctest: +SKIP array([2, 3, 0]) # random
Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher'] # doctest: +SKIP >>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3]) # doctest: +SKIP array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random dtype='<U11')
-
dask.array.random.
exponential
(scale=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.exponential.
Some inconsistencies with the Dask version may exist.
Draw samples from an exponential distribution.
Its probability density function is
\[f(x; \frac{1}{\beta}) = \frac{1}{\beta} \exp(-\frac{x}{\beta}),\]for
x > 0
and 0 elsewhere. \(\beta\) is the scale parameter, which is the inverse of the rate parameter \(\lambda = 1/\beta\). The rate parameter is an alternative, widely used parameterization of the exponential distribution [R152].The exponential distribution is a continuous analogue of the geometric distribution. It describes many common situations, such as the size of raindrops measured over many rainstorms [R150], or the time between page requests to Wikipedia [R151].
Parameters: scale : float or array_like of floats
The scale parameter, \(\beta = 1/\lambda\). Must be non-negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifscale
is a scalar. Otherwise,np.array(scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized exponential distribution.
References
[R150] (1, 2) Peyton Z. Peebles Jr., “Probability, Random Variables and Random Signal Principles”, 4th ed, 2001, p. 57. [R151] (1, 2) Wikipedia, “Poisson process”, https://en.wikipedia.org/wiki/Poisson_process [R152] (1, 2) Wikipedia, “Exponential distribution”, https://en.wikipedia.org/wiki/Exponential_distribution
-
dask.array.random.
f
(dfnum, dfden, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.f.
Some inconsistencies with the Dask version may exist.
Draw samples from an F distribution.
Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters must be greater than zero.
The random variate of the F distribution (also known as the Fisher distribution) is a continuous probability distribution that arises in ANOVA tests, and is the ratio of two chi-square variates.
Parameters: dfnum : float or array_like of floats
Degrees of freedom in numerator, must be > 0.
dfden : float or array_like of float
Degrees of freedom in denominator, must be > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdfnum
anddfden
are both scalars. Otherwise,np.broadcast(dfnum, dfden).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Fisher distribution.
See also
scipy.stats.f
- probability density function, distribution or cumulative density function, etc.
Notes
The F statistic is used to compare in-group variances to between-group variances. Calculating the distribution depends on the sampling, and so it is a function of the respective degrees of freedom in the problem. The variable dfnum is the number of samples minus one, the between-groups degrees of freedom, while dfden is the within-groups degrees of freedom, the sum of the number of samples in each group minus the number of groups.
References
[R153] Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002. [R154] Wikipedia, “F-distribution”, https://en.wikipedia.org/wiki/F-distribution Examples
An example from Glantz[1], pp 47-40:
Two groups, children of diabetics (25 people) and children from people without diabetes (25 controls). Fasting blood glucose was measured, case group had a mean value of 86.1, controls had a mean value of 82.2. Standard deviations were 2.09 and 2.49 respectively. Are these data consistent with the null hypothesis that the parents diabetic status does not affect their children’s blood glucose levels? Calculating the F statistic from the data gives a value of 36.01.
Draw samples from the distribution:
>>> dfnum = 1. # between group degrees of freedom # doctest: +SKIP >>> dfden = 48. # within groups degrees of freedom # doctest: +SKIP >>> s = np.random.f(dfnum, dfden, 1000) # doctest: +SKIP
The lower bound for the top 1% of the samples is :
>>> np.sort(s)[-10] # doctest: +SKIP 7.61988120985 # random
So there is about a 1% chance that the F statistic will exceed 7.62, the measured value is 36, so the null hypothesis is rejected at the 1% level.
-
dask.array.random.
gamma
(shape, scale=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.gamma.
Some inconsistencies with the Dask version may exist.
Draw samples from a Gamma distribution.
Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale (sometimes designated “theta”), where both parameters are > 0.
Parameters: shape : float or array_like of floats
The shape of the gamma distribution. Must be non-negative.
scale : float or array_like of floats, optional
The scale of the gamma distribution. Must be non-negative. Default is equal to 1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifshape
andscale
are both scalars. Otherwise,np.broadcast(shape, scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized gamma distribution.
See also
scipy.stats.gamma
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Gamma distribution is
\[p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},\]where \(k\) is the shape and \(\theta\) the scale, and \(\Gamma\) is the Gamma function.
The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.
References
[R155] Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html [R156] Wikipedia, “Gamma distribution”, https://en.wikipedia.org/wiki/Gamma_distribution Examples
Draw samples from the distribution:
>>> shape, scale = 2., 2. # mean=4, std=2*sqrt(2) # doctest: +SKIP >>> s = np.random.gamma(shape, scale, 1000) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> import scipy.special as sps # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 50, density=True) # doctest: +SKIP >>> y = bins**(shape-1)*(np.exp(-bins/scale) / # doctest: +SKIP ... (sps.gamma(shape)*scale**shape)) >>> plt.plot(bins, y, linewidth=2, color='r') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
geometric
(p, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.geometric.
Some inconsistencies with the Dask version may exist.
Draw samples from the geometric distribution.
Bernoulli trials are experiments with one of two outcomes: success or failure (an example of such an experiment is flipping a coin). The geometric distribution models the number of trials that must be run in order to achieve success. It is therefore supported on the positive integers,
k = 1, 2, ...
.The probability mass function of the geometric distribution is
\[f(k) = (1 - p)^{k - 1} p\]where p is the probability of success of an individual trial.
Parameters: p : float or array_like of floats
The probability of success of an individual trial.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifp
is a scalar. Otherwise,np.array(p).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized geometric distribution.
Examples
Draw ten thousand values from the geometric distribution, with the probability of an individual success equal to 0.35:
>>> z = np.random.geometric(p=0.35, size=10000) # doctest: +SKIP
How many trials succeeded after a single run?
>>> (z == 1).sum() / 10000. # doctest: +SKIP 0.34889999999999999 #random
-
dask.array.random.
gumbel
(loc=0.0, scale=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.gumbel.
Some inconsistencies with the Dask version may exist.
Draw samples from a Gumbel distribution.
Draw samples from a Gumbel distribution with specified location and scale. For more information on the Gumbel distribution, see Notes and References below.
Parameters: loc : float or array_like of floats, optional
The location of the mode of the distribution. Default is 0.
scale : float or array_like of floats, optional
The scale parameter of the distribution. Default is 1. Must be non- negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Gumbel distribution.
See also
scipy.stats.gumbel_l
,scipy.stats.gumbel_r
,scipy.stats.genextreme
,weibull
Notes
The Gumbel (or Smallest Extreme Value (SEV) or the Smallest Extreme Value Type I) distribution is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. The Gumbel is a special case of the Extreme Value Type I distribution for maximums from distributions with “exponential-like” tails.
The probability density for the Gumbel distribution is
\[p(x) = \frac{e^{-(x - \mu)/ \beta}}{\beta} e^{ -e^{-(x - \mu)/ \beta}},\]where \(\mu\) is the mode, a location parameter, and \(\beta\) is the scale parameter.
The Gumbel (named for German mathematician Emil Julius Gumbel) was used very early in the hydrology literature, for modeling the occurrence of flood events. It is also used for modeling maximum wind speed and rainfall rates. It is a “fat-tailed” distribution - the probability of an event in the tail of the distribution is larger than if one used a Gaussian, hence the surprisingly frequent occurrence of 100-year floods. Floods were initially modeled as a Gaussian process, which underestimated the frequency of extreme events.
It is one of a class of extreme value distributions, the Generalized Extreme Value (GEV) distributions, which also includes the Weibull and Frechet.
The function has a mean of \(\mu + 0.57721\beta\) and a variance of \(\frac{\pi^2}{6}\beta^2\).
References
[R157] Gumbel, E. J., “Statistics of Extremes,” New York: Columbia University Press, 1958. [R158] Reiss, R.-D. and Thomas, M., “Statistical Analysis of Extreme Values from Insurance, Finance, Hydrology and Other Fields,” Basel: Birkhauser Verlag, 2001. Examples
Draw samples from the distribution:
>>> mu, beta = 0, 0.1 # location and scale # doctest: +SKIP >>> s = np.random.gumbel(mu, beta, 1000) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 30, density=True) # doctest: +SKIP >>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta) # doctest: +SKIP ... * np.exp( -np.exp( -(bins - mu) /beta) ), ... linewidth=2, color='r') >>> plt.show() # doctest: +SKIP
Show how an extreme value distribution can arise from a Gaussian process and compare to a Gaussian:
>>> means = [] # doctest: +SKIP >>> maxima = [] # doctest: +SKIP >>> for i in range(0,1000) : # doctest: +SKIP ... a = np.random.normal(mu, beta, 1000) ... means.append(a.mean()) ... maxima.append(a.max()) >>> count, bins, ignored = plt.hist(maxima, 30, density=True) # doctest: +SKIP >>> beta = np.std(maxima) * np.sqrt(6) / np.pi # doctest: +SKIP >>> mu = np.mean(maxima) - 0.57721*beta # doctest: +SKIP >>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta) # doctest: +SKIP ... * np.exp(-np.exp(-(bins - mu)/beta)), ... linewidth=2, color='r') >>> plt.plot(bins, 1/(beta * np.sqrt(2 * np.pi)) # doctest: +SKIP ... * np.exp(-(bins - mu)**2 / (2 * beta**2)), ... linewidth=2, color='g') >>> plt.show() # doctest: +SKIP
-
dask.array.random.
hypergeometric
(ngood, nbad, nsample, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.hypergeometric.
Some inconsistencies with the Dask version may exist.
Draw samples from a Hypergeometric distribution.
Samples are drawn from a hypergeometric distribution with specified parameters, ngood (ways to make a good selection), nbad (ways to make a bad selection), and nsample (number of items sampled, which is less than or equal to the sum
ngood + nbad
).Parameters: ngood : int or array_like of ints
Number of ways to make a good selection. Must be nonnegative.
nbad : int or array_like of ints
Number of ways to make a bad selection. Must be nonnegative.
nsample : int or array_like of ints
Number of items sampled. Must be at least 1 and at most
ngood + nbad
.size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned if ngood, nbad, and nsample are all scalars. Otherwise,np.broadcast(ngood, nbad, nsample).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized hypergeometric distribution. Each sample is the number of good items within a randomly selected subset of size nsample taken from a set of ngood good items and nbad bad items.
See also
scipy.stats.hypergeom
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Hypergeometric distribution is
\[P(x) = \frac{\binom{g}{x}\binom{b}{n-x}}{\binom{g+b}{n}},\]where \(0 \le x \le n\) and \(n-b \le x \le g\)
for P(x) the probability of
x
good results in the drawn sample, g = ngood, b = nbad, and n = nsample.Consider an urn with black and white marbles in it, ngood of them are black and nbad are white. If you draw nsample balls without replacement, then the hypergeometric distribution describes the distribution of black balls in the drawn sample.
Note that this distribution is very similar to the binomial distribution, except that in this case, samples are drawn without replacement, whereas in the Binomial case samples are drawn with replacement (or the sample space is infinite). As the sample space becomes large, this distribution approaches the binomial.
References
[R159] Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972. [R160] Weisstein, Eric W. “Hypergeometric Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/HypergeometricDistribution.html [R161] Wikipedia, “Hypergeometric distribution”, https://en.wikipedia.org/wiki/Hypergeometric_distribution Examples
Draw samples from the distribution:
>>> ngood, nbad, nsamp = 100, 2, 10 # doctest: +SKIP # number of good, number of bad, and number of samples >>> s = np.random.hypergeometric(ngood, nbad, nsamp, 1000) # doctest: +SKIP >>> from matplotlib.pyplot import hist # doctest: +SKIP >>> hist(s) # doctest: +SKIP # note that it is very unlikely to grab both bad items
Suppose you have an urn with 15 white and 15 black marbles. If you pull 15 marbles at random, how likely is it that 12 or more of them are one color?
>>> s = np.random.hypergeometric(15, 15, 15, 100000) # doctest: +SKIP >>> sum(s>=12)/100000. + sum(s<=3)/100000. # doctest: +SKIP # answer = 0.003 ... pretty unlikely!
-
dask.array.random.
laplace
(loc=0.0, scale=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.laplace.
Some inconsistencies with the Dask version may exist.
Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay).
The Laplace distribution is similar to the Gaussian/normal distribution, but is sharper at the peak and has fatter tails. It represents the difference between two independent, identically distributed exponential random variables.
Parameters: loc : float or array_like of floats, optional
The position, \(\mu\), of the distribution peak. Default is 0.
scale : float or array_like of floats, optional
\(\lambda\), the exponential decay. Default is 1. Must be non- negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Laplace distribution.
Notes
It has the probability density function
\[f(x; \mu, \lambda) = \frac{1}{2\lambda} \exp\left(-\frac{|x - \mu|}{\lambda}\right).\]The first law of Laplace, from 1774, states that the frequency of an error can be expressed as an exponential function of the absolute magnitude of the error, which leads to the Laplace distribution. For many problems in economics and health sciences, this distribution seems to model the data better than the standard Gaussian distribution.
References
[R162] Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972. [R163] Kotz, Samuel, et. al. “The Laplace Distribution and Generalizations, ” Birkhauser, 2001. [R164] Weisstein, Eric W. “Laplace Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LaplaceDistribution.html [R165] Wikipedia, “Laplace distribution”, https://en.wikipedia.org/wiki/Laplace_distribution Examples
Draw samples from the distribution
>>> loc, scale = 0., 1. # doctest: +SKIP >>> s = np.random.laplace(loc, scale, 1000) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 30, density=True) # doctest: +SKIP >>> x = np.arange(-8., 8., .01) # doctest: +SKIP >>> pdf = np.exp(-abs(x-loc)/scale)/(2.*scale) # doctest: +SKIP >>> plt.plot(x, pdf) # doctest: +SKIP
Plot Gaussian for comparison:
>>> g = (1/(scale * np.sqrt(2 * np.pi)) * # doctest: +SKIP ... np.exp(-(x - loc)**2 / (2 * scale**2))) >>> plt.plot(x,g) # doctest: +SKIP
-
dask.array.random.
logistic
(loc=0.0, scale=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.logistic.
Some inconsistencies with the Dask version may exist.
Draw samples from a logistic distribution.
Samples are drawn from a logistic distribution with specified parameters, loc (location or mean, also median), and scale (>0).
Parameters: loc : float or array_like of floats, optional
Parameter of the distribution. Default is 0.
scale : float or array_like of floats, optional
Parameter of the distribution. Must be non-negative. Default is 1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized logistic distribution.
See also
scipy.stats.logistic
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Logistic distribution is
\[P(x) = P(x) = \frac{e^{-(x-\mu)/s}}{s(1+e^{-(x-\mu)/s})^2},\]where \(\mu\) = location and \(s\) = scale.
The Logistic distribution is used in Extreme Value problems where it can act as a mixture of Gumbel distributions, in Epidemiology, and by the World Chess Federation (FIDE) where it is used in the Elo ranking system, assuming the performance of each player is a logistically distributed random variable.
References
[R166] Reiss, R.-D. and Thomas M. (2001), “Statistical Analysis of Extreme Values, from Insurance, Finance, Hydrology and Other Fields,” Birkhauser Verlag, Basel, pp 132-133. [R167] Weisstein, Eric W. “Logistic Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LogisticDistribution.html [R168] Wikipedia, “Logistic-distribution”, https://en.wikipedia.org/wiki/Logistic_distribution Examples
Draw samples from the distribution:
>>> loc, scale = 10, 1 # doctest: +SKIP >>> s = np.random.logistic(loc, scale, 10000) # doctest: +SKIP >>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, bins=50) # doctest: +SKIP
# plot against distribution
>>> def logist(x, loc, scale): # doctest: +SKIP ... return np.exp((loc-x)/scale)/(scale*(1+np.exp((loc-x)/scale))**2) >>> lgst_val = logist(bins, loc, scale) # doctest: +SKIP >>> plt.plot(bins, lgst_val * count.max() / lgst_val.max()) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
lognormal
(mean=0.0, sigma=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.lognormal.
Some inconsistencies with the Dask version may exist.
Draw samples from a log-normal distribution.
Draw samples from a log-normal distribution with specified mean, standard deviation, and array shape. Note that the mean and standard deviation are not the values for the distribution itself, but of the underlying normal distribution it is derived from.
Parameters: mean : float or array_like of floats, optional
Mean value of the underlying normal distribution. Default is 0.
sigma : float or array_like of floats, optional
Standard deviation of the underlying normal distribution. Must be non-negative. Default is 1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifmean
andsigma
are both scalars. Otherwise,np.broadcast(mean, sigma).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized log-normal distribution.
See also
scipy.stats.lognorm
- probability density function, distribution, cumulative density function, etc.
Notes
A variable x has a log-normal distribution if log(x) is normally distributed. The probability density function for the log-normal distribution is:
\[p(x) = \frac{1}{\sigma x \sqrt{2\pi}} e^{(-\frac{(ln(x)-\mu)^2}{2\sigma^2})}\]where \(\mu\) is the mean and \(\sigma\) is the standard deviation of the normally distributed logarithm of the variable. A log-normal distribution results if a random variable is the product of a large number of independent, identically-distributed variables in the same way that a normal distribution results if the variable is the sum of a large number of independent, identically-distributed variables.
References
[R169] Limpert, E., Stahel, W. A., and Abbt, M., “Log-normal Distributions across the Sciences: Keys and Clues,” BioScience, Vol. 51, No. 5, May, 2001. https://stat.ethz.ch/~stahel/lognormal/bioscience.pdf [R170] Reiss, R.D. and Thomas, M., “Statistical Analysis of Extreme Values,” Basel: Birkhauser Verlag, 2001, pp. 31-32. Examples
Draw samples from the distribution:
>>> mu, sigma = 3., 1. # mean and standard deviation # doctest: +SKIP >>> s = np.random.lognormal(mu, sigma, 1000) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 100, density=True, align='mid') # doctest: +SKIP
>>> x = np.linspace(min(bins), max(bins), 10000) # doctest: +SKIP >>> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) # doctest: +SKIP ... / (x * sigma * np.sqrt(2 * np.pi)))
>>> plt.plot(x, pdf, linewidth=2, color='r') # doctest: +SKIP >>> plt.axis('tight') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
Demonstrate that taking the products of random samples from a uniform distribution can be fit well by a log-normal probability density function.
>>> # Generate a thousand samples: each is the product of 100 random >>> # values, drawn from a normal distribution. >>> b = [] # doctest: +SKIP >>> for i in range(1000): # doctest: +SKIP ... a = 10. + np.random.standard_normal(100) ... b.append(np.product(a))
>>> b = np.array(b) / np.min(b) # scale values to be positive # doctest: +SKIP >>> count, bins, ignored = plt.hist(b, 100, density=True, align='mid') # doctest: +SKIP >>> sigma = np.std(np.log(b)) # doctest: +SKIP >>> mu = np.mean(np.log(b)) # doctest: +SKIP
>>> x = np.linspace(min(bins), max(bins), 10000) # doctest: +SKIP >>> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) # doctest: +SKIP ... / (x * sigma * np.sqrt(2 * np.pi)))
>>> plt.plot(x, pdf, color='r', linewidth=2) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
logseries
(p, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.logseries.
Some inconsistencies with the Dask version may exist.
Draw samples from a logarithmic series distribution.
Samples are drawn from a log series distribution with specified shape parameter, 0 <
p
< 1.Parameters: p : float or array_like of floats
Shape parameter for the distribution. Must be in the range (0, 1).
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifp
is a scalar. Otherwise,np.array(p).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized logarithmic series distribution.
See also
scipy.stats.logser
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Log Series distribution is
\[P(k) = \frac{-p^k}{k \ln(1-p)},\]where p = probability.
The log series distribution is frequently used to represent species richness and occurrence, first proposed by Fisher, Corbet, and Williams in 1943 [2]. It may also be used to model the numbers of occupants seen in cars [3].
References
[R171] Buzas, Martin A.; Culver, Stephen J., Understanding regional species diversity through the log series distribution of occurrences: BIODIVERSITY RESEARCH Diversity & Distributions, Volume 5, Number 5, September 1999 , pp. 187-195(9). [R172] Fisher, R.A,, A.S. Corbet, and C.B. Williams. 1943. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology, 12:42-58. [R173] D. J. Hand, F. Daly, D. Lunn, E. Ostrowski, A Handbook of Small Data Sets, CRC Press, 1994. [R174] Wikipedia, “Logarithmic distribution”, https://en.wikipedia.org/wiki/Logarithmic_distribution Examples
Draw samples from the distribution:
>>> a = .6 # doctest: +SKIP >>> s = np.random.logseries(a, 10000) # doctest: +SKIP >>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s) # doctest: +SKIP
# plot against distribution
>>> def logseries(k, p): # doctest: +SKIP ... return -p**k/(k*np.log(1-p)) >>> plt.plot(bins, logseries(bins, a)*count.max()/ # doctest: +SKIP ... logseries(bins, a).max(), 'r') >>> plt.show() # doctest: +SKIP
-
dask.array.random.
negative_binomial
(n, p, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.negative_binomial.
Some inconsistencies with the Dask version may exist.
Draw samples from a negative binomial distribution.
Samples are drawn from a negative binomial distribution with specified parameters, n successes and p probability of success where n is > 0 and p is in the interval [0, 1].
Parameters: n : float or array_like of floats
Parameter of the distribution, > 0.
p : float or array_like of floats
Parameter of the distribution, >= 0 and <=1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifn
andp
are both scalars. Otherwise,np.broadcast(n, p).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized negative binomial distribution, where each sample is equal to N, the number of failures that occurred before a total of n successes was reached.
Notes
The probability mass function of the negative binomial distribution is
\[P(N;n,p) = \frac{\Gamma(N+n)}{N!\Gamma(n)}p^{n}(1-p)^{N},\]where \(n\) is the number of successes, \(p\) is the probability of success, \(N+n\) is the number of trials, and \(\Gamma\) is the gamma function. When \(n\) is an integer, \(\frac{\Gamma(N+n)}{N!\Gamma(n)} = \binom{N+n-1}{N}\), which is the more common form of this term in the the pmf. The negative binomial distribution gives the probability of N failures given n successes, with a success on the last trial.
If one throws a die repeatedly until the third time a “1” appears, then the probability distribution of the number of non-“1”s that appear before the third “1” is a negative binomial distribution.
References
[R175] Weisstein, Eric W. “Negative Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NegativeBinomialDistribution.html [R176] Wikipedia, “Negative binomial distribution”, https://en.wikipedia.org/wiki/Negative_binomial_distribution Examples
Draw samples from the distribution:
A real world example. A company drills wild-cat oil exploration wells, each with an estimated probability of success of 0.1. What is the probability of having one success for each successive well, that is what is the probability of a single success after drilling 5 wells, after 6 wells, etc.?
>>> s = np.random.negative_binomial(1, 0.1, 100000) # doctest: +SKIP >>> for i in range(1, 11): # doctest: +SKIP ... probability = sum(s<i) / 100000. ... print(i, "wells drilled, probability of one success =", probability)
-
dask.array.random.
noncentral_chisquare
(df, nonc, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.noncentral_chisquare.
Some inconsistencies with the Dask version may exist.
Draw samples from a noncentral chi-square distribution.
The noncentral \(\chi^2\) distribution is a generalization of the \(\chi^2\) distribution.
Parameters: df : float or array_like of floats
Degrees of freedom, must be > 0.
Changed in version 1.10.0: Earlier NumPy versions required dfnum > 1.
nonc : float or array_like of floats
Non-centrality, must be non-negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdf
andnonc
are both scalars. Otherwise,np.broadcast(df, nonc).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized noncentral chi-square distribution.
Notes
The probability density function for the noncentral Chi-square distribution is
\[P(x;df,nonc) = \sum^{\infty}_{i=0} \frac{e^{-nonc/2}(nonc/2)^{i}}{i!} P_{Y_{df+2i}}(x),\]where \(Y_{q}\) is the Chi-square with q degrees of freedom.
References
[R177] Wikipedia, “Noncentral chi-squared distribution” https://en.wikipedia.org/wiki/Noncentral_chi-squared_distribution Examples
Draw values from the distribution and plot the histogram
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000), # doctest: +SKIP ... bins=200, density=True) >>> plt.show() # doctest: +SKIP
Draw values from a noncentral chisquare with very small noncentrality, and compare to a chisquare.
>>> plt.figure() # doctest: +SKIP >>> values = plt.hist(np.random.noncentral_chisquare(3, .0000001, 100000), # doctest: +SKIP ... bins=np.arange(0., 25, .1), density=True) >>> values2 = plt.hist(np.random.chisquare(3, 100000), # doctest: +SKIP ... bins=np.arange(0., 25, .1), density=True) >>> plt.plot(values[1][0:-1], values[0]-values2[0], 'ob') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
Demonstrate how large values of non-centrality lead to a more symmetric distribution.
>>> plt.figure() # doctest: +SKIP >>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000), # doctest: +SKIP ... bins=200, density=True) >>> plt.show() # doctest: +SKIP
-
dask.array.random.
noncentral_f
(dfnum, dfden, nonc, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.noncentral_f.
Some inconsistencies with the Dask version may exist.
Draw samples from the noncentral F distribution.
Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters > 1. nonc is the non-centrality parameter.
Parameters: dfnum : float or array_like of floats
Numerator degrees of freedom, must be > 0.
Changed in version 1.14.0: Earlier NumPy versions required dfnum > 1.
dfden : float or array_like of floats
Denominator degrees of freedom, must be > 0.
nonc : float or array_like of floats
Non-centrality parameter, the sum of the squares of the numerator means, must be >= 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdfnum
,dfden
, andnonc
are all scalars. Otherwise,np.broadcast(dfnum, dfden, nonc).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized noncentral Fisher distribution.
Notes
When calculating the power of an experiment (power = probability of rejecting the null hypothesis when a specific alternative is true) the non-central F statistic becomes important. When the null hypothesis is true, the F statistic follows a central F distribution. When the null hypothesis is not true, then it follows a non-central F statistic.
References
[R178] Weisstein, Eric W. “Noncentral F-Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NoncentralF-Distribution.html [R179] Wikipedia, “Noncentral F-distribution”, https://en.wikipedia.org/wiki/Noncentral_F-distribution Examples
In a study, testing for a specific alternative to the null hypothesis requires use of the Noncentral F distribution. We need to calculate the area in the tail of the distribution that exceeds the value of the F distribution for the null hypothesis. We’ll plot the two probability distributions for comparison.
>>> dfnum = 3 # between group deg of freedom # doctest: +SKIP >>> dfden = 20 # within groups degrees of freedom # doctest: +SKIP >>> nonc = 3.0 # doctest: +SKIP >>> nc_vals = np.random.noncentral_f(dfnum, dfden, nonc, 1000000) # doctest: +SKIP >>> NF = np.histogram(nc_vals, bins=50, density=True) # doctest: +SKIP >>> c_vals = np.random.f(dfnum, dfden, 1000000) # doctest: +SKIP >>> F = np.histogram(c_vals, bins=50, density=True) # doctest: +SKIP >>> import matplotlib.pyplot as plt # doctest: +SKIP >>> plt.plot(F[1][1:], F[0]) # doctest: +SKIP >>> plt.plot(NF[1][1:], NF[0]) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
normal
(loc=0.0, scale=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.normal.
Some inconsistencies with the Dask version may exist.
Draw random samples from a normal (Gaussian) distribution.
The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [R181], is often called the bell curve because of its characteristic shape (see the example below).
The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [R181].
Parameters: loc : float or array_like of floats
Mean (“centre”) of the distribution.
scale : float or array_like of floats
Standard deviation (spread or “width”) of the distribution. Must be non-negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized normal distribution.
See also
scipy.stats.norm
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Gaussian distribution is
\[p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },\]where \(\mu\) is the mean and \(\sigma\) the standard deviation. The square of the standard deviation, \(\sigma^2\), is called the variance.
The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at \(x + \sigma\) and \(x - \sigma\) [R181]). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.
References
[R180] Wikipedia, “Normal distribution”, https://en.wikipedia.org/wiki/Normal_distribution [R181] (1, 2, 3, 4) P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125. Examples
Draw samples from the distribution:
>>> mu, sigma = 0, 0.1 # mean and standard deviation # doctest: +SKIP >>> s = np.random.normal(mu, sigma, 1000) # doctest: +SKIP
Verify the mean and the variance:
>>> abs(mu - np.mean(s)) # doctest: +SKIP 0.0 # may vary
>>> abs(sigma - np.std(s, ddof=1)) # doctest: +SKIP 0.1 # may vary
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 30, density=True) # doctest: +SKIP >>> plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * # doctest: +SKIP ... np.exp( - (bins - mu)**2 / (2 * sigma**2) ), ... linewidth=2, color='r') >>> plt.show() # doctest: +SKIP
Two-by-four array of samples from N(3, 6.25):
>>> np.random.normal(3, 2.5, size=(2, 4)) # doctest: +SKIP array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], # random [ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) # random
-
dask.array.random.
pareto
(a, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.pareto.
Some inconsistencies with the Dask version may exist.
Draw samples from a Pareto II or Lomax distribution with specified shape.
The Lomax or Pareto II distribution is a shifted Pareto distribution. The classical Pareto distribution can be obtained from the Lomax distribution by adding 1 and multiplying by the scale parameter
m
(see Notes). The smallest value of the Lomax distribution is zero while for the classical Pareto distribution it ismu
, where the standard Pareto distribution has locationmu = 1
. Lomax can also be considered as a simplified version of the Generalized Pareto distribution (available in SciPy), with the scale set to one and the location set to zero.The Pareto distribution must be greater than zero, and is unbounded above. It is also known as the “80-20 rule”. In this distribution, 80 percent of the weights are in the lowest 20 percent of the range, while the other 20 percent fill the remaining 80 percent of the range.
Parameters: a : float or array_like of floats
Shape of the distribution. Must be positive.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
is a scalar. Otherwise,np.array(a).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Pareto distribution.
See also
scipy.stats.lomax
- probability density function, distribution or cumulative density function, etc.
scipy.stats.genpareto
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Pareto distribution is
\[p(x) = \frac{am^a}{x^{a+1}}\]where \(a\) is the shape and \(m\) the scale.
The Pareto distribution, named after the Italian economist Vilfredo Pareto, is a power law probability distribution useful in many real world problems. Outside the field of economics it is generally referred to as the Bradford distribution. Pareto developed the distribution to describe the distribution of wealth in an economy. It has also found use in insurance, web page access statistics, oil field sizes, and many other problems, including the download frequency for projects in Sourceforge [R182]. It is one of the so-called “fat-tailed” distributions.
References
[R182] (1, 2) Francis Hunt and Paul Johnson, On the Pareto Distribution of Sourceforge projects. [R183] Pareto, V. (1896). Course of Political Economy. Lausanne. [R184] Reiss, R.D., Thomas, M.(2001), Statistical Analysis of Extreme Values, Birkhauser Verlag, Basel, pp 23-30. [R185] Wikipedia, “Pareto distribution”, https://en.wikipedia.org/wiki/Pareto_distribution Examples
Draw samples from the distribution:
>>> a, m = 3., 2. # shape and mode # doctest: +SKIP >>> s = (np.random.pareto(a, 1000) + 1) * m # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, _ = plt.hist(s, 100, density=True) # doctest: +SKIP >>> fit = a*m**a / bins**(a+1) # doctest: +SKIP >>> plt.plot(bins, max(count)*fit/max(fit), linewidth=2, color='r') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
poisson
(lam=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.poisson.
Some inconsistencies with the Dask version may exist.
Draw samples from a Poisson distribution.
The Poisson distribution is the limit of the binomial distribution for large N.
Parameters: lam : float or array_like of floats
Expectation of interval, must be >= 0. A sequence of expectation intervals must be broadcastable over the requested size.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned iflam
is a scalar. Otherwise,np.array(lam).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Poisson distribution.
Notes
The Poisson distribution
\[f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!}\]For events with an expected separation \(\lambda\) the Poisson distribution \(f(k; \lambda)\) describes the probability of \(k\) events occurring within the observed interval \(\lambda\).
Because the output is limited to the range of the C int64 type, a ValueError is raised when lam is within 10 sigma of the maximum representable value.
References
[R186] Weisstein, Eric W. “Poisson Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/PoissonDistribution.html [R187] Wikipedia, “Poisson distribution”, https://en.wikipedia.org/wiki/Poisson_distribution Examples
Draw samples from the distribution:
>>> import numpy as np # doctest: +SKIP >>> s = np.random.poisson(5, 10000) # doctest: +SKIP
Display histogram of the sample:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 14, density=True) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
Draw each 100 values for lambda 100 and 500:
>>> s = np.random.poisson(lam=(100., 500.), size=(100, 2)) # doctest: +SKIP
-
dask.array.random.
power
(a, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.power.
Some inconsistencies with the Dask version may exist.
Draws samples in [0, 1] from a power distribution with positive exponent a - 1.
Also known as the power function distribution.
Parameters: a : float or array_like of floats
Parameter of the distribution. Must be non-negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
is a scalar. Otherwise,np.array(a).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized power distribution.
Raises: ValueError
If a < 1.
Notes
The probability density function is
\[P(x; a) = ax^{a-1}, 0 \le x \le 1, a>0.\]The power function distribution is just the inverse of the Pareto distribution. It may also be seen as a special case of the Beta distribution.
It is used, for example, in modeling the over-reporting of insurance claims.
References
[R188] Christian Kleiber, Samuel Kotz, “Statistical size distributions in economics and actuarial sciences”, Wiley, 2003. [R189] Heckert, N. A. and Filliben, James J. “NIST Handbook 148: Dataplot Reference Manual, Volume 2: Let Subcommands and Library Functions”, National Institute of Standards and Technology Handbook Series, June 2003. https://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/powpdf.pdf Examples
Draw samples from the distribution:
>>> a = 5. # shape # doctest: +SKIP >>> samples = 1000 # doctest: +SKIP >>> s = np.random.power(a, samples) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, bins=30) # doctest: +SKIP >>> x = np.linspace(0, 1, 100) # doctest: +SKIP >>> y = a*x**(a-1.) # doctest: +SKIP >>> normed_y = samples*np.diff(bins)[0]*y # doctest: +SKIP >>> plt.plot(x, normed_y) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
Compare the power function distribution to the inverse of the Pareto.
>>> from scipy import stats # doctest: +SKIP >>> rvs = np.random.power(5, 1000000) # doctest: +SKIP >>> rvsp = np.random.pareto(5, 1000000) # doctest: +SKIP >>> xx = np.linspace(0,1,100) # doctest: +SKIP >>> powpdf = stats.powerlaw.pdf(xx,5) # doctest: +SKIP
>>> plt.figure() # doctest: +SKIP >>> plt.hist(rvs, bins=50, density=True) # doctest: +SKIP >>> plt.plot(xx,powpdf,'r-') # doctest: +SKIP >>> plt.title('np.random.power(5)') # doctest: +SKIP
>>> plt.figure() # doctest: +SKIP >>> plt.hist(1./(1.+rvsp), bins=50, density=True) # doctest: +SKIP >>> plt.plot(xx,powpdf,'r-') # doctest: +SKIP >>> plt.title('inverse of 1 + np.random.pareto(5)') # doctest: +SKIP
>>> plt.figure() # doctest: +SKIP >>> plt.hist(1./(1.+rvsp), bins=50, density=True) # doctest: +SKIP >>> plt.plot(xx,powpdf,'r-') # doctest: +SKIP >>> plt.title('inverse of stats.pareto(5)') # doctest: +SKIP
-
dask.array.random.
randint
(low, high=None, size=None, dtype='l')¶ This docstring was copied from numpy.random.mtrand.RandomState.randint.
Some inconsistencies with the Dask version may exist.
Return random integers from low (inclusive) to high (exclusive).
Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).
Parameters: low : int or array-like of ints
Lowest (signed) integers to be drawn from the distribution (unless
high=None
, in which case this parameter is one above the highest such integer).high : int or array-like of ints, optional
If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if
high=None
). If array-like, must contain integer valuessize : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.dtype : dtype, optional
Desired dtype of the result. All dtypes are determined by their name, i.e., ‘int64’, ‘int’, etc, so byteorder is not available and a specific precision may have different C types depending on the platform. The default value is ‘np.int’.
New in version 1.11.0.
Returns: out : int or ndarray of ints
size-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided.
See also
random.random_integers
- similar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted.
Examples
>>> np.random.randint(2, size=10) # doctest: +SKIP array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) # random >>> np.random.randint(1, size=10) # doctest: +SKIP array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Generate a 2 x 4 array of ints between 0 and 4, inclusive:
>>> np.random.randint(5, size=(2, 4)) # doctest: +SKIP array([[4, 0, 2, 1], # random [3, 2, 2, 0]])
Generate a 1 x 3 array with 3 different upper bounds
>>> np.random.randint(1, [3, 5, 10]) # doctest: +SKIP array([2, 2, 9]) # random
Generate a 1 by 3 array with 3 different lower bounds
>>> np.random.randint([1, 5, 7], 10) # doctest: +SKIP array([9, 8, 7]) # random
Generate a 2 by 4 array using broadcasting with dtype of uint8
>>> np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8) # doctest: +SKIP array([[ 8, 6, 9, 7], # random [ 1, 16, 9, 12]], dtype=uint8)
-
dask.array.random.
random
(size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.random_sample.
Some inconsistencies with the Dask version may exist.
Return random floats in the half-open interval [0.0, 1.0).
Results are from the “continuous uniform” distribution over the stated interval. To sample \(Unif[a, b), b > a\) multiply the output of random_sample by (b-a) and add a:
(b - a) * random_sample() + a
Parameters: size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.Returns: out : float or ndarray of floats
Array of random floats of shape size (unless
size=None
, in which case a single float is returned).Examples
>>> np.random.random_sample() # doctest: +SKIP 0.47108547995356098 # random >>> type(np.random.random_sample()) # doctest: +SKIP <class 'float'> >>> np.random.random_sample((5,)) # doctest: +SKIP array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428]) # random
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5 # doctest: +SKIP array([[-3.99149989, -0.52338984], # random [-2.99091858, -0.79479508], [-1.23204345, -1.75224494]])
-
dask.array.random.
random_sample
(size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.random_sample.
Some inconsistencies with the Dask version may exist.
Return random floats in the half-open interval [0.0, 1.0).
Results are from the “continuous uniform” distribution over the stated interval. To sample \(Unif[a, b), b > a\) multiply the output of random_sample by (b-a) and add a:
(b - a) * random_sample() + a
Parameters: size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.Returns: out : float or ndarray of floats
Array of random floats of shape size (unless
size=None
, in which case a single float is returned).Examples
>>> np.random.random_sample() # doctest: +SKIP 0.47108547995356098 # random >>> type(np.random.random_sample()) # doctest: +SKIP <class 'float'> >>> np.random.random_sample((5,)) # doctest: +SKIP array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428]) # random
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5 # doctest: +SKIP array([[-3.99149989, -0.52338984], # random [-2.99091858, -0.79479508], [-1.23204345, -1.75224494]])
-
dask.array.random.
rayleigh
(scale=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.rayleigh.
Some inconsistencies with the Dask version may exist.
Draw samples from a Rayleigh distribution.
The \(\chi\) and Weibull distributions are generalizations of the Rayleigh.
Parameters: scale : float or array_like of floats, optional
Scale, also equals the mode. Must be non-negative. Default is 1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifscale
is a scalar. Otherwise,np.array(scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Rayleigh distribution.
Notes
The probability density function for the Rayleigh distribution is
\[P(x;scale) = \frac{x}{scale^2}e^{\frac{-x^2}{2 \cdotp scale^2}}\]The Rayleigh distribution would arise, for example, if the East and North components of the wind velocity had identical zero-mean Gaussian distributions. Then the wind speed would have a Rayleigh distribution.
References
[R190] Brighton Webs Ltd., “Rayleigh Distribution,” https://web.archive.org/web/20090514091424/http://brighton-webs.co.uk:80/distributions/rayleigh.asp [R191] Wikipedia, “Rayleigh distribution” https://en.wikipedia.org/wiki/Rayleigh_distribution Examples
Draw values from the distribution and plot the histogram
>>> from matplotlib.pyplot import hist # doctest: +SKIP >>> values = hist(np.random.rayleigh(3, 100000), bins=200, density=True) # doctest: +SKIP
Wave heights tend to follow a Rayleigh distribution. If the mean wave height is 1 meter, what fraction of waves are likely to be larger than 3 meters?
>>> meanvalue = 1 # doctest: +SKIP >>> modevalue = np.sqrt(2 / np.pi) * meanvalue # doctest: +SKIP >>> s = np.random.rayleigh(modevalue, 1000000) # doctest: +SKIP
The percentage of waves larger than 3 meters is:
>>> 100.*sum(s>3)/1000000. # doctest: +SKIP 0.087300000000000003 # random
-
dask.array.random.
standard_cauchy
(size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.standard_cauchy.
Some inconsistencies with the Dask version may exist.
Draw samples from a standard Cauchy distribution with mode = 0.
Also known as the Lorentz distribution.
Parameters: size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.Returns: samples : ndarray or scalar
The drawn samples.
Notes
The probability density function for the full Cauchy distribution is
\[P(x; x_0, \gamma) = \frac{1}{\pi \gamma \bigl[ 1+ (\frac{x-x_0}{\gamma})^2 \bigr] }\]and the Standard Cauchy distribution just sets \(x_0=0\) and \(\gamma=1\)
The Cauchy distribution arises in the solution to the driven harmonic oscillator problem, and also describes spectral line broadening. It also describes the distribution of values at which a line tilted at a random angle will cut the x axis.
When studying hypothesis tests that assume normality, seeing how the tests perform on data from a Cauchy distribution is a good indicator of their sensitivity to a heavy-tailed distribution, since the Cauchy looks very much like a Gaussian distribution, but with heavier tails.
References
[R192] NIST/SEMATECH e-Handbook of Statistical Methods, “Cauchy Distribution”, https://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm [R193] Weisstein, Eric W. “Cauchy Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/CauchyDistribution.html [R194] Wikipedia, “Cauchy distribution” https://en.wikipedia.org/wiki/Cauchy_distribution Examples
Draw samples and plot the distribution:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> s = np.random.standard_cauchy(1000000) # doctest: +SKIP >>> s = s[(s>-25) & (s<25)] # truncate distribution so it plots well # doctest: +SKIP >>> plt.hist(s, bins=100) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
standard_exponential
(size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.standard_exponential.
Some inconsistencies with the Dask version may exist.
Draw samples from the standard exponential distribution.
standard_exponential is identical to the exponential distribution with a scale parameter of 1.
Parameters: size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.Returns: out : float or ndarray
Drawn samples.
Examples
Output a 3x8000 array:
>>> n = np.random.standard_exponential((3, 8000)) # doctest: +SKIP
-
dask.array.random.
standard_gamma
(shape, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.standard_gamma.
Some inconsistencies with the Dask version may exist.
Draw samples from a standard Gamma distribution.
Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale=1.
Parameters: shape : float or array_like of floats
Parameter, must be non-negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifshape
is a scalar. Otherwise,np.array(shape).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized standard gamma distribution.
See also
scipy.stats.gamma
- probability density function, distribution or cumulative density function, etc.
Notes
The probability density for the Gamma distribution is
\[p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},\]where \(k\) is the shape and \(\theta\) the scale, and \(\Gamma\) is the Gamma function.
The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.
References
[R195] Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html [R196] Wikipedia, “Gamma distribution”, https://en.wikipedia.org/wiki/Gamma_distribution Examples
Draw samples from the distribution:
>>> shape, scale = 2., 1. # mean and width # doctest: +SKIP >>> s = np.random.standard_gamma(shape, 1000000) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> import scipy.special as sps # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 50, density=True) # doctest: +SKIP >>> y = bins**(shape-1) * ((np.exp(-bins/scale))/ # doctest: +SKIP ... (sps.gamma(shape) * scale**shape)) >>> plt.plot(bins, y, linewidth=2, color='r') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
standard_normal
(size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.standard_normal.
Some inconsistencies with the Dask version may exist.
Draw samples from a standard Normal distribution (mean=0, stdev=1).
Parameters: size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.Returns: out : float or ndarray
A floating-point array of shape
size
of drawn samples, or a single sample ifsize
was not specified.See also
normal
- Equivalent function with additional
loc
andscale
arguments for setting the mean and standard deviation.
Notes
For random samples from \(N(\mu, \sigma^2)\), use one of:
mu + sigma * np.random.standard_normal(size=...) np.random.normal(mu, sigma, size=...)
Examples
>>> np.random.standard_normal() # doctest: +SKIP 2.1923875335537315 #random
>>> s = np.random.standard_normal(8000) # doctest: +SKIP >>> s # doctest: +SKIP array([ 0.6888893 , 0.78096262, -0.89086505, ..., 0.49876311, # random -0.38672696, -0.4685006 ]) # random >>> s.shape # doctest: +SKIP (8000,) >>> s = np.random.standard_normal(size=(3, 4, 2)) # doctest: +SKIP >>> s.shape # doctest: +SKIP (3, 4, 2)
Two-by-four array of samples from \(N(3, 6.25)\):
>>> 3 + 2.5 * np.random.standard_normal(size=(2, 4)) # doctest: +SKIP array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], # random [ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) # random
-
dask.array.random.
standard_t
(df, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.standard_t.
Some inconsistencies with the Dask version may exist.
Draw samples from a standard Student’s t distribution with df degrees of freedom.
A special case of the hyperbolic distribution. As df gets large, the result resembles that of the standard normal distribution (standard_normal).
Parameters: df : float or array_like of floats
Degrees of freedom, must be > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifdf
is a scalar. Otherwise,np.array(df).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized standard Student’s t distribution.
Notes
The probability density function for the t distribution is
\[P(x, df) = \frac{\Gamma(\frac{df+1}{2})}{\sqrt{\pi df} \Gamma(\frac{df}{2})}\Bigl( 1+\frac{x^2}{df} \Bigr)^{-(df+1)/2}\]The t test is based on an assumption that the data come from a Normal distribution. The t test provides a way to test whether the sample mean (that is the mean calculated from the data) is a good estimate of the true mean.
The derivation of the t-distribution was first published in 1908 by William Gosset while working for the Guinness Brewery in Dublin. Due to proprietary issues, he had to publish under a pseudonym, and so he used the name Student.
References
[R197] (1, 2) Dalgaard, Peter, “Introductory Statistics With R”, Springer, 2002. [R198] Wikipedia, “Student’s t-distribution” https://en.wikipedia.org/wiki/Student’s_t-distribution Examples
From Dalgaard page 83 [R197], suppose the daily energy intake for 11 women in kilojoules (kJ) is:
>>> intake = np.array([5260., 5470, 5640, 6180, 6390, 6515, 6805, 7515, \ # doctest: +SKIP ... 7515, 8230, 8770])
Does their energy intake deviate systematically from the recommended value of 7725 kJ?
We have 10 degrees of freedom, so is the sample mean within 95% of the recommended value?
>>> s = np.random.standard_t(10, size=100000) # doctest: +SKIP >>> np.mean(intake) # doctest: +SKIP 6753.636363636364 >>> intake.std(ddof=1) # doctest: +SKIP 1142.1232221373727
Calculate the t statistic, setting the ddof parameter to the unbiased value so the divisor in the standard deviation will be degrees of freedom, N-1.
>>> t = (np.mean(intake)-7725)/(intake.std(ddof=1)/np.sqrt(len(intake))) # doctest: +SKIP >>> import matplotlib.pyplot as plt # doctest: +SKIP >>> h = plt.hist(s, bins=100, density=True) # doctest: +SKIP
For a one-sided t-test, how far out in the distribution does the t statistic appear?
>>> np.sum(s<t) / float(len(s)) # doctest: +SKIP 0.0090699999999999999 #random
So the p-value is about 0.009, which says the null hypothesis has a probability of about 99% of being true.
-
dask.array.random.
triangular
(left, mode, right, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.triangular.
Some inconsistencies with the Dask version may exist.
Draw samples from the triangular distribution over the interval
[left, right]
.The triangular distribution is a continuous probability distribution with lower limit left, peak at mode, and upper limit right. Unlike the other distributions, these parameters directly define the shape of the pdf.
Parameters: left : float or array_like of floats
Lower limit.
mode : float or array_like of floats
The value where the peak of the distribution occurs. The value must fulfill the condition
left <= mode <= right
.right : float or array_like of floats
Upper limit, must be larger than left.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifleft
,mode
, andright
are all scalars. Otherwise,np.broadcast(left, mode, right).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized triangular distribution.
Notes
The probability density function for the triangular distribution is
\[\begin{split}P(x;l, m, r) = \begin{cases} \frac{2(x-l)}{(r-l)(m-l)}& \text{for $l \leq x \leq m$},\\ \frac{2(r-x)}{(r-l)(r-m)}& \text{for $m \leq x \leq r$},\\ 0& \text{otherwise}. \end{cases}\end{split}\]The triangular distribution is often used in ill-defined problems where the underlying distribution is not known, but some knowledge of the limits and mode exists. Often it is used in simulations.
References
[R199] Wikipedia, “Triangular distribution” https://en.wikipedia.org/wiki/Triangular_distribution Examples
Draw values from the distribution and plot the histogram:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> h = plt.hist(np.random.triangular(-3, 0, 8, 100000), bins=200, # doctest: +SKIP ... density=True) >>> plt.show() # doctest: +SKIP
-
dask.array.random.
uniform
(low=0.0, high=1.0, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.uniform.
Some inconsistencies with the Dask version may exist.
Draw samples from a uniform distribution.
Samples are uniformly distributed over the half-open interval
[low, high)
(includes low, but excludes high). In other words, any value within the given interval is equally likely to be drawn by uniform.Parameters: low : float or array_like of floats, optional
Lower boundary of the output interval. All values generated will be greater than or equal to low. The default value is 0.
high : float or array_like of floats
Upper boundary of the output interval. All values generated will be less than high. The default value is 1.0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned iflow
andhigh
are both scalars. Otherwise,np.broadcast(low, high).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized uniform distribution.
See also
randint
- Discrete uniform distribution, yielding integers.
random_integers
- Discrete uniform distribution over the closed interval
[low, high]
. random_sample
- Floats uniformly distributed over
[0, 1)
. random
- Alias for random_sample.
rand
- Convenience function that accepts dimensions as input, e.g.,
rand(2,2)
would generate a 2-by-2 array of floats, uniformly distributed over[0, 1)
.
Notes
The probability density function of the uniform distribution is
\[p(x) = \frac{1}{b - a}\]anywhere within the interval
[a, b)
, and zero elsewhere.When
high
==low
, values oflow
will be returned. Ifhigh
<low
, the results are officially undefined and may eventually raise an error, i.e. do not rely on this function to behave when passed arguments satisfying that inequality condition.Examples
Draw samples from the distribution:
>>> s = np.random.uniform(-1,0,1000) # doctest: +SKIP
All values are within the given interval:
>>> np.all(s >= -1) # doctest: +SKIP True >>> np.all(s < 0) # doctest: +SKIP True
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> count, bins, ignored = plt.hist(s, 15, density=True) # doctest: +SKIP >>> plt.plot(bins, np.ones_like(bins), linewidth=2, color='r') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
vonmises
(mu, kappa, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.vonmises.
Some inconsistencies with the Dask version may exist.
Draw samples from a von Mises distribution.
Samples are drawn from a von Mises distribution with specified mode (mu) and dispersion (kappa), on the interval [-pi, pi].
The von Mises distribution (also known as the circular normal distribution) is a continuous probability distribution on the unit circle. It may be thought of as the circular analogue of the normal distribution.
Parameters: mu : float or array_like of floats
Mode (“center”) of the distribution.
kappa : float or array_like of floats
Dispersion of the distribution, has to be >=0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifmu
andkappa
are both scalars. Otherwise,np.broadcast(mu, kappa).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized von Mises distribution.
See also
scipy.stats.vonmises
- probability density function, distribution, or cumulative density function, etc.
Notes
The probability density for the von Mises distribution is
\[p(x) = \frac{e^{\kappa cos(x-\mu)}}{2\pi I_0(\kappa)},\]where \(\mu\) is the mode and \(\kappa\) the dispersion, and \(I_0(\kappa)\) is the modified Bessel function of order 0.
The von Mises is named for Richard Edler von Mises, who was born in Austria-Hungary, in what is now the Ukraine. He fled to the United States in 1939 and became a professor at Harvard. He worked in probability theory, aerodynamics, fluid mechanics, and philosophy of science.
References
[R200] Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972. [R201] von Mises, R., “Mathematical Theory of Probability and Statistics”, New York: Academic Press, 1964. Examples
Draw samples from the distribution:
>>> mu, kappa = 0.0, 4.0 # mean and dispersion # doctest: +SKIP >>> s = np.random.vonmises(mu, kappa, 1000) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> from scipy.special import i0 # doctest: +SKIP >>> plt.hist(s, 50, density=True) # doctest: +SKIP >>> x = np.linspace(-np.pi, np.pi, num=51) # doctest: +SKIP >>> y = np.exp(kappa*np.cos(x-mu))/(2*np.pi*i0(kappa)) # doctest: +SKIP >>> plt.plot(x, y, linewidth=2, color='r') # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
wald
(mean, scale, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.wald.
Some inconsistencies with the Dask version may exist.
Draw samples from a Wald, or inverse Gaussian, distribution.
As the scale approaches infinity, the distribution becomes more like a Gaussian. Some references claim that the Wald is an inverse Gaussian with mean equal to 1, but this is by no means universal.
The inverse Gaussian distribution was first studied in relationship to Brownian motion. In 1956 M.C.K. Tweedie used the name inverse Gaussian because there is an inverse relationship between the time to cover a unit distance and distance covered in unit time.
Parameters: mean : float or array_like of floats
Distribution mean, must be > 0.
scale : float or array_like of floats
Scale parameter, must be > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifmean
andscale
are both scalars. Otherwise,np.broadcast(mean, scale).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Wald distribution.
Notes
The probability density function for the Wald distribution is
\[P(x;mean,scale) = \sqrt{\frac{scale}{2\pi x^3}}e^ \frac{-scale(x-mean)^2}{2\cdotp mean^2x}\]As noted above the inverse Gaussian distribution first arise from attempts to model Brownian motion. It is also a competitor to the Weibull for use in reliability modeling and modeling stock returns and interest rate processes.
References
[R202] Brighton Webs Ltd., Wald Distribution, https://web.archive.org/web/20090423014010/http://www.brighton-webs.co.uk:80/distributions/wald.asp [R203] Chhikara, Raj S., and Folks, J. Leroy, “The Inverse Gaussian Distribution: Theory : Methodology, and Applications”, CRC Press, 1988. [R204] Wikipedia, “Inverse Gaussian distribution” https://en.wikipedia.org/wiki/Inverse_Gaussian_distribution Examples
Draw values from the distribution and plot the histogram:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> h = plt.hist(np.random.wald(3, 2, 100000), bins=200, density=True) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
weibull
(a, size=None)¶ This docstring was copied from numpy.random.mtrand.RandomState.weibull.
Some inconsistencies with the Dask version may exist.
Draw samples from a Weibull distribution.
Draw samples from a 1-parameter Weibull distribution with the given shape parameter a.
\[X = (-ln(U))^{1/a}\]Here, U is drawn from the uniform distribution over (0,1].
The more common 2-parameter Weibull, including a scale parameter \(\lambda\) is just \(X = \lambda(-ln(U))^{1/a}\).
Parameters: a : float or array_like of floats
Shape parameter of the distribution. Must be nonnegative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifa
is a scalar. Otherwise,np.array(a).size
samples are drawn.Returns: out : ndarray or scalar
Drawn samples from the parameterized Weibull distribution.
See also
scipy.stats.weibull_max
,scipy.stats.weibull_min
,scipy.stats.genextreme
,gumbel
Notes
The Weibull (or Type III asymptotic extreme value distribution for smallest values, SEV Type III, or Rosin-Rammler distribution) is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. This class includes the Gumbel and Frechet distributions.
The probability density for the Weibull distribution is
\[p(x) = \frac{a} {\lambda}(\frac{x}{\lambda})^{a-1}e^{-(x/\lambda)^a},\]where \(a\) is the shape and \(\lambda\) the scale.
The function has its peak (the mode) at \(\lambda(\frac{a-1}{a})^{1/a}\).
When
a = 1
, the Weibull distribution reduces to the exponential distribution.References
[R205] Waloddi Weibull, Royal Technical University, Stockholm, 1939 “A Statistical Theory Of The Strength Of Materials”, Ingeniorsvetenskapsakademiens Handlingar Nr 151, 1939, Generalstabens Litografiska Anstalts Forlag, Stockholm. [R206] Waloddi Weibull, “A Statistical Distribution Function of Wide Applicability”, Journal Of Applied Mechanics ASME Paper 1951. [R207] Wikipedia, “Weibull distribution”, https://en.wikipedia.org/wiki/Weibull_distribution Examples
Draw samples from the distribution:
>>> a = 5. # shape # doctest: +SKIP >>> s = np.random.weibull(a, 1000) # doctest: +SKIP
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt # doctest: +SKIP >>> x = np.arange(1,100.)/50. # doctest: +SKIP >>> def weib(x,n,a): # doctest: +SKIP ... return (a / n) * (x / n)**(a - 1) * np.exp(-(x / n)**a)
>>> count, bins, ignored = plt.hist(np.random.weibull(5.,1000)) # doctest: +SKIP >>> x = np.arange(1,100.)/50. # doctest: +SKIP >>> scale = count.max()/weib(x, 1., 5.).max() # doctest: +SKIP >>> plt.plot(x, weib(x, 1., 5.)*scale) # doctest: +SKIP >>> plt.show() # doctest: +SKIP
-
dask.array.random.
zipf
(a, size=None)¶ Standard distributions
-
dask.array.stats.
ttest_ind
(a, b, axis=0, equal_var=True)¶ Calculate the T-test for the means of two independent samples of scores.
This docstring was copied from scipy.stats.ttest_ind.
Some inconsistencies with the Dask version may exist.
This is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. This test assumes that the populations have identical variances by default.
Parameters: a, b : array_like
The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
axis : int or None, optional
Axis along which to compute test. If None, compute over the whole arrays, a, and b.
equal_var : bool, optional
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional (Not supported in Dask)
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: statistic : float or array
The calculated t-statistic.
pvalue : float or array
The two-tailed p-value.
Notes
We can use this test, if we observe two independent samples from the same or different population, e.g. exam scores of boys and girls or of two ethnic groups. The test measures whether the average (expected) value differs significantly across samples. If we observe a large p-value, for example larger than 0.05 or 0.1, then we cannot reject the null hypothesis of identical average scores. If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, then we reject the null hypothesis of equal averages.
References
[R208] (1, 2) https://en.wikipedia.org/wiki/T-test#Independent_two-sample_t-test [R209] (1, 2) https://en.wikipedia.org/wiki/Welch%27s_t-test Examples
>>> from scipy import stats # doctest: +SKIP >>> np.random.seed(12345678) # doctest: +SKIP
Test with sample with identical means:
>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500) # doctest: +SKIP >>> rvs2 = stats.norm.rvs(loc=5,scale=10,size=500) # doctest: +SKIP >>> stats.ttest_ind(rvs1,rvs2) # doctest: +SKIP (0.26833823296239279, 0.78849443369564776) >>> stats.ttest_ind(rvs1,rvs2, equal_var = False) # doctest: +SKIP (0.26833823296239279, 0.78849452749500748)
ttest_ind underestimates p for unequal variances:
>>> rvs3 = stats.norm.rvs(loc=5, scale=20, size=500) # doctest: +SKIP >>> stats.ttest_ind(rvs1, rvs3) # doctest: +SKIP (-0.46580283298287162, 0.64145827413436174) >>> stats.ttest_ind(rvs1, rvs3, equal_var = False) # doctest: +SKIP (-0.46580283298287162, 0.64149646246569292)
When n1 != n2, the equal variance t-statistic is no longer equal to the unequal variance t-statistic:
>>> rvs4 = stats.norm.rvs(loc=5, scale=20, size=100) # doctest: +SKIP >>> stats.ttest_ind(rvs1, rvs4) # doctest: +SKIP (-0.99882539442782481, 0.3182832709103896) >>> stats.ttest_ind(rvs1, rvs4, equal_var = False) # doctest: +SKIP (-0.69712570584654099, 0.48716927725402048)
T-test with different means, variance, and n:
>>> rvs5 = stats.norm.rvs(loc=8, scale=20, size=100) # doctest: +SKIP >>> stats.ttest_ind(rvs1, rvs5) # doctest: +SKIP (-1.4679669854490653, 0.14263895620529152) >>> stats.ttest_ind(rvs1, rvs5, equal_var = False) # doctest: +SKIP (-0.94365973617132992, 0.34744170334794122)
-
dask.array.stats.
ttest_1samp
(a, popmean, axis=0, nan_policy='propagate')¶ Calculate the T-test for the mean of ONE group of scores.
This docstring was copied from scipy.stats.ttest_1samp.
Some inconsistencies with the Dask version may exist.
This is a two-sided test for the null hypothesis that the expected value (mean) of a sample of independent observations a is equal to the given population mean, popmean.
Parameters: a : array_like
sample observation
popmean : float or array_like
expected value in null hypothesis. If array_like, then it must have the same shape as a excluding the axis dimension
axis : int or None, optional
Axis along which to compute test. If None, compute over the whole array a.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: statistic : float or array
t-statistic
pvalue : float or array
two-tailed p-value
Examples
>>> from scipy import stats # doctest: +SKIP
>>> np.random.seed(7654567) # fix seed to get the same result # doctest: +SKIP >>> rvs = stats.norm.rvs(loc=5, scale=10, size=(50,2)) # doctest: +SKIP
Test if mean of random sample is equal to true mean, and different mean. We reject the null hypothesis in the second case and don’t reject it in the first case.
>>> stats.ttest_1samp(rvs,5.0) # doctest: +SKIP (array([-0.68014479, -0.04323899]), array([ 0.49961383, 0.96568674])) >>> stats.ttest_1samp(rvs,0.0) # doctest: +SKIP (array([ 2.77025808, 4.11038784]), array([ 0.00789095, 0.00014999]))
Examples using axis and non-scalar dimension for population mean.
>>> stats.ttest_1samp(rvs,[5.0,0.0]) # doctest: +SKIP (array([-0.68014479, 4.11038784]), array([ 4.99613833e-01, 1.49986458e-04])) >>> stats.ttest_1samp(rvs.T,[5.0,0.0],axis=1) # doctest: +SKIP (array([-0.68014479, 4.11038784]), array([ 4.99613833e-01, 1.49986458e-04])) >>> stats.ttest_1samp(rvs,[[5.0],[0.0]]) # doctest: +SKIP (array([[-0.68014479, -0.04323899], [ 2.77025808, 4.11038784]]), array([[ 4.99613833e-01, 9.65686743e-01], [ 7.89094663e-03, 1.49986458e-04]]))
-
dask.array.stats.
ttest_rel
(a, b, axis=0, nan_policy='propagate')¶ Calculate the T-test on TWO RELATED samples of scores, a and b.
This docstring was copied from scipy.stats.ttest_rel.
Some inconsistencies with the Dask version may exist.
This is a two-sided test for the null hypothesis that 2 related or repeated samples have identical average (expected) values.
Parameters: a, b : array_like
The arrays must have the same shape.
axis : int or None, optional
Axis along which to compute test. If None, compute over the whole arrays, a, and b.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: statistic : float or array
t-statistic
pvalue : float or array
two-tailed p-value
Notes
Examples for the use are scores of the same set of student in different exams, or repeated sampling from the same units. The test measures whether the average score differs significantly across samples (e.g. exams). If we observe a large p-value, for example greater than 0.05 or 0.1 then we cannot reject the null hypothesis of identical average scores. If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, then we reject the null hypothesis of equal averages. Small p-values are associated with large t-statistics.
References
https://en.wikipedia.org/wiki/T-test#Dependent_t-test_for_paired_samples
Examples
>>> from scipy import stats # doctest: +SKIP >>> np.random.seed(12345678) # fix random seed to get same numbers # doctest: +SKIP
>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500) # doctest: +SKIP >>> rvs2 = (stats.norm.rvs(loc=5,scale=10,size=500) + # doctest: +SKIP ... stats.norm.rvs(scale=0.2,size=500)) >>> stats.ttest_rel(rvs1,rvs2) # doctest: +SKIP (0.24101764965300962, 0.80964043445811562) >>> rvs3 = (stats.norm.rvs(loc=8,scale=10,size=500) + # doctest: +SKIP ... stats.norm.rvs(scale=0.2,size=500)) >>> stats.ttest_rel(rvs1,rvs3) # doctest: +SKIP (-3.9995108708727933, 7.3082402191726459e-005)
-
dask.array.stats.
chisquare
(f_obs, f_exp=None, ddof=0, axis=0)¶ Calculate a one-way chi square test.
This docstring was copied from scipy.stats.chisquare.
Some inconsistencies with the Dask version may exist.
The chi square test tests the null hypothesis that the categorical data has the given frequencies.
Parameters: f_obs : array_like
Observed frequencies in each category.
f_exp : array_like, optional
Expected frequencies in each category. By default the categories are assumed to be equally likely.
ddof : int, optional
“Delta degrees of freedom”: adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with
k - 1 - ddof
degrees of freedom, where k is the number of observed frequencies. The default value of ddof is 0.axis : int or None, optional
The axis of the broadcast result of f_obs and f_exp along which to apply the test. If axis is None, all values in f_obs are treated as a single data set. Default is 0.
Returns: chisq : float or ndarray
The chi-squared test statistic. The value is a float if axis is None or f_obs and f_exp are 1-D.
p : float or ndarray
The p-value of the test. The value is a float if ddof and the return value chisq are scalars.
See also
scipy.stats.power_divergence
Notes
This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.
The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate.
References
[R210] Lowry, Richard. “Concepts and Applications of Inferential Statistics”. Chapter 8. https://web.archive.org/web/20171022032306/http://vassarstats.net:80/textbook/ch8pt1.html [R211] “Chi-squared test”, https://en.wikipedia.org/wiki/Chi-squared_test Examples
When just f_obs is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies.
>>> from scipy.stats import chisquare # doctest: +SKIP >>> chisquare([16, 18, 16, 14, 12, 12]) # doctest: +SKIP (2.0, 0.84914503608460956)
With f_exp the expected frequencies can be given.
>>> chisquare([16, 18, 16, 14, 12, 12], f_exp=[16, 16, 16, 16, 16, 8]) # doctest: +SKIP (3.5, 0.62338762774958223)
When f_obs is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T # doctest: +SKIP >>> obs.shape # doctest: +SKIP (6, 2) >>> chisquare(obs) # doctest: +SKIP (array([ 2. , 6.66666667]), array([ 0.84914504, 0.24663415]))
By setting
axis=None
, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.>>> chisquare(obs, axis=None) # doctest: +SKIP (23.31034482758621, 0.015975692534127565) >>> chisquare(obs.ravel()) # doctest: +SKIP (23.31034482758621, 0.015975692534127565)
ddof is the change to make to the default degrees of freedom.
>>> chisquare([16, 18, 16, 14, 12, 12], ddof=1) # doctest: +SKIP (2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the chi-squared statistic with ddof.
>>> chisquare([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) # doctest: +SKIP (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
f_obs and f_exp are also broadcast. In the following, f_obs has shape (6,) and f_exp has shape (2, 6), so the result of broadcasting f_obs and f_exp has shape (2, 6). To compute the desired chi-squared statistics, we use
axis=1
:>>> chisquare([16, 18, 16, 14, 12, 12], # doctest: +SKIP ... f_exp=[[16, 16, 16, 16, 16, 8], [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
-
dask.array.stats.
power_divergence
(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None)¶ Cressie-Read power divergence statistic and goodness of fit test.
This docstring was copied from scipy.stats.power_divergence.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic.
Parameters: f_obs : array_like
Observed frequencies in each category.
f_exp : array_like, optional
Expected frequencies in each category. By default the categories are assumed to be equally likely.
ddof : int, optional
“Delta degrees of freedom”: adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with
k - 1 - ddof
degrees of freedom, where k is the number of observed frequencies. The default value of ddof is 0.axis : int or None, optional
The axis of the broadcast result of f_obs and f_exp along which to apply the test. If axis is None, all values in f_obs are treated as a single data set. Default is 0.
lambda_ : float or str, optional
lambda_ gives the power in the Cressie-Read power divergence statistic. The default is 1. For convenience, lambda_ may be assigned one of the following strings, in which case the corresponding numerical value is used:
String Value Description "pearson" 1 Pearson's chi-squared statistic. In this case, the function is equivalent to `stats.chisquare`. "log-likelihood" 0 Log-likelihood ratio. Also known as the G-test [R214]_. "freeman-tukey" -1/2 Freeman-Tukey statistic. "mod-log-likelihood" -1 Modified log-likelihood ratio. "neyman" -2 Neyman's statistic. "cressie-read" 2/3 The power recommended in [R216]_.
Returns: statistic : float or ndarray
The Cressie-Read power divergence test statistic. The value is a float if axis is None or if` f_obs and f_exp are 1-D.
pvalue : float or ndarray
The p-value of the test. The value is a float if ddof and the return value stat are scalars.
See also
Notes
This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.
When lambda_ is less than zero, the formula for the statistic involves dividing by f_obs, so a warning or error may be generated if any value in f_obs is 0.
Similarly, a warning or error may be generated if any value in f_exp is zero when lambda_ >= 0.
The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate.
This function handles masked arrays. If an element of f_obs or f_exp is masked, then data at that position is ignored, and does not count towards the size of the data set.
New in version 0.13.0.
References
[R212] Lowry, Richard. “Concepts and Applications of Inferential Statistics”. Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html [R213] “Chi-squared test”, https://en.wikipedia.org/wiki/Chi-squared_test [R214] “G-test”, https://en.wikipedia.org/wiki/G-test [R215] Sokal, R. R. and Rohlf, F. J. “Biometry: the principles and practice of statistics in biological research”, New York: Freeman (1981) [R216] Cressie, N. and Read, T. R. C., “Multinomial Goodness-of-Fit Tests”, J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464. Examples
(See chisquare for more examples.)
When just f_obs is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic):
>>> from scipy.stats import power_divergence # doctest: +SKIP >>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood') # doctest: +SKIP (2.006573162632538, 0.84823476779463769)
The expected frequencies can be given with the f_exp argument:
>>> power_divergence([16, 18, 16, 14, 12, 12], # doctest: +SKIP ... f_exp=[16, 16, 16, 16, 16, 8], ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497)
When f_obs is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T # doctest: +SKIP >>> obs.shape # doctest: +SKIP (6, 2) >>> power_divergence(obs, lambda_="log-likelihood") # doctest: +SKIP (array([ 2.00657316, 6.77634498]), array([ 0.84823477, 0.23781225]))
By setting
axis=None
, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.>>> power_divergence(obs, axis=None) # doctest: +SKIP (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) # doctest: +SKIP (23.31034482758621, 0.015975692534127565)
ddof is the change to make to the default degrees of freedom.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1) # doctest: +SKIP (2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the test statistic with ddof.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) # doctest: +SKIP (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
f_obs and f_exp are also broadcast. In the following, f_obs has shape (6,) and f_exp has shape (2, 6), so the result of broadcasting f_obs and f_exp has shape (2, 6). To compute the desired chi-squared statistics, we must use
axis=1
:>>> power_divergence([16, 18, 16, 14, 12, 12], # doctest: +SKIP ... f_exp=[[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
-
dask.array.stats.
skew
(a, axis=0, bias=True, nan_policy='propagate')¶ Compute the sample skewness of a data set.
This docstring was copied from scipy.stats.skew.
Some inconsistencies with the Dask version may exist.
For normally distributed data, the skewness should be about 0. For unimodal continuous distributions, a skewness value > 0 means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to 0, statistically speaking.
Parameters: a : ndarray
data
axis : int or None, optional
Axis along which skewness is calculated. Default is 0. If None, compute over the whole array a.
bias : bool, optional
If False, then the calculations are corrected for statistical bias.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: skewness : ndarray
The skewness of values along an axis, returning 0 where all values are equal.
Notes
The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.
\[g_1=\frac{m_3}{m_2^{3/2}}\]where
\[m_i=\frac{1}{N}\sum_{n=1}^N(x[n]-\bar{x})^i\]is the biased sample \(i\texttt{th}\) central moment, and \(\bar{x}\) is the sample mean. If
bias
is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.\[G_1=\frac{k_3}{k_2^{3/2}}= \frac{\sqrt{N(N-1)}}{N-2}\frac{m_3}{m_2^{3/2}}.\]References
[R217] Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 2.2.24.1 Examples
>>> from scipy.stats import skew # doctest: +SKIP >>> skew([1, 2, 3, 4, 5]) # doctest: +SKIP 0.0 >>> skew([2, 8, 0, 4, 1, 9, 9, 0]) # doctest: +SKIP 0.2650554122698573
-
dask.array.stats.
skewtest
(a, axis=0, nan_policy='propagate')¶ Test whether the skew is different from the normal distribution.
This docstring was copied from scipy.stats.skewtest.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that the skewness of the population that the sample was drawn from is the same as that of a corresponding normal distribution.
Parameters: a : array
The data to be tested
axis : int or None, optional
Axis along which statistics are calculated. Default is 0. If None, compute over the whole array a.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: statistic : float
The computed z-score for this test.
pvalue : float
a 2-sided p-value for the hypothesis test
Notes
The sample size must be at least 8.
References
[R218] R. B. D’Agostino, A. J. Belanger and R. B. D’Agostino Jr., “A suggestion for using powerful and informative tests of normality”, American Statistician 44, pp. 316-321, 1990. Examples
>>> from scipy.stats import skewtest # doctest: +SKIP >>> skewtest([1, 2, 3, 4, 5, 6, 7, 8]) # doctest: +SKIP SkewtestResult(statistic=1.0108048609177787, pvalue=0.3121098361421897) >>> skewtest([2, 8, 0, 4, 1, 9, 9, 0]) # doctest: +SKIP SkewtestResult(statistic=0.44626385374196975, pvalue=0.6554066631275459) >>> skewtest([1, 2, 3, 4, 5, 6, 7, 8000]) # doctest: +SKIP SkewtestResult(statistic=3.571773510360407, pvalue=0.0003545719905823133) >>> skewtest([100, 100, 100, 100, 100, 100, 100, 101]) # doctest: +SKIP SkewtestResult(statistic=3.5717766638478072, pvalue=0.000354567720281634)
-
dask.array.stats.
kurtosis
(a, axis=0, fisher=True, bias=True, nan_policy='propagate')¶ Compute the kurtosis (Fisher or Pearson) of a dataset.
This docstring was copied from scipy.stats.kurtosis.
Some inconsistencies with the Dask version may exist.
Kurtosis is the fourth central moment divided by the square of the variance. If Fisher’s definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution.
If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators
Use kurtosistest to see if result is close enough to normal.
Parameters: a : array
data for which the kurtosis is calculated
axis : int or None, optional
Axis along which the kurtosis is calculated. Default is 0. If None, compute over the whole array a.
fisher : bool, optional
If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0).
bias : bool, optional
If False, then the calculations are corrected for statistical bias.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: kurtosis : array
The kurtosis of values along an axis. If all values are equal, return -3 for Fisher’s definition and 0 for Pearson’s definition.
References
[R219] Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Examples
>>> from scipy.stats import kurtosis # doctest: +SKIP >>> kurtosis([1, 2, 3, 4, 5]) # doctest: +SKIP -1.3
-
dask.array.stats.
kurtosistest
(a, axis=0, nan_policy='propagate')¶ Test whether a dataset has normal kurtosis.
This docstring was copied from scipy.stats.kurtosistest.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that the kurtosis of the population from which the sample was drawn is that of the normal distribution:
kurtosis = 3(n-1)/(n+1)
.Parameters: a : array
array of the sample data
axis : int or None, optional
Axis along which to compute test. Default is 0. If None, compute over the whole array a.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: statistic : float
The computed z-score for this test.
pvalue : float
The 2-sided p-value for the hypothesis test
Notes
Valid only for n>20. This function uses the method described in [R220].
References
[R220] (1, 2) see e.g. F. J. Anscombe, W. J. Glynn, “Distribution of the kurtosis statistic b2 for normal samples”, Biometrika, vol. 70, pp. 227-234, 1983. Examples
>>> from scipy.stats import kurtosistest # doctest: +SKIP >>> kurtosistest(list(range(20))) # doctest: +SKIP KurtosistestResult(statistic=-1.7058104152122062, pvalue=0.08804338332528348)
>>> np.random.seed(28041990) # doctest: +SKIP >>> s = np.random.normal(0, 1, 1000) # doctest: +SKIP >>> kurtosistest(s) # doctest: +SKIP KurtosistestResult(statistic=1.2317590987707365, pvalue=0.21803908613450895)
-
dask.array.stats.
normaltest
(a, axis=0, nan_policy='propagate')¶ Test whether a sample differs from a normal distribution.
This docstring was copied from scipy.stats.normaltest.
Some inconsistencies with the Dask version may exist.
This function tests the null hypothesis that a sample comes from a normal distribution. It is based on D’Agostino and Pearson’s [R221], [R222] test that combines skew and kurtosis to produce an omnibus test of normality.
Parameters: a : array_like
The array containing the sample to be tested.
axis : int or None, optional
Axis along which to compute test. Default is 0. If None, compute over the whole array a.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: statistic : float or array
s^2 + k^2
, wheres
is the z-score returned by skewtest andk
is the z-score returned by kurtosistest.pvalue : float or array
A 2-sided chi squared probability for the hypothesis test.
References
[R221] (1, 2) D’Agostino, R. B. (1971), “An omnibus test of normality for moderate and large sample size”, Biometrika, 58, 341-348 [R222] (1, 2) D’Agostino, R. and Pearson, E. S. (1973), “Tests for departure from normality”, Biometrika, 60, 613-622 Examples
>>> from scipy import stats # doctest: +SKIP >>> pts = 1000 # doctest: +SKIP >>> np.random.seed(28041990) # doctest: +SKIP >>> a = np.random.normal(0, 1, size=pts) # doctest: +SKIP >>> b = np.random.normal(2, 1, size=pts) # doctest: +SKIP >>> x = np.concatenate((a, b)) # doctest: +SKIP >>> k2, p = stats.normaltest(x) # doctest: +SKIP >>> alpha = 1e-3 # doctest: +SKIP >>> print("p = {:g}".format(p)) # doctest: +SKIP p = 3.27207e-11 >>> if p < alpha: # null hypothesis: x comes from a normal distribution # doctest: +SKIP ... print("The null hypothesis can be rejected") ... else: ... print("The null hypothesis cannot be rejected") The null hypothesis can be rejected
-
dask.array.stats.
f_oneway
(*args)¶ Performs a 1-way ANOVA.
This docstring was copied from scipy.stats.f_oneway.
Some inconsistencies with the Dask version may exist.
The one-way ANOVA tests the null hypothesis that two or more groups have the same population mean. The test is applied to samples from two or more groups, possibly with differing sizes.
Parameters: sample1, sample2, … : array_like
The sample measurements for each group.
Returns: statistic : float
The computed F-value of the test.
pvalue : float
The associated p-value from the F-distribution.
Notes
The ANOVA test has important assumptions that must be satisfied in order for the associated p-value to be valid.
- The samples are independent.
- Each sample is from a normally distributed population.
- The population standard deviations of the groups are all equal. This property is known as homoscedasticity.
If these assumptions are not true for a given set of data, it may still be possible to use the Kruskal-Wallis H-test (scipy.stats.kruskal) although with some loss of power.
The algorithm is from Heiman[2], pp.394-7.
References
[R223] R. Lowry, “Concepts and Applications of Inferential Statistics”, Chapter 14, 2014, http://vassarstats.net/textbook/ [R224] G.W. Heiman, “Understanding research methods and statistics: An integrated introduction for psychology”, Houghton, Mifflin and Company, 2001. [R225] (1, 2) G.H. McDonald, “Handbook of Biological Statistics”, One-way ANOVA. http://www.biostathandbook.com/onewayanova.html Examples
>>> import scipy.stats as stats # doctest: +SKIP
[R225] Here are some data on a shell measurement (the length of the anterior adductor muscle scar, standardized by dividing by length) in the mussel Mytilus trossulus from five locations: Tillamook, Oregon; Newport, Oregon; Petersburg, Alaska; Magadan, Russia; and Tvarminne, Finland, taken from a much larger data set used in McDonald et al. (1991).
>>> tillamook = [0.0571, 0.0813, 0.0831, 0.0976, 0.0817, 0.0859, 0.0735, # doctest: +SKIP ... 0.0659, 0.0923, 0.0836] >>> newport = [0.0873, 0.0662, 0.0672, 0.0819, 0.0749, 0.0649, 0.0835, # doctest: +SKIP ... 0.0725] >>> petersburg = [0.0974, 0.1352, 0.0817, 0.1016, 0.0968, 0.1064, 0.105] # doctest: +SKIP >>> magadan = [0.1033, 0.0915, 0.0781, 0.0685, 0.0677, 0.0697, 0.0764, # doctest: +SKIP ... 0.0689] >>> tvarminne = [0.0703, 0.1026, 0.0956, 0.0973, 0.1039, 0.1045] # doctest: +SKIP >>> stats.f_oneway(tillamook, newport, petersburg, magadan, tvarminne) # doctest: +SKIP (7.1210194716424473, 0.00028122423145345439)
-
dask.array.stats.
moment
(a, moment=1, axis=0, nan_policy='propagate')¶ Calculate the nth moment about the mean for a sample.
This docstring was copied from scipy.stats.moment.
Some inconsistencies with the Dask version may exist.
A moment is a specific quantitative measure of the shape of a set of points. It is often used to calculate coefficients of skewness and kurtosis due to its close relationship with them.
Parameters: a : array_like
data
moment : int or array_like of ints, optional
order of central moment that is returned. Default is 1.
axis : int or None, optional
Axis along which the central moment is computed. Default is 0. If None, compute over the whole array a.
nan_policy : {‘propagate’, ‘raise’, ‘omit’}, optional
Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.
Returns: n-th central moment : ndarray or float
The appropriate moment along the given axis or over all values if axis is None. The denominator for the moment calculation is the number of observations, no degrees of freedom correction is done.
Notes
The k-th central moment of a data sample is:
\[m_k = \frac{1}{n} \sum_{i = 1}^n (x_i - \bar{x})^k\]Where n is the number of samples and x-bar is the mean. This function uses exponentiation by squares [R226] for efficiency.
References
[R226] (1, 2) https://eli.thegreenplace.net/2009/03/21/efficient-integer-exponentiation-algorithms Examples
>>> from scipy.stats import moment # doctest: +SKIP >>> moment([1, 2, 3, 4, 5], moment=1) # doctest: +SKIP 0.0 >>> moment([1, 2, 3, 4, 5], moment=2) # doctest: +SKIP 2.0
-
dask.array.image.
imread
(filename, imread=None, preprocess=None)¶ Read a stack of images into a dask array
Parameters: filename: string
A globstring like ‘myfile.*.png’
imread: function (optional)
Optionally provide custom imread function. Function should expect a filename and produce a numpy array. Defaults to
skimage.io.imread
.preprocess: function (optional)
Optionally provide custom function to preprocess the image. Function should expect a numpy array for a single image.
Returns: Dask array of all images stacked along the first dimension. All images
will be treated as individual chunks
Examples
>>> from dask.array.image import imread >>> im = imread('2015-*-*.png') # doctest: +SKIP >>> im.shape # doctest: +SKIP (365, 1000, 1000, 3)
-
dask.array.gufunc.
apply_gufunc
(func, signature, *args, **kwargs)¶ Apply a generalized ufunc or similar python function to arrays.
signature
determines if the function consumes or produces core dimensions. The remaining dimensions in given input arrays (*args
) are considered loop dimensions and are required to broadcast naturally against each other.In other terms, this function is like np.vectorize, but for the blocks of dask arrays. If the function itself shall also be vectorized use
vectorize=True
for convenience.Parameters: func : callable
Function to call like
func(*args, **kwargs)
on input arrays (*args
) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions [R227] (if this is not the case, setvectorize=True
). If this function returns multiple outputs,output_core_dims
has to be set as well.signature: string
Specifies what core dimensions are consumed and produced by
func
. According to the specification of numpy.gufunc signature [R228]*args : numeric
Input arrays or scalars to the callable function.
axes: List of tuples, optional, keyword only
A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of
"(i,j),(j,k)->(i,k)"
appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be[(-2, -1), (-2, -1), (-2, -1)]
. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.axis: int, optional, keyword only
A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and
()
for all others. For instance, for a signature"(i),(i)->()"
, it is equivalent to passing inaxes=[(axis,), (axis,), ()]
.keepdims: bool, optional, keyword only
If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like
"(i),(i)->()"
or"(m,m)->()"
. If used, the location of the dimensions in the output can be controlled with axes and axis.output_dtypes : Optional, dtype or list of dtypes, keyword only
Valid numpy dtype specification or list thereof. If not given, a call of
func
with a small set of data is performed in order to try to automatically determine the output dtypes.output_sizes : dict, optional, keyword only
Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.
vectorize: bool, keyword only
If set to
True
,np.vectorize
is applied tofunc
for convenience. Defaults toFalse
.allow_rechunk: Optional, bool, keyword only
Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to
False
.**kwargs : dict
Extra keyword arguments to pass to func
Returns: Single dask.array.Array or tuple of dask.array.Array
References
[R227] (1, 2) https://docs.scipy.org/doc/numpy/reference/ufuncs.html [R228] (1, 2) https://docs.scipy.org/doc/numpy/reference/c-api.generalized-ufuncs.html Examples
>>> import dask.array as da >>> import numpy as np >>> def stats(x): ... return np.mean(x, axis=-1), np.std(x, axis=-1) >>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30)) >>> mean, std = da.apply_gufunc(stats, "(i)->(),()", a) >>> mean.compute().shape (10, 20)
>>> def outer_product(x, y): ... return np.einsum("i,j->ij", x, y) >>> a = da.random.normal(size=( 20,30), chunks=(10, 30)) >>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40)) >>> c = da.apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, vectorize=True) >>> c.compute().shape (10, 20, 30, 40)
-
dask.array.gufunc.
as_gufunc
(signature=None, **kwargs)¶ Decorator for
dask.array.gufunc
.Parameters: signature : String
Specifies what core dimensions are consumed and produced by
func
. According to the specification of numpy.gufunc signature [R230]axes: List of tuples, optional, keyword only
A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of
"(i,j),(j,k)->(i,k)"
appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be[(-2, -1), (-2, -1), (-2, -1)]
. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.axis: int, optional, keyword only
A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and
()
for all others. For instance, for a signature"(i),(i)->()"
, it is equivalent to passing inaxes=[(axis,), (axis,), ()]
.keepdims: bool, optional, keyword only
If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like
"(i),(i)->()"
or"(m,m)->()"
. If used, the location of the dimensions in the output can be controlled with axes and axis.output_dtypes : Optional, dtype or list of dtypes, keyword only
Valid numpy dtype specification or list thereof. If not given, a call of
func
with a small set of data is performed in order to try to automatically determine the output dtypes.output_sizes : dict, optional, keyword only
Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.
vectorize: bool, keyword only
If set to
True
,np.vectorize
is applied tofunc
for convenience. Defaults toFalse
.allow_rechunk: Optional, bool, keyword only
Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to
False
.Returns: Decorator for pyfunc that itself returns a gufunc.
References
[R229] https://docs.scipy.org/doc/numpy/reference/ufuncs.html [R230] (1, 2) https://docs.scipy.org/doc/numpy/reference/c-api.generalized-ufuncs.html Examples
>>> import dask.array as da >>> import numpy as np >>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30)) >>> @da.as_gufunc("(i)->(),()", output_dtypes=(float, float)) ... def stats(x): ... return np.mean(x, axis=-1), np.std(x, axis=-1) >>> mean, std = stats(a) >>> mean.compute().shape (10, 20)
>>> a = da.random.normal(size=( 20,30), chunks=(10, 30)) >>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40)) >>> @da.as_gufunc("(i),(j)->(i,j)", output_dtypes=float, vectorize=True) ... def outer_product(x, y): ... return np.einsum("i,j->ij", x, y) >>> c = outer_product(a, b) >>> c.compute().shape (10, 20, 30, 40)
-
dask.array.gufunc.
gufunc
(pyfunc, **kwargs)¶ Binds pyfunc into
dask.array.apply_gufunc
when called.Parameters: pyfunc : callable
Function to call like
func(*args, **kwargs)
on input arrays (*args
) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions [R231] (if this is not the case, setvectorize=True
). If this function returns multiple outputs,output_core_dims
has to be set as well.signature : String, keyword only
Specifies what core dimensions are consumed and produced by
func
. According to the specification of numpy.gufunc signature [R232]axes: List of tuples, optional, keyword only
A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of
"(i,j),(j,k)->(i,k)"
appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be[(-2, -1), (-2, -1), (-2, -1)]
. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.axis: int, optional, keyword only
A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and
()
for all others. For instance, for a signature"(i),(i)->()"
, it is equivalent to passing inaxes=[(axis,), (axis,), ()]
.keepdims: bool, optional, keyword only
If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like
"(i),(i)->()"
or"(m,m)->()"
. If used, the location of the dimensions in the output can be controlled with axes and axis.output_dtypes : Optional, dtype or list of dtypes, keyword only
Valid numpy dtype specification or list thereof. If not given, a call of
func
with a small set of data is performed in order to try to automatically determine the output dtypes.output_sizes : dict, optional, keyword only
Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.
vectorize: bool, keyword only
If set to
True
,np.vectorize
is applied tofunc
for convenience. Defaults toFalse
.allow_rechunk: Optional, bool, keyword only
Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to
False
.Returns: Wrapped function
References
[R231] (1, 2) https://docs.scipy.org/doc/numpy/reference/ufuncs.html [R232] (1, 2) https://docs.scipy.org/doc/numpy/reference/c-api.generalized-ufuncs.html Examples
>>> import dask.array as da >>> import numpy as np >>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30)) >>> def stats(x): ... return np.mean(x, axis=-1), np.std(x, axis=-1) >>> gustats = da.gufunc(stats, signature="(i)->(),()", output_dtypes=(float, float)) >>> mean, std = gustats(a) >>> mean.compute().shape (10, 20)
>>> a = da.random.normal(size=( 20,30), chunks=(10, 30)) >>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40)) >>> def outer_product(x, y): ... return np.einsum("i,j->ij", x, y) >>> guouter_product = da.gufunc(outer_product, signature="(i),(j)->(i,j)", output_dtypes=float, vectorize=True) >>> c = guouter_product(a, b) >>> c.compute().shape (10, 20, 30, 40)
-
dask.array.core.
map_blocks
(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)¶ Map a function across all blocks of a dask array.
Parameters: func : callable
Function to apply to every block in the array.
args : dask arrays or other objects
dtype : np.dtype, optional
The
dtype
of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.chunks : tuple, optional
Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.
drop_axis : number or iterable, optional
Dimensions lost by the function.
new_axis : number or iterable, optional
New dimensions created by the function. Note that these are applied after
drop_axis
(if present).token : string, optional
The key prefix to use for the output array. If not provided, will be determined from the function name.
name : string, optional
The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.
**kwargs :
Other keyword arguments to pass to function. Values must be constants (not dask.arrays)
See also
dask.array.blockwise
- Generalized operation with control over block alignment.
Examples
>>> import dask.array as da >>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute() array([ 0, 2, 4, 6, 8, 10])
The
da.map_blocks
function can also accept multiple arrays.>>> d = da.arange(5, chunks=2) >>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e) >>> f.compute() array([ 0, 2, 6, 12, 20])
If the function changes shape of the blocks then you must provide chunks explicitly.
>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))
You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.
>>> a = da.arange(18, chunks=(6,)) >>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))
If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.
>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1), ... new_axis=[0, 2])
If
chunks
is specified butnew_axis
is not, then it is inferred to add the necessary number of axes on the left.Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.
>>> x = da.arange(1000, chunks=(100,)) >>> y = da.arange(100, chunks=(10,))
The relevant attribute to match is numblocks.
>>> x.numblocks (10,) >>> y.numblocks (10,)
If these match (up to broadcasting rules) then we can map arbitrary functions across blocks
>>> def func(a, b): ... return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8') dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute() array([ 99, 9, 199, 19, 299, 29, 399, 39, 499, 49, 599, 59, 699, 69, 799, 79, 899, 89, 999, 99])
Your block function get information about where it is in the array by accepting a special
block_info
keyword argument.>>> def func(block, block_info=None): ... pass
This will receive the following information:
>>> block_info # doctest: +SKIP {0: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)]}, None: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)], 'chunk-shape': (100,), 'dtype': dtype('float64')}}
For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to
40:50
). The same information is provided for the output, with the keyNone
, plus the shape and dtype that should be returned.These features can be combined to synthesize an array from scratch, for example:
>>> def func(block_info=None): ... loc = block_info[None]['array-location'][0] ... return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_) dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute() array([0, 1, 2, 3, 4, 5, 6, 7])
You may specify the key name prefix of the resulting task in the graph with the optional
token
keyword argument.>>> x.map_blocks(lambda x: x + 1, name='increment') # doctest: +SKIP dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
-
dask.array.core.
blockwise
(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)¶ Tensor operation: Generalized inner and outer products
A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The
blockwise
function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.Parameters: func : callable
Function to apply to individual tuples of blocks
out_ind : iterable
Block pattern of the output, something like ‘ijk’ or (1, 2, 3)
*args : sequence of Array, index pairs
Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’)
**kwargs : dict
Extra keyword arguments to pass to function
dtype : np.dtype
Datatype of resulting array.
concatenate : bool, keyword only
If true concatenate arrays along dummy indices, else provide lists
adjust_chunks : dict
Dictionary mapping index to function to be applied to chunk sizes
new_axes : dict, keyword only
New indexes and their dimension lengths
Examples
2D embarrassingly parallel operation from two arrays, x, and y.
>>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8') # z = x + y # doctest: +SKIP
Outer product multiplying x by y, two 1-d vectors
>>> z = blockwise(operator.mul, 'ij', x, 'i', y, 'j', dtype='f8') # doctest: +SKIP
z = x.T
>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype) # doctest: +SKIP
The transpose case above is illustrative because it does same transposition both on each in-memory block by calling
np.transpose
and on the order of the blocks themselves, by switching the order of the indexij -> ji
.We can compose these same patterns with more variables and more complex in-memory functions
z = X + Y.T
>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8') # doctest: +SKIP
Any index, like
i
missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead passconcatenate=True
.Inner product multiplying x by y, two 1-d vectors
>>> def sequence_dot(x_blocks, y_blocks): ... result = 0 ... for x, y in zip(x_blocks, y_blocks): ... result += x.dot(y) ... return result
>>> z = blockwise(sequence_dot, '', x, 'i', y, 'i', dtype='f8') # doctest: +SKIP
Add new single-chunk dimensions with the
new_axes=
keyword, including the length of the new dimension. New dimensions will always be in a single chunk.>>> def f(x): ... return x[:, None] * np.ones((1, 5))
>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': 5}, dtype=x.dtype) # doctest: +SKIP
New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see
da.map_blocks
).>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype) # doctest: +SKIP
If the applied function changes the size of each chunk you can specify this with a
adjust_chunks={...}
dictionary holding a function for each index that modifies the dimension size in that index.>>> def double(x): ... return np.concatenate([x, x])
>>> y = blockwise(double, 'ij', x, 'ij', ... adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype) # doctest: +SKIP
Include literals by indexing with None
>>> y = blockwise(add, 'ij', x, 'ij', 1234, None, dtype=x.dtype) # doctest: +SKIP
-
dask.array.core.
normalize_chunks
(chunks, shape=None, limit=None, dtype=None, previous_chunks=None)¶ Normalize chunks to tuple of tuples
This takes in a variety of input types and information and produces a full tuple-of-tuples result for chunks, suitable to be passed to Array or rechunk or any other operation that creates a Dask array.
Parameters: chunks: tuple, int, dict, or string
The chunks to be normalized. See examples below for more details
shape: Tuple[int]
The shape of the array
limit: int (optional)
The maximum block size to target in bytes, if freedom is given to choose
dtype: np.dtype
previous_chunks: Tuple[Tuple[int]] optional
Chunks from a previous array that we should use for inspiration when rechunking auto dimensions. If not provided but auto-chunking exists then auto-dimensions will prefer square-like chunk shapes.
Examples
Specify uniform chunk sizes
>>> normalize_chunks((2, 2), shape=(5, 6)) ((2, 2, 1), (2, 2, 2))
Also passes through fully explicit tuple-of-tuples
>>> normalize_chunks(((2, 2, 1), (2, 2, 2)), shape=(5, 6)) ((2, 2, 1), (2, 2, 2))
Cleans up lists to tuples
>>> normalize_chunks([[2, 2], [3, 3]]) ((2, 2), (3, 3))
Expands integer inputs 10 -> (10, 10)
>>> normalize_chunks(10, shape=(30, 5)) ((10, 10, 10), (5,))
Expands dict inputs
>>> normalize_chunks({0: 2, 1: 3}, shape=(6, 6)) ((2, 2, 2), (3, 3))
The values -1 and None get mapped to full size
>>> normalize_chunks((5, -1), shape=(10, 10)) ((5, 5), (10,))
Use the value “auto” to automatically determine chunk sizes along certain dimensions. This uses the
limit=
anddtype=
keywords to determine how large to make the chunks. The term “auto” can be used anywhere an integer can be used. See array chunking documentation for more information.>>> normalize_chunks(("auto",), shape=(20,), limit=5, dtype='uint8') ((5, 5, 5, 5),)
You can also use byte sizes (see
dask.utils.parse_bytes
) in place of “auto” to ask for a particular size>>> normalize_chunks("1kiB", shape=(2000,), dtype='float32') ((250, 250, 250, 250, 250, 250, 250, 250),)
Respects null dimensions
>>> normalize_chunks((), shape=(0, 0)) ((0,), (0,))
Array Methods¶
-
class
dask.array.
Array
¶ Parallel Dask Array
A parallel nd-array comprised of many numpy arrays arranged in a grid.
This constructor is for advanced uses only. For normal use see the
da.from_array
function.Parameters: dask : dict
Task dependency graph
name : string
Name of array in dask
shape : tuple of ints
Shape of the entire array
chunks: iterable of tuples
block sizes along each dimension
dtype : str or dtype
Typecode or data-type for the new Dask Array
meta : empty ndarray
empty ndarray created with same NumPy backend, ndim and dtype as the Dask Array being created (overrides dtype)
See also
-
all
(axis=None, out=None, keepdims=False)¶ This docstring was copied from numpy.ndarray.all.
Some inconsistencies with the Dask version may exist.
Returns True if all elements evaluate to True.
Refer to numpy.all for full documentation.
See also
numpy.all
- equivalent function
-
any
(axis=None, out=None, keepdims=False)¶ This docstring was copied from numpy.ndarray.any.
Some inconsistencies with the Dask version may exist.
Returns True if any of the elements of a evaluate to True.
Refer to numpy.any for full documentation.
See also
numpy.any
- equivalent function
-
argmax
(axis=None, out=None)¶ This docstring was copied from numpy.ndarray.argmax.
Some inconsistencies with the Dask version may exist.
Return indices of the maximum values along the given axis.
Refer to numpy.argmax for full documentation.
See also
numpy.argmax
- equivalent function
-
argmin
(axis=None, out=None)¶ This docstring was copied from numpy.ndarray.argmin.
Some inconsistencies with the Dask version may exist.
Return indices of the minimum values along the given axis of a.
Refer to numpy.argmin for detailed documentation.
See also
numpy.argmin
- equivalent function
-
argtopk
(k, axis=-1, split_every=None)¶ The indices of the top k elements of an array.
See
da.argtopk
for docstring
-
astype
(dtype, **kwargs)¶ Copy of the array, cast to a specified type.
Parameters: dtype : str or dtype
Typecode or data-type to which the array is cast.
casting : {‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional
Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility.
- ‘no’ means the data types should not be cast at all.
- ‘equiv’ means only byte-order changes are allowed.
- ‘safe’ means only casts which can preserve values are allowed.
- ‘same_kind’ means only safe casts or casts within a kind,
- like float64 to float32, are allowed.
- ‘unsafe’ means any data conversions may be done.
copy : bool, optional
By default, astype always returns a newly allocated array. If this is set to False and the dtype requirement is satisfied, the input array is returned instead of a copy.
-
blocks
¶ Slice an array by blocks
This allows blockwise slicing of a Dask array. You can perform normal Numpy-style slicing but now rather than slice elements of the array you slice along blocks so, for example,
x.blocks[0, ::2]
produces a new dask array with every other block in the first row of blocks.You can index blocks in any way that could index a numpy array of shape equal to the number of blocks in each dimension, (available as array.numblocks). The dimension of the output array will be the same as the dimension of this array, even if integer indices are passed. This does not support slicing with
np.newaxis
or multiple lists.Returns: A Dask array Examples
>>> import dask.array as da >>> x = da.arange(10, chunks=2) >>> x.blocks[0].compute() array([0, 1]) >>> x.blocks[:3].compute() array([0, 1, 2, 3, 4, 5]) >>> x.blocks[::2].compute() array([0, 1, 4, 5, 8, 9]) >>> x.blocks[[-1, 0]].compute() array([8, 9, 0, 1])
-
choose
(choices, out=None, mode='raise')¶ This docstring was copied from numpy.ndarray.choose.
Some inconsistencies with the Dask version may exist.
Use an index array to construct a new array from a set of choices.
Refer to numpy.choose for full documentation.
See also
numpy.choose
- equivalent function
-
clip
(min=None, max=None, out=None, **kwargs)¶ This docstring was copied from numpy.ndarray.clip.
Some inconsistencies with the Dask version may exist.
Return an array whose values are limited to
[min, max]
. One of max or min must be given.Refer to numpy.clip for full documentation.
See also
numpy.clip
- equivalent function
-
compute_chunk_sizes
()¶ Compute the chunk sizes for a Dask array. This is especially useful when the chunk sizes are unknown (e.g., when indexing one Dask array with another).
Notes
This function modifies the Dask array in-place.
Examples
>>> import dask.array as da >>> import numpy as np >>> x = da.from_array([-2, -1, 0, 1, 2], chunks=2) >>> x.chunks ((2, 2, 1),) >>> y = x[x <= 0] >>> y.chunks ((nan, nan, nan),) >>> y.compute_chunk_sizes() # in-place computation dask.array<getitem, shape=(3,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray> >>> y.chunks ((2, 1, 0),)
-
copy
()¶ Copy array. This is a no-op for dask.arrays, which are immutable
-
cumprod
(axis=None, dtype=None, out=None)¶ This docstring was copied from numpy.ndarray.cumprod.
Some inconsistencies with the Dask version may exist.
Return the cumulative product of the elements along the given axis.
Refer to numpy.cumprod for full documentation.
See also
numpy.cumprod
- equivalent function
-
cumsum
(axis=None, dtype=None, out=None)¶ This docstring was copied from numpy.ndarray.cumsum.
Some inconsistencies with the Dask version may exist.
Return the cumulative sum of the elements along the given axis.
Refer to numpy.cumsum for full documentation.
See also
numpy.cumsum
- equivalent function
-
dot
(b, out=None)¶ This docstring was copied from numpy.ndarray.dot.
Some inconsistencies with the Dask version may exist.
Dot product of two arrays.
Refer to numpy.dot for full documentation.
See also
numpy.dot
- equivalent function
Examples
>>> a = np.eye(2) # doctest: +SKIP >>> b = np.ones((2, 2)) * 2 # doctest: +SKIP >>> a.dot(b) # doctest: +SKIP array([[2., 2.], [2., 2.]])
This array method can be conveniently chained:
>>> a.dot(b).dot(b) # doctest: +SKIP array([[8., 8.], [8., 8.]])
-
flatten
([order])¶ This docstring was copied from numpy.ndarray.ravel.
Some inconsistencies with the Dask version may exist.
Return a flattened array.
Refer to numpy.ravel for full documentation.
See also
numpy.ravel
- equivalent function
ndarray.flat
- a flat iterator on the array.
-
itemsize
¶ Length of one array element in bytes
-
map_blocks
(*args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)¶ Map a function across all blocks of a dask array.
Parameters: func : callable
Function to apply to every block in the array.
args : dask arrays or other objects
dtype : np.dtype, optional
The
dtype
of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.chunks : tuple, optional
Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.
drop_axis : number or iterable, optional
Dimensions lost by the function.
new_axis : number or iterable, optional
New dimensions created by the function. Note that these are applied after
drop_axis
(if present).token : string, optional
The key prefix to use for the output array. If not provided, will be determined from the function name.
name : string, optional
The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.
**kwargs :
Other keyword arguments to pass to function. Values must be constants (not dask.arrays)
See also
dask.array.blockwise
- Generalized operation with control over block alignment.
Examples
>>> import dask.array as da >>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute() array([ 0, 2, 4, 6, 8, 10])
The
da.map_blocks
function can also accept multiple arrays.>>> d = da.arange(5, chunks=2) >>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e) >>> f.compute() array([ 0, 2, 6, 12, 20])
If the function changes shape of the blocks then you must provide chunks explicitly.
>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))
You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.
>>> a = da.arange(18, chunks=(6,)) >>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))
If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.
>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1), ... new_axis=[0, 2])
If
chunks
is specified butnew_axis
is not, then it is inferred to add the necessary number of axes on the left.Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.
>>> x = da.arange(1000, chunks=(100,)) >>> y = da.arange(100, chunks=(10,))
The relevant attribute to match is numblocks.
>>> x.numblocks (10,) >>> y.numblocks (10,)
If these match (up to broadcasting rules) then we can map arbitrary functions across blocks
>>> def func(a, b): ... return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8') dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute() array([ 99, 9, 199, 19, 299, 29, 399, 39, 499, 49, 599, 59, 699, 69, 799, 79, 899, 89, 999, 99])
Your block function get information about where it is in the array by accepting a special
block_info
keyword argument.>>> def func(block, block_info=None): ... pass
This will receive the following information:
>>> block_info # doctest: +SKIP {0: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)]}, None: {'shape': (1000,), 'num-chunks': (10,), 'chunk-location': (4,), 'array-location': [(400, 500)], 'chunk-shape': (100,), 'dtype': dtype('float64')}}
For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to
40:50
). The same information is provided for the output, with the keyNone
, plus the shape and dtype that should be returned.These features can be combined to synthesize an array from scratch, for example:
>>> def func(block_info=None): ... loc = block_info[None]['array-location'][0] ... return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_) dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute() array([0, 1, 2, 3, 4, 5, 6, 7])
You may specify the key name prefix of the resulting task in the graph with the optional
token
keyword argument.>>> x.map_blocks(lambda x: x + 1, name='increment') # doctest: +SKIP dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
-
map_overlap
(func, depth, boundary=None, trim=True, **kwargs)¶ Map a function over blocks of the array with some overlap
We share neighboring zones between blocks of the array, then map a function, then trim away the neighboring strips.
Parameters: func: function
The function to apply to each extended block
depth: int, tuple, or dict
The number of elements that each block should share with its neighbors If a tuple or dict then this can be different per axis
boundary: str, tuple, dict
How to handle the boundaries. Values include ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or any constant value like 0 or np.nan
trim: bool
Whether or not to trim
depth
elements from each block after calling the map function. Set this to False if your mapping function already does this for you**kwargs:
Other keyword arguments valid in
map_blocks
Examples
>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1]) >>> x = from_array(x, chunks=5) >>> def derivative(x): ... return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0) >>> y.compute() array([ 1, 0, 1, 1, 0, 0, -1, -1, 0])
>>> import dask.array as da >>> x = np.arange(16).reshape((4, 4)) >>> d = da.from_array(x, chunks=(2, 2)) >>> d.map_overlap(lambda x: x + x.size, depth=1).compute() array([[16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]])
>>> func = lambda x: x + x.size >>> depth = {0: 1, 1: 1} >>> boundary = {0: 'reflect', 1: 'none'} >>> d.map_overlap(func, depth, boundary).compute() # doctest: +NORMALIZE_WHITESPACE array([[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27]])
-
max
(axis=None, out=None, keepdims=False, initial=<no value>, where=True)¶ This docstring was copied from numpy.ndarray.max.
Some inconsistencies with the Dask version may exist.
Return the maximum along a given axis.
Refer to numpy.amax for full documentation.
See also
numpy.amax
- equivalent function
-
mean
(axis=None, dtype=None, out=None, keepdims=False)¶ This docstring was copied from numpy.ndarray.mean.
Some inconsistencies with the Dask version may exist.
Returns the average of the array elements along given axis.
Refer to numpy.mean for full documentation.
See also
numpy.mean
- equivalent function
-
min
(axis=None, out=None, keepdims=False, initial=<no value>, where=True)¶ This docstring was copied from numpy.ndarray.min.
Some inconsistencies with the Dask version may exist.
Return the minimum along a given axis.
Refer to numpy.amin for full documentation.
See also
numpy.amin
- equivalent function
-
moment
(order, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)¶ Calculate the nth centralized moment.
Parameters: order : int
Order of the moment that is returned, must be >= 2.
axis : int, optional
Axis along which the central moment is computed. The default is to compute the moment of the flattened array.
dtype : data-type, optional
Type to use in computing the moment. For arrays of integer type the default is float64; for arrays of float types it is the same as the array type.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original array.
ddof : int, optional
“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is zero.
Returns: moment : ndarray
References
[R233] Pebay, Philippe (2008), “Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments”, Technical Report SAND2008-6212, Sandia National Laboratories.
-
nbytes
¶ Number of bytes in array
-
nonzero
()¶ This docstring was copied from numpy.ndarray.nonzero.
Some inconsistencies with the Dask version may exist.
Return the indices of the elements that are non-zero.
Refer to numpy.nonzero for full documentation.
See also
numpy.nonzero
- equivalent function
-
partitions
¶ Slice an array by partitions. Alias of dask array .blocks attribute.
This alias allows you to write agnostic code that works with both dask arrays and dask dataframes.
This allows blockwise slicing of a Dask array. You can perform normal Numpy-style slicing but now rather than slice elements of the array you slice along blocks so, for example,
x.blocks[0, ::2]
produces a new dask array with every other block in the first row of blocks.You can index blocks in any way that could index a numpy array of shape equal to the number of blocks in each dimension, (available as array.numblocks). The dimension of the output array will be the same as the dimension of this array, even if integer indices are passed. This does not support slicing with
np.newaxis
or multiple lists.Returns: A Dask array Examples
>>> import dask.array as da >>> x = da.arange(10, chunks=2) >>> x.partitions[0].compute() array([0, 1]) >>> x.partitions[:3].compute() array([0, 1, 2, 3, 4, 5]) >>> x.partitions[::2].compute() array([0, 1, 4, 5, 8, 9]) >>> x.partitions[[-1, 0]].compute() array([8, 9, 0, 1]) >>> all(x.partitions[:].compute() == x.blocks[:].compute()) True
-
prod
(axis=None, dtype=None, out=None, keepdims=False, initial=1, where=True)¶ This docstring was copied from numpy.ndarray.prod.
Some inconsistencies with the Dask version may exist.
Return the product of the array elements over the given axis
Refer to numpy.prod for full documentation.
See also
numpy.prod
- equivalent function
-
ravel
([order])¶ This docstring was copied from numpy.ndarray.ravel.
Some inconsistencies with the Dask version may exist.
Return a flattened array.
Refer to numpy.ravel for full documentation.
See also
numpy.ravel
- equivalent function
ndarray.flat
- a flat iterator on the array.
-
rechunk
(chunks='auto', threshold=None, block_size_limit=None)¶ See da.rechunk for docstring
-
repeat
(repeats, axis=None)¶ This docstring was copied from numpy.ndarray.repeat.
Some inconsistencies with the Dask version may exist.
Repeat elements of an array.
Refer to numpy.repeat for full documentation.
See also
numpy.repeat
- equivalent function
-
reshape
(shape, order='C')¶ This docstring was copied from numpy.ndarray.reshape.
Some inconsistencies with the Dask version may exist.
Returns an array containing the same data with a new shape.
Refer to numpy.reshape for full documentation.
See also
numpy.reshape
- equivalent function
Notes
Unlike the free function numpy.reshape, this method on ndarray allows the elements of the shape parameter to be passed in as separate arguments. For example,
a.reshape(10, 11)
is equivalent toa.reshape((10, 11))
.
-
round
(decimals=0, out=None)¶ This docstring was copied from numpy.ndarray.round.
Some inconsistencies with the Dask version may exist.
Return a with each element rounded to the given number of decimals.
Refer to numpy.around for full documentation.
See also
numpy.around
- equivalent function
-
size
¶ Number of elements in array
-
squeeze
(axis=None)¶ This docstring was copied from numpy.ndarray.squeeze.
Some inconsistencies with the Dask version may exist.
Remove single-dimensional entries from the shape of a.
Refer to numpy.squeeze for full documentation.
See also
numpy.squeeze
- equivalent function
-
std
(axis=None, dtype=None, out=None, ddof=0, keepdims=False)¶ This docstring was copied from numpy.ndarray.std.
Some inconsistencies with the Dask version may exist.
Returns the standard deviation of the array elements along given axis.
Refer to numpy.std for full documentation.
See also
numpy.std
- equivalent function
-
store
(targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs)¶ Store dask arrays in array-like objects, overwrite data in target
This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.
If your data fits in memory then you may prefer calling
np.array(myarray)
instead.Parameters: sources: Array or iterable of Arrays
targets: array-like or Delayed or iterable of array-likes and/or Delayeds
These should support setitem syntax
target[10:20] = ...
lock: boolean or threading.Lock, optional
Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular
threading.Lock
object to be shared among all writes.regions: tuple of slices or list of tuples of slices
Each
region
tuple inregions
should be such thattarget[region].shape = source.shape
for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.compute: boolean, optional
If true compute immediately, return
dask.delayed.Delayed
otherwisereturn_stored: boolean, optional
Optionally return the stored result (default False).
Examples
>>> x = ... # doctest: +SKIP
>>> import h5py # doctest: +SKIP >>> f = h5py.File('myfile.hdf5', mode='a') # doctest: +SKIP >>> dset = f.create_dataset('/data', shape=x.shape, ... chunks=x.chunks, ... dtype='f8') # doctest: +SKIP
>>> store(x, dset) # doctest: +SKIP
Alternatively store many arrays at the same time
>>> store([x, y, z], [dset1, dset2, dset3]) # doctest: +SKIP
-
sum
(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)¶ This docstring was copied from numpy.ndarray.sum.
Some inconsistencies with the Dask version may exist.
Return the sum of the array elements over the given axis.
Refer to numpy.sum for full documentation.
See also
numpy.sum
- equivalent function
-
swapaxes
(axis1, axis2)¶ This docstring was copied from numpy.ndarray.swapaxes.
Some inconsistencies with the Dask version may exist.
Return a view of the array with axis1 and axis2 interchanged.
Refer to numpy.swapaxes for full documentation.
See also
numpy.swapaxes
- equivalent function
-
to_dask_dataframe
(columns=None, index=None)¶ Convert dask Array to dask Dataframe
Parameters: columns: list or string
list of column names if DataFrame, single string if Series
index : dask.dataframe.Index, optional
An optional dask Index to use for the output Series or DataFrame.
The default output index depends on whether the array has any unknown chunks. If there are any unknown chunks, the output has
None
for all the divisions (one per chunk). If all the chunks are known, a default index with known divsions is created.Specifying
index
can be useful if you’re conforming a Dask Array to an existing dask Series or DataFrame, and you would like the indices to match.See also
-
to_delayed
(optimize_graph=True)¶ Convert into an array of
dask.delayed
objects, one per chunk.Parameters: optimize_graph : bool, optional
If True [default], the graph is optimized before converting into
dask.delayed
objects.See also
-
to_hdf5
(filename, datapath, **kwargs)¶ Store array in HDF5 file
>>> x.to_hdf5('myfile.hdf5', '/x') # doctest: +SKIP
Optionally provide arguments as though to
h5py.File.create_dataset
>>> x.to_hdf5('myfile.hdf5', '/x', compression='lzf', shuffle=True) # doctest: +SKIP
See also
da.store
,h5py.File.create_dataset
-
to_svg
(size=500)¶ Convert chunks from Dask Array into an SVG Image
Parameters: chunks: tuple
size: int
Rough size of the image
Returns: text: An svg string depicting the array as a grid of chunks
Examples
>>> x.to_svg(size=500) # doctest: +SKIP
-
to_tiledb
(uri, *args, **kwargs)¶ Save array to the TileDB storage manager
See function
to_tiledb()
for argument documentation.See https://docs.tiledb.io for details about the format and engine.
-
to_zarr
(*args, **kwargs)¶ Save array to the zarr storage format
See https://zarr.readthedocs.io for details about the format.
See function
to_zarr()
for parameters.
-
topk
(k, axis=-1, split_every=None)¶ The top k elements of an array.
See
da.topk
for docstring
-
trace
(offset=0, axis1=0, axis2=1, dtype=None, out=None)¶ This docstring was copied from numpy.ndarray.trace.
Some inconsistencies with the Dask version may exist.
Return the sum along diagonals of the array.
Refer to numpy.trace for full documentation.
See also
numpy.trace
- equivalent function
-
transpose
(*axes)¶ This docstring was copied from numpy.ndarray.transpose.
Some inconsistencies with the Dask version may exist.
Returns a view of the array with axes transposed.
For a 1-D array this has no effect, as a transposed vector is simply the same vector. To convert a 1-D array into a 2D column vector, an additional dimension must be added. np.atleast2d(a).T achieves this, as does a[:, np.newaxis]. For a 2-D array, this is a standard matrix transpose. For an n-D array, if axes are given, their order indicates how the axes are permuted (see Examples). If axes are not provided and
a.shape = (i[0], i[1], ... i[n-2], i[n-1])
, thena.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0])
.Parameters: axes : None, tuple of ints, or n ints
- None or no argument: reverses the order of the axes.
- tuple of ints: i in the j-th place in the tuple means a’s i-th axis becomes a.transpose()’s j-th axis.
- n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form)
Returns: out : ndarray
View of a, with axes suitably permuted.
See also
ndarray.T
- Array property returning the array transposed.
ndarray.reshape
- Give a new shape to an array without changing its data.
Examples
>>> a = np.array([[1, 2], [3, 4]]) # doctest: +SKIP >>> a # doctest: +SKIP array([[1, 2], [3, 4]]) >>> a.transpose() # doctest: +SKIP array([[1, 3], [2, 4]]) >>> a.transpose((1, 0)) # doctest: +SKIP array([[1, 3], [2, 4]]) >>> a.transpose(1, 0) # doctest: +SKIP array([[1, 3], [2, 4]])
-
var
(axis=None, dtype=None, out=None, ddof=0, keepdims=False)¶ This docstring was copied from numpy.ndarray.var.
Some inconsistencies with the Dask version may exist.
Returns the variance of the array elements, along given axis.
Refer to numpy.var for full documentation.
See also
numpy.var
- equivalent function
-
view
(dtype=None, order='C')¶ Get a view of the array as a new data type
Parameters: dtype:
The dtype by which to view the array. The default, None, results in the view having the same data-type as the original array.
order: string
‘C’ or ‘F’ (Fortran) ordering
This reinterprets the bytes of the array under a new dtype. If that
dtype does not have the same size as the original array then the shape
will change.
Beware that both numpy and dask.array can behave oddly when taking
shape-changing views of arrays under Fortran ordering. Under some
versions of NumPy this function will fail when taking shape-changing
views of Fortran ordered arrays if the first dimension has chunks of
size one.
-
vindex
¶ Vectorized indexing with broadcasting.
This is equivalent to numpy’s advanced indexing, using arrays that are broadcast against each other. This allows for pointwise indexing:
>>> x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> x = from_array(x, chunks=2) >>> x.vindex[[0, 1, 2], [0, 1, 2]].compute() array([1, 5, 9])
Mixed basic/advanced indexing with slices/arrays is also supported. The order of dimensions in the result follows those proposed for ndarray.vindex: the subspace spanned by arrays is followed by all slices.
Note:
vindex
provides more general functionality than standard indexing, but it also has fewer optimizations and can be significantly slower.
-