esmraldi.sparsematrix

Module Contents

Classes

SparseMatrix

A sparse multidimensional array.

Functions

_find_start_end(mask)

delete(sparse, indices[, axis])

count_nonzero(arr)

zeros(shape[, dtype, order, like])

zeros_like(a[, dtype, order, subok, shape])

take(array, indices[, axis])

esmraldi.sparsematrix._find_start_end(mask)
esmraldi.sparsematrix.delete(sparse, indices, axis=0)
esmraldi.sparsematrix.count_nonzero(arr)
esmraldi.sparsematrix.zeros(shape, dtype=float, order='C', *, like=None)
esmraldi.sparsematrix.zeros_like(a, dtype=None, order='C', subok='True', shape=None)
esmraldi.sparsematrix.take(array, indices, axis=None, **kwargs)
class esmraldi.sparsematrix.SparseMatrix(coords, data=None, shape=None, has_duplicates=True, sorted=False, prune=False, cache=False, fill_value=None, idx_dtype=None, is_maybe_densify=True)

Bases: sparse.COO

A sparse multidimensional array.

This is stored in COO format. It depends on NumPy and Scipy.sparse for computation, but supports arrays of arbitrary dimension.

Parameters
coordsnumpy.ndarray (COO.ndim, COO.nnz)

An array holding the index locations of every value Should have shape (number of dimensions, number of non-zeros).

datanumpy.ndarray (COO.nnz,)

An array of Values. A scalar can also be supplied if the data is the same across all coordinates. If not given, defers to as_coo.

shapetuple[int] (COO.ndim,)

The shape of the array.

has_duplicatesbool, optional

A value indicating whether the supplied value for coords has duplicates. Note that setting this to False when coords does have duplicates may result in undefined behaviour. See COO.sum_duplicates

sortedbool, optional

A value indicating whether the values in coords are sorted. Note that setting this to True when coords isn’t sorted may result in undefined behaviour. See COO.sort_indices.

prunebool, optional

A flag indicating whether or not we should prune any fill-values present in data.

cachebool, optional

Whether to enable cacheing for various operations. See COO.enable_caching

fill_value: scalar, optional

The fill value for this array.

See also

DOK

A mostly write-only sparse array.

as_coo

Convert any given format to COO.

Examples

You can create COO objects from Numpy arrays.

>>> x = np.eye(4, dtype=np.uint8)
>>> x[2, 3] = 5
>>> s = COO.from_numpy(x)
>>> s
<COO: shape=(4, 4), dtype=uint8, nnz=5, fill_value=0>
>>> s.data  
array([1, 1, 1, 5, 1], dtype=uint8)
>>> s.coords  
array([[0, 1, 2, 2, 3],
       [0, 1, 2, 3, 3]])

COO objects support basic arithmetic and binary operations.

>>> x2 = np.eye(4, dtype=np.uint8)
>>> x2[3, 2] = 5
>>> s2 = COO.from_numpy(x2)
>>> (s + s2).todense()  
array([[2, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 2, 5],
       [0, 0, 5, 2]], dtype=uint8)
>>> (s * s2).todense()  
array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]], dtype=uint8)

Binary operations support broadcasting.

>>> x3 = np.zeros((4, 1), dtype=np.uint8)
>>> x3[2, 0] = 1
>>> s3 = COO.from_numpy(x3)
>>> (s * s3).todense()  
array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 1, 5],
       [0, 0, 0, 0]], dtype=uint8)

COO objects also support dot products and reductions.

>>> s.dot(s.T).sum(axis=0).todense()   
array([ 1,  1, 31,  6], dtype=uint64)

You can use Numpy ufunc operations on COO arrays as well.

>>> np.sum(s, axis=1).todense()  
array([1, 1, 6, 1], dtype=uint64)
>>> np.round(np.sqrt(s, dtype=np.float64), decimals=1).todense()   
array([[ 1. ,  0. ,  0. ,  0. ],
       [ 0. ,  1. ,  0. ,  0. ],
       [ 0. ,  0. ,  1. ,  2.2],
       [ 0. ,  0. ,  0. ,  1. ]])

Operations that will result in a dense array will usually result in a different fill value, such as the following.

>>> np.exp(s)
<COO: shape=(4, 4), dtype=float16, nnz=5, fill_value=1.0>

You can also create COO arrays from coordinates and data.

>>> coords = [[0, 0, 0, 1, 1],
...           [0, 1, 2, 0, 3],
...           [0, 3, 2, 0, 1]]
>>> data = [1, 2, 3, 4, 5]
>>> s4 = COO(coords, data, shape=(3, 4, 5))
>>> s4
<COO: shape=(3, 4, 5), dtype=int64, nnz=5, fill_value=0>

If the data is same across all coordinates, you can also specify a scalar.

>>> coords = [[0, 0, 0, 1, 1],
...           [0, 1, 2, 0, 3],
...           [0, 3, 2, 0, 1]]
>>> data = 1
>>> s5 = COO(coords, data, shape=(3, 4, 5))
>>> s5
<COO: shape=(3, 4, 5), dtype=int64, nnz=5, fill_value=0>

Following scipy.sparse conventions you can also pass these as a tuple with rows and columns

>>> rows = [0, 1, 2, 3, 4]
>>> cols = [0, 0, 0, 1, 1]
>>> data = [10, 20, 30, 40, 50]
>>> z = COO((data, (rows, cols)))
>>> z.todense()  
array([[10,  0],
       [20,  0],
       [30,  0],
       [ 0, 40],
       [ 0, 50]])

You can also pass a dictionary or iterable of index/value pairs. Repeated indices imply summation:

>>> d = {(0, 0, 0): 1, (1, 2, 3): 2, (1, 1, 0): 3}
>>> COO(d)
<COO: shape=(2, 3, 4), dtype=int64, nnz=3, fill_value=0>
>>> L = [((0, 0), 1),
...      ((1, 1), 2),
...      ((0, 0), 3)]
>>> COO(L).todense()  
array([[4, 0],
       [0, 2]])

You can convert DOK arrays to COO arrays.

>>> from sparse import DOK
>>> s6 = DOK((5, 5), dtype=np.int64)
>>> s6[1:3, 1:3] = [[4, 5], [6, 7]]
>>> s6
<DOK: shape=(5, 5), dtype=int64, nnz=4, fill_value=0>
>>> s7 = s6.asformat('coo')
>>> s7
<COO: shape=(5, 5), dtype=int64, nnz=4, fill_value=0>
>>> s7.todense()  
array([[0, 0, 0, 0, 0],
       [0, 4, 5, 0, 0],
       [0, 6, 7, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])
Attributes
coordsnumpy.ndarray (ndim, nnz)

An array holding the coordinates of every nonzero element.

datanumpy.ndarray (nnz,)

An array holding the values corresponding to COO.coords.

shapetuple[int] (ndim,)

The dimensions of this array.

__add__(other)
__sub__(other)
__mul__(other)
__div__(other)
__truediv__(other)
__floordiv__(other)
__mod__(other)
__pow__(other)
__iadd__(other)
__isub__(other)
__imul__(other)
__idiv__(other)
__itruediv__(other)
__ifloordiv__(other)
__imod__(other)
__ipow__(other)
get_nd_iterable_indices(keys)
__getitem__(key)
__setitem__(key, value)
broadcast_to(shape)

Performs the equivalent of numpy.broadcast_to for COO. Note that this function returns a new array instead of a view.

Parameters
shapetuple[int]

The shape to broadcast the data to.

Returns
COO

The broadcasted sparse array.

Raises
ValueError

If the operand cannot be broadcast to the given shape.

See also

numpy.broadcast_to

NumPy equivalent function

__array_function__(func, types, args, kwargs)
__array_ufunc__(ufunc, method, *inputs, **kwargs)
view(dtype=None, type=None)
transpose(axes=None)

Returns a new array which has the order of the axes switched.

Parameters
axesIterable[int], optional

The new order of the axes compared to the previous one. Reverses the axes by default.

Returns
COO

The new array with the axes in the desired order.

See also

COO.T

A quick property to reverse the order of the axes.

numpy.ndarray.transpose

Numpy equivalent function.

Examples

We can change the order of the dimensions of any COO array with this function.

>>> x = np.add.outer(np.arange(5), np.arange(5)[::-1])
>>> x  
array([[4, 3, 2, 1, 0],
       [5, 4, 3, 2, 1],
       [6, 5, 4, 3, 2],
       [7, 6, 5, 4, 3],
       [8, 7, 6, 5, 4]])
>>> s = COO.from_numpy(x)
>>> s.transpose((1, 0)).todense()  
array([[4, 5, 6, 7, 8],
       [3, 4, 5, 6, 7],
       [2, 3, 4, 5, 6],
       [1, 2, 3, 4, 5],
       [0, 1, 2, 3, 4]])

Note that by default, this reverses the order of the axes rather than switching the last and second-to-last axes as required by some linear algebra operations.

>>> x = np.random.rand(2, 3, 4)
>>> s = COO.from_numpy(x)
>>> s.transpose().shape
(4, 3, 2)
flatten(order='C')

Returns a new COO array that is a flattened version of this array.

Returns
COO

The flattened output array.

Notes

The order parameter is provided just for compatibility with Numpy and isn’t actually supported.

Examples

>>> s = COO.from_numpy(np.arange(10))
>>> s2 = s.reshape((2, 5)).flatten()
>>> s2.todense()
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
reshape(shape, order='C')

Returns a new COO array that is a reshaped version of this array.

Parameters
shapetuple[int]

The desired shape of the output array.

Returns
COO

The reshaped output array.

See also

numpy.ndarray.reshape

The equivalent Numpy function.

Notes

The order parameter is provided just for compatibility with Numpy and isn’t actually supported.

Examples

>>> s = COO.from_numpy(np.arange(25))
>>> s2 = s.reshape((5, 5))
>>> s2.todense()  
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])