ndarray

The NDArray class in minima is a kind of container that can hold all sorts of elements, especially numbers (but for now just f32 numbers).

source

default_device

 default_device ()

source

cpu_numpy

 cpu_numpy ()

Return numpy device


source

BackendDevice

 BackendDevice (name, mod)

A backend device, wraps the implementation module.


source

NDArray

 NDArray (value:Union[ForwardRef('NDArray'),numpy.ndarray,Sequence],
          device:Optional[__main__.BackendDevice]=None)

NDArray represents a n-dimensional array with operations that can be performed on multiple devices. This class is an abstraction over numpy and other backend devices, providing a unified interface to interact with arrays.

Use cases of this class include numerical operations, scientific computing, and machine learning.

Type Default Details
value typing.Union[ForwardRef(‘NDArray’), numpy.ndarray, typing.Sequence] The value on which to create the NDArray from
device typing.Optional[main.BackendDevice] None The device on which the array computations are performed.
Returns None

The concept of “strides” is crucial to understanding how multi-dimensional arrays are stored and accessed in memory. The term “stride” refers to the number of elements (or steps) you need to move in memory to go from one element to the next along a particular axis of an array.

In the context of PyTorch and many other libraries that work with multi-dimensional arrays, the strides attribute of a tensor gives you the number of elements you need to skip in memory to move one step along each dimension of the tensor.

The strides of a tensor are defined as a tuple of integers, where each integer represents the step size for the corresponding dimension.

Let’s look at your example:

x = torch.arange(24).reshape(2,3,4)
print(x.stride()) # outputs: (12, 4, 1)

Here, the tensor x has a shape of (2,3,4), and the stride is (12,4,1).

  • The first element of the stride tuple, 12, tells you that you need to step over 12 elements in memory to get from one element to the next along the first axis (axis=0, the one that has size 2). This makes sense, because there are 12 elements in each “block” of this dimension (3*4 = 12).

  • The second element, 4, says that you need to step over 4 elements in memory to move from one element to the next along the second axis (axis=1, the one that has size 3). This is because there are 4 elements in each “row” of this dimension.

  • The last element, 1, shows that you only need to step over 1 element in memory to move from one element to the next along the last axis (axis=2, the one that has size 4). This is because elements along this axis are contiguous in memory.

In conclusion, the concept of strides is critical for efficient storage and computations on multi-dimensional arrays, as it allows libraries like PyTorch to perform complex operations without needing to actually rearrange or copy any data in memory.

Let’s start with a simple 1D case. Imagine we have an array of size 10 and we want to select every second element. Instead of physically copying every second element to a new array, we could just create a new “view” of the array with a stride of 2. This means that to move to the next element in our sliced array, we jump over 2 elements in the original data.

For multi-dimensional arrays, the principle is the same but each dimension has its own stride. When you slice a tensor, you’re essentially creating a new tensor (a view) that starts from a different offset and possibly uses different strides.

Consider a 2D case: if you slice the first dimension (e.g., array[1:, :]), you’re changing the offset to start from the second element along that dimension. Essentially, you’re jumping over a number of elements equal to the stride of that dimension.

If you slice the second dimension (e.g., array[:, ::2] to select every second column), you’re not changing the offset, but you’re doubling the stride for the second dimension. This tells the tensor to skip one element in memory for every step in that dimension, giving you every second column.

In summary, slicing doesn’t involve any data copying. Instead, it changes the starting point (offset) and how you move along each dimension (stride) of the tensor. This makes slicing operations very efficient, even on large tensors.

import nbdev; nbdev.nbdev_export()