More NumPy#
Copying arrays#
Important
Simply using = does not make a copy, but much like with lists,
you will just have multiple names pointing to the same ndarray
object
Therefore, we need to understand if two arrays, A and B point to:
the same array, including shape and data/memory space
the same data/memory space, but perhaps different shapes (a view)
a separate copy of the data (i.e. stored completely separately in memory)
All of these are possible.
Let’s look at different ways to copy:
B = Athis is assignment. No copy is made.
AandBpoint to the same data in memory and share the same shape, etc. They are just two different labels for the same object in memoryThis is essentially equivalent to the C++ behavior of:
std::vector<double> A; auto& B = A;
B = A[:]this is a view or shallow copy. The shape, stride, … info for
AandBare stored independently, but both point to the same memory location for data.In some sense, you can think of
Bas containing a pointer to the data inA.B = A.copy()this is a deep copy. A completely separate object will be created in memory, with the elements from
Acopied into theB‘s memory. After this, there is no connection betweenAandB.
Views#
Consider the following:
>>> q = np.array([[1, 2, 3, 2, 1],
... [2, 4, 4, 4, 2],
... [3, 4, 4, 4, 3],
... [2, 4, 4, 4, 2],
... [1, 2, 3, 2, 1]])
>>> a = q[1:4, 1:4]
>>> a
array([[4, 4, 4],
[4, 4, 4],
[4, 4, 4]])
>>> a[:, :] = 0
>>> q
array([[1, 2, 3, 2, 1],
[2, 0, 0, 0, 2],
[3, 0, 0, 0, 3],
[2, 0, 0, 0, 2],
[1, 2, 3, 2, 1]])
Here we:
create a view
ainto arrayqthat just consists of the middle 3×3 array of elements.zero out the elements of
aprint out
qand notice that the change we made toais reflected inq.
A view shares the underlying data of the original array, but has separate metadata (size, shape, etc.). Views in NumPy allow us to do efficient operations on portions of arrays.
try it…
Create an array as:
a = np.arange(15)
now create a view, by slicing the entire array:
c = a[:]
If you reshape c to be 3×5, what happens to a?
Boolean indexing#
We can index arrays using expressions to avoid loops. This create a
mask of True and False that tells NumPy which elements to work
on.
Consider the following—we’ll zero out all the elements larger than 4:
>>> a = np.arange(12).reshape(3, 4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[a > 4] = 0
>>> a
array([[0, 1, 2, 3],
[4, 0, 0, 0],
[0, 0, 0, 0]])
Notice how we indexed a with a > 0. We can print that
expression itself out to see what it look like—this would be the
mask used for indexing the array.
>>> a > 4
array([[False, False, False, False],
[False, True, True, True],
[ True, True, True, True]])
Doing this explicitly with loops would involve something like:
for i in range(a.shape[0]):
for j in range(a.shape[1]):
if a[i, j] > 4:
a[i, j] = 0