NumPy Copying and Indexing

NumPy Copying and Indexing#

import numpy as np

Copying Arrays#


Simply using = does not make a copy, but much like with lists, you will just have multiple names pointing to the same ndarray object

Therefore, we need to understand if two arrays, A and B point to:

  • the same array, including shape and data/memory space

  • the same data/memory space, but perhaps different shapes (a view)

  • a separate copy of the data (i.e. stored completely separately in memory)

All of these are possible.

  • B = A

    this is assignment. No copy is made. A and B point to the same data in memory and share the same shape, etc. They are just two different labels for the same object in memory

  • B = A[:]

    this is a view or shallow copy. The shape info for A and B are stored independently, but both point to the same memory location for data

  • B = A.copy()

    this is a deep copy. A completely separate object will be created in memory, with a completely separate location in memory.

Let’s look at examples

a = np.arange(10)
[0 1 2 3 4 5 6 7 8 9]

Here is assignment—we can just use the is operator to test for equality

b = a
b is a

Since b and a are the same, changes to the shape of one are reflected in the other—no copy is made.

b.shape = (2, 5)
[[0 1 2 3 4]
 [5 6 7 8 9]]
(2, 5)
b is a
[[0 1 2 3 4]
 [5 6 7 8 9]]

a shallow copy creates a new view into the array—the data is the same, but the array properties can be different

a = np.arange(12)
c = a[:]
a.shape = (3,4)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[ 0  1  2  3  4  5  6  7  8  9 10 11]

since the underlying data is the same memory, changing an element of one is reflected in the other

c[1] = -1
[[ 0 -1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Even slices into an array are just views, still pointing to the same memory

d = c[3:8]
[3 4 5 6 7]
d[:] = 0 
[[ 0 -1  2  0]
 [ 0  0  0  0]
 [ 8  9 10 11]]
[ 0 -1  2  0  0  0  0  0  8  9 10 11]
[0 0 0 0 0]

There are lots of ways to inquire if two arrays are the same, views, own their own data, etc

print(c is a)
print(c.base is a)

to make a copy of the data of the array that you can deal with independently of the original, you need a deep copy

d = a.copy()
d[:,:] = 0.0

[[ 0 -1  2  0]
 [ 0  0  0  0]
 [ 8  9 10 11]]
[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

Boolean Indexing#

There are lots of fun ways to index arrays to access only those elements that meet a certain condition

a = np.arange(12).reshape(3,4)
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Here we set all the elements in the array that are > 4 to zero

a[a > 4] = 0
array([[0, 1, 2, 3],
       [4, 0, 0, 0],
       [0, 0, 0, 0]])

and now, all the zeros to -1

a[a == 0] = -1
array([[-1,  1,  2,  3],
       [ 4, -1, -1, -1],
       [-1, -1, -1, -1]])
a == -1
array([[ True, False, False, False],
       [False,  True,  True,  True],
       [ True,  True,  True,  True]])

if we have 2 tests, we need to use logical_and() or logical_or()

a = np.arange(12).reshape(3,4)
a[np.logical_and(a > 3, a <= 9)] = 0.0
array([[ 0,  1,  2,  3],
       [ 0,  0,  0,  0],
       [ 0,  0, 10, 11]])

Our test that we index the array with returns a boolean array of the same shape:

a > 4
array([[False, False, False, False],
       [False, False, False, False],
       [False, False,  True,  True]])