NumPy Copying and Indexing

NumPy Copying and Indexing#

import numpy as np

Copying Arrays#

Important

Simply using = does not make a copy, but much like with lists, you will just have multiple names pointing to the same ndarray object

Therefore, we need to understand if two arrays, A and B point to:

  • the same array, including shape and data/memory space

  • the same data/memory space, but perhaps different shapes (a view)

  • a separate copy of the data (i.e. stored completely separately in memory)

All of these are possible.

  • B = A

    this is assignment. No copy is made. A and B point to the same data in memory and share the same shape, etc. They are just two different labels for the same object in memory

  • B = A[:]

    this is a view or shallow copy. The shape info for A and B are stored independently, but both point to the same memory location for data

  • B = A.copy()

    this is a deep copy. A completely separate object will be created in memory, with a completely separate location in memory.

Let’s look at examples

a = np.arange(10)
print(a)
[0 1 2 3 4 5 6 7 8 9]

Here is assignment—we can just use the is operator to test for equality

b = a
b is a
True

Since b and a are the same, changes to the shape of one are reflected in the other—no copy is made.

b.shape = (2, 5)
print(b)
a.shape
[[0 1 2 3 4]
 [5 6 7 8 9]]
(2, 5)
b is a
True
print(a)
[[0 1 2 3 4]
 [5 6 7 8 9]]

a shallow copy creates a new view into the array—the data is the same, but the array properties can be different

a = np.arange(12)
c = a[:]
a.shape = (3,4)

print(a)
print(c)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[ 0  1  2  3  4  5  6  7  8  9 10 11]

since the underlying data is the same memory, changing an element of one is reflected in the other

c[1] = -1
print(a)
[[ 0 -1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Even slices into an array are just views, still pointing to the same memory

d = c[3:8]
print(d)
[3 4 5 6 7]
d[:] = 0 
print(a)
print(c)
print(d)
[[ 0 -1  2  0]
 [ 0  0  0  0]
 [ 8  9 10 11]]
[ 0 -1  2  0  0  0  0  0  8  9 10 11]
[0 0 0 0 0]

There are lots of ways to inquire if two arrays are the same, views, own their own data, etc

print(c is a)
print(c.base is a)
print(c.flags.owndata)
print(a.flags.owndata)
False
True
False
True

to make a copy of the data of the array that you can deal with independently of the original, you need a deep copy

d = a.copy()
d[:,:] = 0.0

print(a)
print(d)
[[ 0 -1  2  0]
 [ 0  0  0  0]
 [ 8  9 10 11]]
[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

Boolean Indexing#

There are lots of fun ways to index arrays to access only those elements that meet a certain condition

a = np.arange(12).reshape(3,4)
a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Here we set all the elements in the array that are > 4 to zero

a[a > 4] = 0
a
array([[0, 1, 2, 3],
       [4, 0, 0, 0],
       [0, 0, 0, 0]])

and now, all the zeros to -1

a[a == 0] = -1
a
array([[-1,  1,  2,  3],
       [ 4, -1, -1, -1],
       [-1, -1, -1, -1]])
a == -1
array([[ True, False, False, False],
       [False,  True,  True,  True],
       [ True,  True,  True,  True]])

if we have 2 tests, we need to use logical_and() or logical_or()

a = np.arange(12).reshape(3,4)
a[np.logical_and(a > 3, a <= 9)] = 0.0
a
array([[ 0,  1,  2,  3],
       [ 0,  0,  0,  0],
       [ 0,  0, 10, 11]])

Our test that we index the array with returns a boolean array of the same shape:

a > 4
array([[False, False, False, False],
       [False, False, False, False],
       [False, False,  True,  True]])