NumPy Copying and Indexing

NumPy Copying and Indexing#

Copying Arrays#

Important

Simply using = does not make a copy, but much like with lists, you will just have multiple names pointing to the same ndarray object

Therefore, we need to understand if two arrays, A and B point to:

  • the same array, including shape and data/memory space

  • the same data/memory space, but perhaps different shapes (a view)

  • a separate copy of the data (i.e. stored completely separately in memory)

All of these are possible.

  • B = A

    this is assignment. No copy is made. A and B point to the same data in memory and share the same shape, etc. They are just two different labels for the same object in memory

  • B = A[:]

    this is a view or shallow copy. The shape info for A and B are stored independently, but both point to the same memory location for data

  • B = A.copy()

    this is a deep copy. A completely separate object will be created in memory, with a completely separate location in memory.

Let’s look at examples

a = np.arange(10)
print(a)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 a = np.arange(10)
      2 print(a)

NameError: name 'np' is not defined

Here is assignment—we can just use the is operator to test for equality

b = a
b is a

Since b and a are the same, changes to the shape of one are reflected in the other—no copy is made.

b.shape = (2, 5)
print(b)
a.shape
b is a
print(a)

a shallow copy creates a new view into the array—the data is the same, but the array properties can be different

a = np.arange(12)
c = a[:]
a.shape = (3,4)

print(a)
print(c)

since the underlying data is the same memory, changing an element of one is reflected in the other

c[1] = -1
print(a)

Even slices into an array are just views, still pointing to the same memory

d = c[3:8]
print(d)
d[:] = 0 
print(a)
print(c)
print(d)

There are lots of ways to inquire if two arrays are the same, views, own their own data, etc

print(c is a)
print(c.base is a)
print(c.flags.owndata)
print(a.flags.owndata)

to make a copy of the data of the array that you can deal with independently of the original, you need a deep copy

d = a.copy()
d[:,:] = 0.0

print(a)
print(d)

Boolean Indexing#

There are lots of fun ways to index arrays to access only those elements that meet a certain condition

a = np.arange(12).reshape(3,4)
a

Here we set all the elements in the array that are > 4 to zero

a[a > 4] = 0
a

and now, all the zeros to -1

a[a == 0] = -1
a
a == -1

if we have 2 tests, we need to use logical_and() or logical_or()

a = np.arange(12).reshape(3,4)
a[np.logical_and(a > 3, a <= 9)] = 0.0
a

Our test that we index the array with returns a boolean array of the same shape:

a > 4