NumPy Copying and Indexing#
Copying Arrays#
Important
Simply using =
does not make a copy, but much like with lists, you will just have multiple names pointing to the same ndarray object
Therefore, we need to understand if two arrays, A
and B
point to:
the same array, including shape and data/memory space
the same data/memory space, but perhaps different shapes (a view)
a separate copy of the data (i.e. stored completely separately in memory)
All of these are possible.
B = A
this is assignment. No copy is made.
A
andB
point to the same data in memory and share the same shape, etc. They are just two different labels for the same object in memoryB = A[:]
this is a view or shallow copy. The shape info for A and B are stored independently, but both point to the same memory location for data
B = A.copy()
this is a deep copy. A completely separate object will be created in memory, with a completely separate location in memory.
Let’s look at examples
a = np.arange(10)
print(a)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[1], line 1
----> 1 a = np.arange(10)
2 print(a)
NameError: name 'np' is not defined
Here is assignment—we can just use the is
operator to test for equality
b = a
b is a
Since b
and a
are the same, changes to the shape of one are reflected in the other—no copy is made.
b.shape = (2, 5)
print(b)
a.shape
b is a
print(a)
a shallow copy creates a new view into the array—the data is the same, but the array properties can be different
a = np.arange(12)
c = a[:]
a.shape = (3,4)
print(a)
print(c)
since the underlying data is the same memory, changing an element of one is reflected in the other
c[1] = -1
print(a)
Even slices into an array are just views, still pointing to the same memory
d = c[3:8]
print(d)
d[:] = 0
print(a)
print(c)
print(d)
There are lots of ways to inquire if two arrays are the same, views, own their own data, etc
print(c is a)
print(c.base is a)
print(c.flags.owndata)
print(a.flags.owndata)
to make a copy of the data of the array that you can deal with independently of the original, you need a deep copy
d = a.copy()
d[:,:] = 0.0
print(a)
print(d)
Boolean Indexing#
There are lots of fun ways to index arrays to access only those elements that meet a certain condition
a = np.arange(12).reshape(3,4)
a
Here we set all the elements in the array that are > 4 to zero
a[a > 4] = 0
a
and now, all the zeros to -1
a[a == 0] = -1
a
a == -1
if we have 2 tests, we need to use logical_and()
or logical_or()
a = np.arange(12).reshape(3,4)
a[np.logical_and(a > 3, a <= 9)] = 0.0
a
Our test that we index the array with returns a boolean array of the same shape:
a > 4