# COMPSCI682 Help Session 1: Slicing and Broadcasting in Python

## 1. Python List and Numpy Array (ndarray)

### 1.1 List
- List is a collection of items. The items in a List can be numbers, strings, list, Numpy Array, etc. 

In [1]:
my_list = [1, '2', [3]]
my_list += [4.01]
print('my_list:', my_list)

my_list: [1, '2', [3], 4.01]


In [2]:
my_list.append(4)
print('my_list:', my_list)

my_list: [1, '2', [3], 4.01, 4]


In [3]:
my_list.append([5,6])
my_list += [7,8,9]
print('my_list:', my_list)

my_list: [1, '2', [3], 4.01, 4, [5, 6], 7, 8, 9]


- When the items in a List are all from the same data type, we can do math to the List. 

In [4]:
l1 = [0, 2, 3, 4]
s_l1 = [x**2 for x in l1]
print('s_l1:', s_l1)

# s_l1_long = []
# for i in range(0,len(l1)):
#     s_l1_long += [l1[i]**2]
# print('s_l1_long:', s_l1_long)

s_l1: [0, 4, 9, 16]


In [5]:
l1 = [0, 1, 2, 3, 4]
s_l1 = [x**2 for x in l1 if x % 2 == 0]
print('s_l1:', s_l1)

# s_l1_long = []
# for i in range(0,len(l1)):
#     if l1[i] % 2 == 0:
#         s_l1_long += [l1[i]**2]
# print('s_l1_long:', s_l1_long)

s_l1: [0, 4, 16]


### 1.2 NumPy Array

Numpy arrays are usually used to represent vectors and multidimensional arrays and it is optimized to perform various operations.

Advantage of Numpy Arrays over lists

**Size** - Numpy data structures take up less space

**Performance** - they have a need for speed and are faster than lists

**Functionality** - NumPy have optimized 
functions such as linear algebra operations built in.

In [6]:
import numpy as np


### 1.3 len, size, shape, indexing
#### 1.3.1 Lists can be used to represent arrays. In the example below, we have a list representing an array of size 2x4

#### 1.3.2 For the purpose of indexing lists representing multi-dimensional arrays we may need multiple square brackets depending on the specificity of access.





In [7]:
l = [[0, 1, 2, 3], [4, 5, 6, 7]]

In [8]:
print(l[1])

[4, 5, 6, 7]


In [9]:
print(l[1][2])

6


In [10]:
print(l[1, 2]) #TypeError - Numpy is compatible with this kind of indexing

TypeError: ignored

In [None]:
len(l)
l.size  #Error
l.shape #Error

#### 1.3.2 Numpy Array

In [11]:
a = np.arange(8).reshape((2,4))

In [12]:
print(a[1])
print(type(a[1]))

[4 5 6 7]
<class 'numpy.ndarray'>


In [13]:
print(a[1][2])
print(a[1, 2])

6
6


In [14]:
print('len(a) = ', len(a))
print('a.size = ', a.size)
print('a.shape = ', a.shape)

len(a) =  2
a.size =  8
a.shape =  (2, 4)


### 1.4 Transfer between List and Numpy Array

#### 1.4.1 List --> Numpy Array

In [15]:
l = [[0, 1, 2, 3], [4, 5, 6, 7]]
print(l)
print(type(l))
a = np.array(l)
print(a)
print(type(a))

[[0, 1, 2, 3], [4, 5, 6, 7]]
<class 'list'>
[[0 1 2 3]
 [4 5 6 7]]
<class 'numpy.ndarray'>


In [16]:
l = [[0, 1, 2, 3], [4, 5, 6, 7]]
a1 = np.asarray(l) 
print(a1)
print(type(a1))

[[0 1 2 3]
 [4 5 6 7]]
<class 'numpy.ndarray'>


#### 1.4.2 Numpy Array --> List

In [17]:
l = [[[0, 1, 2, 3],[2,3,4,5]], [[4, 5, 6, 7],[2,3,4,5]]]
a = np.array(l)
l1 = list(a)
print(type(l1))
print(l1)

l = [[0, 1, 2, 3], [4, 5, 6, 7]]
a = np.array(l)
l1 = list(a)
print(type(l1))
print(l1)

<class 'list'>
[array([[0, 1, 2, 3],
       [2, 3, 4, 5]]), array([[4, 5, 6, 7],
       [2, 3, 4, 5]])]
<class 'list'>
[array([0, 1, 2, 3]), array([4, 5, 6, 7])]


In [18]:
print(type(l1[0]))

<class 'numpy.ndarray'>


- We should use tolist() instead.

In [19]:
l2 = a.tolist()
print(type(l2))
print(l2)

<class 'list'>
[[0, 1, 2, 3], [4, 5, 6, 7]]


## 2. Slicing
- The slice() creates a slice object representing the set of indices specified by range.

### 2.1 Basic usage

In [20]:
a = np.arange(4)
print(a)

[0 1 2 3]


[start : stop : step]
- start - starting integer where the slicing of the object starts
- stop - integer until which the slicing takes place. The slicing stops at index stop - 1.
- step - integer value which determines the increment between each index for slicing

In [21]:
a[1:2:1]

array([1])

- default step size = 1

In [22]:
a[1:2]

array([1])

In [23]:
a[0:4:3]

array([0, 3])

In [24]:
a[0::3]

array([0, 3])

In [25]:
a[:]

array([0, 1, 2, 3])

### 2.2 Negative numbers
- Python supports using "negative numbers" to index into a string

- -1 means the last char
- -2 is the next to last
- and so on.

In [26]:
a = np.arange(4)
print(a)

[0 1 2 3]


In [27]:
a[-1]

3

In [28]:
a[-1:]

array([3])

In [29]:
a[-3:]

array([1, 2, 3])

- Let's see a 2-dimensional array.

In [30]:
a = np.arange(12).reshape(3,4)
print(a)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [31]:
a[:, -2:]

array([[ 2,  3],
       [ 6,  7],
       [10, 11]])

In [32]:
a[:, 0:-1:2]

array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])

- Let's see a 3-dimensional array.

In [33]:
a = np.arange(24).reshape((2, 3, 4))
print(a)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


In [34]:
a[1:2, 0:3, 1:3]

array([[[13, 14],
        [17, 18],
        [21, 22]]])

In [35]:
a[1, 0:3, 1:3]

array([[13, 14],
       [17, 18],
       [21, 22]])

In [36]:
a[0:2, -2, -1]

array([ 7, 19])

### 2.3 Change of dimensions

In [37]:
a = np.arange(24).reshape((6,4))
print(a.shape)

(6, 4)


In [38]:
print('----reshape----')
print(a.reshape((1,6,4)).shape)
print(a.reshape((6,1,4)).shape)
print(a.reshape((6,4,1)).shape)

print('----expand_dims----')
print(np.expand_dims(a,axis=0).shape)
print(np.expand_dims(a,axis=1).shape)
print(np.expand_dims(a,axis=2).shape)

----reshape----
(1, 6, 4)
(6, 1, 4)
(6, 4, 1)
----expand_dims----
(1, 6, 4)
(6, 1, 4)
(6, 4, 1)


### 2.4 Modify values
[Python tutor link](http://www.pythontutor.com/) is useful to visually understand how Python works.

#### 2.4.1 List
##### 2.4.1.1 m = l

In [39]:
l = list(range(5))

In [40]:
m = l
print('same object? ', m is l) #checks whether m and l refer to "the same object"
print('m: ', m)
print('l: ', l)

same object?  True
m:  [0, 1, 2, 3, 4]
l:  [0, 1, 2, 3, 4]


In [41]:
m[0] = -1
print('m: ', m)
print('l: ', l)

m:  [-1, 1, 2, 3, 4]
l:  [-1, 1, 2, 3, 4]


##### 2.4.1.2 m = l[:]

In [42]:
l = list(range(5))
m = l[:]
print('same object? ', m is l) #checks whether m and l refer to "the same object"
print('m: ', m)
print('l: ', l)

same object?  False
m:  [0, 1, 2, 3, 4]
l:  [0, 1, 2, 3, 4]


In [43]:
m[0] = -1
print('m: ', m)
print('l: ', l)

m:  [-1, 1, 2, 3, 4]
l:  [0, 1, 2, 3, 4]


##### 2.4.1.3 m = list(l)

In [44]:
l = list(range(5))
m = list(l)
print('same object? ', m is l) #checks whether m and l refer to "the same object"
print('m: ', m)
print('l: ', l)

same object?  False
m:  [0, 1, 2, 3, 4]
l:  [0, 1, 2, 3, 4]


In [45]:
m[0] = -1
print('m: ', m)
print('l: ', l)

m:  [-1, 1, 2, 3, 4]
l:  [0, 1, 2, 3, 4]


#### 2.3.2 Numpy array

In [46]:
import numpy as np
a = np.arange(5)

##### 2.3.2.1 b = a

In [47]:
b = a
print('same object? ', a is b)
print('a: ', a)
print('b: ', b)

same object?  True
a:  [0 1 2 3 4]
b:  [0 1 2 3 4]


In [48]:
b[1] = -1
print('a: ', a)
print('b: ', b)

a:  [ 0 -1  2  3  4]
b:  [ 0 -1  2  3  4]


##### 2.3.2.2 b = a[:]

In [49]:
a = np.arange(4).reshape((2,2))
b = a[:]
print('same object? ', a is b)
print('a: ', a)
print('b: ', b)

same object?  False
a:  [[0 1]
 [2 3]]
b:  [[0 1]
 [2 3]]


In [50]:
b[1] = -2
print('a: ', a)
print('b: ', b)

a:  [[ 0  1]
 [-2 -2]]
b:  [[ 0  1]
 [-2 -2]]


##### 2.3.2.3 b = np.array(a)

In [51]:
a = np.arange(5)
b = np.array(a)
print('same object? ', a is b)
print('a: ', a)
print('b: ', b)

same object?  False
a:  [0 1 2 3 4]
b:  [0 1 2 3 4]


In [52]:
b[1] = -2
print('a: ', a)
print('b: ', b)

a:  [0 1 2 3 4]
b:  [ 0 -2  2  3  4]


### 2.4 Advanced indexing
- Indexing with boolean array
- Indexing with integer list / array

In [53]:
a = np.arange(12).reshape((3,4))
print(a)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [54]:
idx = (a % 2 == 0)
print(idx)

[[ True False  True False]
 [ True False  True False]
 [ True False  True False]]


In [55]:
a[idx] = 5
print(a)

[[ 5  1  5  3]
 [ 5  5  5  7]
 [ 5  9  5 11]]


In [56]:
a = np.arange(12).reshape((3,4))
idx = (a[0]<3)
print(idx)
a[idx] # Error!

[ True  True  True False]


IndexError: ignored

In [57]:
print(a)
idx = (a[0]<3)
print(idx)
a[0, idx] = 100
print(a)


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[ True  True  True False]
[[100 100 100   3]
 [  4   5   6   7]
 [  8   9  10  11]]


In [58]:
a = np.arange(12).reshape((3,4))
b = a[1, a[0]<3]  #   creates a copy of the data because idx is a boolean array
b[0] = -10

- Q. Is "a" the same object as "b"?

In [59]:
print(b[0])
print(a[1, a[0]<3][0])

-10
4


- Indexing on a 2-dim array 

In [60]:
a = np.arange(12).reshape((3,4))
print(a)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


-Q. How do we access to 1, 6, 11 in "a"?

In [61]:
a[[0,1], [1, 2], [2, 3]]

IndexError: ignored

In [62]:
a[[0, 1, 2], [1, 2, 3]]

array([ 1,  6, 11])

In [63]:
a = np.arange(24).reshape((2, 3, 4))
print(a)
print(a[[0, 1], [1, 1], [1, 1]])

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
[ 5 17]


In [64]:
a = np.arange(5)
print(a)

[0 1 2 3 4]


In [65]:
ind = [2, 3, 1, 4, 0]
print(a[ind])

[2 3 1 4 0]


Suppose we want to change a set of common indices for all the rows, say the 1st and 3rd indices of the last dimension.

In [66]:

a = np.arange(24).reshape((2, 3, 4))
print(a)
a[:, : ,[0, 2]] = 10
print(a)


[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
[[[10  1 10  3]
  [10  5 10  7]
  [10  9 10 11]]

 [[10 13 10 15]
  [10 17 10 19]
  [10 21 10 23]]]


In [67]:
from scipy.spatial import distance
X = np.random.random((5000,3072))
Y = np.random.random((500,3072))
l2_dist = distance.cdist(X, Y, 'euclidean')
print(l2_dist.shape)
print(l2_dist)

(5000, 500)
[[22.72436978 22.30081732 22.30715942 ... 22.31939087 22.67177925
  22.67352269]
 [22.90378361 22.93328359 22.58244282 ... 22.5842375  23.06583346
  23.18932263]
 [22.4958348  22.57943911 22.81426585 ... 22.76861267 22.71157807
  22.86834135]
 ...
 [22.57269523 22.95367895 22.68607128 ... 22.75038732 23.25562061
  22.97985852]
 [22.31498996 21.95591962 22.52900591 ... 22.14882094 22.17903123
  22.43995973]
 [22.6627813  22.43508396 22.80824521 ... 22.49360153 22.95726492
  22.38657259]]


## 3. Broadcasting
### 3.1 The basic idea
- Universal functions: functions that apply elementwise on arrays  
    Examples: np.add, np.power, np.greater, np.log, np.absolute  
- Universal functions that takes two input arrays:  
    - Simplest case: two input arrays have same shape  
    - Two inputs with different shapes? Broadcasting!  
        Replicate values to make their shapes match  
        Can avoid making redundant copies
        
**A simple example:**

In [68]:
a = np.arange(12).reshape((3,4))
b = 1.1
c = np.arange(4)
d = np.arange(3)

print("a =",a)
print('----------------')
print("b =",b)
print('----------------')
print("c =",c)
print('----------------')
print("d =",d)
print('----------------')
print("a*b =\n",a * b)
print('----------------')
print("(a * b) + c =\n",(a * b) + c)
# print('----------------')
print("(a * b) + c + d=\n",(a * b) + c + d[:,np.newaxis])

a = [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
----------------
b = 1.1
----------------
c = [0 1 2 3]
----------------
d = [0 1 2]
----------------
a*b =
 [[ 0.   1.1  2.2  3.3]
 [ 4.4  5.5  6.6  7.7]
 [ 8.8  9.9 11.  12.1]]
----------------
(a * b) + c =
 [[ 0.   2.1  4.2  6.3]
 [ 4.4  6.5  8.6 10.7]
 [ 8.8 10.9 13.  15.1]]
(a * b) + c + d=
 [[ 0.   2.1  4.2  6.3]
 [ 5.4  7.5  9.6 11.7]
 [10.8 12.9 15.  17.1]]


Q. How can we add two inputs with different shapes?

In [69]:
b1 = np.arange(3).reshape((3,1))
print(b1)

[[0]
 [1]
 [2]]


In [70]:
b2 = np.arange(5).reshape((5,1)).T
print(b2)

[[0 1 2 3 4]]


In [71]:
b1_tile = np.tile(b1, (1,5))
print(b1_tile)

[[0 0 0 0 0]
 [1 1 1 1 1]
 [2 2 2 2 2]]


In [72]:
b2_tile = np.tile(b2, (3,1))
print(b2_tile)

[[0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]]


In [73]:
print(b1_tile+b2_tile)

[[0 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 6]]


Q. Are there any simpler way?

In [74]:
print(b1)

[[0]
 [1]
 [2]]


In [75]:
print(b2)

[[0 1 2 3 4]]


In [76]:
print(b1+b2)

[[0 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 6]]


### 3.2 The broadcasting rule
**Example:**  

In [77]:
A = np.arange(2*4*3).reshape((2,4,1,3))
B = np.arange(5).reshape((5,1))
print('A.shape: ', A.shape)
print('B.shape: ', B.shape)

A.shape:  (2, 4, 1, 3)
B.shape:  (5, 1)


In [78]:
C = A + B

Q. What is the shape of C?

In [79]:
print('C.shape: ', C.shape)

C.shape:  (2, 4, 5, 3)



1. If one array has smaller dimension, fill 1's at the beginning of its shape. Start from the last dimension and work forward
    - A.shape: (2, 4, 1, 3)
    - B.shape: (**1**, **1**, 5, 1)
- If one array has length 1 for the current dimension, replicate the values in that dimension
    - A.shape: (2, 4, **5**, 3)
    - B.shape: (**2**, **4**, 5, **3**)
- If either array has greater than 1 for a dimension, and two arrays don't match: report an error

In [80]:
A = np.arange(2*4*3).reshape((2,4,1,3))
B = np.arange(10).reshape((5,2))
print('A.shape: ', A.shape)
print('B.shape: ', B.shape)

A.shape:  (2, 4, 1, 3)
B.shape:  (5, 2)


Q. What will happen to the following code?

In [81]:
C = A + B

ValueError: ignored