#
C/C++ Programming|Help With SPSS|Help With R Programming| Haskell Programming

Assignment 4 : L-Layer Deep Neural Network

Assignment:

1) Build a deep NN to recognize cat pictures (using the same dataset you have).

2) Estimate training and testing accuracies.

3) Use L=5; i.e., 4 hidden layers. Use number of hidden units 22,10,7,5

4) Use Relu activation for all hidden units and sigmoid for the output layer

5) Try again for L=7 (6 hidden layers); use # hidden units 30,22,10,7,5,3. Does

this help?

Details

Adapt same concepts explained in the previous assignments to an L-layer NN

You will adjust your helper functions (initialization, Forprop, Backprop) to be called multple times while

you are running a for loop over the layers

The output layer has a different structure than the other layers (different activation (sigmoid) and different

dimensions)

Remember, The input is a (64,64,3) image which is flattened to a vector of size (12288,1).

You take the sigmoid of the final linear unit. If it is greater than 0.5, you classify it to be a cat.

Equations:

1. Forward propagation:

Linear equation (where )

Activation

1. Backward propagation: The three outputs are computed using the input .

Here are the formulas you need:

If is the activation function, compute

.

In order to initialize backpropagation; you need derivative of : Use the following code to compute this

derivative

Recommended Steps

1. Store the dimesnions of the network in an array layer_dims. The len(layer_dims)= L+1 and it consists of

the number of inputs/hidden untis of each layer

2. Initialize the parameters for an -layer neural network.

3. Implement the "Forprop" module:

Complete the "linear part" of a layer's forward propagation step (resulting in ).

Use activation function relu for all layers except the output layer, use sigmoid.

Combine the previous two steps into a new "linear-activation" forward function.

Stack the "linear-Relu" forward function L-1 time and add a "linear-sigmoid" at the end (for the

output layer ). This gives you a new "deep-model_forward" function.

4. Compute the loss.

5. Implement the "Backprop" module:

Complete the "linear" part of a layer's backward propagation step.

Use the derivative of relu and sigmoid accordingly (have separate functions for these)

Combine the previous two steps into a new "linear-Activation" backward function.

Start with "linear-sigmoid" backward and then Stack "linear-Relu" backward L-1 times in a new

"deep-model_backward" function.

6. Merge all the functions above in an L-layer-Model (To train your model)

7. Finally update the parameters; and compute accuracies.

Recommended Functions:

1. Initialization: The initialization for a deeper L-layer neural network requires a for loop over the layers.

You should make sure that your dimensions match between each layer. Use random initialization for the

weight matrices. Use np.random.randn(shape) * 0.01. Use zeros initialization for the biases. Use

np.zeros(shape)

I/p:

layer_dims -- python array containing the dimensions of each layer (including the input layer); of

length L+1

O/p:

parameters -- python dictionary containing your initialized parameters "W1", "b1", ..., "WL", "bL":

Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])

bl -- bias vector of shape (layer_dims[l], 1)

1. Implement the forward propagation (3 functions)

Function 1: Implement the linear part of a layer's forward propagation.

Inputs:

A -- activations from previous layer (or input data)

W -- weights matrix of shape (size of current layer x size of previous layer)

b -- bias vector of shape. (size of the current layer, 1)

Outputs:

Z -- the input of the activation function

cache -- dictionary containing "A", "W" and "b"

Function 2 Implement the activation part of a layer's forward propagation

Inputs:

dAL = −(np. divide(Y , AL) − np. divide(1 − Y , 1 − AL))

L

Z

[l]

L

11/1/2020 L-Layer_deep-NN

https://blackboard.udmercy.edu/bbcswebdav/pid-1563288-dt-content-rid-25593194_1/courses/17052_ELEE5940-02_2021/L-Layer_deep-NN.html 4/6

A_prev -- activations from previous layer (or input data): (size of previous layer, number of

examples)

W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)

b -- bias vector, numpy array of shape (size of the current layer, 1)

activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

Outputs:

A -- the output of the activation function

cache -- a python dictionary containing "linear_cache (A_prev, W, b)" and "activation_cache

(Z)"; stored for computing the backward pass efficiently

Function 3 Combine the two functions above together. Implement forward propagation for the "linearrelu"*(L-1)

and one final "linear-sigmoid" computation

Inputs:

X -- data (train or test)

parameters -- output of initialize_parameters function

Outputs:

AL -- last activation output

caches -- list of caches containing: every cache of linear_activation_forward() (there are L of

them; indexed by the layer index)

1. Compute cost

1. Implement Backpropagation: Compute the gradient of the loss function with respect to the network

parameters. (Again 3 functions)

Function 1: Implement the linear portion of backward propagation for a single layer. In particular,

Suppose you have already calculated the derivative . You want to get .

Inputs:

dZ -- Gradient of the cost with respect to the linear output (of current layer l)

cache -- (A_prev, W, b (linear cache), Z(activation cache)); from forward propagation in the

current layer

Outputs:

dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same

shape as A_prev

dW -- Gradient of the cost with respect to W (current layer l), same shape as W

db -- Gradient of the cost with respect to b (current layer l), same shape as b

Function 2: Remember, If is the activation function, we compute Here,

we want to implement the backward propagation for the "linear-activation" layer. is sigmoid for the

final layer and relu for the other L-1 hidden layers.

Inputs:

dA -- (post)activation gradient for current layer l

cache -- tuple of values (linear_cache, activation_cache)

activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

Outputs:

dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same

shape as A_prev

dW -- Gradient of the cost with respect to W (current layer l), same shape as W

db -- Gradient of the cost with respect to b (current layer l), same shape as b

dZ

[l] (dW , d d )

[l] b

[l] A[l−1]

g dZ = d ∗ ( )

[l] A[l] g

′ Z

[l]

g

11/1/2020 L-Layer_deep-NN

https://blackboard.udmercy.edu/bbcswebdav/pid-1563288-dt-content-rid-25593194_1/courses/17052_ELEE5940-02_2021/L-Layer_deep-NN.html 5/6

Function 3 Now you implement the backward function for the whole network. Remmeber; when you

implemented the forward function, at each iteration, you stored a cache which contains (A_l-1,W,b,

and Z). In the back propagation, you will use those variables to compute the gradients. Therefore, in

the backward function, you will iterate through all the hidden layers backward, starting from layer .

On each step, you will use the cached values for layer to backpropagate through layer . To sum

up, you need to (i) Initialize backward propagation (comput dAL); (ii) implement the backward

propagation for the Lth layer "linear-sigmoid"; and then implement "linear-relu" * (L-1)

Inputs:

AL -- output of the forward propagation of last layer (L_model_forward())

Y -- true "label" vector

caches -- list of caches containing:

every cache of linear_activation_forward() with "relu" (it's cache

s[l], for l in range(L-1) i.e l = 0...L-2)

the cache of linear_activation_forward() with "sigmoid" (it's cach

es[L-1])

Outputs:

grads -- A dictionary with the gradients

grads["dA"+ str(l)] = ...

grads["dW"+ str(l)] = ...

grads["db"+ str(l)] = ... for l=1,2,...L

1. Update parameters using gradient descent update rule (use forloop over the layers).

Inputs:

parameters -- python dictionary containing your parameters

grads -- python dictionary containing your gradients, output of L_model_backward

learning rate

Outputs:

parameters -- python dictionary containing your updated parameters

parameters["W" + str(l)] = ...

parameters["b" + str(l)] = ... for l=1,2,..L

1. L-layer NN Implements a L-layer neural network:

Inputs:

X -- data, (specifically train data examples matrix)

Y -- true "label" vector (train data)

layers_dims

learning_rate -- learning rate of GD

num_iterations -- number of iterations of GD

Outputs:

parameters -- parameters learnt by the model. Those can then be used to predict (and to

compute test/train error).

1. Prediction This function should predict the results of a L-layer neural network.

Inputs:

X -- data set of examples you would like to label (train or test)

parameters -- parameters of the trained model

Outputs:

Y_predicted -- predictions for the given dataset X

L

l l

11/1/2020 L-Layer_deep-NN

https://blackboard.udmercy.edu/bbcswebdav/pid-1563288-dt-content-rid-25593194_1/courses/17052_ELEE5940-02_2021/L-Layer_deep-NN.html 6/6

Accuracy printed

In [ ]: