COMP532 Machine Learning and BioInspired

COMP532 DEPARTMENT: Computer Science

SECOND SEMESTER EXAMINATIONS 2019/20

Machine Learning and BioInspired Optimization

INSTRUCTIONS TO CANDIDATES

1. Answer FOUR questions. If you attempt to answer more questions than the required

number of questions, the marks awarded for the excess question answered will be

discarded (starting with your lowest mark);

2. You should enter your answers in a word processing document, e.g., MS Word or LaTex

(Overleaf);

3. Some questions require a diagram - you can either just sketch this on plain paper, and then

scan/photograph, or using any drawing software (e.g., MS word, MS PowerPoint and Google

Drawings), and insert into the word processing document;

4. Having completed your answers, convert to PDF and submit via VITAL before 10am, Friday

22nd May. It is strongly suggested to submit your answers at least half an hour before the

deadline, i.e., 9:30am Friday 22nd May, in case you miss the deadline of submission;

5. You MUST complete the exam by the above stated ending time;

6. You can only submit ONCE;

7. Expected Writing Time: 3 hours.

8. Please do NOT include your name in your answer sheet.

9. This is an open-book assessment. Copying any material from another source, or colluding

with any other person in the preparation and production of this work will be considered

suspected academic misconduct and will be dealt with according to the University’s

Academic Integrity Policy.

10. The use of a calculator IS allowed.

PAPER CODE COMP532 page 1 of 5 Continued

1. Reinforcement Learning (25 marks)

(a) (8 marks) The following table is a list of action values at time t. Calculate the action

probabilities for a1, a2 and a3 according to Softmax Action Selection using the Boltzmann

distribution with τ = 2.

an a1 a2 a3

Qt(an) -1 4 2

(b) (10 marks) Consider the 2 × 2 deterministic grid world in Figure 1(a). Allowable moves

are shown by arrows, and the numbers indicate the reward for performing each action.

At time step t, a Q-learning agent has computed Q values as given in Figure 1(b); e.g.,

Q(LowerLeft, Up) = 15 and Q(LowerLeft, Right) = 20. Show the changes in the Q

value estimates when the agent subsequently takes the path shown by the dotted line

in Figure 1(c) (the agent starts in the lower left cell) when learning rate α = 0.5 and

discount factor γ = 0.75. Clearly show how you computed your answer.

Figure 1: The deterministic grid world.

(c) (7 marks) Draw the backup diagram of the Temporal Difference learning method. In addition, explain bootstrapping and sampling using the diagram.

2. Multi-Agent Reinforcement Learning (25 marks)

(a) (15 marks) Below is the payoff matrix for the Prisoners Dilemma game. (Row player: x;

Column player: y; C: cooperate; D: Defect)

C D

C (3, 3) (0, 5)

D (5, 0) (1, 1)

(1) State the Nash Equilibrium of the game and explain your answer.

(2) Consider that at some time t that x = (0.3, 0.7) and y = (0.6, 0.4), where the first

argument in the vectors is the probability of cooperating and the second argument is the

probability of defecting for players x and y respectively.

According to the replicator dynamics, calculate ˙x1 and ˙y1, i.e., the change in probability

of playing the first action (C), for both players. Show how you computed your answer.

PAPER CODE COMP532 page 2 of 5 Continued

(b) (5 marks) Fig. 2 is the direction field plot of the Rock-Paper-Scissors game. According

to the plot, is there a Nash equilibrium for the game? If yes, is the Nash equilibrium

evolutionarily stable? Explain your answer.

Figure 2: The direction field for the Rock-Paper-Scissors game.

MDPs? Provide an (abstract) example.

3. Swarm Intelligence (25 marks)

(a) (15 marks) Consider a Travelling Salesman problem as shown in Fig. 3: there are 4

different cities, i.e., Liverpool (A), Manchester (B), Leeds (C) and Sheffield (D), and the

edges represent the connection between the cities; the numbers on the edges represent

the distances between the cities; a salesman travels over the graph who aims to find a

closed tour of minimal length connecting each city - using Ant System. Take ηij = 1/dij,

where dij is the distance between cities i and j.

Figure 3: The map of connections between Liverpool, Manchester, Leeds and Sheffield.

PAPER CODE COMP532 page 3 of 5 Continued

(1) For one ant starting at Liverpool, calculate the values of p

AB(1), p

AC(1), p

AD(1). Take

parameter values α = 1, β = 3, ρ = 0.5, Q = 100, τ = 10䛈6

(2) Consider that the ant took the tour A → B → C → D → A.

Calculate: τAB(2) and τAC(2).

(b) (5 marks) Draw a graph to show the velocity update of the Particle Swarm Optimisation

algorithm and give a brief explanation of your graph.

benefits?

4. Deep Learning (25 marks)

(a) (6 marks) Figure 4 is a network for the function y = wx + b. At the current time, x =

䛈3, w = 2, b = 3. Calculate the updated w by applying back-propagation.

Figure 4: The network.

(b) (6 marks) (1) In one convolution layer, we have 8 filters of 3x3x3 applied to input volume

64x64x3 with stride 1 and pad 1. What is the size of the output volume? Give details of

how you calculate the size of the output volume.

(6 marks) (2) Using this example, explain how to apply the convolution operation to the

input volume to get the output volume.

suitable for an Atari Games agent.

5. Artificial Immune Systems and DNA Computing (25 marks)

(a) (5 marks) Explain Clonal Selection in the Artificial Immune System.

(b) (5 marks) What are the common characteristics of Artificial Immune Systems and Reinforcement Learning? Give your explanations.

Give your explanations.

PAPER CODE COMP532 page 4 of 5 Continued

(d) (5 marks) What are the common characteristics in the Parallel Problem Solving methods

we have covered, including single-/multi-agent learning, deep learning, swarm intelligence, artificial immune system and DNA computing? Give your explanations.

(e) (5 marks) What are the five steps of Adleman’s experiment in DNA computing to solve

the Traveling Salesman Problem (TSP)?

PAPER CODE COMP532 page 5 of 5 End

QQ：99515681
WeChat：codinghelp
Email：99515681@qq.com
Work Time：8:00-23:00

Hots

Help With Artificial Intelligence Meth... 2024-04-18
Ghostwriter Kxo206 Database Management... 2024-04-18
Ghostwriter Comp9417 Project: Multitas... 2024-04-18
Ghostwriter Bl5611 Drugdiscovery2024 T... 2024-04-18
Ghostwriter Comp5313/Comp4313—Large Sc 2024-04-18
Ghostwriter Aem 4500 / Econ 3860 / Aem... 2024-04-18
Ghostwriter Math 1151, Spring 2024 Wri... 2024-04-18
Ghostwriter 7Ssmm712 – Topics In Appli 2024-04-18
Ghostwriter Elec Eng 3088/7088 Compute... 2024-04-18
Ghostwriter Pols0010 Data Analysis Ter... 2024-04-18
Ghostwriter Econ 602: Course Projecthe... 2024-04-18
Help With Economics 253 - Spring 2024 ... 2024-04-18
Ghostwriter Artd 6151: Sustainability ... 2024-04-18
Help With Ifn647 Text, Web And Media A... 2024-04-18
Help With Cse340 Project 2: Parsinggh... 2024-04-18
Help With Mane - 4500 Modeling And Con... 2024-04-18
Help With Civil 750 - Timber Engineeri... 2024-04-18
Ghostwriter Qbus6860 Sustainable Energ... 2024-04-18
Ghostwriter 25721 Investment Managemen... 2024-04-18
Help With Eeee4123 Hdl For Programmabl... 2024-04-18

Programming Assignment Help！