Home Page > > Details

COMP532 Machine Learning and BioInspired

COMP532 DEPARTMENT: Computer Science
SECOND SEMESTER EXAMINATIONS 2019/20
Machine Learning and BioInspired Optimization
INSTRUCTIONS TO CANDIDATES
1. Answer FOUR questions. If you attempt to answer more questions than the required
number of questions, the marks awarded for the excess question answered will be
discarded (starting with your lowest mark);
2. You should enter your answers in a word processing document, e.g., MS Word or LaTex
(Overleaf);
3. Some questions require a diagram - you can either just sketch this on plain paper, and then
scan/photograph, or using any drawing software (e.g., MS word, MS PowerPoint and Google
Drawings), and insert into the word processing document;
4. Having completed your answers, convert to PDF and submit via VITAL before 10am, Friday
22nd May. It is strongly suggested to submit your answers at least half an hour before the
deadline, i.e., 9:30am Friday 22nd May, in case you miss the deadline of submission;
5. You MUST complete the exam by the above stated ending time;
6. You can only submit ONCE;
7. Expected Writing Time: 3 hours.
8. Please do NOT include your name in your answer sheet.
9. This is an open-book assessment. Copying any material from another source, or colluding
with any other person in the preparation and production of this work will be considered
suspected academic misconduct and will be dealt with according to the University’s
Academic Integrity Policy.
10. The use of a calculator IS allowed.
PAPER CODE COMP532 page 1 of 5 Continued
1. Reinforcement Learning (25 marks)
(a) (8 marks) The following table is a list of action values at time t. Calculate the action
probabilities for a1, a2 and a3 according to Softmax Action Selection using the Boltzmann
distribution with τ = 2.
an a1 a2 a3
Qt(an) -1 4 2
(b) (10 marks) Consider the 2 × 2 deterministic grid world in Figure 1(a). Allowable moves
are shown by arrows, and the numbers indicate the reward for performing each action.
At time step t, a Q-learning agent has computed Q values as given in Figure 1(b); e.g.,
Q(LowerLeft, Up) = 15 and Q(LowerLeft, Right) = 20. Show the changes in the Q
value estimates when the agent subsequently takes the path shown by the dotted line
in Figure 1(c) (the agent starts in the lower left cell) when learning rate α = 0.5 and
discount factor γ = 0.75. Clearly show how you computed your answer.
Figure 1: The deterministic grid world.
(c) (7 marks) Draw the backup diagram of the Temporal Difference learning method. In ad￾dition, explain bootstrapping and sampling using the diagram.
2. Multi-Agent Reinforcement Learning (25 marks)
(a) (15 marks) Below is the payoff matrix for the Prisoners Dilemma game. (Row player: x;
Column player: y; C: cooperate; D: Defect)
C D
C (3, 3) (0, 5)
D (5, 0) (1, 1)
(1) State the Nash Equilibrium of the game and explain your answer.
(2) Consider that at some time t that x = (0.3, 0.7) and y = (0.6, 0.4), where the first
argument in the vectors is the probability of cooperating and the second argument is the
probability of defecting for players x and y respectively.
According to the replicator dynamics, calculate ˙x1 and ˙y1, i.e., the change in probability
of playing the first action (C), for both players. Show how you computed your answer.
PAPER CODE COMP532 page 2 of 5 Continued
(b) (5 marks) Fig. 2 is the direction field plot of the Rock-Paper-Scissors game. According
to the plot, is there a Nash equilibrium for the game? If yes, is the Nash equilibrium
evolutionarily stable? Explain your answer.
Figure 2: The direction field for the Rock-Paper-Scissors game.
(c) (5 marks) What are Markov Games, and how do they extend repeated games and
MDPs? Provide an (abstract) example.
3. Swarm Intelligence (25 marks)
(a) (15 marks) Consider a Travelling Salesman problem as shown in Fig. 3: there are 4
different cities, i.e., Liverpool (A), Manchester (B), Leeds (C) and Sheffield (D), and the
edges represent the connection between the cities; the numbers on the edges represent
the distances between the cities; a salesman travels over the graph who aims to find a
closed tour of minimal length connecting each city - using Ant System. Take ηij = 1/dij,
where dij is the distance between cities i and j.
Figure 3: The map of connections between Liverpool, Manchester, Leeds and Sheffield.
PAPER CODE COMP532 page 3 of 5 Continued
(1) For one ant starting at Liverpool, calculate the values of p
1
AB(1), p
1
AC(1), p
1
AD(1). Take
parameter values α = 1, β = 3, ρ = 0.5, Q = 100, τ = 10䛈6
.
(2) Consider that the ant took the tour A → B → C → D → A.
Calculate: τAB(2) and τAC(2).
(b) (5 marks) Draw a graph to show the velocity update of the Particle Swarm Optimisation
algorithm and give a brief explanation of your graph.
(c) (5 marks) What type of multi-agent learning is Swarm Intelligence, and what are its main
benefits?
4. Deep Learning (25 marks)
(a) (6 marks) Figure 4 is a network for the function y = wx + b. At the current time, x =
䛈3, w = 2, b = 3. Calculate the updated w by applying back-propagation.
Figure 4: The network.
(b) (6 marks) (1) In one convolution layer, we have 8 filters of 3x3x3 applied to input volume
64x64x3 with stride 1 and pad 1. What is the size of the output volume? Give details of
how you calculate the size of the output volume.
(6 marks) (2) Using this example, explain how to apply the convolution operation to the
input volume to get the output volume.
(c) (7 marks) Draw a graph to explain the structure of a Deep-Q network that would be
suitable for an Atari Games agent.
5. Artificial Immune Systems and DNA Computing (25 marks)
(a) (5 marks) Explain Clonal Selection in the Artificial Immune System.
(b) (5 marks) What are the common characteristics of Artificial Immune Systems and Rein￾forcement Learning? Give your explanations.
(c) (5 marks) What are the five characteristics of AIS in common with Swarm Systems?
Give your explanations.
PAPER CODE COMP532 page 4 of 5 Continued
(d) (5 marks) What are the common characteristics in the Parallel Problem Solving methods
we have covered, including single-/multi-agent learning, deep learning, swarm intelli￾gence, artificial immune system and DNA computing? Give your explanations.
(e) (5 marks) What are the five steps of Adleman’s experiment in DNA computing to solve
the Traveling Salesman Problem (TSP)?
PAPER CODE COMP532 page 5 of 5 End
Contact Us - Email:99515681@qq.com    WeChat:codinghelp
Programming Assignment Help!