Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

11. You\'re helping some security analysts monitor a collection of networked com

ID: 3595511 • Letter: 1

Question

11. You're helping some security analysts monitor a collection of networked computers, tracking the spread of an online virus. There are n computers in the system, labeled C1, C2, , Cn, and as input you're given a collection of trace data indicating the times at which pairs of computers commu- nicated. Thus the data is a sequence of ordered triples (Ci, Cj, tk); such a triple indicates that C and C, exchanged bits at time tk. There are m triples total We'll assume that the triples are presented to you in sorted order of time. For purposes of simplicity, we'll assume that each pair of computers communicates at most once during the interval you're observing. The security analysts you're working with would like to be able to answer questions of the following form: If the virus was inserted into computer Ca at time x, could it possibly have infected computer Cb by time y? The mechanics of infection are simple: if an infected computer Ci communicates with an uninfected computer C at time t (in other words, if one of the triples (G,G.tr) oG.G,tr) appears in the trace data), then computer C, becomes infected as well, starting at time t^. Infection can thus spread from one machine to another across a sequence of communications, provided that no step in this sequence involves a move backward in time. Thus, for example, if C, is infected by time t^, and the trace data contains triples (Ci, Cj, tk) and (Cj, Cq, t,), where tk s tr then Ca will become infected via C. (Note that it is okay for tk to be equal to t; this would mean that C, had open connections to both C and C at the same ume, and so a virus could move Irom Cj to Cg-) For example, suppose n -4, the trace data consists of the triples

Explanation / Answer

Solution:

This trace data can be represented by an undirected graph in which computers appearing in trace data will be vertices of graph(C1,C2............Ci......Cj........), while edges between any two vertices Ci and Cj will represent the time tk at which they communicated.

This particular problem can be solved by finding a possible path (i.e. a connected component) between two given competers and checking whether they are linked via intermediate computers in the order of increasing times. If it is so, then possibly second computer may be infected by the first computer provided first computer is already infected. And this can be found by performing a BFS w.r.t. each vertex on the concerned graph.

Let C be set of vertices that are reachable from a given vertex s in graph. A formal algorithm can be applied as:

-----------------------------------------------------

For all vertices s in graph

Let C to store all nodes reachable (path) from s

Initialize: C = (s)

While there is an edge (u,v) where u is in C, but v is not in C

   Add v to C

End While

For all vertices in C except s

       If s already infected

           if tk (time for some u connected to s) >= t(at which s become infected)

                  Flag u as infected

           Else If tk (time for some u connected to s) <= tr (time for some v connected to u)

                 Flag v as infected provided u already marked as infected

           End If

       End If

End For

End For