Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Youre helping some security analysts monitor a collection of networked computers

ID: 3813717 • Letter: Y

Question

Youre helping some security analysts monitor a collection of networked computers, tracking the spread of an online virus. There are n computers in the system, labeled C1, C2, . . . , Cn, and as input youre given a collection of trace data indicating the times at which pairs of computers communicated. Thus the data is a sequence of ordered triples (Ci , Cj , tk); such a triple indicates that Ci and Cj exchanged bits at time tk. There are m triples total.

Well assume that the triples are presented to you in sorted order of time. For purposes of simplicity, well assume that each pair of computers communicates at most once during the interval youre observing.

he security analysts youre working with would like to be able to answer questions of the following form: If the virus was inserted into computer Ca at time x, could it possibly have infected computer Cb by time y? The mechanics of infection are simple: if an infected computer Ci communicates with an uninfected computer Cj at time tk (in other words, if one of the triples (Ci , Cj , tk) or (Cj , Ci , tk) appears in the trace data), then computer Cj becomes infected as well, starting at time tk. Infection can thus spread from one machine to another across a sequence of communications, provided that no step in this sequence involves a move backward in time. Thus, for example, if Ci is infected by time tk, and the trace data contains triples (Ci , Cj , tk) and (Cj , Cq, tr), where tk ? tr , then Cq will become infected via Cj. (Note that it is okay for tk to be equal to tr ; this would mean that Cj had open connections to both Ci and Cq at the same time, and so a virus could move from Ci to Cq.)

For example, suppose n = 4, the trace data consists of the triples

C1, C2, 4), (C2, C4, 8), (C3, C4, 8), (C1, C4, 12),

and the virus was inserted into computer C1 at time 2. Then C3 would be infected at time 8 by a sequence of three steps: first C2 becomes infected at time 4, then C4 gets the virus from C2 at time 8, and then C3 gets the virus from C4 at time 8. On the other hand, if the trace data were

(C2, C3, 8), (C1, C4, 12), (C1, C2, 14),

and again the virus was inserted into computer C1 at time 2, then C3 would not become infected during the period of observation: although C2 becomes infected at time 14, we see that C3 only communicates with C2 before C2 was infected. There is no sequence of communications moving forward in time by which the virus could get from C1 to C3 in this second example.

Design an algorithm that answers questions of this type: given a collection of trace data, the algorithm should decide whether a virus introduced at computer Ca at time x could have infected computer Cb by time y. The algorithm should run in time O(m ). Also, prove the correctness of your algorithm.

Explanation / Answer

Algorithm Computer Infection

Input: The list of triplets (Ci, Cj, tk) - The two computers Ci & Cj communicated at time tk. Let the Triplets be T[0...m-1]

Output: The list of pairs (Ck, ts) - The list of computers affected with the first time instance. Let the pairs be P[0...m-1]

P[0] = Ca;

for( n = 0 to m-1 ){ //Processing order m

Triplet (Cx, Cy, tz) = T[n];

If (Cx is present in the list of Ps) { //Highest/worst processing m

If (Cy is not present in the list of Ps){ //Highest/worst processing m

Add Cy to P with time tz; // One unit for m computers, it is m.

if(Cy == Cb) {

break;

}

}

else{ //do nothing } // Cy is already present in P.

}

else { //Cx is not present in P

if(Cy is present in P){ //Worst case processing order m

Add Cx to P with time tz; //One unit

if(Cx == Cb){

break

}

}

/*else{

//do nothing //Both Cx and Cy are not affected. Ignore this triplet

}*/

}

End of Computer Infection;

Run time: Worst case for Big-Oh notation- Everything goes in linear (Or operations - if.. else indicates that). Hence the worst case run time of the algorithm = (max. no. of comparisons for each of computers in triplets = ) 2 *m + (Comparing the second node availability in P =) m = 3m. In Big-Oh notation O(3m) = O(m)