Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

USING C MPI Workpool Application In this assignment, you are asked to write and

ID: 3678597 • Letter: U

Question

USING C

MPI Workpool Application

In this assignment, you are asked to write and execute a sequential program for the Monte Carlo pi calculation. Then you area asked to write an MPI program for the Monte Carlo pi calculation that implements a load-balancing workpool pattern, testing on your computer and on the cluster.

Monte Carlo pi calculation

The Monte Carlo algorithm for computing pi is well known and given in many parallel programming texts. It is a so-called embarrassingly parallel application particularly amenable to parallel implementation and is used more for demonstration purposes than as a good way to compute pi. However, it can lead to more important Monte Carlo applications. A circle is formed within a square. The circle has unit radius and the square has sides 2 x 2. The ratio of the area of the circle to the area of the square is given by pi(1^2 )/(2 x 2) = pi/4, as shown in Figure 1. Points within the square are chosen randomly and a score is kept of how many points happen to lie within the circle. The fraction of points within the circle will be pi/4, given a sufficient number of randomly selected points.

Figure 1 - Monte Carlo pi calculation

Usually in programs, only one quadrant is used as shown in Figure 2 so that the randomly selected points generated, xr, yr, are each between 0 and 1, and are counted as in the circle, if

2 Figure - Using one quadrant of circle

Workpool

The workpool pattern is like a master-slave pattern but has a task queue that provides load balancing. Figure 3 shows a workpool with the task queue. Individual tasks are given to the slaves. When a slave finishes a task and returns the result, it is given another task from the task queue, until the task queue is empty. At that point, the master waits until all outstanding results are returned. The termination condition is the task queue empty and all result collected.

Figure 3 - Workpool with a task queue

Implementing a Workpool for the Monte Carlo pi calculation

In the particular workpool of the Monte Carlo pi calculation you are asked to implement, the master process begins by sending a different number (called a “seed”) to each of the slaves as shown in Figure 4. Each slave uses that number as the starting seed for its random number generator. The slaves generate two random numbers as the coordinates of a point (x, y). If the point is within the quadrant of circle (i.e. (x^2) + (y^2) <= 1), a counter is incremented to indicate the point is within the quadrant of circle. Each slave repeats its actions with for a total of N points in each slave. The slaves then return how many of their N points are within the quadrant of circle to the master. When a slave returns its result, the master then gives the slave a new starting seed and the slave repeats its actions. The master continues to give starting seeds to a maximum of S seeds and then stops send out seeds. At that time,NxS random points will have been checked. Finally the master uses the results from the slaves to compute the approximation for pi, given: (pi/4)=( Total number of points within quadrant of circle/NxS)

Figure 4 - A Workpool implementation of Monte Carlo pi calculation

USING C

Task 1

First write a sequential program in C to perform the Monte Carlo pi calculation (not implementing a workpool). Just one process performs the calculation and the total number of points is input at the keyboard. The process generates a pair of random numbers, determines whether the point is within the quadrant of circle. The process continues to generate random points, and counts how many points are within the quadrant of circle until all the points have been checked. It is acceptable to generate the random numbers each between 0 and 1 not including 1 if it easier to generate random numbers. The error will be minimal with a large number of points.

The program is to output both the value of pi and its error from the exact value, and also the execution time (done by instrumenting the code). For the exact value of pi, use #define PI 3.14159265358979. Execute the program on your computer (VM or a native installation). Test with at least 10,000 points and other values.

Task 2

Now write an MPI program to implement a workpool version for the Monte Carlo pi calculation where S and N can be input at the keyboard. It is critical that you implement the workpool as described earlier. You will lose many points if you do not do so. When a slave returns its result, the master then gives the Slaves Master inside Starting seed for seed random sequence to slave Return number of N random points inside arc of circle Aggregate answers 4 slave a new starting seed and the slave repeats its actions, until all S seeds have been sent. Try to make sure that the seeds being sent from the master to the slaves do not appear in the random number sequence number being generated by the slaves. This is very important to avoid interrelated sequences. This is not easy to guarantee, but develop a reasonable approach and justify.

Execute the program on your computer (VM or a native installation). Test with 4 processes, and at least N * S = 10,000 and other values.

Some clues on Task 2: There are S tasks (seeds). Let us first assume S is larger than the number of slaves. One way to do a workpool is for the master to send one task to each of the slaves in turn and then wait for slaves to return results. As each slave returns a result, the master sends that slave a new task, until all S tasks have been sent. Then when the master receives a slave result, it sends a terminator message that the slave interprets for it to stop. The terminator message might be a unique tag or zero data. The destination can receive messages with any value of tag using MPI_ANY_TAG as the tag in MPI_Rec() and find out what tag has been received by checking the value of status.MPI_TAG. So one might have code such as:

One can accommodate when the number of slaves is greater than the number of tasks (an unlikely
situation) by terminating those slaves not needed after send the S tasks to the first S slaves, but this is
not strictly needed here as you are asked only to test with four processes. It is simple enough to add.

USING C

Total area = 4 2 Area = 2

Explanation / Answer

#include #include #include #include #include "mpi.h" #include #define PI 3.14159265358979 MPI_Status status; int n, s, S, rank, num_processes; struct timeval tic, toc, seed_time; void slave_io () { char slave_name[256]; gethostname(slave_name, 256); printf("Slave Process #%d [%s] ", rank, slave_name); int computations = 0; while(1) { // While true... int seed; // Get the seed from the master (or kill msg) MPI_Recv(&seed, 1, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, &status); if(status.MPI_TAG == 0) { // Kill slave process if TAG == 0 break; } else if(status.MPI_TAG == 200) { // Compute number of points in circle if TAG == 200 // Compute number of circumscribed points srand(seed); int count = 0, i = 0; for(i = 0; i