Consider the System availability(A) of a server cluster in terms of three parame
ID: 3884832 • Letter: C
Question
Consider the System availability(A) of a server cluster in terms of three parameters: namely the mean time to failure(MTTF), the mean time to repair (MTTR), and a regular maintenance time (RMT). The MTTF reflects the average uptime between two adjacent natural failures. The MTTR is the downtime due to natural failure. The RMT refers to scheduled down time for hardware/software maintenance or updates.
a)Given a cloud system with a demand availability A = 98%. If the MTTF is known to be two years( or 365 x 24 x 2 = 17,250 hours) and the MTTR is known 24 hours, what is the value of RMT in hours per month that you can schedule for this cloud system?
b)Consider a cloud cluster of three servers. The cluster is considered available( or acceptable wih a satisfactory performance level) if a least k servers are operating mormally where k < 3. Assume that each server has an availability rate of p(or a failure rate of 1 – p). Derive a formula to calculate the total cluster availability A(i.e., the probability that the cluster is available satisfactorily). Note that A is a Function of K and p.
c)Given that each server has an availability p = 0.98, what is the largest minimum number of servers that must be available to achieve a total cluster availability A, which is higher that 96%? You have to check the effect of all possible values of k in part(b) in order to answer this question correctly.
Explanation / Answer
The foundation of fast-start recovery is the fast-start checkpointing architecture. Instead of the conventional event driven (that is, log switching) checkpointing, which does bulk writes, fast-start checkpointing occurs incrementally. Each DBWn process periodically writes buffers to disk to advance the checkpoint position. The oldest modified blocks are written first to ensure that every write lets the checkpoint advance. Fast-start checkpointing eliminates bulk writes and the resultant I/O spikes that occur with conventional checkpointing.
With fast-start fault recovery, the Oracle database is opened for access by applications without having to wait for the undo, or rollback, phase to be completed. The rollback of data locked by uncommitted transaction is done dynamically on an as needed basis. If the user process encounters a row locked by a crashed transaction, then it just rolls back that row. The impact of rolling back the rows requested by a query is negligible.
Fast-start fault recovery is very fast, because undo data is stored in the database, not in the log files. Undoing a block does not require an expensive sequential scan of a log file. It is simply a matter of locating the right version of the data block within the database.
Fast-start recovery can greatly reduce mean time to recover (MTTR) with minimal effects on online application performance. Oracle continuously estimates the recovery time and automatically adjusts the checkpointing rate to meet the target recovery time.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.