4.3 Scheduling · Foundations of Algorithms Using C++ Pseudocod

## 4.3 Scheduling Suppose a hair stylist has several customers waiting for different treatments (e.g., simple cut, cut with shampoo, permanent, hair coloring). The treatments don't all take the same amount of time, but the stylist knows how long each takes. A resonable goal would be to schedule the customers in such a way as to minimize the total time they spend both waiting and being served (getting treated). Such a schedule is considered optimal. The time spent both waiting and being served is called the ***time in the system.*** The problem of minimizing the total time in the system has many applications. For example, we may want to schedule users' access to a disk drive to minimize the total time they spend waiting and being served. Another scheduling problem occurs when each job (customer) takes the same amount of time to complete but has a deadline by which it must start to yield a profit associated with the job. The goal is to schedule the jobs to maximize the total profit. We will consider ***scheduling with deadlines*** after discussing the simpler scheduling problem of minimizing the total time in the system. ### 4.3.1 Minimizing Total Time in the System A simple solution to minimizing the total time in the system is to consider all possible schedules and take the minimum. This is illustrated in the following example. Example 4.2**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose there are three jobs and the service times for these jobs are ![](https://box.kancloud.cn/aafea2d7004cefe304f9c7e8a190238e_312x20.jpg) The actual time units are not relevant to the problem. If we schedule them in the order 1, 2, 3, the times spent in the system for the three jobs are as follows: Job Time in the System 1 5 (service time) 2 5 (wait for job 1) + 10 (service time) 3 5 (wait for job 1) + 10 (wait for job 2) + 4 (service time) The total time in the system for this schedule is ![](https://box.kancloud.cn/f771d1f436996d1530303f69403e1686_377x68.jpg) This same method of computation yields the following list of all possible schedules and total times in the system: *Schedule* *Total Time in the System* \[1, 2, 3\] 5+(5+10)+(5+10+4) = 39 \[1, 3, 2\] 5+(5+4)+(5+4+10) = 33 \[2, 1, 3\] 10+(10+5)+(10+5+4) = 44 \[2, 3, 1\] 10+(10+4)+(10+4+5) = 43 \[3, 1, 2\] 4+(4+5)+(4+5+10) = 32 \[3, 2, 1\] 4+(4+10)+(4+10+5) = 37 Schedule \[3, 1, 2\] is optimal with a total time of 32. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Clearly, an algorithm that considers all possible schedules is factorial-time. Notice in the previous example that an optimal schedule occurs when the job with the smallest service time (job 3, with a service time of 4) is scheduled first, followed by the job with the second-smallest service time (job 1, with a service time of 5), followed finally by the job with the largest service time (job 2, with a service time of 10). Intuitively, it seems that such a schedule would be optimal because it gets the shortest jobs out of the way first. A high-level greedy algorithm for this apporach is as follows: ``` sort the jobs by service time in nondecreasing order; while ( the instance is not solved){ schedule the next job; // selection procedure and // feasibility check if ( there are no more jobs) // solution check the instance is solved; } ``` We wrote this algorithm in the general form of the greedy approach to show that it is indeed a greedy algorithm. However, clearly all the algorithm does is sort the jobs according to service time. Its time complexity is therefore given by ![](https://box.kancloud.cn/411a66bb18e13f4fe798bf531573d6a8_153x22.jpg) Although intuitively it seems that the schedule created by this algorithm is optimal, this supposition needs to be proved. The following theorem proves the stronger result that this schedule is the only optimal one. Theorem 4.3**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**The only schedule that minimizes the total time in the system is one that schedules jobs in nondecreasing order by service time. Proof: For 1 ≤ *i ≤ n* − 1, let *t*i be the service time for the *i*th job scheduled in some particular optimal schedule (one that minimizes the total time in the system). We need to show that the schedule has the jobs scheduled in nondecreasing order by service time. We show this using proof by contradiction. If they are not scheduled in nondecreasing order, then for at least one *i* where 1 ≤ *i ≤ n* − 1, ![](https://box.kancloud.cn/c1fae24da21ff609ed4a3d29da9a7efb_77x20.jpg) We can rearrange our original schedule by interchanging the *i*th and (*i*+1)st jobs. By doing this, we have taken *ti* units off the time the (*i*+1)st job (in the original schedule) spends in the system. The reason is that it no longer waits while the *i*th job (in the original schedule) is being served. Similarly, we have added *ti*+1 units to the time the *i*th job (in the orignal schedule) spends in the system. Clearly, we have not changed the time that any other job spends in the system. Therefore, if *T* is the total time in the system in our original schedule and *T'* is the total time in the rearranged schedule, ![](https://box.kancloud.cn/5ba8780e7ffa2b6621907077cb3b40a8_151x23.jpg) Because *ti > ti*+1, ![](https://box.kancloud.cn/0dc484f93af618c2a9049883fb8550f7_63x23.jpg) which contradicts the optimality of our original schedule. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** It is straightforward to generalize our algorithm to handle the Multiple-Server Scheduling problem. Suppose there are *m* servers. Order those servers in an arbitrary manner. Order the jobs again by service time in nondecreasing order. Let the first server serve the first job, the second server the second job,…, and the *m*th server the *m*th job. The first server will finish first because that server serves the job with the shortest service time. Therefore, the first server serves the (*m* + 1)st job. Similarly, the second server serves the (*m*+2)nd job, and so on. The scheme is as follows: ![](https://box.kancloud.cn/b1e8fb740dbf5c0f5e308fa04bcebe96_400x144.jpg) Clearly, the jobs end up being processed in the following order: ![](https://box.kancloud.cn/d0590b979af330b073bb72509ea14d93_360x20.jpg) That is, the jobs are processed in nondecreasing order by service time. ### 4.3.2 Scheduling with Deadlines In this scheduling problem, each job takes one unit of time to finish and has a deadline and a profit. If the job starts before or at its deadline, the profit is obtained. The goal is to schedule the jobs so as to maximize the total profit. Not all jobs have to be scheduled. We need not consider any schedule that has a job scheduled after its deadline because that schedule has the same profit as one that doesn't schedule the job at all. We call such a schedule ***impossible.*** The following example illustrates this problem. Example 4.3**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose we have the following jobs, deadlines, and profits: *Job* *Deadline* *Profit* 1 2 30 2 1 35 3 2 25 4 1 40 When we say that job 1 has a deadline of 2, we mean that job 1 can start at time 1 or time 2. There is no time 0. Because job 2 has a deadline of 1, that job can start only at time 1. The possible schedules and total profits are as follows: *Schedule* \[1, 3\] \[2, 1\] \[2, 3\] \[3, 1\] \[4, 1\] \[4, 3\] *Total Profit* 30+25 = 55 35+30 = 65 35+25 = 60 25+30 = 55 40+30 = 70 40+25 = 65 Impossible schedules have not been listed. For example, schedule \[1,2\] is not possible, and is therefore not listed, because job I would start first at time 1 and take one unit of time to finish, causing job 2 to start at time 2. However, the deadline for job 2 is time 1. Schedule \[1,3\], for example, is possible because job 1 is started before its deadline, and job 3 is started at its deadline. We see that schedule \[4,1\] is optimal with a total profit of 70. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** To consider all schedules, as is done in [Example 4.3](#ch04ex11), takes factorial time. Notice in the example that the job with the greatest profit (job 4) is included in the optimal schedule, but the job with the second-greatest profit (job 2) is not. Because both jobs have deadlines equal to 1, both cannot be scheduled. Of course, the one with the greatest profit is the one scheduled. The other job scheduled is job 1, because its profit is greater than that of job 3. This suggests that a reasonable greedy approach to solving the problem would be to first sort the jobs in nonincreasing order by profit, and next inspect each job in sequence and add it to the schedule if it is possible. Before we can create even a high-level algorithm for this approach, we need some definitions. A sequence is called a ***feasible sequence*** if all the jobs in the sequence start by their deadlines. For example, \[4,1\] is a feasible sequence in [Example 4.3](#ch04ex11), but \[1,4\] is not a feasible sequence. A set of jobs is called a ***feasible set*** if there exists at least one feasible sequence for the jobs in the set. In [Example 4.3](#ch04ex11), {1,4} is a feasible set because the scheduling sequence \[4,1\] is feasible, whereas {2,4} is not a feasible set because no scheduling sequence allows both jobs to start by their deadlines. Our goal is to find a feasible sequence with maximum total profit. We call such a sequence an ***optimal sequence*** and the set of jobs in the sequence an ***optimal set of jobs.*** We can now present high-level greedy algorithm for the Scheduling with Deadlines problem. ``` sort the jobs in nonincreasing order by profit; S = Ø while (the instance is not solved){ select next job; // selection procedure if (S is feasible with this job added) // feasibility check add this job to S; if (there are no more jobs) // solution check the instance is solved; } ``` The following example illustrates this algorithm. Example 4.4**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose we have the following jobs, deadlines, and profits: *Job* *Deadline* *Profit* 1 3 40 2 1 35 3 1 30 4 3 25 5 1 20 6 3 15 7 2 10 We have already sorted the jobs before labeling them. The previous greedy algorithm does the following: 1. *S* is set to Ø. 2. *S* is set to {1} because the sequence \[1\] is feasible. 3. *S* is set to {1,2} because the sequence \[2,1\] is feasible. 4. {1,2, 3} is rejected because there is no feasible sequence for this set. 5. *S* is set to {1,2,4} because the sequence \[2,1,4\] is feasible. 6. {1,2,4,5}is rejected because there is no feasible sequence for this set. 7. {1,2,4,6}is rejected because there is no feasible sequence for this set. 8. {1,2,4,7}is rejected because there is no feasible sequence for this set. The final value of *S* is {1,2,4}, and a feasible sequence for this set is \[2,1,4\]. Because jobs 1 and 4 both have deadlines of 3, we could use the feasible sequence \[2,4,1\] instead. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Before proving that this algorithm always produces an optimal sequence, let's write a formal version of it. To do this, we need an efficient way to determine whether a set is feasible. To consider all possible sequences is not acceptable because it would take factorial time to do this. The following lemma enables us to check efficiently whether or not a set is feasible. Lemma 4.3**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Let *S* be a set of jobs. Then *S* is feasible if and only if the sequence obtained by ordering the jobs in *S* according to nondecreasing deadlines is feasible. Proof: Suppose *S* is feasible. Then there exists at least one feasible sequence for the jobs in *S*. In this sequence, suppose that job *x* is scheduled before job *y*, and job *y* has a smaller (earlier) deadline than job *x*. If we interchange these two jobs in the sequence, job *y* will still start by its deadline because it will have started even earlier. And, because the deadline for job *x* is larger than the deadline for job *y* and the new time slot given to job *x* was adequate for job *y*, job *x* will also start by its deadline. Therefore, the new sequence will still be feasible. We can prove that the ordered sequence is feasible by repeatedly using this fact while we do an Exchange Sort ([Algorithm 1.3](LiB0008.html#32)) on the original feasible sequence. In the other direction, of course, *S* is feasible if the ordered sequence is feasible. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Example 4.5**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose we have the jobs in [Example 4.4](#ch04ex12). To determine whether {1,2,4,7) is feasible, [Lemma 4.3](#ch04ex13) says we need only check the feasibility of the sequence ![](https://box.kancloud.cn/ea5573a931e0dc34e5c9ab375d0d301c_102x68.jpg) The deadline of each job has been listed under the job. Because job 4 is not scheduled by its deadline, the sequence is not feasible. By [Lemma 4.3](#ch04ex13), the set is not feasible. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** The algorithm follows. It is assumed that the jobs have already been sorted by profit in nonincreasing order, before being passed to the algorithm. Because the profits are needed only to sort the jobs, they are not listed as parameters of the algorithm. Algorithm 4.4: Scheduling with Deadlines**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Problem: Determine the schedule with maximum total profit given that each job has a profit that will be obtained only if the job is scheduled by its deadline. Inputs: *n*, the number of jobs, and array of integers *deadline*, indexed from 1 to *n*, where *deadline\[i\]* is the deadline for the *i*th job. The array has been sorted in nonincreasing order according to the profits associated with the jobs. Outputs: an optimal sequence *J* for the jobs. ``` void schedule (int n, const int deadline [], sequence_of_integer& j) { index i; sequence_of_integer K; J = [1]; for (i = 2; i <= n; i++){ K = J with i added according to nondecreasing values of deadline [i]; if (K is feasible) J = K; } } ``` **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Before analyzing this algorithm, let's apply it. Example 4.6**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Suppose we have the jobs in [Example 4.4](#ch04ex12). Recall that they had the following deadlines: *Job* *Deadline* 1 3 2 1 3 1 4 3 5 1 6 3 7 2 [Algorithm 4.4](#ch04ex15) does the following: 1. *J* is set to \[1\]. 2. *K* is set to \[2, 1\] and is determined to be feasible. *J* is set to \[2, 1\] because *K* is feasible. 3. *K* is set to \[2, 3, 1\] and is rejected because it is not feasible. 4. *K* is set to \[2, 1, 4\] and is determined to be feasible. *J* is set to \[2, 1, 4\] because *K* is feasible. 5. *K* is set to \[2, 5, 1, 4\] and is rejected because it is not feasible. 6. *K* is set to \[2, 1, 6, 4\] and is rejected because it is not feasible. 7. *K* is set to \[2, 7, 1, 4\] and is rejected because it is not feasible. The final value of *J* is \[2, 1, 4\]. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** **![Start Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**Analysis of Algorithm 4.4 Worst-Case Time Complexity (Scheduling with Deadlines)Basic operation: We need to do comparisons to sort the jobs, we need to do more comparisons when we set *K* equal to *J* with job *i* added, and we need to do comparisons to check if *K* is feasible. Therefore, a comparison instruction is the basic operation. Input size: *n*, the number of jobs. It takes a time of Θ (*n* lg *n*) to sort the jobs before passing them to the procedure. In each iteration of the **for**-*i* loop, we need to do at most *i* - 1 comparisons to add the *i*th job to *K*, and at most *i* comparisons to check if *K* is feasible. Therefore, the worst case is ![](https://box.kancloud.cn/ac0185f0b4c092ae7c3bac7a585b172b_275x53.jpg) The first equality is obtained in Example A.1 in Appendix. Because this time dominates the sorting time, ![](https://box.kancloud.cn/1c3eede4e37d10b65f9abd13b6ff25ad_130x25.jpg) **![End Sidebar](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Finally, we prove that the algorithm always gives an optimal solution. Theorem 4.4**![Start example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)**![](https://box.kancloud.cn/485bc67a7beadb016ac9d44f235f6e03_27x27.jpg) [Algorithm 4.4](#ch04ex15) always produces an optimal set of jobs. Proof: The proof is by induction on the number of jobs *n.* Induction base: Clearly, the theorem holds if there is one job. Induction hypothesis: Suppose the set of jobs that our algorithm obtains from the first *n* jobs is optimal for the first *n* jobs. Induction step: We need to show that the set of jobs our algorithm obtains from the first *n* + 1 jobs. To that end, let *A* be the set of jobs that our algorithm obtains from the first *n* + 1 jobs and let *B* be an optimal set of jobs obtained from the first *n* + 1 jobs. Furthermore, let job(*k*) be the *k*th job in our sorted list of jobs. There are two cases: **Case 1.** *B* does not include job (*n* + 1). In this case *B* is a set of jobs obtained from the first *n* jobs. However, by the induction hypothesis, *A* includes an optimal set of jobs obtained from the first *n* jobs. Therefore, the total profit of the jobs in *B* can be no greater than the total profit of the jobs in *A*, and *A* must be optimal. **Case 2.** *B* includes job (*n* + 1). Suppose *A* includes job (*n* + 1). Then ![](https://box.kancloud.cn/654406f262b292a41ef5c59427a3a817_400x20.jpg) Where *B*′ is a set obtained from the first *n* jobs and *A*′ is the set our algorithm obtains from the first *n* jobs. By the induction hypothesis, *A*′ is optimal for the first *n* jobs. Therefore, ![](https://box.kancloud.cn/57b06e24a38bc8a780734cfd387082d0_400x45.jpg) where *profit (n*+1) is the profit of job (*n*+1) and *profit (A)* is the total profit of the jobs in *A*. Since *B* is optimal for the first *n*+1 jobs, we can conclude that *A* is also. Suppose *A* does not include job (*n*+1). Consider the time slot occupied by job (*n*+1) in *B*. If that slot was available when our algorithm considered job (*n*+1), it would schedule that job. Therefore, that time slot must be given to some job, say job (*i*1), in *A.* If job (*i*1) is in *B*, then whatever slot it occupies in *B* must be occupied by some job, say job (*i*2), in *A* because otherwise our algorithm could have put job (*i*1) in that slot and job (*n*+1) in job (*i*1)'s slot. Clearly, job (*i*2) is not equal to job (*i*1) or job (*n*+1). If job (*i*2) is in *B*, then whatever time slot it occupies in *B* must be occupied by some job, say job (*i*3), in *A* because otherwise our algorithm could have put job (*i*2) in that slot, job (*i*1) in job (*i*2)'s slot, and job (*n*+1) in job (*i*1)'s slot. Clearly, job (*i*3) is not equal to job (*i*2), job (*i*1), or job (*n*+1). We can repeat this argument indefinitely. Since the schedules are finite, we must therefore eventually arrive at a job, say job (*ik*), which is in *A* and not in *B.* Otherwise, *A* ⊆ *B*, which means our algorithm could have scheduled job (*n*+1) with the jobs in *A.* We can modify *B* by placing job (*i*1) in job (*n*+1)'s slot, job (*i*2) in job (*i*1)'s slot, …, and job (*ik*) in job (*ik*-1)'s slot. We have effectively replaced job (*n*+1) by job (*ik*) in *B.* Because the jobs are sorted in nondecreasing order, job (*ik*)'s profit is at least as great as job (*n*+1)'s profit. So we have a set of jobs that has a total profit at least as great as the total profit of the jobs in *B.* However, this set is obtained from the first *n* jobs. Therefore, by the induction hypothesis, its total profit can be no greater than the total profit of the jobs in *A*, which means *A* is optimal. **![End example](https://box.kancloud.cn/e95df9d604ab2d89febe370ae4d88fb1_1x1.gif)** Using Disjoint Set Data Structure III, which is presented in [Appendix C](LiB0108.html#1464), it is possible to create a Θ (*n* lg *m*) version of procedure *schedule* (in [Algorithm 4.4](#ch04ex15)), where *m* is the minimum of *n* and the largest of the deadlines for the *n* jobs. Because the time to sort is still in Θ (*n*lg*n*), the entire algorithm is Θ (*n* lg *m*). This modification will be discussed in the exercises.