Integer programming model extensions for a multi-stage nurse rostering problem

In the variant of the well studied nurse rostering problem proposed in the Second International Nurse Rostering Competition, multiple stages have to be solved sequentially which are dependent on each other. We propose an integer programming model for this problem and show that a set of newly developed extensions in the form of additional constraints to deal with the incomplete information can significantly improve the quality of the generated solutions. We compare our solution approaches with the results obtained in the competition and show that the extended model achieves results competitive with the competition finalists.


Introduction
The automated generation of high quality staff schedules, in particular for hospitals, has been an important problem for over 40 years. Multiple variants and solution approaches exist [to be found e.g. in surveys by Ernst et al. (2004) and den Bergh et al. (2013)]. A survey focused on rostering problems in hospitals specifically was published by Burke et al. (2004). Ceschia et al. (2015) proposed a variant of the Nurse Rostering Problem for the Second International Nurse Rostering Competition (INRC-II). In contrast to previous problem variants, a multi-stage formulation is used, where solutions for individual weeks have to be produced by the solver sequentially, without information about the requirements of later weeks. Salassa and Vanden Berghe (2012)  This multi-stage setting poses two unique challenges for solvers: The dependencies between weeks make it necessary to take the solutions of previous weeks into account during the evaluation of the quality of a schedule. Publications treating this issue are by Glass and Knight (2010), Salassa and Vanden Berghe (2012) and Smet et al. (2016), who developed and formalized various strategies to consistently include the results of previous stages in the evaluation of the current schedule. Their findings have largely been included in the rules for constraint evaluation of the INRC-II.
Further, and not explored in the previously mentioned papers, is the fact that due to the incomplete information during all weeks but the last, the generated solution can no longer be guaranteed to be optimal even if each week is solved to optimality. More so, a naive model that is not adapted to this setting will produce imbalanced schedules that incur large penalties in later weeks as options are restricted excessively by the solutions of the previous weeks. It is therefore necessary to allow for the existence of future scheduling periods already during the solution of the current stage. While e.g. Salassa and Vanden Berghe (2012) successfully improve their results by regarding the solutions of past stages, they do not include any special mechanisms to account for the upcoming future stages. To the best of our knowledge, the same holds for the other works dealing with such multi-stage settings.
There have been 15 submissions to the INRC-II, with seven of these advancing to the final round. The results for all participants are available on the competition website, 1 while information about some solution approaches (Dang et al. 2016;Kheiri et al. 2016;Jin et al. 2016;Römer and Mellouli 2016) has been published as extended abstracts in the proceedings of PATAT2016. In particular, the winners of the competition used integer programming (IP) with a network-flow-based formulation (Römer and Mellouli 2016). One other publication dealing with this problem is by Santos et al. (2015), who used a weighted constraint satisfaction approach.
Several IP formulations have been previously used for nurse rostering problems, including the problem proposed in the First International Nurse Rostering Competition (INRC2010) (Haspeslagh et al. 2014). For this (single-stage) problem, Santos et al. (2016) provided an IP formulation and proposed techniques to improve the performance of IP solvers based on this formulation, by providing good dual and primal bounds. Valouxis et al. (2012) proposed a two phase approach for the INRC2010 problem. Integer programming formulations were proposed in both phases to assign nurses to working days and to shift types. A branch and price algorithm has been proposed for solving INRC2010 instances in Burke and Curtois (2014). In this competition, also several heuristic and hybrid algorithms have been applied to the instances. We refer the reader to Haspeslagh et al. (2014) for a comparison of different approaches. IP has also been used for other nurse rostering problems [e.g. Burke et al. (2010), Brucker et al. (2011. In this article, we investigate solving a new nurse rostering problem (proposed in the INRC-II) by Integer Programming. As in this problem a multi-stage formulation is used, the application of previous IP approaches is not sufficient to obtain good solutions that take into consideration future scheduling periods. Therefore, novel formulations are needed to cope with this new problem. We first propose a basic IP formulation (Sect. 3) for the INRC-II problem (as defined in Sect. 2). Our main contribution is the extension of this model with new additional constraints to account for the multi-stage setting (see Sect. 4). To the best of our knowledge this extension presents an original contribution to the literature.
We evaluate our formulations in Sect. 5, using the instances provided for the INRC-II and show that the additional constraints significantly improve the quality of the generated solutions. We also compare our model with the results of the finalists in the INRC-II. In this comparison, the results of the extended (IP) model are competitive (slightly better than the median).
This paper is an extension of work previously published in the proceedings of the 2016 conference on the Practice and Theory of Automated Timetabling Mischek and Musliu (2016), which was further based on the first author's master's thesis [see Mischek (2016)].

Problem definition
In this section, we give a short overview of the problem used in the INRC-II. A detailed description of the problem structure and all constraints can be found in the competition rules (Ceschia et al. 2015).
Instances are of either 4 or 8 weeks duration. For each week, a schedule has to be found by the solver, using only the information provided in a global scenario file, containing information about the nurses and their contracts, week data about the requirements of the current week and a history with data concerning the last assignments of the previous weeks and some global counters. Information about the following weeks, in particular about the covering requirements, is not available until the solution for the current week has been fixed by the solver.
In the following, a work stretch denotes a period of consecutive working days for a nurse. Rest stretch and shift stretch are analogously defined for periods of consecutive days off and assignments to the same shift, respectively.
There are four hard constraints that have to be fulfilled by any solution to be regarded as feasible: H1. Single assignment per day: Each nurse can only work a single shift using a single skill per day. H2. Under-staffing: The minimum number of nurses required for each shift and skill must be present. H3. Shift type successions: Nurses must not have shifts on two consecutive days that form a forbidden sequence. H4. Missing required skill: Nurses can only cover assignments for which they have the required skill.
Further, seven soft constraints are defined. Solutions should try to satisfy these constraints, but violating them only results in a penalty to the quality of the solution (weights are listed in the description of each constraint).

S1. Insufficient staffing for optimal coverage (30):
The number of nurses assigned to each shift and skill should not be smaller than the optimum staffing. The penalty is multiplied by the number of missing nurses. S2. Consecutive assignments (15/30): The length of each shift stretch (weight 15) and work stretch (weight 30) should be within the bounds defined for the shift type resp. the contract of the involved nurse. The penalty is multiplied by the number of missing or surplus assignments. S3. Consecutive days off (30): As before, the length of each rest stretch should be within the bounds defined in each nurse's contract. The penalty is multiplied by the number of missing or surplus days off. S4. Preferences (10): The requests of nurses for shifts (or days) off should be respected.

S5. Complete week-end (30):
Nurses with the complete-weekend constraint in their contract should either work both days of the weekend or none. S6. Total assignments (20): Over the whole planning horizon, each nurse's assignments should be within the bounds defined in their contract. S7. Total working week-ends (30): Over the whole planning horizon, each nurse should not work more than the maximum number of weekends defined in their contract. The complete information necessary to evaluate constraints S6 and S7 is available only after the solution for the last week has been fixed, although they should of course be respected by solvers during all weeks. All sequence constraints (H3, S2, S3) also use the border data from the solution for the previous week (this is provided in the history file).

Parameters
The first set of parameters contains values that stay the same over the whole planning horizon. These values are stored in the scenario file:

Decision variables
x d nsk = 1 if nurse n is assigned to shift s using skill k on day d, and 0 otherwise. The W n variable indicates that nurse n works at least one day of the weekend. The violation of soft constraints is measured using either non-negative or boolean surplus variables:

Objective function
The objective function is the weighted sum over all violations of each soft constraint:

Constraints
The following (in)equalities model the hard constraints, as described above.
For constraint H3, any forbidden shift sequence (u s 1 s 2 = 0) must not be assigned to the same nurse on consecutive days. This must be ensured both within the week (a) and at the boundary of this week with the previous one (i.e. on the first day of the week, b).
The remaining inequalities deal with the soft constraints. Each inequality can be deactivated by setting the appropriate surplus variable to a value greater than zero, which results in a corresponding penalty in the objective function.
S2 actually contains various different constraints that have to be modeled separately: consecutive assignments of the same shift (min (a)/ max (b)) and of work in general (min (c) / max (d)), both during and at the start of the week. For the minimum consecutive shifts constraints, all patterns that compose a sequence shorter than the required length are prevented. For example, if the minimum number of consecutive night shifts (N) is 4, the patterns {xNx, xNNx, xNNNx}, where x is any other shift or a day off, should not appear.
Since each pattern incurs a penalty proportional to the number of missing assignments, (in the example, xNx would incur a penalty of 45, while xNNNx would incur a penalty of 15) the surplus variables are weighted correspondingly, to ensure that a value of at least the number of missing assignments is necessary to deactivate the constraint.
Equations 8 and 9 model the case where a stretch starts at the beginning of the week or towards the end of the previous week.
The maximum consecutive shifts constraints is modeled like this: For each shift s with a maximum of σ + s consecutive assignments, each block of σ + s + 1 days must contain at least one day where s is not assigned. Note that contrary to the situation for S2a, violations of this constraint by more than one shift assignment result in multiple matches of the pattern and therefore it suffices to use boolean surplus variables.
As before, equations 10 model the case where a shift block started in the previous week.
The inequalities modelling the maximum and minimum length of work stretches (S2c, S2d) function analogously to those for shift stretches. The only difference is that an assignment to any shift counts towards the length of the work stretch. S2d S3 similarily contains two independent constraints: the minimum (a) and maximum (b) number of consecutive days off, again both during and at the start of the week. The equations modelling these constraints are again analoguous to those from constraints S2c and S2d, except that days of work and days off were swapped.
To model nurse requests for shifts or days off, any assignment to an unwanted shift incurs the penalty.
For the complete weekends constraint, first the additional helper variables W n are set if the nurse n works either of the days on the weekend. Equations 24 then ensure that if W n is set, and the complete weekend constraint is present for the nurse, both days of the weekend should have work assigned.
The constraint S6 (number of total assignments) is modeled slightly differently from the description given by Ceschia et al. (2015). Originally, these constraints were evaluated only after the schedules of all weeks were fixed. In our model, the penalties are calculated immediately and added to the objective function value of the week in which they arise. This does not change the overall quality of the whole schedule, so results are still comparable, although the intermediate quality value of the individual weeks might be different.
The equations for constraint S7 (maximum number of weekends worked) use the variable W n , set in equations 23.

Model extensions
While the basic model described in Sect. 3 yields feasible solutions that are optimal for each week (if given enough time), the connections between weeks are mostly ignored. Because the weeks are solved individually, solutions are favored that give slightly better results in earlier weeks, at the cost of having potentially much larger penalties in later weeks.
In order to take this into account and improve the overall solution quality, we propose the following extensions to the model, in the form of additional (soft) constraints.

Overstaffing
Looking at the total number of shifts for each nurse, one can see that nearly all nurses exceed their maximum number of assignments (compare Fig. 1), except for nurses with a full time contract. Even these nurses mostly have all their available shifts assigned. This is despite the fact that the total number of assignments is much larger than the amount needed to cover all shifts at the optimal level (1180 assigned versus 1029 needed for optimal staffing levels in the example instance).
However, since there is no penalty on overstaffing, there is no pressure to avoid unnecessary assignments. Indeed, in some cases it can seem advantageous to assign shifts above the optimal staffing levels in order to fulfil sequence constraints or the complete weekend constraints (S7).
However, as soon as the available assignments are used up, high penalties are unavoidable as other constraints (in particular cover constraints and sequence constraints) still have to be fulfilled.
This can also be seen from Fig. 2: In earlier weeks, far more shifts are assigned to nurses than necessary, while in later weeks, constraints S6 force solutions to be closer to the required staffing levels.
To avoid this situation, we introduced a new constraint penalizing overstaffing: S8*. Overstaffing: The number of nurses assigned to each shift and skill per day should not exceed the optimal coverage levels.
This constraint can be added to the basic IP model with the following inequalities: where C S8 * skd is a new surplus variable.

Average assignments
While the overstaffing constraints already reduce the number of excess assignments over the maximum, they do not differentiate between nurses with different contracts. As a result, nurses with part-time or half-time contracts have the same schedules as those with full-time contracts in the earlier weeks. Consequently, they are not available in later weeks without penalties, as their contracts are already maxed out. Ideally, nurses should be employed according to their contracts during all weeks, with full-time nurses having more assignments per week than other nurses. In part this is already done implicitly, because nurses with shorter contracts usually also have shorter work stretch lengths and longer rest stretch lengths.
To ensure that each nurse will be available until the last stage, their assignments should be distributed evenly across all stages.

S6*. Average assignments:
The total number of assignments up to the current week must be within the bounds defined in the contract, multiplied by the fraction of weeks that have already passed.
This extension generalizes constraints S6 to earlier weeks.
To give an example, if a nurse has a minimum of 10 assignments and a maximum of 22, then after stage 4 (of 8), they should have between 5 and 11 shifts assigned. Assuming that they already had 7 shifts assigned in stages 1-3, this constraint would require them to have between 0 and 4 assignments in stage 4.
If these constraints are satisfied during all weeks, it can be guaranteed that also constraints S6 are satisfied for the whole schedule.
C S6 * n are again new surplus variables. Here, fractional limits are rounded such that the limits are always integer numbers and the solutions satisfying S6* always also satisfy S6. We also experimented with different rounding schemes, but this did not influence the quality of the generated solutions.
An alternative version of constraints S6* can be formulated as

S6*b. Average assignments:
In each week, the remaining assignments (not yet used in previous weeks) should be divided equally among all remaining weeks.
Continuing the example above, the nurse in question would have between 3 and 15 assignments left to distribute over 5 weeks (stages 4-8). This means that according to constraint S6*b, they should have between 3 5 (rounded up to 1) and 3 shifts assigned during stage 4.
The difference between these two formulations becomes visible in case of an imbalance in preceding stages (i.e. too many or too few assignments): S6* tries to restore the balance (which might require unusual work or rest stretches), while S6*b ignores a global balance and works exclusively with the assignments remaining for the current and future stages. A further discussion of these two formulations can be found in Sect. 5.

Average working weekends
The same argument as above also applies to constraints S7, the maximum number of total weekends. Just like assignments in general, also weekends should not be used up in the early stages, but distributed across all weeks to preserve options.
Therefore, an analoguous constraint S7* can be defined: S7*. Average working weekends: In each week, the still available working weekends (not yet used in previous weeks) should be divided equally among all remaining weeks.
with the corresponding inequalities Note that since there is at most one working weekend per week and nurse, and the maximum number of working weekends is less than the number of weeks, the limit set for each week will either be 0 or 1.

Next week restrictions
In addition to the global constraints, solutions for different stages influence each other also at the boundary between weeks.
Since the staffing requirements for the next week are unknown in each stage, leaving more options to schedule nurses without conflicts is beneficial. If there are only few good assignments for the nurses with a certain skill, satisfying the cover constraints might become difficult if they do not match one of those options.
A common way for schedules to restrict the options for the next stage is via the sequence constraints. For example, let the minimum number of consecutive night shifts (σ − N ) be 4 and the proposed solution for this week end with a single night shift on Sunday for a nurse (and any other shift or a day off on Saturday, compare Fig. 3). Then we already know that any assignment for this nurse from Monday to Wednesday that is not a night shift, will inevitably incur a penalty (and depending on the rest of the schedule, assigning only night shifts on these three days could result in penalties of its own).
As another example, if the maximum number of consecutive night shifts is 5 and the proposed solution already contains a shift stretch of at least 5 night shifts in the days leading up to Sunday, this means that assigning a further night shift on Monday of the next week would incur a penalty for exceeding the maximum length.
The same reasoning applies to work and rest stretches.

S9*. Restriction of next week's assignments:
Options for next week's schedule should not be restricted. The penalty is calculated as the total number of shifts that cannot be assigned in the next week without violating at least one sequence constraint.
The equations to model this constraint are split into restrictions from shift (a), work (b) and rest (c) stretches, each regarding the minimum and maximum stretch length and with their own set of surplus variables.
Equations 34 detect blocks of the same shift s at the end of the week with a length b below the minimum shift sequence length (constraint S2a). If such a block is found, this means that for the first σ − s − b days of the next week, neither a day off, nor any of the other |S \ {s}| shifts can be assigned without penalty. In the equation, the left hand side of the inequality sums to b + 1 in such a case. To still satisfy the constraint, the surplus variable C S9 * a n must Sa Su Mo Tu We . . . -N N? N? N? . . .

Fig. 3
Assignment that heavily restricts options for the following week. Assuming σ − N = 4, a single night shift on sunday will cause a penalty in the next week if any shifts other than additional night shifts have to be assigned between monday and wednesday be assigned a value ≥ |S|(σ − s − b), resulting in a corresponding penalty in the objective function.
The case, where a long shift stretch of shift s blocks an assignment of s to the first day of the following week, is covered by Eq. 36. Any block of shift s at the end of the week with a length of at least σ + s violates the corresponding inequality, unless compensated for by a surplus variable.
The restrictions on work (b) and rest (c) stretches work analogously, except that the weights of the surplus variables differ according to the number of blocked assignments.

Unresolvable patterns
In the solutions generated for various instances, we found that violations of sequence constraints most commonly appeared at the boundaries between weeks. In many cases, this is the result of patterns similar to those shown in Fig. 4. In general, not checking the feasibility of completing a multi-shift work stretch in the next week can lead to situations where the last shift stretch can not be extended to the minimum length without violating the maximum work stretch length.
This leads to the following additional constraint: = 5, the maximum work stretch length is already reached but at least one more night shift at the beginning of the next week is required Mo Tu We Th Fr Sa Su Mo Fig. 5 The same pattern as for Fig. 4, split up into the parts matched by the constraints S10*. For this assignment, b = 3 S10*. Unresolvable Patterns: For work stretches at the end of the week, there should be a way to complete them in the next week without violating either the maximum work stretch length or the minimum shift stretch length.

Mo Tu We Th Fr Sa Su Mo
Assume a stretch of shift s is assigned to nurse n at the end of the week. Then an unresolvable pattern has the following structure: First, a block of at least w + n − σ − s shifts (that can be any type except a day off) is scheduled (A), followed by a single shift that is not s (B). Then, the remaining b days up to the end of the week are filled with assignments to shift s (C), where b < σ − s . To avoid a violation of the minimum shift stretch length, at least σ − s −b more days of shift s would be required at the start of the next week. However, together with parts (A) and (B), this would bring the total work stretch length to at least (w + n − σ − s ) + 1 + b + (σ − s − b) = w + n + 1, which exceeds the maximum work stretch length w + n . The different parts are visualized in Fig. 5 Equations 40 detect and penalize these patterns through the use of a further set of surplus variables.

Objective function
To have an impact on the generated solutions, the surplus variables for the added constraints have to be included in the objective function. The objective function f for the extended model is therefore where W S8 * , W S6 * , W S7 * , W S9 * and W S10 * are the weights of their corresponding constraints.
After a solution has been fixed, the actual penalty has to be recalculated using the objective function of the basic model f , to ensure that the penalties from the additional constraints of the extended model are not included in the final result.
Obviously, all constraints introduced in this section should be ignored in the last week, as there is no further week to influence.

Experimental results
All algorithms were implemented in Java 7, and we used the IBM ILOG CPLEX solver, 2 version 12.6.3, to solve the IP models. All experiments were performed on an Intel Xeon 2.33GHz PC, using a single thread. The time limit was set to the time alloted by the benchmarking script 3 provided for the INRC-II (on our machine, between 1 and 9 minutes per stage, depending on the instance size).
We trained the parameters of our models using the parameter-tuning framework irace (López-Ibáñez et al. 2011), over the set of late instances 4 published for the INRC-II. The models were then evaluated on the set of hidden instances. 5

Model extensions
We first evaluated the impact of adding each extension to the basic IP model individually. Due to the similarity in structure and purpose, constraints S6* (Average Assignments) and S7* (Average Weekends) are grouped together. For this comparison, the weight of each additional constraint was set to a value of 1 to ensure that the focus of the optimization still remains on the original constraints. The only exception is constraint S10* (Unresolvable patterns), since a violation of this constraint directly results in a violation of at least one shift stretch length constraint in the next week and thus warrants a weight of 15 (as if the violation had already occured). Figure 6 shows that each extension improves the quality of the solutions, with the largest impact achieved by S6*/S7* and S8*, those extensions dealing with the two global constraints S6 and S7.
Considering the two variants of S6*, there is next to no difference between the performance of S6* and S6*b.

Fig. 6
Performance of the basic model extended by each set of constraints individually. The baseline (value of 1) for each instance is the solution generated by the basic model without extensions further reducing the penalties of the generated solutions. This will be denoted as the extended model. Adding also constraints S9* increased the penalties again, even when S9* was assigned a much lower weight than the other extensions. This is probably connected to the fact that solving models including constraints S9* took nearly twice as long on average and thus optimal solutions could not be found for many stages. However, this cannot be the only reason, as results for models without S9* are also better even for instances where each stage could be solved to optimality.
To find optimal weights for the extensions S6*/S7* and S8* (the weight for S10* corresponds directly to the weight of the shifts stretch length constraints), we used IRACE. Both W S6 * (= W S7 * ) and W S8 * were varied between 0 and 20, with a precision of one significant digit after the decimal point. IRACE was run in parallel on 4 cores with a limit of 5000 iterations.
The best values reported by IRACE are W S6 * = 9.9 and W S8 * = 11.9. The two next best configurations are very similar and further experiments showed that the results do not vary significantly under small variations of the weights.
Considering solution times, CPLEX was able to solve most weeks to optimality, even for the larger instances. Over the whole set of hidden instances, using the extended model, an optimal solution could be found for 274 out of 360 weeks. The average gap between the best solution found and the final lower bound for the remaining 86 weeks was only 1.90%, indicating that substantial improvements are not to be expected even with much longer running times.

Final results
For the final evaluation, the extended model was used, with weights for the extensions as shown on Table 1. The exact results can be found on Table 2, see also Fig. 7 for comparison. Due to the extensions, the penalty incurred by the generated solutions is reduced by about 40% on average, in some cases even to less than half the penalty of the basic model. Further, there is no instance, where the extended model produced results that were not at least 20% better than those of the basic model.
Compared to those of the finalists in the INRC-II, the results of the extended model are competitive (slightly better than the median), although no new best known solutions could be found. The average rank over all instances is 3.45, placing our results firmly into the top half of the finalists.

Conclusions
In this paper, we have proposed and evaluated several original extensions of standard IP formulations for nurse rostering problems in order to deal with multi-stage settings, as described for the INRC-II.
We have shown that our extensions significantly improve upon the results of the basic model and achieve competitive results compared to the finalists in the competition.
The fact that our model could be solved to (near) optimality in most cases, even under the strict time limits imposed by the challenge, indicates that major improvements cannot be expected from varying solution techniques alone. Instead, future research should be focused on further modifications of the model to distribute the penalties more equally between weeks and avoid blocking options for later weeks. Techniques that try to predict the requirements of yet unknown weeks or distinguish between nurses of different skill sets and contracts could result in even better models.