At Stitch Fix, we often have to make decisions that involve uncertainty. For example, we choose the location of a new warehouse without knowing the exact population the warehouse will serve in the coming years. Or we purchase a certain quantity of a newly released fashion trend before we know how popular that trend will be. Or we promise to deliver a client’s curated items on a specific date not knowing if an unforeseen weather event will impact delivery time.

Our prediction of the future will always be incomplete and erroneous. Yet, as decision makers, we have to make a call despite imperfect data. We can either “best guess” uncertain values and hope that we aren’t too far off the mark, or we can optimize for the worst case and make a decision that works for every possibility. Planning for the worst case might make sense if you’re designing a bridge; however, in most business cases this leads to overly conservative decisions that are too expensive or inefficient.

In this post, we’ll introduce stochastic programming, a framework for making optimal decisions that involve uncertainty. Whereas classical optimization approaches modeling the real world assuming known parameters, stochastic programs allow us to explicitly treat uncertainty in our decision-making. We’ll also apply this framework to an operational problem at Stitch Fix.

### Flexible stylist schedules with uncertain demand

Let’s take the problem of stylist scheduling as an example. When a client requests a Fix, our word for a curated shipment of clothing, a stylist with skills appropriate to that client (aided by algorithms) selects the items to go out in that Fix. To ensure on-time delivery, each Fix must be styled within a specific time frame.

The number of Fix requests fluctuates from day to day and week to week, and it is critical for both Stitch Fix and our clients to have the right number of stylists available. Hiring and scheduling enough stylists to meet any possible client demand is too costly and would lead to idle resources. Failing to schedule enough stylists to meet demand could lead to Fixes arriving late or the opportunity cost of deferred demand.

In practice, Stitch Fix does not tell stylists when to work. Styling a Fix requires creativity and freedom, and we can best serve our clients by allowing stylists to choose the schedule that works best for them. For the purposes of this post we will present a simplified version of this problem. Let’s assume that stylists are provided specific schedules that take into account their preferences, and explore a method for finding the right allocation of stylists over time in the presence of uncertain client demand.

### Optimize now for minimal future recourse

It takes time to hire and train new stylists and therefore stylists’ schedules must be set well before client demand is realized. But, even with a set schedule, once we observe client demand we do have some flexibility to adjust. Flexing up or down both incur additional costs and are generally more expensive than scheduling time initially, but alleviates the risk of over or understaffing.

This is naturally framed as a two-stage optimization problem, among the most widely applied and studied stochastic programming models. In a two-stage stochastic problem, the decision maker first commits to an action (scheduling in our example) before knowing the outcome of some random event (demand). Then, after the outcome is known, the decision maker is able to take a recourse action to correct for the decision made in the first stage in light of new information. The optimal policy tells the decision maker what the best initial action is and what recourse actions will be necessary for each random outcome.

### Two-stage stochastic optimization framework

Let’s do a little bit of math and formulate our stochastic problem. First, we will need some notation. The ultimate decision we want to make is whether an individual stylist is scheduled to work during a given time frame. Let’s call the aggregate vector of such decisions \(\mathbf{x}\). The deterministic single-stage scheduling problem can be formulated as:

where \(\cal{F}\) is the feasible space, the set of decisions \(\mathbf{x}\) that respect the individual stylist work preferences and required staffing levels, and \(c\) is the cost of scheduling (think compensating stylists for working within a specific time frame).

There are two issues with this decision: (1) it does not
hedge against any future demand outcomes, and (2) it assumes we
have no way to correct the schedule once we observe the demand.
The two-stage formulation we are about to introduce will address
both of these by not minimizing only the cost of the initial committed
schedule, but minimizing the **total expected cost** of both
scheduling and recourse.

Once we make the “here and now” decision \(\mathbf{x}\) in the first stage, we can
make corrections in a **second-stage** (recourse) optimization problem after the uncertainty is revealed.
Let’s denote the randomness in the recourse data
by \(\omega\). In stochastic optimization, we assume
the probability distribution of \(\omega\) can be estimated from historical data.
The general two-stage
problem can then be written as:

where we minimize the expected recourse cost, \(Q\), **in addition** to the cost of
the initial decision while respecting the original flexibility requirements
in the schedule, \(\cal{F}\).

So where did the recourse decisions go? The optimal recourse \(Q(\mathbf{x}, \omega)\), is the solution to the following problem:

where \(q_\omega\) is the cost of the recourse action \(\mathbf{y}_\omega\) in the random event \(\omega\) and \(\cal{F}^\prime_\omega\) is the random feasible space for \(y\) whose shape changes with \(\mathbf{x}\) and \(\omega\).

## Back to our example

Let’s apply this framework to our scheduling problem! After setting the schedule for time \(t\) (call it \(x_t\)), we observe the realized demand (call it \(\delta_{t\omega}\)) and take a recourse action to adjust for any gap between these two.

In the random event \(\omega\), if the error \(x_t - \delta_{t\omega}\) is negative we are understaffed and would prefer to add stylists in order to meet client demand. Let’s call this decision \(y_{t\omega}^+\), which produces a new staff size at time \(t\) of \(x_t + y_{t\omega}^+\). Then, \(v_{t\omega} =\delta_{t\omega} - (x_t + y_{t\omega}^+)\) is the resulting unmet demand (if any) after the adjustment. Additional stylist time and the remaining error incur costs which we denote by \(q_{t\omega}^{+}\) and \(r_{t\omega}\), respectively.

Similarly, if the demand turns out to be lower than estimated and we are overstaffed, we can adjust the schedule by asking some stylists not to work. Let’s call this staff reduction \(y_{t\omega}^-\) and its associated cost \(q_{t\omega}^{-}\). The resulting actual excess is then given by \(u_{t\omega} = (x_t - y_{t\omega}^-) - \delta_{t\omega}\) with a respective additional underutilization cost \(h_{t\omega}\).

The second-stage problem aims to minimize the total expected cost of the recourse actions and the remaining mismatch between the demand and available staff. This can be modeled as:

where the first constraints capture the balancing relationships between the decisions and demand.
Note that we cannot be over and understaffed at the same time, so

\(y_{t\omega}^- + u_{t\omega}\) and \(y_{t\omega}^+ + v_{t\omega}\) cannot
be both positive for a given random event \(\omega\).

### How do we solve this in practice?

The expected value component in the problem we formulated can be a high-dimensional integral. How can we break this down and solve this problem numerically? We mentioned earlier that the uncertainty in the two-stage problem is assumed to come from a probability distribution we can estimate. This, under certain assumptions, allows us to specify a finite set \(S\) of realizations of the random variables, called “scenarios,” where each scenario has an associated probability \(p_s\). The expectation can be written as:

With this formulation we have converted a stochastic problem into a large deterministic problem.
This is called the *deterministic equivalent problem*. When we solve this problem,
we obtain a single optimal solution \(\mathbf{x}\) which we commit to first, as well as optimal
solutions \(\mathbf{y}_s\) for each scenario \(s \in S\).

But how should we construct the set of scenarios in the first place? Scenarios are built to approximate the original solution space, if we pick too few samples, then our obtained optimal solution may be poor, compared with the “true” optimum of the original problem. On the other hand, with too many scenarios the size of the resulting optimization problem explodes and becomes intractable. We basically have to trade off solution quality with tractability. One popular approach is to use Monte Carlo techniques to build an approximation of the expected value term. We are not going to discuss the quality of these approximations here but if you are interested, you may read about them here.

After careful scenario selection, if the size of the deterministic equivalent problem is small enough, you can solve it as a usual optimization problem. This blog post shows you how. High-quality deterministic equivalent problems used in real world applications are too large to solve in practice. The good news is that these problems have a special shape that we can use to decompose them into smaller subproblems. Going over details of this special shape and available decomposition algorithms is beyond the scope of this post. You can check out this introductory resource if you want to learn more.

### Final thoughts

By solving the stochastic optimization problem we are able to make optimal decisions in the face of uncertainty. But the benefits go beyond a single solution. By framing and solving this problem we consider a number of key business questions. What is the nature of the uncertainty that impacts our decisions? What actions are necessary, when, and what remedial actions are possible? Lastly (and possibly most importantly), how do we value the necessary trade-offs between those actions and the possible impact to the business, employees, and clients? Addressing these questions, clearly and explicitly, can have a dramatic impact on how the business operates.

## References

- Kibaek Kim and Sanjay Mehrotra. “A two-stage stochastic integer programming approach to integrated staffing and scheduling with application to nurse management.” Operations Research 63.6 (2015): 1431-1451.
- Alessandra Parisio and Colin Neil Jones. “A two-stage stochastic programming approach to employee scheduling in retail outlets with uncertain demand.” Omega 53 (2015): 97-103.