Be smarter. Be seetd.

How to organize an office so everyone working there can be comfortable and productive is the topic of much discussion (and the subject of a previous post). One of the key questions is, how do we allocate people to desks (or offices)?

A common strategy is to seat people by their team or sub-team membership (for example, the Algorithms team at Stitch Fix is broadly classed into four sub-teams: Client, Merchandise, Styling, and Platform). This has obvious advantages but potentially misses spontaneous collaboration amongst people on different teams ¹. Another strategy which we have been employing is to simply allocate people randomly. This has many benefits, mainly that it is unlikely to be anywhere near an extremum.

If we are to do something other than random, how would it work? There are many considerations, which at a high-level are quite subjective. Things like, how do we organise the different sub-teams? What about the fact some people should be seated near each other? What if individuals prefer certain parts of the office? Although these decisions are subjective, once they have been decided upon, we can bring a lot of objectivity to the problem.

To solve this problem at Stitch Fix, we have built upon our previous seating arrangement efforts and developed a new seating allocation tool - “seetd”. It works by casting the allocation of people to seats (or offices) as an optimization problem, with several different cost-terms which we will discuss below. The ultimate goal is to then find an arrangement which minimises our overall cost function.

Teams and sub-teams

One big consideration in the allocation is how to arrange sub-teams. Do we want everyone to be evenly dispersed or do we want every sub-team to sit together? An initial model is to consider if neighbouring people are on the same team or not. The term is given by:

\[\sum_{i,k} J_{ik}\sigma_i \cdot \sigma_k,\]

where \(J_{ik} = 1\) if person \(i\) sits next to person \(k\) and \(\sigma\) is the vector representation of the persons team (i.e. \(\sigma_i \cdot \sigma_k = 1\) if person \(i\) is on the same team as person \(k\) and 0 otherwise). If people are not neighbours, the term will be 0 (due to \(J_{ik}\)), and if they are, the term will only be non-zero if they share the same team vectors \(\sigma\). Interestingly, this term has the exact same form as that for the Hamiltonian of a spin-glass (disordered magnetic material), where the magnetic spins \(S\) (team membership) of atoms at lattice points (think seats), \(i,k\) is given by:

\[H = - \sum_{ik} J_{ik}S_{i}S_{k}\]

where \(J_{ik}\) is the interaction term for spins and it is calculated of nearest neighbours.

What does our team-term look like in practice? Below is an animation demonstrating what happens when we have a configuration that minimises the above cost term that promotes moving members of the same team away from each other. In the animation, each block represents a person in a seat (ignoring the fact some people may struggle to reach theirs) and the color represents team membership. Think of it being 4 tables of 9 people, with each table initially consisting of people on the same team. The initial state has all members seated together, and the final state has them mixed.

In the end we actually modify the neighbour term to take into account, not just nearest neighbours, but all neighbours by weighting them by the distance from one another. Now we have,

\[J_{ik} \rightarrow d_{ik}^{-1},\]

where

\[d_{ik} = ||\pmb{x}_i - \pmb{x}_k ||^{u}_{v},\]

and \(\pmb{x}_i\) is the location of person \(i\), \(\pmb{x}_k\) is the location of seat \(k\), \(v\) modulates the strength of the distance term, and \(v\) determines which distance norm to use (\(v=2\) is euclidean, \(v=1\) manhatten). Additionally, we also break the sum out by team. This allows us to have different interaction terms on a team-by-team basis (i.e. one team sitting together while the others are dispersed). It now looks as follows:

\[\sum_{m} \frac{\beta_m}{N_m} \sum_{ik \in \Omega_m} \left(d_{ik} +\epsilon_t\right)^{-1},\]

where \(\epsilon_t\) is a small positive constant (roughly analogous to a gravitational softening parameter). We scale by the number of people in a team (\(N_m\)), and take the sum over the set of team-members (\(\Omega_m\)) for team \(m\) which allows us to drop the dot product term (taking the sum over the team implies \(\sigma_i \cdot \sigma_k = 1 \forall i,k \in \Omega_m\)). The \(\beta\) values allow for pushing teams away from each other (\(\beta_m = 1\)), or making them agglomerate (\(\beta_m = -1\)).

Below, we can see what happens when we break the team-term out by sub-team. The animation has three of the teams with \(\beta = 1\), while one team has \(\beta = -1\). The opposite sign promotes agglomeration for that particular team, while the positive sign promotes separation. This aglomeration/dispersion process shares some similiarites with urban segregation models, as discussed in a previous post.

Preferences and distances

The second term we are going to consider is people’s personal preferences. We could imagine that different people prefer different parts of the office. For example, people may prefer to sit near windows. We can construct a matrix that encapsulates seat assignments at a particular time.

We will do this by constructing \(X_{ij}\) an \(n \times n\) matrix. Each row (\(i\)) of the matrix represents a person, and each column (\(j\)) a seat. The entries of \(X_{ij}\) dictate if person \(i\) is sitting at location \(j\) (and subsequently take on a value of either 0 or 1). If we want one person per seat and one seat per person then we also require:

\[\sum_i X_{ij} = 1\] \[\sum_j X_{ij} = 1\]

We can now also encode people’s preferences for a particular seat with another \(n \times n\) matrix: \(r_{ij}\). This matrix doesn’t have the same constraints as \(X\) since people can have multiple preferences, although it may be necessary to constrain the sum of the matrix (meaning people only have a finite amount to allocate for preferences),

\[\sum_{j} r_{ij} \le r_{max}.\]

If we only consider that we want to maximise people’s preferences then we can develop a cost function that simply consists of maximising the following:

\[\sum_{i,j} r_{ij}X_{ij}\]

It should be noted that there needs to be heterogeneity in people’s preferences for this term to do anything, otherwise it just ends up being a constant. Maximising (or minimizing) functions such as this are (now) easily solved using linear programming. What happens if we want to consider more things?

One consideration is that people should be seated further away from their current location. For each person, we can fill a row in a matrix (\(d_{ij}\)) with the distance from each seat \(j\) to that person’s current location \(j\). The aim here is to maximise the distance from the current location, which can also be written as minimizing

\[\sum_{i,j} X_{ij}\left(d_{ij} + \epsilon_d \right)^{-1}\]

where \(\epsilon_d\) is a small positive constant. It takes exactly the same form as the term for personal preferences, and these can be combined to yield:

\[\sum_{i,j} X_{ij}\left[\alpha_1r_{ij} + \alpha_2\left(d_{ij} + \epsilon_d \right)^{-1} \right]\]

where \(\alpha_i\) is a constant dictating the strength of the term (\(\alpha_1 =-1\) if we want to minimise instead of maximise).

Below is an animation demonstrating what happens when we have a configuration that minimises the above cost term (with \(\alpha_1 = 0\)). In the animation, each block represents a person just as the previous example. At the initial stage, everyone is ordered by team and at the final stage the ordering persists, but we can see that the location of each team has moved. This is consistent with the cost term, moving people as far from their original position as possible. We can see the teams on diagonally opposing tables have swapped locations as this is the configuration that minimises this cost term.

New neighbours

The final factor that goes into our objective function is that people should be seated farther from their current neighbours in the new arrangement. If we take our original seating configuration and calculate an \(n \times n\) matrix of distances between person \(i\) and person \(k\) given by,

\[p_{ik} = ||\pmb{x}_i - \pmb{x}_k ||^{u}_{v},\]

where \(\pmb{x}_i,\pmb{x}_k\) are the locations of person \(i,k\). The goal is then to minimise the following:

\[\left[\sum_{ik} |p_{ik}^{o} - p_{ik}| + \epsilon_p\right]^{-1},\]

where \(p_{ik}^{o}\) is the person-to-person distances of the original configuration and \(p_{ik}\) is the person-to-person distances for the current configuration.

This is then minimised when closer people are moved further away, and farther away people are moved closer.

Think global. Act local

The next step is to construct our cost function out of the three terms as follows:

\[C_T = \lambda_1\sum_{m} \frac{\beta_m}{N_m} \sum_{ik \in \Omega_m} \left(d_{ik} +\epsilon_t\right)^{-1} + \lambda_2\sum_{ij} X_{ij}\left[\alpha_1r_{ij} + \alpha_2\left(d_{ij} + \epsilon_d \right)^{-1} \right] + \lambda_3 \left[\sum_{ij} |p_{ij}^{o} - p_{ij}| + \epsilon_p\right]^{-1}\]

where \(\lambda_l\) is the scalar weighting of term \(l\). One important factor is how to weight the relative strength of the three terms. For our case, we wanted them to be roughly equal. To determine the appropriate \(\lambda\)’s, (\(q = 1000\)) random configurations were generated to get an average value for each term

\[\bar C_l = \sum_q C_{l,q}\]

The scaling then becomes

\[\lambda_l = \bar C_l^{-1}\]

Although this is not perfect, it at least puts each term on equal footing for random configurations. Depending on the layout, it may be necessary to alter the power scaling paramter \(u\) for the distances (i.e. \(u \lt 1\) to reduce its effect).

It might also be advantageous to include cost terms related to past configurations. This is easily done by modifying our cost function to now be:

\[C_{T}^{'} = C_T + \gamma C_{T-1}^{'}\]

which calculates the cost of the current configuration to past configurations through the term \(C_{T-1}^{'}\) and the strength of these is dictated by \(\gamma\) (generally \(\gamma \le 1\)).

Optimization

So how do we actually get the seating assignments that minimise our cost function? We are going to solve this using a combination of simulated annealing (SA) and local optimization.

SA is probabilistic optimization algorithm (and a variant on the Metropolis-Hastings algorithm) that can be very good at finding good solutions to some difficult problems. SA comes from statistical mechanics and is inspired by the way materials find low-energy atomic configurations. SA is good because it allows us to solve a huge range of problems that may be hard to optimize otherwise, and in this case, allows us the flexibility to add arbitrarily complex constraints to our seating in the future. It is able to find approximate global solutions by moving in directions that increase the cost function.

SA works by changing current potential solutions into a neighbouring state. Rather than simply selecting neighbouring states that lower our cost function, SA accepts worse states with some finite probability. This property allows it to escape minima and explore more of the space. The acceptance probability is related to the difference in the cost function as well as a temperature parameter. A high temperature will result in worse moves being accepted more regularly, while a low temp will rarely result in worse moves being accepted. A common strategy is to anneal the temperature as the iterations go on, decreasing the probability of accepting worse solutions with iteration number. To find better solutions, we also include a local search step, which allows us to improve a solution in some neighbourhood with relative ease (akin to basin hopping).

In the case here, we simply iterate through each seat and its immediate neighbours and accept any change that lowers the cost function. The local optimization part is performed after any new solution is accepted during the SA procedure. Here is some simple python to demonstrate:

  for it in range(iterations):
    X_candidate = change_iterate(X)
    new_cost = cost(X_candidate)

    accept = prob_f(new_cost, cur_cost, T) >= np.random.uniform(0,1)

    if accept:
      X = local_optimize(X_candidate)
      cur_cost = cost(X)

change_iterate is a function that moves the current seating arrangement into a neighbouring state. prob_f is a function that returns the probability of accepting the new state based on the current cost of this configurtion, the previous configuration cost, and the temperature. A common form is

\[\exp(\left[-(C_{old} - C_{new})/T \right]).\]

local_optimize acts as a greedy optimizer, moving into neighbouring states but only ever accepting configurations that lower the cost function. This process is repeated for a pre-defined number of iterations.

Below is an animation of seetd in action, optimizing the cost function we built (ignoring peoples preferences). Each color represents an individual and each block is a seat (sub-team membership is not shown but exists). The green space are aisles (or empty seats).

Conclusions

Although optimizing seating is something of a curiosity at first, it becomes apparent that the optimization shares a lot of similarities with real-world problems, from economics and logistics to urban segregation and magnetism. Here we have developed a fun tool to help us optimally allocate people. The full working code can be found here [2].

Jesse Clark and Dave Spiegel

June 29, 2017 - San Francisco, CA