Thoughtfully writing a blog post

Introducing our Hybrid lda2vec Algorithm

The goal of lda2vec is to make volumes of text useful to humans (not machines!) while still keeping the model simple to modify. It learns the powerful word representations in word2vec while jointly constructing human-interpretable LDA document representations.

Real-Time Event Visualization

Beautiful data visualizations reveal stories that mere numbers cannot tell. Using visualizations, we can get a sense of scale, speed, direction, and trend of the data. Additionally, we can draw the attention of the audience – the key to any successful presentation – in a way that’s impossible with dry tabulations. While a tabular view of new online signups is informative for tracking, a dynamic map would provide a more captivating view and reveal dimensions that a table cannot.

The Timeless Way of Building Software

In 1977, an important 20th-century architect and prolific author, Christopher Alexander, wrote a book, “A Pattern Language: Towns, Buildings, Construction” followed in 1979 by another book, “The Timeless Way of Building”. These were books about architectural thinking where patterns, a pattern language, and a “way” of building were discussed in depth.

Unsharing the Database

At Stitch Fix, we are currently tackling a pretty common problem among fast-growing startups in the process of scaling. Our applications are overdependent on a shared database, and in order for us to uncouple the various engineering teams from one another and to grow our applications to the next level, we need to unshare it. This blog post will talk about the problems we are trying to solve, and the stepwise approach we are taking to solve them.

Sorry ARIMA, but I’m Going Bayesian

When people think of “data science” they probably think of algorithms that scan large datasets to predict a customer’s next move or interpret unstructured text. But what about models that utilize small, time-stamped datasets to forecast dry metrics such as demand and sales? Yes, I’m talking about good old time series analysis, an ancient discipline that hasn’t received the cool “data science” rebranding enjoyed by many other areas of analytics.

Managing Technical Change

One of the biggest challenges to growing an engineering team is dealing with technology choice. Some organizations stop time at the moment they chose their current stack, refusing to add any new technologies. Others allow everyone to use whatever they want, leading to an explosion of unmanageable technologies. Both of these are terrible, but what is the alternative?

Thought Experiments in the Browser

As data scientists, we work in concert with other members of an organization with the goal of making better decisions. This often involves finding trends and anomalies in historical data to guide future action. But in some cases, the best aid to decision-making is less about finding “the answer” in the data and more about developing a deeper understanding of the underlying problem. In this post we will focus another tool that is often overlooked: interactive simulations through the means of agent based modeling.

Machine Learning To Kickstart Human Training

Stitch Fix values the input of both human experts and computer algorithms in our styling process. As we’ve pointed out before, this approach has a lot of benefits and so it’s no surprise that more and more technologies (like Tesla’s self-driving cars, Facebook’s chat bot, and’s augmented customer service) are also marrying computer and human workforces. Interest has been rising in how to optimize this type of hybrid algorithm. At Stitch Fix we have realized that well-trained humans are just as important for this as well-trained machines.

The Merchandising Calendar

Since Stitch Fix is a retail company at heart, we operate on the merchandising calendar. The merchandising calendar is used by the retail industry for accounting of sales, inventory and payroll. It originated in the 1930s and became widely adopted by the 1940s. The primary reason for its creation was to guarantee the same number of weekends in comparable months since a large percentage of retail sales occur on weekends. Also the calendar ensures that any end date of a period falls on the same day of the week.

Assessing the Null Hypothesis — A Meta-Analysis (Ruminations on April 1st)

As statisticians and data scientists, we often set out to test the null hypothesis. We acquire some data, apply some statistical tests, and see what the p-value is. If we find a sufficiently-low p-value, we reject the null hypothesis, which is commonly referred to as .