Thoughtfully writing a blog post

More Human Humans

Machines are going to take over the world and leave us humans without jobs. This is the meme going around in mainstream business books on the topic of Artificial Intelligence (AI). This is understandable as the number of things that machines can do better than humans is increasing: diagnosing medical conditions, analyzing legal documents, making parole decisions, to name a few. But doing something better doesn’t necessarily make machines an alternative to humans. If machines and humans each contribute differently to a capability, then there is opportunity to combine their unique talents to produce an outcome that is better than either one could achieve on their own. This has real potential to change not only how we work, but also how we understand our experience of being human.

Good Books for All Things Data

One of the greatest benefits of working among a diverse group of data scientists and data engineers at Stitch Fix is how much we can learn from our peers. Usually that means getting ad hoc help with specific questions from the resident expert(s). But it also means getting advice on how best to fill any gaps in our own skill sets or knowledge bases, or just what interesting data science materials to explore in our spare time. Our blog posts usually highlight the former; this post touches on the latter.

Introducing our Hybrid lda2vec Algorithm

The goal of lda2vec is to make volumes of text useful to humans (not machines!) while still keeping the model simple to modify. It learns the powerful word representations in word2vec while jointly constructing human-interpretable LDA document representations.

Real-Time Event Visualization

Beautiful data visualizations reveal stories that mere numbers cannot tell. Using visualizations, we can get a sense of scale, speed, direction, and trend of the data. Additionally, we can draw the attention of the audience – the key to any successful presentation – in a way that’s impossible with dry tabulations. While a tabular view of new online signups is informative for tracking, a dynamic map would provide a more captivating view and reveal dimensions that a table cannot.

The Timeless Way of Building Software

In 1977, an important 20th-century architect and prolific author, Christopher Alexander, wrote a book, “A Pattern Language: Towns, Buildings, Construction” followed in 1979 by another book, “The Timeless Way of Building”. These were books about architectural thinking where patterns, a pattern language, and a “way” of building were discussed in depth.

Unsharing the Database

At Stitch Fix, we are currently tackling a pretty common problem among fast-growing startups in the process of scaling. Our applications are overdependent on a shared database, and in order for us to uncouple the various engineering teams from one another and to grow our applications to the next level, we need to unshare it. This blog post will talk about the problems we are trying to solve, and the stepwise approach we are taking to solve them.

Sorry ARIMA, but I’m Going Bayesian

When people think of “data science” they probably think of algorithms that scan large datasets to predict a customer’s next move or interpret unstructured text. But what about models that utilize small, time-stamped datasets to forecast dry metrics such as demand and sales? Yes, I’m talking about good old time series analysis, an ancient discipline that hasn’t received the cool “data science” rebranding enjoyed by many other areas of analytics.

Managing Technical Change

One of the biggest challenges to growing an engineering team is dealing with technology choice. Some organizations stop time at the moment they chose their current stack, refusing to add any new technologies. Others allow everyone to use whatever they want, leading to an explosion of unmanageable technologies. Both of these are terrible, but what is the alternative?

Thought Experiments in the Browser

As data scientists, we work in concert with other members of an organization with the goal of making better decisions. This often involves finding trends and anomalies in historical data to guide future action. But in some cases, the best aid to decision-making is less about finding “the answer” in the data and more about developing a deeper understanding of the underlying problem. In this post we will focus another tool that is often overlooked: interactive simulations through the means of agent based modeling.

Machine Learning To Kickstart Human Training

Stitch Fix values the input of both human experts and computer algorithms in our styling process. As we’ve pointed out before, this approach has a lot of benefits and so it’s no surprise that more and more technologies (like Tesla’s self-driving cars, Facebook’s chat bot, and’s augmented customer service) are also marrying computer and human workforces. Interest has been rising in how to optimize this type of hybrid algorithm. At Stitch Fix we have realized that well-trained humans are just as important for this as well-trained machines.