A New Era of Creativity: Expert-in-the-loop Generative AI at Stitch Fix

Tianlin Duan
- San Francisco, CA

Generative AI has been gaining attention and popularity in recent years. Made possible with advances in deep learning algorithms and trained with previously unimaginable amounts of data, generative AI has already contributed many real-world use cases, from creating realistic images with systems like DALL-E 2 and Midjourney, to generating human-like responses with ChatGPT.

At Stitch Fix, we are constantly exploring innovative ways to utilize the latest advancements in AI and ML to enhance the experiences of our clients. In this blog post, we will delve into our approach to generative AI, with a special focus on our text generation use cases. By combining algo-generated text with a human expert-in-the-loop approach, we aim to streamline tasks such as crafting engaging advertisement headlines and producing high-fidelity product descriptions.

Algo-generated Ad Headlines

Generative AI in the text space is powered by large language models (LLMs) that are pre-trained on vast amounts of data (for example, GPT-3 is pre-trained on nearly the entire internet) and can understand and generate natural language. However, once pre-trained, it can generalize from very limited amounts of data and is capable of performing a wide variety of natural language tasks such as Q&A, translation, summarization, and text generation. This few-shot learning capability, which relies only on a few examples to make predictions, makes it especially well suited for tasks that require creativity and originality, such as crafting compelling ad headlines.

Ad headlines are often the first interaction with potential clients, so it’s crucial to make them engaging. Traditional marketing requires a copywriter to write new headlines for every new ad asset, which can be time consuming and costly, and may not always result in unique copy. Using generative AI, such as GPT-3, we can quickly generate a large number of headlines tailored to our brand tone and messaging.

We achieve this by using a combination of latent style understanding (check out this blog and this blog to learn more about how we understand clients’ personal styles), word embeddings (read more about word embeddings in this blog and this blog), and few-shot learning. Our ad assets primarily consist of many outfit images that illustrate a wide range of styles we offer. First, we map the outfit and a set of style keywords (such as effortless, classic, romantic, professional, boho etc.) to the latent style space, and then find the style keywords closest to the outfit in that space. Next, we use GPT-3 to generate headlines based on the selected style keywords. Our human experts (copywriters) then review and edit the headlines generated by AI to ensure they capture the style of the outfit and align with the brand’s tone and messaging. This human expert-in-the-loop approach allows us to leverage the creativity and efficiency of generative AI while still maintaining human oversight. For our copywriters, reviewing headlines instead of writing new ones saves a significant amount of time and effort.

Diagram illustrating how we generate ad headlines

We have since used the expert-in-the-loop approach for generating all ad headlines for Facebook and Instagram campaigns, and have continued to benefit from improved efficiency while not sacrificing quality. Additionally, we’ve been constantly seeking opportunities to extend this capability to other areas.

Algo-generated Product Descriptions

“A smart choice for everyday wear, this crew neck T-shirt is a versatile addition to your wardrobe, pairing well with jeans, leggings and shorts.”

With the success of algo-generated ad headlines, we have gained confidence in the potential of expert-in-the-loop generative AI for production use cases. Next, we set our sights on a problem of even greater scale and complexity: product descriptions.

Product descriptions play a crucial role in e-commerce and fashion retail websites. Well-written, accurate, and detailed descriptions can enhance the client experience, build trust, and improve search engine optimization. Our Freestyle offering, where clients can shop for individual items in their own personal shopping feed, benefits greatly from informative and compelling product descriptions on the product detail pages (PDP). Writing descriptions for hundreds of thousands of styles in inventory is a daunting task for human copywriters alone, and relying on the few-shot learning approach used for ad headlines results in generic, limited quality descriptions. We needed a solution that could build on the generic large language model with few-shot learning and utilize expert-written examples to create a customized solution for our use case.

Enter, fine-tuning!

Fine-tuning is the process of retraining a pre-trained base model on a smaller, task-specific dataset in order to adapt it to a specific use case. Through fine-tuning, the model learns the unique language, style, and requirements of the specific task, leading to improved performance compared to a generic pre-trained model. For our product description use case, we gathered a task-specific dataset by having our human experts write several hundred high-quality product descriptions (the “completion”, or training output) based on product attributes (the “prompt”, or training input). We then fine-tuned the base model on this task-specific dataset to teach the model our language, style, and template for high-quality product descriptions. This resulted in accurate and engaging descriptions tailored to meet the needs of our clients, all written in the Stitch Fix brand voice.

Our fine-tuned algo solution offers unbeatable time savings as well as excellent scalability without sacrificing quality of descriptions. In fact, in a blind evaluation, where we compared our algo-generated product descriptions with written descriptions, we found that the descriptions generated by our AI solution achieved higher quality scores, which demonstrated the efficacy of our fine-tuned algorithm solution.

An Expert-in-the-loop Approach

In both use cases above, we used an “expert-in-the-loop” approach that incorporates human expertise into the text generation and evaluation process. It combines the efficiency and scalability of algorithmic solutions with the quality and expertise of human experts. The result is a solution that not only delivers high-quality content but also continuously improves with each iteration.

In comparison to relying solely on human experts, the expert-in-the-loop approach is much more efficient and fun. Our copywriters have reported saving significant time and effort as they review and edit the algo-generated content instead of writing new content from scratch. They have also shared that the algo-generated content can be fun and even inspiring to work with, as there are often interesting expressions or angles that are not typical of human-generated content.

Compared to purely algorithmic solutions, the expert-in-the-loop approach ensures the quality and appropriateness of the client-facing content. Among all generative AI applications, text generation may be the area where human expertise is needed most. Natural language is complex and nuanced, and while algorithms are able to generate text, they often fall short when it comes to capturing the subtleties of human language such as tone and sentiment. That’s where human experts come in. Only human experts are able to distinguish between these nuances and choose the best expressions for the algorithm to model after.

In our expert-in-the-loop approach, human experts work with us from the very beginning to define what a high-quality output should look like. For product descriptions, for example, the output should be original, unique, and sound natural and compelling. It should also make truthful statements about the product and align with our brand guidelines. As we iterate and improve the algorithm, our experts provide valuable insight into how to further improve the solution. For example, in ad headlines, our copywriters can discern if certain fashion-forward wording may not align with our brand messaging. This intelligence can help us further fine-tune the algorithm and improve the quality of the output through regular quality assurance checks. The expert-in-the-loop approach thus creates a positive feedback loop where human experts and algorithms work together to continually improve the quality of the generated content.

As we look ahead and as the use of generative AI continues to grow and evolve, we are excited to explore the potential of generative AI in even more use cases across our business: assisting efficient styling, textual expression of style understanding, and much more.

Tweet this post! Post on LinkedIn
Multithreaded

Come Work with Us!

We’re a diverse team dedicated to building great products, and we’d love your help. Do you want to build amazing products with amazing peers? Join us!