Polymorphic, Recursive Interfaces Using Go Generics
Go's generics solve a previously-unsolvable problem with interfaces, but awkwardly
Go's generics solve a previously-unsolvable problem with interfaces, but awkwardly
Understanding recomposition in Jetpack Compose is critical to optimizing application performance. Here are some "gotchas" to avoid.
Here on the Styling team at Stitch Fix, we are lessening the amount of state a developer needs to have in their head at any given time by moving towards an application architecture that makes use of microservices and micro front ends, connected by a central API.
In a world of asynchronous communication, it’s more important than ever to create inclusive and remote-friendly collaboration, decision-making, alignment, and documentation processes.
An excerpt from an interview with Code Climate and Codecademy on how to transition from IC to engineering manager and tips for new managers.
A video recording of a talk from TSConf 2020 teaching how the use of strong types with TypeScript can prevent bugs in React apps.
When looking back at the various teams I’ve been part of throughout my career, I often think about what made certain teams successful, and why others struggled more? Learn how Stitch Fix teams are built for success.
How can a successful technology organization balance the need to innovate with the desire to maintain a lean, consistent tech stack?
This past December marked three years since I joined Stitch Fix Engineering. In that short time, I’ve witnessed the bulk of the growth that we’ve experienced as a company since our founding. For example, in December 2014 there were roughly 10 engineers and now we number nearly 100. It was a time when those of us at our SF headquarters used to be able to sit around a small conference room table for our all hands meetings. Similarly, we needed just a slightly larger table when our remote engineers (roughly 50% of our team) came to SF for a week during our once-quarterly Engineering summits.
The lights are dimmed, bingo cards are dealt, and the scent of popcorn (well, melted butter..) spreads through the corridors... It’s time for usability movie night, and everyone’s invited!
I’ve been lax in updating the “Patterns of Service-oriented Architecture” series, so here’s a new one on a time-honored and critical technique: Database Transactions. This is a powerful feature of most SQL databases that allows you to apply a series of changes to the database in an “all or nothing” type of approach.
Recently Increment published an issue about what it is like being a developer at a number of great companies like Fastly, Lyft, and DigitalOcean. We thought it would be fun to create a blog post answering the same questions about what it’s like at Stitch Fix.
At Stitch Fix we certainly have enough data that it qualifies as Big, but since we collect the data ourselves we focus on making it as Rich as possible.
In a system based on messages or events, there are numerous ways that system can fail, and the techniques needed to handle those failures are different. You can’t just switch messaging infrastructure or use a framework to address all failure points. It’s also important to understand where failures can occur, even if the underlying infrastructure is perfect. Much like acknowledging that there is no happy path, the way you plan for failure affects product decisions.
It seems like the endless stream of data in a terminal is the farthest thing from a living, breathing being. Looking through the Stitch Fix codebase, what comes to mind is structures, information systems, methods of exchange – not a group of finches on a faraway Galapagos island.
The opinion that software will soon dominate and radically change every aspect of everyone’s lives has become so commonplace, and repeated so frequently, that those of us in the tech industry treat it as a statement of fact. A less heralded but more concrete fact is that much of the productivity gains expected from the introduction of computer technologies have not been realized.
Next up in our “Patterns of Service-oriented Architecture” series we’ll talk about dealing with highly normalized data that spans many tables and services, or otherwise has a large object graph that reaches beyond just a simple database, by caching a denormalized version of it.
In this installment of our “Patterns of Service-oriented Architecture” series, we’re going to talk about a complex concept called idempotency, and a technique you can apply to your service design to ensure that requested work is only performed once.
This entry in our “Patterns of Service-oriented Architecture” is a very common one, but it bears discussion. It’s running code in a background process, instead of in a synchronous request a Consumer might be waiting on.
You may have heard the news that we’re sharing more information about Stitch Fix. We’ve also been sharing our expertise. We’ve been busy ourselves talking about how we build systems here at Stitch Fix. Here are some talks our team has delivered in the last 2-3 weeks.
This is the first installment in the “Patterns of Service-oriented Architecture” series of posts, and we’ll start off with a widely-applicable pattern called Asynchronous Transaction. It’s a simple pattern to use when your service must perform long-running tasks before giving a definitive result to its consumer.
This is the start of a blog series called “Patterns of Service-Oriented Architecture”, which is based on my experience at Stitch Fix (the first post is up if you want to get right to it!). Over the last four years, we’ve gone from a team of two developers and one Rails app, to almost 80 developers managing 40+ applications. These applications are a mixture of user-facing and headless services. While our technical architecture isn’t perfect, we’ve had relatively few major problems. Part of the reason for that is that we’ve done a decent job of identifying and re-applying patterns to solve similar technical challenges.
We’ve been trying to use the technology radar concept as a way to document both what technologies we have in use as well as our current thinking on direction. It’s viewed as documentation and not speculation, so our tech radar should never have stuff in it that’s not actually in production.
In cycling, a wheel that doesn't wobble is considered to be "true". We're happy with the email validation solution we've found here - it's *also* true, and did not involve re-inventing the wheel.
We could not have the luxury of continuous deployment if we didn’t have the discipline of a strong testing process. Tests help us ensure that each new version of the app works well, without having to go through an extensive manual QA process. Tests help us continuously deploy with confidence.
Today, I feel pretty comfortable with my relationship to Time. That's a new thing for me to be able to say. Most of my adult life has felt like a Thunderdome style battle to the death with Time.
Somewhere in a closet in Texas, a desk sits, isolated from the noise of children and clanging dishes. From here, I gather my wits and connect to the outside world. I reach out to other devs.
If there’s one thing we probably have in common, it’s that we both have a “1:1”. Some sort of regular check-in meeting with your manager. If you don’t, I urge you to fix that.
My team at Stitch Fix builds internal tools for our merchandisers, who are responsible for planning and buying inventory that delights our clients. Internally, we are known as the “Erch” team. It’s a funny name with roots deep in Stitch Fix history, but I’ll save that story for another time. Instead, I have a story of my own to share. Prior to joining the Erch Engineering team, I got my start at Stitch Fix on the Merchandising team.
It’s easy to know if you are under-engineering something, because you produce sloppy work. It’s much harder to know when you’re over-engineering. The root cause of this is expanding the problem at hand so that the solution is much more interesting than the solution to the actual problem.
In 1977, an important 20th-century architect and prolific author, Christopher Alexander, wrote a book, “A Pattern Language: Towns, Buildings, Construction” followed in 1979 by another book, “The Timeless Way of Building”. These were books about architectural thinking where patterns, a pattern language, and a “way” of building were discussed in depth.
At Stitch Fix, we are currently tackling a pretty common problem among fast-growing startups in the process of scaling. Our applications are overdependent on a shared database, and in order for us to uncouple the various engineering teams from one another and to grow our applications to the next level, we need to unshare it. This blog post will talk about the problems we are trying to solve, and the stepwise approach we are taking to solve them.
One of the biggest challenges to growing an engineering team is dealing with technology choice. Some organizations stop time at the moment they chose their current stack, refusing to add any new technologies. Others allow everyone to use whatever they want, leading to an explosion of unmanageable technologies. Both of these are terrible, but what is the alternative?
Since Stitch Fix is a retail company at heart, we operate on the merchandising calendar. The merchandising calendar is used by the retail industry for accounting of sales, inventory and payroll. It originated in the 1930s and became widely adopted by the 1940s. The primary reason for its creation was to guarantee the same number of weekends in comparable months since a large percentage of retail sales occur on weekends. Also the calendar ensures that any end date of a period falls on the same day of the week.
Rather than putting all domain logic into a single application monolith, modern software architectures tend to split functionality into multiple applications and services. There are definitely pros and cons of either design; and it is not the goal of this article to go into the details of what is better and why. However, it is generally agreed upon that one of the detractors of using services is the increased complexity that it brings to the table. Specifically, there are many more points of failure, and building a robust system means expecting that nodes of your system may fail at any given point in time. These failures cannot be ignored and you cannot expect to be able to wait until your nodes are all green to operate normally. In order to address some of the added complexity that services introduce, I’ve compiled a checklist of important elements to consider:
Way back in 2013, I described Stitch Fix’s git flow, as a reaction to the popular post “A Successful Git Branching Model”. Recently, Jussi Judin posted a reaction using the classic considered harmful tag line. I read the post with enthusiasm, but found the process described still too complex. This felt like a good time to refresh my previous post with how Stitch Fix currently does branching and Git (it’s even simpler than it was in 2013).
This is a story I will one day tell my grandchildren. But as I have no grandchildren, you, dear reader, will have to do. Now sit down, pop this hard candy in your mouth and look interested.
Preboot is a feature provided by Heroku that can help you achieve zero downtime deploys. Meaning that when you push a new version of your code, there’s not even a split second that your users experience your app being down.
Update (10/31/2016): We’ve written a newer blog post about how we test, integrate and deploy our iOS app. It complements the information here, and includes up-to-date details about our current process.
It’s often expedient to discuss the “happy path”, which is the ideal or most simple flow of logic through a system. While it’s a great tool for conversation and identifying requirements, I’ve found it more and more problematic when thinking through actual implementations, especially when there are distributed systems involved. It’s often better to plan and design for all paths from the start.
Update (10/31/2016): We’ve written a newer blog post about how we test, integrate and deploy our iOS app. It complements the information here, and includes up-to-date details about our current process.
I am a tech talk and blog post addict! In the evenings I frequently run on my treadmill and watch at least one tech talk. At our last offsite in San Francisco a few of my co-workers encouraged me to write a post of my favorite tech talks and articles from 2015. Well here they are, in no particular order:
Stitch Fix recently released its first iPhone app! What started as a simple, single-storyboard application is now a complex application with entirely programmatic views. The transformation from storyboard to programmatic views was not straightforward–we experienced the good and the bad of storyboards, .xibs, programmatic views, and in-between hybrids.
Here at Stitch Fix, we have many different apps and services. As our infrastructure grows, so does the need to create and define more “micro-services” to centralize and isolate important shared behavior and data. Testing is a very important part of our development process. As we started creating more and more services, we realized that we had to change the way we think about integration tests when services are involved. Here’s a typical workflow that you may be familiar with:
I think it’s safe to say that on-call week is never something we software engineers eagerly anticipate. I know from past experience, my stress levels tend to uptick, my personal productivity expectations decline, and my eyes are always trained on my email, hoping the dreaded pagerduty alert will not jump from the shadows. But I have to ask myself, what’s driving these negative emotions? And more importantly, are they pointing to possible improvements we as software engineers should implement to make on call duties less of a burden? At Stitch Fix, we’ve acknowledged these concerns and recently implemented strategies to help make on call duties less stressful and more productive.
I know it’s trite to say that interviewing for a job is really stressful, but I hope you’ll cut me some slack. The experience is still a little fresh. This summer I dove headfirst into the gauntlet that is Job Hunting for the first time in six years. I’d heard nightmarish stories about what awaited me: riddles and puzzles, trick question whiteboard code exercises, and grueling panel interview sessions. I was, if you’re going to force me to be perfectly honest, scared out of my mind.
One of the great things about being a software engineer at Stitch Fix is that most of our applications are internal-facing. This means that many of our “customers” are also our co-workers and may even work in the same building as us. For example, I work on our engineering team that supports our merchandising organization - the people who plan, buy, design, and allocate our inventory. Our merchants work with us in our headquarters on Market St. in downtown San Francisco.
Update (10/31/2016): We’ve written a newer blog post about how we test, integrate and deploy our iOS app. It complements the information here, and includes up-to-date details about our current process.
We don’t have a single monolithic application—we have lots of special purpose applications. Initially, there were just a few, managed by a few developers, and we used RubyGems to share logic between them. Now, we have over 33 developers, are a much bigger business, and have a lot more code in production. So, we’ve turned to HTTP services. Instead of detailing the virtues of this architecture, I want to talk about how we do this with Rails using a shared gem that’s less than 1,000 lines of code called stitches.
I’ve been talking to and interviewing quite a few junior software engineers and Rails bootcamp graduates of late. Since testing is so central to our work here at Stitch Fix I usually ask them about it and TDD. Often people respond sheepishly that they don’t really write tests first but they know they should. One person told me that their bootcamp class read the “TDD is dead. Long Live Testing” post by DHH and took away the lesson that testing wasn’t required. I’m pretty much 100% certain that’s not the correct interpretation. Perhaps it points to a problem with TDD in Rails development culture: testing is a requirement and TDD is a choice, but the two are often conflated.
Facebook, Uber, Lyft, GitHub, Pandora - an impressive list of companies. And now Stitch Fix, a company devoted to disrupting the retail industry with innovative technology, and where I work as a software engineer, was in direct competition against these organizations.
We’ve given up on “fat models, skinny controllers” as a design style for our Rails apps—in fact we abandoned it before we started. Instead, we factor our code into special-purpose classes, commonly called service objects. We’ve thrashed on exactly how these classes should be written, so this post is going to outline what I think is the most successful way to create a service object.
A couple of years ago a developer friend and I were discussing our ideal engineering jobs. I didn’t have a specific idea of what mine would be but I knew that it would sound boring to most people. Less than a year later I found myself at Stitch Fix doing just that and I could not be happier.
I ❤ UNIX and using the command line; they help me solve problems at Stitch Fix. I’m not alone. Across the Data Science and Engineering teams, we’re constantly solving problems with UNIX and the command line.
Now you know all about strings
and ints
and how Hash tables
are ordered (or not). You know how to talk to a JSON API and build a Twitter clone in 3 different languages (or maybe 3 different JavaScript frameworks). Your aptitude, technical knowhow and general pluckiness have helped you land your first real Software Development job. Congratulations. This is no easy feat. I know that you put in many hours of study and many more of frustration to get where you are today. By knowing how to write software, you’ve opened up for yourself a whole world of possibilities.
As mentioned previously, a lot of what we do at Stitch Fix is create software to run Stitch Fix. While we have a team dedicated to our (external) customers’ experience, most engineers are working on software for internal users.
Being a good developer means knowing both what to build and how to build it. At Stitch Fix, we don’t have a dedicated product team, so we spend more time than most figuring out what to build1.
Working on a public-facing website can be a ton of fun. You get to show your work off to friends & family, see it get used by a ton of people, and feel the pride of working on what the world thinks of as your company’s “product”. But this is just the tip of the iceberg.
On my personal blog, I outlined some issues with Rails’ front-end support. A lot of discussion around that post had to do with the notion that Rails isn’t for building “ambitious” apps that are client-side heavy. The thinking further goes that for un-ambitious apps, “The Rails Way” (server-generated JavaScript) and JQuery is Just Fine™.
At Stitch Fix, we’re working on our sign-up flow. During this process, we debated whether we should have a single password field, or, as many sites do, two— the second one being a “password confirmation”.
The Grouper Engineering Blog posted a good description of what it’s like to host Resque on Heroku:
We call it “Old Admin”. A basic admin interface (similar to Rails’ Active Admin) built by an outside team before the current engineering team arrived. For a time it ran our whole business. One of the first tasks of our original engineers was to replace it with something more sophisticated. However, limited as it was in many ways, it inadvertently taught us some good lessons about designing internal systems.
We use Resque at Stitch Fix…a lot. For background processing or getting work out of the web request/response loop, Resque is our go-to technology.
The ability to quickly create and deploy an application is crucial to avoiding a monolithic architecture. I touched on this in my talk at RubyNation, as well as my post here, but a key part of that ability is to script the actual creation of the application so that it’s ready for your infrastructure and your team. At Stitch Fix, we use Ruby on Rails for most of our applications and, fortunately enough, Rails provides a handy feature called application templates that allows you to script the creation of a new applications with whatever boilerplate you need.
Like most good Rails developers, we use presenters at Stitch Fix. We typically implement them using delegation, but I’ve been finding that the time savings of this approach over just making a struct-like class is negligable, and results in code that’s harder to change and harder to use.
Stitch Fix is the third startup I’ve worked at, and the second where I joined at a fairly early stage. So far, we’ve been able to avoid creating a single monolithic application, and have been consistently delivering and deploying solutions to our users. I believe this is because we’ve developed a set of “super powers” which have been extremely helpful, and I believe these powers will keep our team and technology healthy for years to come.
An old post from Linus Torvalds about using git popped up on Hacker News today, and reminded me of the complex gitflow workflow, as well as Scott Chacon’s post on “Github Flow”, both of which are excellent reads.
At Stitch Fix, we outsource pretty much all of our hosting and technical needs to Heroku or their add-ons. Given where we are now as a company, it makes total sense: we don’t need to hire an admin, we don’t need to adminster actual boxes, and we can easily add/remove/change our technical infrastructure. If you are a small startup and you are messing with Linode slices, you are probably wasting time.
Tonight Jon Dean will be giving a lighting talk at the Pittsburgh Ruby User Group to talk about why business logic in Rails applications should be moved out of the model and into pure Ruby classes. If you’re in the Pittsburgh area be sure to stop by and introduce yourself!
Jon Dean spoke at Magma Conf 2013 about Inter-Application Communication and RESTful APIs for a Service Oriented Architecture. The talk was intended to provide a basic overview of a few reasons to create a Service Architecture for your platform and some basic starting code for those just beginning to look into it. Take a look here for more information and the slides.