Software services

Asynchronous Design Critique: Giving Feedback

Feedback, in whichever form it takes, and whatever it may be called, is one of the most effective soft skills that we have at our disposal to collaboratively get our designs to a better place while growing our own skills and perspectives.

Feedback is also one of the most underestimated tools, and often by assuming that we’re already good at it, we settle, forgetting that it’s a skill that can be trained, grown, and improved. Poor feedback can create confusion in projects, bring down morale, and affect trust and team collaboration over the long term. Quality feedback can be a transformative force. 

Practicing our skills is surely a good way to improve, but the learning gets even faster when it’s paired with a good foundation that channels and focuses the practice. What are some foundational aspects of giving good feedback? And how can feedback be adjusted for remote and distributed work environments? 

On the web, we can identify a long tradition of asynchronous feedback: from the early days of open source, code was shared and discussed on mailing lists. Today, developers engage on pull requests, designers comment in their favorite design tools, project managers and scrum masters exchange ideas on tickets, and so on.

Design critique is often the name used for a type of feedback that’s provided to make our work better, collaboratively. So it shares a lot of the principles with feedback in general, but it also has some differences.

The content

The foundation of every good critique is the feedback’s content, so that’s where we need to start. There are many models that you can use to shape your content. The one that I personally like best—because it’s clear and actionable—is this one from Lara Hogan.

While this equation is generally used to give feedback to people, it also fits really well in a design critique because it ultimately answers some of the core questions that we work on: What? Where? Why? How? Imagine that you’re giving some feedback about some design work that spans multiple screens, like an onboarding flow: there are some pages shown, a flow blueprint, and an outline of the decisions made. You spot something that could be improved. If you keep the three elements of the equation in mind, you’ll have a mental model that can help you be more precise and effective.

Here is a comment that could be given as a part of some feedback, and it might look reasonable at a first glance: it seems to superficially fulfill the elements in the equation. But does it?

Not sure about the buttons’ styles and hierarchy—it feels off. Can you change them?

Observation for design feedback doesn’t just mean pointing out which part of the interface your feedback refers to, but it also refers to offering a perspective that’s as specific as possible. Are you providing the user’s perspective? Your expert perspective? A business perspective? The project manager’s perspective? A first-time user’s perspective?

When I see these two buttons, I expect one to go forward and one to go back.

Impact is about the why. Just pointing out a UI element might sometimes be enough if the issue may be obvious, but more often than not, you should add an explanation of what you’re pointing out.

When I see these two buttons, I expect one to go forward and one to go back. But this is the only screen where this happens, as before we just used a single button and an “×” to close. This seems to be breaking the consistency in the flow.

The question approach is meant to provide open guidance by eliciting the critical thinking in the designer receiving the feedback. Notably, in Lara’s equation she provides a second approach: request, which instead provides guidance toward a specific solution. While that’s a viable option for feedback in general, for design critiques, in my experience, defaulting to the question approach usually reaches the best solutions because designers are generally more comfortable in being given an open space to explore.

The difference between the two can be exemplified with, for the question approach:

When I see these two buttons, I expect one to go forward and one to go back. But this is the only screen where this happens, as before we just used a single button and an “×” to close. This seems to be breaking the consistency in the flow. Would it make sense to unify them?

Or, for the request approach:

When I see these two buttons, I expect one to go forward and one to go back. But this is the only screen where this happens, as before we just used a single button and an “×” to close. This seems to be breaking the consistency in the flow. Let’s make sure that all screens have the same pair of forward and back buttons.

At this point in some situations, it might be useful to integrate with an extra why: why you consider the given suggestion to be better.

When I see these two buttons, I expect one to go forward and one to go back. But this is the only screen where this happens, as before we just used a single button and an “×” to close. This seems to be breaking the consistency in the flow. Let’s make sure that all screens have the same two forward and back buttons so that users don’t get confused.

Choosing the question approach or the request approach can also at times be a matter of personal preference. A while ago, I was putting a lot of effort into improving my feedback: I did rounds of anonymous feedback, and I reviewed feedback with other people. After a few rounds of this work and a year later, I got a positive response: my feedback came across as effective and grounded. Until I changed teams. To my shock, my next round of feedback from one specific person wasn’t that great. The reason is that I had previously tried not to be prescriptive in my advice—because the people who I was previously working with preferred the open-ended question format over the request style of suggestions. But now in this other team, there was one person who instead preferred specific guidance. So I adapted my feedback for them to include requests.

One comment that I heard come up a few times is that this kind of feedback is quite long, and it doesn’t seem very efficient. No… but also yes. Let’s explore both sides.

No, this style of feedback is actually efficient because the length here is a byproduct of clarity, and spending time giving this kind of feedback can provide exactly enough information for a good fix. Also if we zoom out, it can reduce future back-and-forth conversations and misunderstandings, improving the overall efficiency and effectiveness of collaboration beyond the single comment. Imagine that in the example above the feedback were instead just, “Let’s make sure that all screens have the same two forward and back buttons.” The designer receiving this feedback wouldn’t have much to go by, so they might just apply the change. In later iterations, the interface might change or they might introduce new features—and maybe that change might not make sense anymore. Without the why, the designer might imagine that the change is about consistency… but what if it wasn’t? So there could now be an underlying concern that changing the buttons would be perceived as a regression.

Yes, this style of feedback is not always efficient because the points in some comments don’t always need to be exhaustive, sometimes because certain changes may be obvious (“The font used doesn’t follow our guidelines”) and sometimes because the team may have a lot of internal knowledge such that some of the whys may be implied.

So the equation above isn’t meant to suggest a strict template for feedback but a mnemonic to reflect and improve the practice. Even after years of active work on my critiques, I still from time to time go back to this formula and reflect on whether what I just wrote is effective.

The tone

Well-grounded content is the foundation of feedback, but that’s not really enough. The soft skills of the person who’s providing the critique can multiply the likelihood that the feedback will be well received and understood. Tone alone can make the difference between content that’s rejected or welcomed, and it’s been demonstrated that only positive feedback creates sustained change in people.

Since our goal is to be understood and to have a positive working environment, tone is essential to work on. Over the years, I’ve tried to summarize the required soft skills in a formula that mirrors the one for content: the receptivity equation.

Respectful feedback comes across as grounded, solid, and constructive. It’s the kind of feedback that, whether it’s positive or negative, is perceived as useful and fair.

Timing refers to when the feedback happens. To-the-point feedback doesn’t have much hope of being well received if it’s given at the wrong time. Questioning the entire high-level information architecture of a new feature when it’s about to ship might still be relevant if that questioning highlights a major blocker that nobody saw, but it’s way more likely that those concerns will have to wait for a later rework. So in general, attune your feedback to the stage of the project. Early iteration? Late iteration? Polishing work in progress? These all have different needs. The right timing will make it more likely that your feedback will be well received.

Attitude is the equivalent of intent, and in the context of person-to-person feedback, it can be

Asynchronous Design Critique: Getting Feedback

“Any comment?” is probably one of the worst ways to ask for feedback. It’s vague and open ended, and it doesn’t provide any indication of what we’re looking for. Getting good feedback starts earlier than we might expect: it starts with the request. 

It might seem counterintuitive to start the process of receiving feedback with a question, but that makes sense if we realize that getting feedback can be thought of as a form of design research. In the same way that we wouldn’t do any research without the right questions to get the insights that we need, the best way to ask for feedback is also to craft sharp questions.

Design critique is not a one-shot process. Sure, any good feedback workflow continues until the project is finished, but this is particularly true for design because design work continues iteration after iteration, from a high level to the finest details. Each level needs its own set of questions.

And finally, as with any good research, we need to review what we got back, get to the core of its insights, and take action. Question, iteration, and review. Let’s look at each of those.

The question

Being open to feedback is essential, but we need to be precise about what we’re looking for. Just saying “Any comment?”, “What do you think?”, or “I’d love to get your opinion” at the end of a presentation—whether it’s in person, over video, or through a written post—is likely to get a number of varied opinions or, even worse, get everyone to follow the direction of the first person who speaks up. And then… we get frustrated because vague questions like those can turn a high-level flows review into people instead commenting on the borders of buttons. Which might be a hearty topic, so it might be hard at that point to redirect the team to the subject that you had wanted to focus on.

But how do we get into this situation? It’s a mix of factors. One is that we don’t usually consider asking as a part of the feedback process. Another is how natural it is to just leave the question implied, expecting the others to be on the same page. Another is that in nonprofessional discussions, there’s often no need to be that precise. In short, we tend to underestimate the importance of the questions, so we don’t work on improving them.

The act of asking good questions guides and focuses the critique. It’s also a form of consent: it makes it clear that you’re open to comments and what kind of comments you’d like to get. It puts people in the right mental state, especially in situations when they weren’t expecting to give feedback.

There isn’t a single best way to ask for feedback. It just needs to be specific, and specificity can take many shapes. A model for design critique that I’ve found particularly useful in my coaching is the one of stage versus depth.

Stage” refers to each of the steps of the process—in our case, the design process. In progressing from user research to the final design, the kind of feedback evolves. But within a single step, one might still review whether some assumptions are correct and whether there’s been a proper translation of the amassed feedback into updated designs as the project has evolved. A starting point for potential questions could derive from the layers of user experience. What do you want to know: Project objectives? User needs? Functionality? Content? Interaction design? Information architecture? UI design? Navigation design? Visual design? Branding?

Here’re a few example questions that are precise and to the point that refer to different layers:

  • Functionality: Is automating account creation desirable?
  • Interaction design: Take a look through the updated flow and let me know whether you see any steps or error states that I might’ve missed.
  • Information architecture: We have two competing bits of information on this page. Is the structure effective in communicating them both?
  • UI design: What are your thoughts on the error counter at the top of the page that makes sure that you see the next error, even if the error is out of the viewport? 
  • Navigation design: From research, we identified these second-level navigation items, but once you’re on the page, the list feels too long and hard to navigate. Are there any suggestions to address this?
  • Visual design: Are the sticky notifications in the bottom-right corner visible enough?

The other axis of specificity is about how deep you’d like to go on what’s being presented. For example, we might have introduced a new end-to-end flow, but there was a specific view that you found particularly challenging and you’d like a detailed review of that. This can be especially useful from one iteration to the next where it’s important to highlight the parts that have changed.

There are other things that we can consider when we want to achieve more specific—and more effective—questions.

A simple trick is to remove generic qualifiers from your questions like “good,” “well,” “nice,” “bad,” “okay,” and “cool.” For example, asking, “When the block opens and the buttons appear, is this interaction good?” might look specific, but you can spot the “good” qualifier, and convert it to an even better question: “When the block opens and the buttons appear, is it clear what the next action is?”

Sometimes we actually do want broad feedback. That’s rare, but it can happen. In that sense, you might still make it explicit that you’re looking for a wide range of opinions, whether at a high level or with details. Or maybe just say, “At first glance, what do you think?” so that it’s clear that what you’re asking is open ended but focused on someone’s impression after their first five seconds of looking at it.

Sometimes the project is particularly expansive, and some areas may have already been explored in detail. In these situations, it might be useful to explicitly say that some parts are already locked in and aren’t open to feedback. It’s not something that I’d recommend in general, but I’ve found it useful to avoid falling again into rabbit holes of the sort that might lead to further refinement but aren’t what’s most important right now.

Asking specific questions can completely change the quality of the feedback that you receive. People with less refined critique skills will now be able to offer more actionable feedback, and even expert designers will welcome the clarity and efficiency that comes from focusing only on what’s needed. It can save a lot of time and frustration.

The iteration

Design iterations are probably the most visible part of the design work, and they provide a natural checkpoint for feedback. Yet a lot of design tools with inline commenting tend to show changes as a single fluid stream in the same file, and those types of design tools make conversations disappear once they’re resolved, update shared UI components automatically, and compel designs to always show the latest version—unless these would-be helpful features were to be manually turned off. The implied goal that these design tools seem to have is to arrive at just one final copy with all discussions closed, probably because they inherited patterns from how written documents are collaboratively edited. That’s probably not the best way to approach design critiques, but even if I don’t want to be too prescriptive here: that could work for some teams.

The asynchronous design-critique approach that I find most effective is to create explicit checkpoints for discussion. I’m going to use the term iteration post for this. It refers to a write-up or presentation of the design iteration followed by a discussion thread of some kind. Any platform that can accommodate this structure can use this. By the way, when I refer to a “write-up or presentation,” I’m including video recordings or other media too: as long as it’s asynchronous, it works.

Using iteration posts has many advantages:

  • It creates a rhythm in the design work so that the designer can review feedback from each iteration and prepare for the next.
  • It makes decisions visible for future review, and conversations are likewise always available.
  • It creates a record of how the design changed over time.
  • Depending on the tool, it might also make it easier to collect feedback and act on it.

These posts of course don’t mean that no other feedback approach should be used, just that iteration posts could be the primary rhythm for a remote design team to use. And other feedback approaches (such as live critique, pair designing, or inline comments) can build from there.

I don’t think there’s a standard format for iteration posts. But there are a few high-level elements that make sense to include as a baseline:

  1. The goal
  2. The design
  3. The list of changes
  4. The questions

Each project is likely to have a goal, and hopefully it’s something that’s already been summarized in a single sentence somewhere else, such as the client brief, the product manager’s outline, or the project owner’s request. So this is something that I’d repeat in every iteration post—literally copy and pasting it. The idea is to provide context and to repeat what’s essential to make each iteration post complete so that there’s no need to find information spread across multiple posts. If I want to know about the latest design, the latest iteration post will have all that I need.

This copy-and-paste part introduces another relevant concept: alignment comes from repetition. So having posts that repeat information is actually very effective toward making sure that everyone is on the same page.

The design is then the actual series of information-architecture outlines, diagrams, flows, maps, wireframes, screens, visuals, and any o

Designing for the Unexpected

I’m not sure when I first heard this quote, but it’s something that has stayed with me over the years. How do you create services for situations you can’t imagine? Or design products that work on devices yet to be invented?

Flash, Photoshop, and responsive design

When I first started designing websites, my go-to software was Photoshop. I created a 960px canvas and set about creating a layout that I would later drop content in. The development phase was about attaining pixel-perfect accuracy using fixed widths, fixed heights, and absolute positioning.

Ethan Marcotte’s talk at An Event Apart and subsequent article “Responsive Web Design” in A List Apart in 2010 changed all this. I was sold on responsive design as soon as I heard about it, but I was also terrified. The pixel-perfect designs full of magic numbers that I had previously prided myself on producing were no longer good enough.

The fear wasn’t helped by my first experience with responsive design. My first project was to take an existing fixed-width website and make it responsive. What I learned the hard way was that you can’t just add responsiveness at the end of a project. To create fluid layouts, you need to plan throughout the design phase.

A new way to design

Designing responsive or fluid sites has always been about removing limitations, producing content that can be viewed on any device. It relies on the use of percentage-based layouts, which I initially achieved with native CSS and utility classes:

.column-span-6 {
  width: 49%;
  float: left;
  margin-right: 0.5%;
  margin-left: 0.5%;
}


.column-span-4 {
  width: 32%;
  float: left;
  margin-right: 0.5%;
  margin-left: 0.5%;
}

.column-span-3 {
  width: 24%;
  float: left;
  margin-right: 0.5%;
  margin-left: 0.5%;
}

Then with Sass so I could take advantage of @includes to re-use repeated blocks of code and move back to more semantic markup:

.logo {
  @include colSpan(6);
}

.search {
  @include colSpan(3);
}

.social-share {
  @include colSpan(3);
}

Media queries

The second ingredient for responsive design is media queries. Without them, content would shrink to fit the available space regardless of whether that content remained readable (The exact opposite problem occurred with the introduction of a mobile-first approach).

Media queries prevented this by allowing us to add breakpoints where the design could adapt. Like most people, I started out with three breakpoints: one for desktop, one for tablets, and one for mobile. Over the years, I added more and more for phablets, wide screens, and so on. 

For years, I happily worked this way and improved both my design and front-end skills in the process. The only problem I encountered was making changes to content, since with our Sass grid system in place, there was no way for the site owners to add content without amending the markup—something a small business owner might struggle with. This is because each row in the grid was defined using a div as a container. Adding content meant creating new row markup, which requires a level of HTML knowledge.

Row markup was a staple of early responsive design, present in all the widely used frameworks like Bootstrap and Skeleton.

1 of 7
2 of 7
3 of 7
4 of 7
5 of 7
6 of 7
7 of 7

Another problem arose as I moved from a design agency building websites for small- to medium-sized businesses, to larger in-house teams where I worked across a suite of related sites. In those roles I started to work much more with reusable components. 

Our reliance on media queries resulted in components that were tied to common viewport sizes. If the goal of component libraries is reuse, then this is a real problem because you can only use these components if the devices you’re designing for correspond to the viewport sizes used in the pattern library—in the process not really hitting that “devices that don’t yet exist”  goal.

Then there’s the problem of space. Media queries allow components to adapt based on the viewport size, but what if I put a component into a sidebar, like in the figure below?

Container queries: our savior or a false dawn?

Container queries have long been touted as an improvement upon media queries, but at the time of writing are unsupported in most browsers. There are JavaScript workarounds, but they can create dependency and compatibility issues. The basic theory underlying container queries is that elements should change based on the size of their parent container and not the viewport width, as seen in the following illustrations.

One of the biggest arguments in favor of container queries is that they help us create components or design patterns that are truly reusable because they can be picked up and placed anywhere in a layout. This is an important step in moving toward a form of component-based design that works at any size on any device.

In other words, responsive components to replace responsive layouts.

Container queries will help us move from designing pages that respond to the browser or device size to designing components that can be placed in a sidebar or in the main content, and respond accordingly.

My concern is that we are still using layout to determine when a design needs to adapt. This approach will always be restrictive, as we will still need pre-defined breakpoints. For this reason, my main question with container queries is, How would we decide when to change the CSS used by a component? 

A component library removed from context and real content is probably not the best place for that decision. 

As the diagrams below illustrate, we can use container queries to create designs for specific container widths, but what if I want to change the design based on the image size or ratio?

In this example, the dimensions of the container are not what should dictate the design; rather, the image is.

It’s hard to say for sure whether container queries will be a success story until we have solid cross-browser support for them. Responsive component libraries would definitely evolve how we design and would improve the possibilities for reuse and design at scale. But maybe we will always need to adjust these components to suit our content.

CSS is changing

Whilst the container query debate rumbles on, there have been numerous advances in CSS that change the way we think about design. The days of fixed-width elements measured in pixels and floated div elements used to cobble layouts together are long gone, consigned to history along with table layouts. Flexbox and CSS Grid have revolutionized layouts for the web. We can now create elements that wrap onto new rows when they run out of space, not when the device changes.

.wrapper {
  display: grid;
  grid-template-columns: repeat(auto-fit, 450px);
  gap: 10px;
}

The repeat() function paired with auto-fit or auto-fill allows us to specify how much space each column should use while leaving it up to the browser to decide when to spill the columns onto a new line. Similar things can be achieved with Flexbox, as elements can wrap over multiple rows and “flex” to fill available space. 

.wrapper {
  display: flex;
  flex-wrap: wrap;
  justify-content: space-between;
}

.child {
  flex-basis: 32%;
  margin-bottom: 20px;
}

The biggest benefit of all this is you don’t need to wrap elements in container rows. Without rows, content isn’t tied to page markup in quite the same way, allowing for removals or additions of content without additional development.

This is a big step forward when it comes to creating designs that allow for evolving content, but the real game changer for flexible designs is CSS Subgrid. 

Remember the days of crafting perfectly alig

Voice Content and Usability

We’ve been having conversations for thousands of years. Whether to convey information, conduct transactions, or simply to check in on one another, people have yammered away, chattering and gesticulating, through spoken conversation for countless generations. Only in the last few millennia have we begun to commit our conversations to writing, and only in the last few decades have we begun to outsource them to the computer, a machine that shows much more affinity for written correspondence than for the slangy vagaries of spoken language.

Computers have trouble because between spoken and written language, speech is more primordial. To have successful conversations with us, machines must grapple with the messiness of human speech: the disfluencies and pauses, the gestures and body language, and the variations in word choice and spoken dialect that can stymie even the most carefully crafted human-computer interaction. In the human-to-human scenario, spoken language also has the privilege of face-to-face contact, where we can readily interpret nonverbal social cues.

In contrast, written language immediately concretizes as we commit it to record and retains usages long after they become obsolete in spoken communication (the salutation “To whom it may concern,” for example), generating its own fossil record of outdated terms and phrases. Because it tends to be more consistent, polished, and formal, written text is fundamentally much easier for machines to parse and understand.

Spoken language has no such luxury. Besides the nonverbal cues that decorate conversations with emphasis and emotional context, there are also verbal cues and vocal behaviors that modulate conversation in nuanced ways: how something is said, not what. Whether rapid-fire, low-pitched, or high-decibel, whether sarcastic, stilted, or sighing, our spoken language conveys much more than the written word could ever muster. So when it comes to voice interfaces—the machines we conduct spoken conversations with—we face exciting challenges as designers and content strategists.

Voice Interactions

We interact with voice interfaces for a variety of reasons, but according to Michael McTear, Zoraida Callejas, and David Griol in The Conversational Interface, those motivations by and large mirror the reasons we initiate conversations with other people, too (). Generally, we start up a conversation because:

  • we need something done (such as a transaction),
  • we want to know something (information of some sort), or
  • we are social beings and want someone to talk to (conversation for conversation’s sake).

These three categories—which I call transactional, informational, and prosocial—also characterize essentially every voice interaction: a single conversation from beginning to end that realizes some outcome for the user, starting with the voice interface’s first greeting and ending with the user exiting the interface. Note here that a conversation in our human sense—a chat between people that leads to some result and lasts an arbitrary length of time—could encompass multiple transactional, informational, and prosocial voice interactions in succession. In other words, a voice interaction is a conversation, but a conversation is not necessarily a single voice interaction.

Purely prosocial conversations are more gimmicky than captivating in most voice interfaces, because machines don’t yet have the capacity to really want to know how we’re doing and to do the sort of glad-handing humans crave. There’s also ongoing debate as to whether users actually prefer the sort of organic human conversation that begins with a prosocial voice interaction and shifts seamlessly into other types. In fact, in Voice User Interface Design, Michael Cohen, James Giangola, and Jennifer Balogh recommend sticking to users’ expectations by mimicking how they interact with other voice interfaces rather than trying too hard to be human—potentially alienating them in the process ().

That leaves two genres of conversations we can have with one another that a voice interface can easily have with us, too: a transactional voice interaction realizing some outcome (“buy iced tea”) and an informational voice interaction teaching us something new (“discuss a musical”).

Transactional voice interactions

Unless you’re tapping buttons on a food delivery app, you’re generally having a conversation—and therefore a voice interaction—when you order a Hawaiian pizza with extra pineapple. Even when we walk up to the counter and place an order, the conversation quickly pivots from an initial smattering of neighborly small talk to the real mission at hand: ordering a pizza (generously topped with pineapple, as it should be).

Alison: Hey, how’s it going?

Burhan: Hi, welcome to Crust Deluxe! It’s cold out there. How can I help you?

Alison: Can I get a Hawaiian pizza with extra pineapple?

Burhan: Sure, what size?

Alison: Large.

Burhan: Anything else?

Alison: No thanks, that’s it.

Burhan: Something to drink?

Alison: I’ll have a bottle of Coke.

Burhan: You got it. That’ll be $13.55 and about fifteen minutes.

Each progressive disclosure in this transactional conversation reveals more and more of the desired outcome of the transaction: a service rendered or a product delivered. Transactional conversations have certain key traits: they’re direct, to the point, and economical. They quickly dispense with pleasantries.

Informational voice interactions

Meanwhile, some conversations are primarily about obtaining information. Though Alison might visit Crust Deluxe with the sole purpose of placing an order, she might not actually want to walk out with a pizza at all. She might be just as interested in whether they serve halal or kosher dishes, gluten-free options, or something else. Here, though we again have a prosocial mini-conversation at the beginning to establish politeness, we’re after much more.

Alison: Hey, how’s it going?

Burhan: Hi, welcome to Crust Deluxe! It’s cold out there. How can I help you?

Alison: Can I ask a few questions?

Burhan: Of course! Go right ahead.

Alison: Do you have any halal options on the menu?

Burhan: Absolutely! We can make any pie halal by request. We also have lots of vegetarian, ovo-lacto, and vegan options. Are you thinking about any other dietary restrictions?

Alison: What about gluten-free pizzas?

Burhan: We can definitely do a gluten-free crust for you, no problem, for both our deep-dish and thin-crust pizzas. Anything else I can answer for you?

Alison: That’s it for now. Good to know. Thanks!

Burhan: Anytime, come back soon!

This is a very different dialogue. Here, the goal is to get a certain set of facts. Informational conversations are investigative quests for the truth—research expeditions to gather data, news, or facts. Voice interactions that are informational might be more long-winded than transactional conversations by necessity. Responses tend to be lengthier, more informative, and carefully communicated so the customer understands the key takeaways.

Voice Interfaces

At their core, voice interfaces employ speech to support users in reaching their goals. But simply because an interface has a voice component doesn’t mean that every user interaction with it is mediated through voice. Because multimodal voice interfaces can lean on visual components like screens as crutches, we’re most concerned in this book with pure voice interfaces, which depend entirely on spoken conversation, lack any visual component whatsoever, and are therefore much more nuanced and challenging to tackle.

Though voice interfaces have long been integral to the imagined future of humanity in science fiction, only recently have those lofty visions become fully realized in genuine voice interfaces.

Interactive voice response (IVR) systems

Though written conversational interfaces have been fixtures of computing for many decades, voice interfaces first emerged in the early 1990s with text-to-speech (TTS) dictation programs that recited written text aloud, as well as speech-enabled in-car systems that gave directions to a user-provided address. With the advent of interactive voice response (IVR) systems, intended as an alternative to overburdened customer service representatives, we became acquainted with the first true voice interfaces that engaged in authentic conversation.

IVR systems allowed organizations to reduce their reliance on call centers but soon became notorious for their clunkiness. Commonplace in the corporate world, these systems were primarily designed as metaphorical switchboards to guide customers to a real phone agent (“Say Reservations to book a flight or check an itinerary”); chances are you will enter a conversation with one when you call an airline or hotel conglomerate. Despite their functional issues and users’ frustration with their inability to speak to an actual human right away, IVR systems proliferated in the early 1990s across a variety of industries (, PDF).

While IVR systems are great for highly repetitive, monotonous conversations that generally don’t veer from a single format, they have a reputation for less scintillating conversation than we’re used to in real life (or even in science fiction).

Screen readers

Parallel to the evolution of IVR systems was the invention of the screen reader, a tool that transcribes visual content into synthesized speech. For Blind or visually impaired website users, it’s the predominant method of interacting with text, multimedia, or form elements. Screen readers represent perhaps the closest equivalent we have today to an out-of-the-box implementat

Sustainable Web Design, An Excerpt

In the 1950s, many in the elite running community had begun to believe it wasn’t possible to run a mile in less than four minutes. Runners had been attempting it since the late 19th century and were beginning to draw the conclusion that the human body simply wasn’t built for the task. 

But on May 6, 1956, Roger Bannister took everyone by surprise. It was a cold, wet day in Oxford, England—conditions no one expected to lend themselves to record-setting—and yet Bannister did just that, running a mile in 3:59.4 and becoming the first person in the record books to run a mile in under four minutes. 

This shift in the benchmark had profound effects; the world now knew that the four-minute mile was possible. Bannister’s record lasted only forty-six days, when it was snatched away by Australian runner John Landy. Then a year later, three runners all beat the four-minute barrier together in the same race. Since then, over 1,400 runners have officially run a mile in under four minutes; the current record is 3:43.13, held by Moroccan athlete Hicham El Guerrouj.

We achieve far more when we believe that something is possible, and we will believe it’s possible only when we see someone else has already done it—and as with human running speed, so it is with what we believe are the hard limits for how a website needs to perform.

Establishing standards for a sustainable web

In most major industries, the key metrics of environmental performance are fairly well established, such as miles per gallon for cars or energy per square meter for homes. The tools and methods for calculating those metrics are standardized as well, which keeps everyone on the same page when doing environmental assessments. In the world of websites and apps, however, we aren’t held to any particular environmental standards, and only recently have gained the tools and methods we need to even make an environmental assessment.

The primary goal in sustainable web design is to reduce carbon emissions. However, it’s almost impossible to actually measure the amount of CO2 produced by a web product. We can’t measure the fumes coming out of the exhaust pipes on our laptops. The emissions of our websites are far away, out of sight and out of mind, coming out of power stations burning coal and gas. We have no way to trace the electrons from a website or app back to the power station where the electricity is being generated and actually know the exact amount of greenhouse gas produced. So what do we do? 

If we can’t measure the actual carbon emissions, then we need to find what we can measure. The primary factors that could be used as indicators of carbon emissions are:

  1. Data transfer 
  2. Carbon intensity of electricity

Let’s take a look at how we can use these metrics to quantify the energy consumption, and in turn the carbon footprint, of the websites and web apps we create.

Data transfer

Most researchers use kilowatt-hours per gigabyte (kWh/GB) as a metric of energy efficiency when measuring the amount of data transferred over the internet when a website or application is used. This provides a great reference point for energy consumption and carbon emissions. As a rule of thumb, the more data transferred, the more energy used in the data center, telecoms networks, and end user devices.

For web pages, data transfer for a single visit can be most easily estimated by measuring the page weight, meaning the transfer size of the page in kilobytes the first time someone visits the page. It’s fairly easy to measure using the developer tools in any modern web browser. Often your web hosting account will include statistics for the total data transfer of any web application (Fig 2.1).

The nice thing about page weight as a metric is that it allows us to compare the efficiency of web pages on a level playing field without confusing the issue with constantly changing traffic volumes. 

Reducing page weight requires a large scope. By early 2020, the median page weight was 1.97 MB for setups the HTTP Archive classifies as “desktop” and 1.77 MB for “mobile,” with desktop increasing 36 percent since January 2016 and mobile page weights nearly doubling in the same period (Fig 2.2). Roughly half of this data transfer is image files, making images the single biggest source of carbon emissions on the average website. 

History clearly shows us that our web pages can be smaller, if only we set our minds to it. While most technologies become ever more energy efficient, including the underlying technology of the web such as data centers and transmission networks, websites themselves are a technology that becomes less efficient as time goes on.

You might be familiar with the concept of performance budgeting as a way of focusing a project team on creating faster user experiences. For example, we might specify that the website must load in a maximum of one second on a broadband connection and three seconds on a 3G connection. Much like speed limits while driving, performance budgets are upper limits rather than vague suggestions, so the goal should always be to come in under budget.

Designing for fast performance does often lead to reduced data transfer and emissions, but it isn’t always the case. Web performance is often more about the subjective perception of load times than it is about the true efficiency of the underlying system, whereas page weight and transfer size are more objective measures and more reliable benchmarks for sustainable web design. 

We can set a page weight budget in reference to a benchmark of industry averages, using data from sources like HTTP Archive. We can also benchmark page weight against competitors or the old version of the website we’re replacing. For example, we might set a maximum page weight budget as equal to our most efficient competitor, or we could set the benchmark lower to guarantee we are best in class. 

If we want to take it to the next level, then we could also start looking at the transfer size of our web pages for repeat visitors. Although page weight for the first time someone visits is the easiest thing to measure, and easy to compare on a like-for-like basis, we can learn even more if we start looking at transfer size in other scenarios too. For example, visitors who load the same page multiple times will likely have a high percentage of the files cached in their browser, meaning they don’t need to transfer all of the files on subsequent visits. Likewise, a visitor who navigates to new pages on the same website will likely not need to load the full page each time, as some global assets from areas like the header and footer may already be cached in their browser. Measuring transfer size at this next level of detail can help us learn even more about how we can optimize efficiency for users who regularly visit our pages, and enable us to set page weight budgets for additional scenarios beyond the first visit.

Page weight budgets are easy to track throughout a design and development process. Although they don’t actually tell us carbon emission and energy consumption analytics directly, they give us a clear indication of efficiency relative to other websites. And as transfer size is an effective analog for energy consumption, we can actually use it to estimate energy consumption too.

In summary, reduced data transfer translates to energy efficiency, a key factor to reducing carbon emissions of web products. The more efficient our products, the less electricity they use, and the less fossil fuels need to be burned to produce the electricity to power them. But as we’ll see next, since all web products demand some power, it’s important to consider the source of that electricity, too.

Carbon intensity of electricity

Regardless of energy efficiency, the level of pollution caused by digital products depends on the carbon intensity of the energy being used to power them. Carbon intensity is a term used to define the grams of CO2 produced for every kilowatt-hour of electricity (gCO2/kWh). This varies widely, with renewable energy sources and nuclear having an extremely low carbon intensity of less than 10 gCO2/kWh (even when factoring in their construction); whereas fossil fuels have very high carbon intensity of approximately 200–400 gCO2/kWh. 

Most electricity comes from national or state grids, where energy from a variety of different sources is mixed together with varying levels of carbon intensity. The distributed nature of the internet means that a single user of a website or app might be using energy from multiple different grids simultaneously; a website user in Paris uses electricity from the French national grid to power their home internet and devices, but the website’s data center could be in Dallas, USA, pulling electricity from the Texas grid, while the telecoms networks use energy from everywhere between Dallas and Paris.

We don’t have control over the full energy supply of web services, but we do have some control over where we host our projects. With a data center using a significant proportion of the energy of any website, locating the data center in an area with low carbon energy will tangibly reduce its carbon emissions. Danish startup Tomorrow reports and maps this user-contributed data, and a glance at their map shows how, for example, choosing a data center in France will have significantly lower carbon emissions than a data center in the Netherlands (Fig 2.3).

Skip to content