Summary:
A product designer’s hustle is commonly recognised as being able to deliver big releases or launch features that hopefully bring in revenue for the business. However, being able to spot and grab opportunities to conduct quick validation tests is also an important strategic skill. Rapid validation exercises that are executed opportunistically, help increase the chances of new product releases to meet users’ needs identified through earlier studies. This in turn helps to meet organisational goals faster. However, more often than not, efforts in these tests go unnoticed even though they make or break the product’s perceived usability.

I’m going to show you a behind-the-screens moment in a typical-ish day as a product designer in Ninja Van. Take a glimpse at our process, where we opportunistically ran quick tests whilst juggling with tight timelines for a major release 🤹.

[1] Introduction: The design problem at hand

As the company grew, we found ourselves overly submerged in the hustle and bustle of designing for complex problems with little time for testing existing solutions. One example is how we needed to adapt an existing core feature - the air waybill. The design problem is that we needed to accommodate additional functional information in a small real estate. On top of that, we only had 1 week to settle on the design recommendations.

A little bit of context: An air waybill is like a parcel’s identity card 🪪. Each air waybill has a unique string of numbers containing the parcel’s identification, also known as the tracking ID. This ID is significant in the delivery process as it helps our logistics personnel differentiate one parcel from another, and know where to deliver each parcel.

*An air waybill’s importance in our delivery process*

The aforementioned design problem stemmed from business requirements to address new customers’ needs. For more context, existing measures which were inefficient, made use of unique red stickers to feature the additional functional information without touching the air waybill. It consisted of:

Empty text field for denoting parent-child parcel numbering for sets of parcels that should move together in the logistics process.
Empty text field for manually writing on the child parcel’s tracking ID.

Our Ninjas printed out the red stickers and distributed them across our new customers manually. In turn, our customers had to print out our existing air waybill, paste both the air waybill and the red sticker onto their parcels. However, the customers tend to miss out manually writing the tracking ID and the parent-child parcel numbering on the red stickers. This caused a slew of administrative problems when they handed over the parcels to us for delivery. Hence, we needed better design.

*Existing measures using red stickers and their problems*

Due to the air waybill’s significance, it was clear that our design changes to the air waybill would have detrimental impact in the logistical processes down the line, if we were not careful. The clock was ticking, and conventional methods promised insights too late to be of use. We knew we needed to at least weed out basic usability problems before the new design reached the development stage in the design process.

[2] The opportunity for quick research

There came a narrow window of opportunity, where we squeezed in a guerrilla usability testing on our new air waybill designs. It was the quickest way to obtain actionable insights on the usability of the design, and narrow down our design recommendations in time for delivery.

““Guerrilla usability testing should be performed when your project calls for quick and cheap testing, such as when you need to validate design assumptions early in the project life cycle, or in a project with a low budget.””

— Usability Geek, 2020

[3] Setting the stage for success

Through the guerrilla usability testing sessions, we wanted to observe and understand the following:

Evaluate the new functional information’s legibility on the air waybills.
Discover parent-child categorisation patterns fuelled by basic human intuition and interpretation of information, when testers were tasked with sorting the air waybills.
Evaluate if the testers were successful in identifying and grouping air waybills into correct sets of child parcels.

We wanted to test two key changes to existing air waybill formats:

We enlarged the last 4 digits of the tracking ID and positioned them at the upper right hand corner for higher visibility.

❓Why: Through past research, we found that many of our user groups utilised these numbers for quick parcel identification, in different parts of our delivery process. By making these numbers a focal point, we hypothesised that it would help our users to distinguish the parcels and take quicker actions.

We introduced a sequence number / total number of child parcels in a set, for quick child-parcel identification and classification.

❓Why: We hypothesised that moving this information from the red stickers onto the air waybills would aid our users to quickly group child parcels that are relevant together.

*Information architectural changes to the air waybill design*

[4] Preparing the research tools and people

Our toolkit was simple:

First, the air waybills were printed out in A5 and A6 sizes - commonly used print sizes by our shippers. Conceptually, each air waybill represented a physical parcel.

Second, we pulled in some developers and product managers in the office to participate as testers 🧑‍💻. These participants have a spectrum of prior knowledge pertaining to the business context (aka being heavily involved vs not involved at all). We were conscious about testing with a non-target user base. However, we assumed priority in weeding out basic usability errors first before testing the iterated designs with actual users (sorters) in our warehouses.

Third, there was no booking of labs or scheduling of sessions weeks in advance. Instead, we laid out the printed air waybills on the biggest table we could find, in a randomised order.

*Visualisation of how we spread out the air waybills*

Fourth, we wrote instructions for the testers on a board for them to read through before they began the test. The instructions were intentionally vague, and forced the tester to group the air waybills based on their intuition.

*A photo of the instructions written for our testers*

Last but not least, as the testers finished their first attempt at categorising the air waybills, more context and instructions were given to those who have zero context about the business problem. They were also given a chance to review and re-categorise their air waybills with renewed comprehension of the context.

[5] Diving into guerrilla usability testing

Since we rarely ran guerrilla tests in the office, many of our colleagues were intrigued by the set up. Amid the clacking of keyboards and muffled voices of Google meet calls, our testing unfolded.

It was like watching live performances with the testers. We were able to perceive the gears in their minds turning profusely as they figured out the task. We recorded unscripted reactions from the testers: They were first baffled by the vagueness of the instructions, but quickly pivoted to visually identify patterns on the air waybill. Eventually, there was an undeniable thrill when each tester eventually discovered the eureka moment that helped them to complete the task.

In each session, thorough notes were taken. We began to observe patterns emerging amongst the testers’ interaction with the air waybills.

A peek behind the screens 👀

Video compilation of the various interactions our testers had with the guerrilla usability test!

[6] The insights and learnings

Our testers looked out for common customer or parcel information to distinguish the air waybills efficiently. They first grouped the air waybills with a similar total number of child parcels together. Then, they further divided the child parcels into each dedicated parent set, based on either the last 4 digits of tracking ID, or customer address information.

*Visualisation of common groupings of air waybills*

As we organically conversed, we discovered their decision making process and strategy behind each categorisation approach. These mental models gave us an insight into how humans intuitively group complex sets of information using pattern identification 🧩.

The insight: Our informational architectural changes worked well in giving clear indication that a sorting pattern exists. The testers pointed out the upper right hand corner information as the first things they refer to to sort the air waybills. Additional business context given after their first attempt helped to give the testers sort them in the expected order, which indicated that training was important to set the tone for our real users.

“I don’t need to care about the sequence of the child parcels (air waybills), as long as I know these X belong and will move together to the same customer based on the customer address.”

— Dev A

Furthermore, the testers spontaneously adopted procedural decision making strategies - they went through two rounds of sorting to reduce cognitive load 🧠.

The insight: We believe that this decision making strategy can be extrapolated to the actual users’ application. As such we made it a point to observe similar behavioural patterns when we eventually tested with real users.

“(First round) I will group them into piles first - like piles of 5, piles of 3… this little pile here, I assume is individual parcels… (Second round) Actually at this point it will just go by the tracking ID.”

— Designer B

🥸 It wasn’t all serious faces; we kept the test sessions as engaging and enjoyable as possible. Humour found its way into our tests which continuously sent the testers and facilitators into uproars of laughter, drawing in wandering eyes from other colleagues in the same space.

“Exasperated quotes: “This is difficult yo, it’s so small… sigh (bends down to zoom into the waybills)””

— Dev C

Our testers also suggested ways to improve our design. An example was taken from Amazon’s usage of unique English phrases to assign unique identification to parcel deliveries. We could also explore using unique symbols as quick visual recognition for better classification of air waybills.

[7] Iterating and testing with real users

Armed with the learnings from the guerrilla usability tests, we were confident enough in our design’s fundamental usability. We iterated on the designs and subsequently tested the air waybills in the warehouses with the actual users. Our changes proved to be beneficial in helping our users quickly identify and categorise the air waybills, and in turn, grouping parcels into correct deliveries. Hooray! 🦄

*The final design and other rejected iterations*

Results of the two tests:

Our users cited better efficiency of sorting the parcels as they did not need to rely on the problematic red stickers for parent-child parcel information.
Our initial hypothesis was sound - The enlarged last 4 digits of the tracking ID which were positioned at the upper right hand corner helped the actual users visibly distinguish between the types of parcels that required special attention versus normal parcels.
Even though hand-held scanners were commonly used to scan the parcels’ barcodes to obtain more thorough parcel information, the quick visualisation on the air waybill gives our users faster indication to identify child parcels.
The sequence number and total number of child parcels helped the users compile complete sets and set aside incomplete sets for separate follow up processes.
As a result, quicker actions can be taken to rectify or trace child parcels that might be missing or damaged along the way.

[8] Reflection

As we reflect on our guerrilla usability testing journey, we learned that innovation can thrive even under constraints. By grasping the opportunity to obtain immediate feedback, we managed to take a small step forward towards creating better user experiences.

Besides, the experience also benefited team dynamics by bringing non-product folks into our design process. Inclusivity in the design and research process has encouraged more collaborative efforts in cross-functional teams.

Even for those who did not participate in the test physically, they were able to lurk on the sidelines and peek over when our laughter and fluster piqued their curiosity once in a while 😶‍🌫️.

At the end of the day, by seizing time to take incremental steps in testing, helped us design better for our users and in turn, achieve organisational goals faster.