Usability

System Usability Scale: 10 Powerful Insights You Need Now

Ever wondered how to measure if a product is truly user-friendly? Enter the System Usability Scale (SUS) — a simple, reliable tool that reveals how usable your digital product really is.

What Is the System Usability Scale (SUS)?

The System Usability Scale, commonly known as SUS, is a 10-item questionnaire designed to evaluate the perceived usability of a system, product, or service. Developed in the late 1980s by John Brooke at Digital Equipment Corporation, SUS has become one of the most widely adopted usability assessment tools across industries — from software and websites to medical devices and mobile apps.

Origins and Development of SUS

The System Usability Scale was first introduced in 1986 during usability research at Digital Equipment Corporation. At the time, there was a growing need for a quick, standardized method to assess user experience without requiring complex lab setups or extensive time investment.

Brooke’s goal was to create a lightweight yet effective tool that could be applied across different types of systems, regardless of their function or interface. The result was a 10-question survey that could be administered quickly and scored objectively. Over the decades, SUS has stood the test of time due to its simplicity, reliability, and versatility.

Despite being developed before the modern web era, SUS remains relevant today because it focuses on overall usability perception rather than specific interface elements. This makes it adaptable to evolving technologies, including voice assistants, augmented reality interfaces, and AI-driven platforms.

Core Structure of the SUS Questionnaire

The System Usability Scale consists of 10 statements, each rated on a 5-point Likert scale ranging from “Strongly Disagree” to “Strongly Agree.” The questions alternate between positive and negative phrasing to reduce response bias.

Here are the standard SUS items:

I think that I would like to use this system frequently.I found the system unnecessarily complex.I thought the system was easy to use.I think that I would need the support of a technical person to be able to use this system.I found the various functions in this system were well integrated.I thought there was too much inconsistency in this system.I would imagine that most people would learn to use this system very quickly.I found the system very cumbersome to use.I felt very confident using the system.I needed to learn a lot of things before I could get going with this system.Notice how odd-numbered questions are positively worded (e.g., “easy to use”), while even-numbered ones are negatively worded (e.g., “unnecessarily complex”)..

This counterbalancing helps minimize the effect of users who tend to agree with everything or disagree across the board..

“The beauty of the System Usability Scale lies in its simplicity — it doesn’t tell you *what* is wrong, but it tells you *that* something might be wrong.” — Jakob Nielsen, Nielsen Norman Group

Scoring and Interpretation of SUS Results

One of the most powerful aspects of the System Usability Scale is its straightforward scoring mechanism. While the questionnaire may look simple, the scoring process allows for meaningful quantitative analysis.

To calculate the SUS score:

  • For odd-numbered items: Subtract 1 from the user’s response (which ranges from 1 to 5).
  • For even-numbered items: Subtract the user’s response from 5.
  • Sum the converted values and multiply by 2.5 to get a final score between 0 and 100.

For example, if a user responds with a “4” on question 1 (positive), you compute: 4 – 1 = 3. If they respond with a “2” on question 2 (negative), you compute: 5 – 2 = 3. Repeat this for all 10 items, sum them, then multiply by 2.5.

The resulting score ranges from 0 (worst) to 100 (best). According to research by Bangor, Kortum, and Miller (2008), the average SUS score across thousands of studies is approximately 68. Scores above 68 are considered above average, while those below are below average.

Benchmarks based on qualitative interpretations include:

  • 90–100: Excellent
  • 80–89: Good
  • 70–79: Acceptable
  • 60–69: Poor
  • 50–59: Awkward
  • Below 50: Unacceptable

These benchmarks help teams contextualize their results and make informed decisions about whether a product meets usability expectations.

Why the System Usability Scale Is So Widely Used

The enduring popularity of the System Usability Scale isn’t accidental. Its widespread adoption stems from a combination of practical advantages that make it ideal for both academic research and real-world product development.

Speed and Efficiency in Usability Testing

One of the biggest strengths of the System usability scale is its brevity. Unlike full heuristic evaluations or in-depth cognitive walkthroughs, SUS takes less than 10 minutes to complete. This makes it perfect for integration into usability tests, beta programs, or post-task feedback sessions.

Because it’s so short, SUS can be used repeatedly throughout the design lifecycle — after wireframe testing, prototype validation, and final product release. Teams can track usability improvements over time with minimal disruption to user flow.

Its efficiency also makes it cost-effective. You don’t need specialized software or trained moderators to collect SUS data. A simple Google Form or embedded survey can yield reliable results from dozens or even hundreds of users.

Reliability and Validity Across Contexts

Despite its simplicity, the System usability scale has demonstrated strong psychometric properties. Numerous studies have confirmed its internal consistency (Cronbach’s alpha typically > 0.9), test-retest reliability, and construct validity.

Research published in the International Journal of Human-Computer Interaction shows that SUS performs consistently across diverse domains — including healthcare, finance, e-commerce, and education. Whether users are interacting with a mobile banking app or a hospital patient portal, SUS provides comparable, interpretable scores.

This cross-context reliability is rare among usability instruments. Many tools are tailored to specific industries or interaction styles, but SUS’s focus on general usability perception allows it to transcend these boundaries.

Universality and Language Adaptability

The System usability scale has been translated into over 30 languages, including Spanish, Chinese, Arabic, Russian, and Japanese. These translations have undergone rigorous validation processes to ensure cultural and linguistic accuracy.

This universality makes SUS an excellent choice for global product teams. If you’re launching a product in multiple regions, you can deploy the same validated questionnaire and compare results across markets.

Moreover, because SUS doesn’t reference specific UI components (like buttons or menus), it avoids localization pitfalls that plague more detailed usability checklists. The core concept of “ease of use” is universally understood, making SUS a truly global metric.

How to Administer the System Usability Scale Effectively

While the System usability scale is easy to use, administering it effectively requires attention to timing, context, and participant selection. Poor implementation can lead to misleading results, even if the tool itself is robust.

Best Practices for Timing and Context

When should you administer the SUS? The optimal moment is immediately after a user completes a set of representative tasks with your system.

For example, in a usability test where participants are asked to create an account, browse products, and make a purchase, the SUS should be given right after the last task. This ensures that their experience is fresh in memory, leading to more accurate responses.

Administering SUS too early (e.g., after only one task) may not reflect overall system usability. Administering it too late (e.g., a day after use) risks recall bias. Real-time or near-real-time collection is key.

If you’re using SUS in a longitudinal study (e.g., tracking usability over weeks), consider pairing it with shorter micro-surveys (like the Single Usability Metric) between full SUS assessments.

Selecting the Right Participants

The quality of your SUS data depends heavily on who completes it. Ideally, participants should represent your actual or target user base.

For consumer apps, this might mean recruiting individuals with varying levels of tech literacy. For enterprise software, you’d want domain experts (e.g., nurses for a hospital EMR system).

A common mistake is using only internal stakeholders (like developers or designers) to fill out SUS. While their input is valuable, they are not typical users and often overestimate usability due to familiarity.

Research suggests that even five users can uncover major usability issues (Nielsen, 1993), but for reliable SUS scoring, aim for at least 15–20 participants to achieve statistical stability. Larger samples (50+) allow for segmentation analysis (e.g., comparing SUS scores by age group or experience level).

Integrating SUS Into Your UX Research Workflow

The System usability scale shouldn’t exist in isolation. It works best when integrated into a broader user experience research strategy.

Consider combining SUS with:

  • Task success rates: Did users complete the task?
  • Time-on-task: How long did it take?
  • Qualitative feedback: What did users say during think-aloud sessions?
  • Net Promoter Score (NPS): Would they recommend the product?

For instance, a high SUS score paired with low task success indicates a disconnect between perceived and actual usability. Conversely, a low SUS score with high success might suggest users completed tasks but found the experience frustrating.

Tools like MeasuringU offer SUS calculators and benchmarking databases to help interpret your results in context.

Comparing SUS With Other Usability Metrics

While the System usability scale is popular, it’s not the only usability assessment tool available. Understanding how SUS compares to alternatives helps you choose the right method for your goals.

SUS vs. SUPR-Q: Measuring Broader User Experience

The SUPR-Q (Standardized User Experience Percentile Rank Questionnaire) builds on SUS by measuring not just usability, but also credibility, loyalty, and appearance.

Where SUS gives you a single usability score, SUPR-Q provides four subscores that reflect different aspects of user experience. It’s particularly useful for websites where trust and visual appeal matter as much as functionality.

However, SUPR-Q requires a license and is less flexible than SUS, which is free to use. SUS remains the go-to for pure usability assessment, while SUPR-Q is better suited for holistic website evaluation.

Learn more about SUPR-Q at MeasuringU’s SUPR-Q page.

SUS vs. UMUX: A Simpler Alternative?

The UMUX (Usability Metric for User Experience) is a 4-item questionnaire derived from SUS, designed to be even shorter. It focuses on two core dimensions: usability and learnability.

While UMUX is faster to administer, it sacrifices some reliability and sensitivity compared to the full SUS. However, for quick pulse checks or mobile in-app surveys, UMUX can be a practical alternative.

Notably, UMUX-Lite — a 2-item version — correlates highly with SUS (r > 0.8), making it useful for large-scale monitoring where survey fatigue is a concern.

Still, for in-depth analysis, the full System usability scale remains the gold standard.

SUS vs. NASA-TLX: Measuring Cognitive Load

The NASA-TLX (Task Load Index) measures perceived mental workload rather than usability. It’s commonly used in high-stakes environments like aviation, surgery, or military operations.

NASA-TLX assesses six dimensions: mental demand, physical demand, temporal demand, performance, effort, and frustration. While it overlaps with SUS in measuring frustration, it’s more focused on cognitive strain than overall ease of use.

For example, a system might score well on SUS (users find it easy to use) but poorly on NASA-TLX (it requires intense concentration). This makes NASA-TLX ideal for safety-critical systems, while SUS is better for general consumer products.

Explore NASA-TLX further at NASA’s official TLX resource page.

Common Misconceptions About the System Usability Scale

Despite its widespread use, the System usability scale is often misunderstood. Clarifying these misconceptions ensures you get the most value from your usability assessments.

Myth 1: SUS Tells You What to Fix

A common misconception is that SUS provides diagnostic insights — that is, it tells you *what* is wrong with your interface. In reality, SUS is a summative metric, not a formative one.

It gives you a score that reflects overall usability perception, but it doesn’t pinpoint specific issues. For example, a low SUS score might result from poor navigation, unclear labels, slow performance, or confusing workflows — but SUS alone won’t tell you which.

To identify root causes, pair SUS with qualitative methods like user interviews, heatmaps, or session recordings. SUS tells you *that* there’s a problem; other tools help you find *where*.

Myth 2: SUS Is Only for Digital Products

While SUS is most commonly used for software and websites, it’s not limited to digital interfaces. Researchers have successfully applied the System usability scale to physical products like medical devices, ATMs, and even household appliances.

For instance, a study on insulin pens used SUS to compare the usability of different models. Another applied it to evaluate the user-friendliness of voting machines.

As long as users interact with a system to achieve goals, SUS can assess their perception of its ease of use — whether it’s a touchscreen, a mechanical dial, or a voice command interface.

Myth 3: SUS Scores Are Absolute

Some teams treat SUS scores as absolute indicators of quality. A score of 75 is “good,” so they move on. But usability is relative.

A score of 75 might be excellent for a complex enterprise tool but poor for a consumer app where users expect instant intuitiveness. Always compare your SUS score against benchmarks — either industry standards or your own previous versions.

Tracking SUS over time is often more valuable than a single score. A rising trend indicates improvement, even if the absolute number isn’t “excellent” yet.

Advanced Applications of the System Usability Scale

Beyond basic usability testing, the System usability scale is being used in innovative ways across research and industry.

Using SUS in Academic Research

The System usability scale is one of the most cited tools in human-computer interaction (HCI) literature. Its standardized nature makes it ideal for comparative studies.

Researchers use SUS to evaluate the impact of design interventions — for example, comparing a traditional menu interface with a gesture-based one. Because SUS provides a single, comparable metric, it simplifies statistical analysis and cross-study synthesis.

Its presence in thousands of peer-reviewed papers has further validated its reliability and contributed to the development of normative databases. These databases allow researchers to contextualize their findings against global averages.

For access to academic resources on SUS, visit Taylor & Francis Online for Brooke’s original paper.

SUS in Product Development and Iteration

In agile and lean product environments, the System usability scale supports rapid iteration. Teams can run quick usability tests every sprint and track SUS scores to measure progress.

For example, a fintech startup might conduct a SUS assessment after each major feature release. If the score drops, they investigate what changed. If it rises, they validate their design decisions.

Some companies even embed SUS into their CI/CD pipelines — automatically triggering surveys after beta releases and visualizing trends in dashboards. This data-driven approach ensures usability doesn’t get sacrificed for speed.

Cross-Cultural and Inclusive Design Applications

As product teams strive for inclusivity, SUS is being used to evaluate usability across diverse user groups — including older adults, people with disabilities, and non-native speakers.

Studies have shown that SUS can effectively detect usability gaps in accessible design. For instance, a website might score 85 with able-bodied users but only 55 with screen reader users — highlighting the need for accessibility improvements.

By segmenting SUS results by demographic or ability, teams can ensure their products are usable by everyone, not just the average user.

Limitations and Criticisms of the System Usability Scale

No tool is perfect. While the System usability scale is powerful, it has limitations that users should be aware of.

Lack of Diagnostic Detail

As previously mentioned, SUS doesn’t explain *why* a score is high or low. It aggregates user sentiment into a single number, which is great for benchmarking but insufficient for root cause analysis.

This limitation means SUS should never be used in isolation. Always supplement it with qualitative feedback, behavioral analytics, or observational data.

For example, if SUS scores drop after a redesign, look at session recordings to see where users are struggling. Combine SUS with tools like Hotjar or FullStory to get the full picture.

Sensitivity to Task Design

SUS scores can be influenced by how tasks are framed during testing. If users are given overly simplistic tasks, they may rate the system higher than if they faced realistic challenges.

Conversely, if tasks are too difficult or poorly explained, users may blame the system unfairly. This means the context of use significantly impacts SUS outcomes.

To mitigate this, ensure tasks are realistic, representative, and clearly communicated. Pilot test your task scenarios before collecting SUS data.

Potential for Response Bias

Like all self-reported measures, SUS is vulnerable to response biases — such as acquiescence bias (tendency to agree), social desirability bias (wanting to please the researcher), or extreme responding.

The alternating positive/negative phrasing in SUS helps reduce acquiescence bias, but it doesn’t eliminate it. Anonymizing responses and assuring participants their feedback won’t affect them can help improve honesty.

Additionally, consider using attention checks or consistency checks in longer surveys to filter out low-quality responses.

What is the ideal sample size for SUS?

A minimum of 15–20 participants is recommended for reliable SUS scoring. While smaller samples (5–8 users) can provide directional insights, larger samples improve statistical confidence and allow for segmentation analysis.

Can I modify the SUS questionnaire?

While you can rephrase SUS items for clarity, it’s strongly discouraged. Modifying the wording compromises the validity of the score and prevents comparison with established benchmarks. If you need a customized tool, consider developing a new instrument or using SUS as-is.

Is the System Usability Scale free to use?

Yes, the System usability scale is in the public domain and free for both commercial and academic use. No permission is required, though proper citation of the original work by John Brooke is appreciated.

How often should I run SUS tests?

Run SUS tests at key milestones: after major design changes, before product launches, and during regular UX audits. For agile teams, integrating SUS into sprint reviews can help maintain usability focus throughout development.

Can SUS be used for non-digital products?

Absolutely. SUS has been successfully applied to physical systems like medical devices, kiosks, and appliances. As long as users interact with a system to achieve goals, SUS can assess perceived usability.

The System Usability Scale remains one of the most trusted, efficient, and versatile tools for measuring perceived usability. Its simplicity belies its power — delivering actionable insights with minimal overhead. By understanding how to administer it correctly, interpret results wisely, and combine it with other methods, teams can ensure their products are not just functional, but truly user-friendly. Whether you’re a UX researcher, product manager, or developer, incorporating SUS into your workflow is a proven step toward building better experiences.


Further Reading:

Back to top button