7 Strategies to Benchmark SaaS Customers to Success

Customer benchmarking —  the practice of identifying where a customer can improve or is already doing well by comparing to other customers – helps Customer Success Managers to deliver unique value to their customers. The comparative insights from benchmarking motivate customers to make changes that produce better outcomes with their solutions. I’ve written more about this link in a recent post.

SaaS customer success leaders publicly encourage greater adoption of this practice. Peter Armaly, a customer success expert with Oracle Marketing Cloud, argued in his presentation at the 2016 Customer Success Summit that it should be a foundational activity of CSMs. In her featured post earlier this year, Kia Puhm, a customer experience consultant and a former executive with Adobe, Eloqua and Blueprint, advocated using customer benchmarking in every QBR, annual renewal discussion and proactive strategic meeting with customers. At the TSIA World conference in May, Rachel Barger presented how Lithium Technologies’ CSMs use their benchmarking program to make recommendations to customers.

I’ve found that SaaS vendors use seven distinct strategies to empower CSMs with customer benchmarking. The first two are longstanding strategies that rely on third-party data. The other five strategies leverage the data that is a byproduct of the vendor’s customer relationships and usage of their solutions. The last two of these five progressively use artificial intelligence to further automate the task.

Strategies Defined

Strategy 1: Customer Benchmarking using Industry Surveys

CSMs can leverage the benchmarking work of industry associations and research firms to help customers set performance targets, identify areas for improvement, and recommend best practices. These independent organizations sponsor benchmarking surveys that are completed by representatives of companies in the industry. The sponsor creates the survey, collects the data and does the analysis. The aggregated, anonymized findings are accessible in published documents or online tools for their members and customers.

With this strategy, the sponsor does all the work, but self-reported data can be unreliable and not specific to the SaaS vendor’s solutions or even customers.

Strategy 2: Customer Benchmarking using Best-Practices Studies

CSMs can help customers strive for superior outcomes by comparing their customer’s practices with the best practices. This approach studies a key business process of several companies that are perceived as the best in their industry and agree to participate. A third-party organization or the SaaS vendor sponsors an on-site study to collect mostly qualitative data on practices, key metrics and business context. The sponsor analyzes the data and reports what they’ve learned.

The sponsor does all the work for this strategy too, and superior practices can be found. But the SaaS vendor’s customers may see the practices of the best companies as unrealistic or irrelevant.

Strategy 3: Customer Benchmarking using Vendor Surveys

SaaS vendors can do their own benchmarking survey when their customer base is sufficiently large to obtain a representative sample. They analyze the aggregated data and share summary findings with their customers. A CSM collects data from an individual customer to compare how they’re doing versus the aggregated, anonymized results at a more granular level.

This strategy benefits from a survey that’s tailored to the vendor’s customer base, but it suffers from the same reliability drawbacks as Strategy 1.

Strategy 4: Customer Benchmarking using Data Scientists

When requested by CSMs, data scientists use their skills with statistics and modeling to mine the SaaS vendor’s data for deep, insightful correlations and comparisons. This is often a one-off solution, although procedures can be set up for recurring needs. The work involves advanced techniques such as regression analysis, stochastic frontier analysis and data envelope analysis using sophisticated software tools.

Data scientists may find unexpected insights by analyzing their own solutions and customer relationship data. The lack of scalability handicaps this strategy.

Strategy 5: Customer Benchmarking using Business Software Reports

By running reports on their aggregated customer data, CSMs can easily see how a specific customer compares with other customers on a given metric. A company’s adopted CRM, BI or CSM software platforms usually provide this capability. For example, customer success software platforms such as Amity and Gainsight generate a scorecard summary which lists the SaaS vendor’s customers and their performance on key metrics. A CSM sorts the list by a selected metric, calculates averages and filters the customers into a relevant group using various attributes.

CSMs can generate reports as needed with this strategy, but the reports contain data, not insights, and support only basic, manual analysis to find insights.

Strategy 6: Customer Benchmarking using your SaaS Product

SaaS vendors can enable their customers to benchmark themselves within their own software solution, which can be used by their CSMs, too. SAP, Apptio, ServiceNow, InsightSquared, Samanage, IQNavigator, ADP, Zendesk and other companies offer this capability. Using the vendor’s anonymized, aggregate customer data, a customer can compare itself against other customers of the same solution on various metrics, and CSMs can compare them also. SaaS companies develop basic benchmarking features in-house (such as scorecards, rankings, averages and several peer groups) or embed technology from specialized benchmarking software vendors like OnlyBoth that do this along with automated, in-depth analysis and narrative reporting.

Customers can get answers to some benchmarking questions on their own, but this strategy doesn’t make use of the internal vendor data that provides valuable benchmarking insights to CSMs but isn’t available for customer viewing.

Strategy 7: Customer Benchmarking using Specialized Benchmarking Software

With advanced automation available from several software firms, CSMs can get many more actionable and deep comparative insights for each customer quickly. Waypoint Group’s TopBox benchmarking module automatically analyzes for significant correlations between customer feedback responses and outcome metrics such as NPS and customer health for several customer peer groups. The Customer Benchmarking Engine available from OnlyBoth automates the data analysis, insight discovery, and narrative reporting by using artificial intelligence and the SaaS vendor’s data.

CSMs benefit from automated analysis which produces many more and deeper insights for each customer, overcoming the limitations of business software reports and data scientist scalability. The challenge is that CSMs are called on to make more judgments on which insights to use and share.

For a brief guide to the pros, cons and requirements of each strategy, see the paper “Pioneering SaaS Customer Success Leaders Seek New Customer Benchmarking Strategies to Deliver Value to Customers.”

Final thoughts

According to research by TSIA, CSMs at 25 percent of SaaS vendors are already using customer benchmarking. These pioneers have recognized the opportunity to enhance customer relationships and sustain their journey with the insights that only they can provide, because only they have the solution-specific data. With the data and strategies that are now accessible, more customer success organizations are poised to adopt this practice.

Jim Berardone

How is my county doing? Benchmarking 3,143 U.S. counties by jobs, income, crime, health, education, faith, rent, ethnicity, geography, population, and more.

How is your county doing, where could it do better, what’s been changing, and among similar counties, who does best?countiesOnlyBoth has launched a County Benchmarking Engine as its newest showcase application and public service, leveraging the availability of federal as well as private-sector data on all 3,143 U.S. counties. We collected data on 104 county attributes by these topics:

Here’s an example of an “Only, both”-style insight about Bronx County in New York City:

Only Bronx County in NY has both as high a median rent for a 3-bedroom unit ($2,293) and as low a median household income ($33,687).

The flip side, for a 1-bedroom apartment this time, is seen in Audubon County, Iowa (population 5,773)  about 1,200 miles away:

Only Audubon County in IA has both as high a per-capita income ($30,714) and as low a median rent for a 1-bedroom unit ($474.00).

Notably, San Francisco County in CA is among the top-15 nationwide in all these 9 categories, in order of their appearance:

  1. 3rd-highest Asian population (33.0%)
  2. 10th-least male obesity (14.8%)
  3. 10th-most residents employed in services (71.8%)
  4. 8th-highest foreign-born population (35.5%)
  5. 11th-highest per-capita income ($49,986)
  6. 11th-least obesity (15.5%)
  7. 10th-biggest advantage, relative to its home state, in per-capita income (+67.1%)
  8. 12th-least female obesity (16.1%)
  9. 13th-least owner-occupied housing (36.6%)

but residents pay for it with the highest median rents in the whole country, for a 0-bedroom (studio/efficiency) unit ($2,072), a 1-bedroom unit ($2,610), and a 3-bedroom unit ($4,250).

Re-visiting New York city, let’s consider this time New York County (i.e., Manhattan). It’s doing well on many dimensions, although unsurprisingly its rents are high. But there’s a residual problem:

Of the 103 counties that have at least $75,459 in median household income, only New York County in NY has decades-long substantial child poverty.

Let’s close out on a high note. Many of the data attributes involve changes over time, so the engine benchmarks not only on characteristics during a single time period, but on changes or trends from one period to a next. Elsewhere we’ve called that space-time benchmarking. Here’s a good outcome on the populous Fairfax county in Virginia, adjacent to Washington DC:

Fairfax County in VA is one of only 3 counties that improved or maxed out on all the obesity and diabetes change metrics (there are 4 of these, and each county needs at least 3 with actual values to qualify).

Fairfax County in VA improved or maxed out on these:

change over 4 years in the prevalence of female obesity = -2.2% (from 22.5% to 20.3%)

change over 4 years in the prevalence of male obesity = -0.5% (from 20.4% to 19.9%)

change over 4 years in the prevalence of obesity = -1.4% (from 21.5% to 20.1%)

change over 9 years in the prevalence of diabetes = -0.4% (from 6.6% to 6.2%)

As we’ve seen, it’s now possible, after a few days of software configuration, to create significant economic or social value by generating numerous performance insights on thousands of benchmarked entities, and later update these insights by figuratively pushing a button to update the insights with updated data.

Try the County Benchmarking Engine yourself at county.onlyboth.com.

Raul Valdes-Perez

Why Your Customer Success Managers Need Comparative Insights for Each Customer

“If you’re not getting better, you’re getting worse.”

What do executives mean when they use this phrase? They know that if their company stays as they are, eventually they’d get worse compared to their competitors.  Their competitors’ success forces them into changing how they perform, and vice versa. Hence, their company has to keep getting better, even if it isn’t sick.

Take a look at Salesforce.com in 2004. They had 20,000 customers of their cloud-based CRM software, up from about 6,000 a few years earlier. They were rapidly on their way to reaching unicorn status of a $1B market cap. But they were losing 8% of their subscribers each month. Yikes! That’s a big problem when a recurring revenue business depends on growing revenue from its customer base. Salesforce responded by changing from
reactively to proactively managing its existing customer relationships. They eventually succeeded in reducing customer churn to 1% per month and the loss of customers to their competitors. Their competitors had to follow suit to keep up. Now, proactive customer success management is one of the hottest movements in business today.

What does this mean for your Customer Success?
All of your customers need to keep improving, too, or they’ll fall behind. And your customers know this. Consequently, they need to know how they’re doing and where they can improve versus others like them. These comparative insights motivate improvement actions inside their companies, which leads to achieving their desired business outcomes — including staying competitive.

If your company offers cloud-based, recurring revenue solutions such as a SaaS subscription, then you have data that’s a byproduct of your customer relationships and their usage of your solutions. As your customers are buying the same solutions for similar business reasons, this data contains a rich source of comparative insights for each customer – insights they can’t get anywhere else.

Your customers can use these unique insights to improve. That’s important to you, too. If they’re not improving, then their outcomes, health and loyalty won’t improve, and you won’t achieve the retention and upselling rates that your company needs to grow.

Comparative insights help Customer Success Managers do their job better.
Today, your CSMs likely use customer data to gain a full picture of an account and to evaluate a customer’s likelihood for renewing, buying more or advocating your solution. This customer insight serves your company well, whereas comparative insights serve your customers well. With unique comparative insights, CSMs can deliver more value to each customer in each interaction. They’ll get a better response to their proactive outreaches, have more strategic conversations with key stakeholders during quarterly business reviews and, most importantly, provoke customer actions to improve their behaviors with your solutions.

Generating comparative insights
The process of identifying areas where one can improve by comparing to others is called benchmarking. With your own data, your CSMs can use benchmarking, too. You can create scorecards with a spreadsheet or CSM software product to rank customers and do simple performance comparisons on individual metrics. If you have relevant benchmarks or targets, CSMs can draw additional insights. But the really noteworthy, action-provoking insights are much harder to find. You’ll need data analysts or data scientists who can perform deep, comparative analysis involving complex correlations, comparisons and clustering across multiple dimensions. Unfortunately, they’re in short supply and CSMs have to wait in line to get what they need.

As a result, CSMs settle for rankings and benchmarks without the deeper analysis, which has limited value. It’s like they have Consumer Reports’ published rankings and performance ratings of cars but without the written analysis for each.

An easy, scalable way to get comparative insights
I’m quite excited that, just last week, my company OnlyBoth announced a new customer analytics solution that can help CSMs take full advantage of the unique insights that exist in their company’s data. OnlyBoth’s Customer Benchmarking Engine uses artificial intelligence to completely automate the searching, analysis and reporting of comparative insights in data. CSMs can simply enter a customer name and get numerous comparative insights in seconds. The software performs a massive, sophisticated data analysis that would take many data scientists many months to do.

Here’s an example of a deep, comparative insight the engine found in the data we gave it. This insight is written up by the software using its natural language generation technology.

insight-example-1

Rhynyx is amongst a group of 55 customers who are using a vendor’s HR software suite more actively than their other customers. However, Rhynyx is not keeping pace with their use of the recruiting module, a core sticky function of the vendor’s HR suite. A CSM can use this insight to proactively engage Rhynyx to examine the root causes and take action so they can achieve similar outcomes to those 55 peer customers.

You can see more examples and learn about our software here.

As someone who sits at the intersection of customer success leadership and benchmarking innovation, I hope to share more about this important but under-discussed topic in future posts. If you have experiences and lessons learned with using benchmarking and comparative insights for Customer Success, I’d love to hear from you.

Jim Berardone

Benchmarking the Tax Systems of 195 Countries

Taxation at the national level is controversial. Economists gather data and form opinions, and so do politicians. Factual, comparative insights on worldwide tax systems are needed. So we applied an automated Benchmarking Engine (taxes.onlyboth.com) to tax data on 195 countries, uncovering 7,617 insights or about 39 insights each, in perfect English, all automated.

paying taxes

Paying the Tax (Collector), by Pieter Brueghel the Younger

The U.S. Agency for International Development publishes a fascinating Collecting Taxes Database on the tax systems of the world’s countries.  The database expresses 33 attributes relating to various metrics and traits, relating to tax rates, efficiency in collecting the revenue that the rates target, diversity in sources of tax revenue (VAT, personal income, corporate, etc.), tax administration, and so on.

We downloaded the latest available version (2012-2013) as well as an earlier 2009-2010 version, in order also to express changes over a three-year interval and enable benchmarking on trends.

The U.S. corporate-tax system is controversial because of its very high rate.  Does the engine find any noteworthy insights relating to corporate taxation?  Indeed it does, which we’ll quote at length:

USA has the lowest corporate income tax productivity (0.07) of the 32 nations with at least 9.3% personal income tax collection as a percentage of GDP (USA is at 11.8%). That 0.07 compares to an average of 0.24 and standard deviation of 0.24 across the 32 nations.

Reaching the average of 0.24 would imply a total increase of 5.9% (absolute) in corporate income tax collection as a percentage of GDP.

USA has these standings among those 32 nations:
corporate income tax collection as a percentage of GDP = 2.6% (8th-least)
corporate income tax rate = 35.0% (most overall)

trailed France (0.08), Malawi (0.08), Austria (0.09), and Belgium (0.09), and others, ending with Algeria (0.86).

1 out of the other 31 nations was ruled out due to missing, unknown, or not-applicable values for corporate income tax productivity, i.e., Angola.

Let’s interpret this. First, it says that the U.S. has the highest corporate income tax (35%) in the world. However, this high rate leads to a low revenue outcome, as indicated by the low 0.07 productivity score.  The Collecting Taxes Database calculates this corporate-income-tax productivity by “dividing the ratio of total corporate income tax revenues to GDP by the general corporate income tax rate.”

Not only is productivity low, but it’s the lowest of the 32 nations that collect a significant share (at least 9.3%) of personal income in relation to GDP (gross domestic product). It’s lower than France, Malawi, and others.  Here’s a plot:

U.S. corporate income tax productivityA separate insight reveals that the U.S. has the lowest corporate-income-tax productivity of the 17 nations with an agriculture sector as a percentage of GDP of at most 1.6% (the U.S. is at 1.2%).

Now let’s click on What’s best in class? to see what the U.S. could aspire to, as shown by nations that are similar, i.e., whose overall values in the database are most similar to the U.S.  It turns out that Hong Kong and South Korea do best among the 20 countries most like the U.S.

Hong Kong has the highest corporate income tax productivity (0.32) among the 20 nations most similar to USA (with 0.07) that likewise have a high-income economy.

Next with 0.18 is South Korea.

USA is 35th best among the 40 nations with applicable values and that have a high-income economy, which range from a worst of 0.02 (Bahrain) to a best of 0.99 (Qatar), with an average of 0.20 and standard deviation of 0.21.

Among all 164 nations with applicable values, the overall average is 0.15 and standard deviation is 0.16. Best is Qatar, with 0.99.

As is typical of a benchmarking engine, we can leave it to human experts – economists and political leaders in this case – to figure out whether the U.S. has a tax problem, what’s causing it, what are possible solutions, and which solution is best.  Our aim has been to provide this Taxes Benchmarking Engine as a public service and as a showcase of what automated benchmarking can do, as was done earlier for college financials, hospitals, and nursing homes.

Benchmarking need not be taxing. Simply enter any country at taxes.onlyboth.com and see how it’s doing, where it could improve, what’s trending, and what’s best in class.

Raul Valdes-Perez

 

 

Benchmarking 15,665 Nursing Homes

Today OnlyBoth launches as a public service what is likely the largest benchmarking analysis ever conducted, as measured in terms of readable language output.

The U.S. has about 1.4 million residents of nursing homes and 15,665 Medicare or Medicaid certified nursing homes. The federal government, through its regulatory powers and reimbursement function, collects performance data on all of these nursing homes, which cry out for comparison in order to understand how each is doing and where each falls short, compared to all peers or their subsets, without bias.

CambridgeMA_Cambridge_Home_for_the_Aged_and_Infirm (1)

We downloaded data from the federal Nursing Home Compare website and spent a couple of days consolidating the information and configuring our Benchmarking Engine, then pushed a button (figuratively!) and waited just a day and a half. The output consists of 642,192 insights, totaling more than two Encyclopaedia Britannicas in terms of English words contained in grammatical, to-the-point sentences and paragraphs. Enter a nursing home and browse the insights at http://nursing.onlyboth.com.

What did the benchmarking engine find?  At one extreme, the engine found the most things to say about Signature Healthcare at Saint Francis in Memphis, TN although most of these were not complimentary. There is clearly room for improvement there.

We often say that a massive analysis, as only an automated benchmarking engine can do, can find specific areas where even the best can improve. We discussed such a case – Stanford University Hospital – while launching our earlier hospitals benchmarking engine.

Let’s revisit sunny California. The US News & World Report lists Edgemoor Hospital in Santee, CA as the top nursing home in California, due to its top five-star rating in all the major categories. Could even Edgemoor improve? The engine reveals several areas for improvement, this one for example:

Edgemoor Hospital in Santee, CA has the most facility-reported incidents (9) of all the 200 nursing homes that have the top rating in each of overall, health inspection, quality measures, staffing, and registered-nurse staffing (5 total). Those 9 represent 18.8% of the total across the 200 nursing homes, whose average is 0.2.

Now let’s consider Bridgepoint Sub-Acute and Rehab Capitol Hill in Washington, DC which is the nursing home nearest Capitol Hill, where Congress meets. This facility does especially well in bladder and bowel control among low-risk, long-stay residents, as compared to other for-profit facilities that locate within a hospital. The engine found six specific areas for keen improvement, the first of which is this:

Bridgepoint Sub-Acute and Rehab Capitol Hill in Washington, DC has the 3rd-most high-risk long-stay residents with pressure ulcers (24.8%) among the 1,982 Mid-Atlantic nursing homes. That 24.8% compares to an average of 6.4% across the 1,982 nursing homes.

Another noteworthy insight it that the facility has “the most severe deficiencies on the health survey (5) of the 718 nursing homes that have the top rating in each of quality measures, staffing, and registered-nurse staffing.

This application is launched as a public service as well as a technology showcase, benchmarking well over triple the number of entities that were benchmarked in our previous largest application, to 4,803 hospitals. The relevance to business is this:  the advent of cloud services, internet of things, and other means for collecting customer performance data will enable the automated benchmarking of business processes, generating tremendous economic value by benchmarking 10,000 entities with the same amount of work as benchmarking 10 entities.

Our goal is universal betterment by providing persuasive, motivating insights that pinpoint what is going well and where improvement is sorely needed and is achievable. Benchmarking Engines will do for business benchmarking what Search Engines did for information seeking, assigning to computers what they do better – massive comparisons – and to people what they do better – evaluating and following up, as appropriate – on benchmarking insights.

Raul Valdes-Perez

 

Relaunch of Hospitals Benchmarking Engine

OnlyBoth benchmarks U.S. hospitals as both a public service and as a visible demonstration of the power of an automated Benchmarking Engine. This enables hospital stakeholders to instantly discover in perfect English how they’re doing, not compared to absolute standards or arbitrary peers, but to all peers and groups.

We launched our first version this summer. Today we relaunched our hospitals benchmarking engine based on fresh data and technical advances:

  1. updated Hospital Compare dataset from Medicare.gov, now on 4,803 hospitals
  2. new hospital attributes relating to hospital performance and geography
  3. better expression of key types of insights
  4. improved heuristics leading to more insights per hospital
  5. addition of data on hospital networks, enabling intra-network comparisons

1.  We have refreshed the data in the hospital application based on a late-September data release at the Hospital Compare data download page. This new release also contains new hospital attributes, as discussed below.

2.  Since geography is an important determinant of peer groups, we’ve added attributes that enabling grouping East Coast, Southern, and Western states. We’ve also added two new attributes from the updated Hospital Compare data that relate to deaths or unplanned readmission due to coronary artery bypass grafting (CABG) surgery, and five new attributes that express hospital-readmission ratios for various afflictions.

3.  A key type of insight expresses how entities that are within an elite peer group fall short along some key dimension. For example, our recent Harvard Business Review article, which explains why benchmarking is done wrong and how to do it right, gives this example of Stanford Hospital:

None of the other 344 hospitals with as many patients who reported YES, they would definitely recommend the hospital (85%) as Stanford Hospital in Stanford, CA also has as few patients who reported that the area around their room was always quiet at night (41%). That is, among those 344 hospitals, it has the fewest patients who reported that the area around their room was always quiet at night.

As the saying goes, this was too clever by half. After considering feedback from surveying users, this insight now appears, with the refreshed data, like this:

Stanford Hospital in Stanford, CA has the fewest patients who reported that the area around their room was always quiet at night (40%) among the 811 hospitals with at least 80% of patients who reported YES, they would definitely recommend the hospital (Stanford Hospital is at 84%). That 40% compares to an average of 69.4% and standard deviation of 10.7% across the 811 hospitals.

Of course, this improvement affects thousands of insights, and millions in the future.

4.  We’ve improved the heuristics that enable finding valuable needles within the huge haystack that results from taking multiple slices out of a dataset of half a million hospital attribute values. Our new Hospitals Benchmarking contains 522,142 insights, or around 109 insights per hospital, compared to the previous 101 per hospital. The key benchmarking question – Where can this hospital improve? – has seen a 4% increase in answers per hospital.

5.  For a hospital-network executive, it’s valuable to benchmark individual hospitals against others in the network, especially because knowledge transfer of good practices can happen more easily when two entities have the same owner. We’ve added a parent attribute that for now includes four networks:  UPMC, Kaiser Foundation, Texas Health Resources, and NYC Health and Hospitals. We’ll add other hospital networks over time.

We expect that this hospitals application, and the diffusion of benchmarking engines in general, will further the goal of enabling universal betterment through data-driven comparison with peers, greatly simplified in terms of human work, but greatly expanded in terms of action-provoking insights.

Raul Valdes-Perez

Avoiding Tunnel Vision in Peer Comparisons

Comparing yourself to peers – also known as benchmarking – lets you understand how you’re doing, identify performance gaps and opportunities to improve, and highlight peer achievements that you could emulate, or your own achievements to be celebrated. As long as data is available, peer comparison can potentially accomplish all of these. The opportunities for peer comparison are greatly increasing due to cloud and other services that generate data as a by-product of serving customers.

The problem is that peer comparison as generally practiced suffers from Tunnel Vision and so misses a lot, to everyone’s detriment. To understand why, let’s first consider an analogy to search engines.

An information seeker, before there were search engines, might have gone to consult a librarian on, say, computers and heard “That’s technology, so look in the Technology books section, over in the back, by the right.” But there’s plenty of material on computers that’s catalogued elsewhere, e.g., automation’s impact on employment and job training, the philosophical question of whether computers in principle could do everything that people do, cognitive modeling of human reasoning using computers, computer history, and so on. The point is that looking only in the Technology section is an example of Tunnel Vision, or maybe bookshelf vision. Search engines changed that.

So where’s the Tunnel Vision in peer comparisons? It’s almost universal practice that the benchmarker chooses one or two organizational goals, then picks a few key metrics (key performance indicators) relevant to those goals, and finally selects several peer groups from a limited set. The outputs are then the mean, median, distribution, or high-percentile values for those peer groups on those metrics. The conclusion is that the organization may or may not have a problem, which may or may not be addressable. The flaw in all this is that organizations have many goals and subgoals, and many metrics that could reveal performance gaps, especially if a very large set of peer groups could also be explored. But our human inability to explore many paths in parallel imposes this Tunnel Vision, for the same reason that pre-search-engines information seekers went looking in one or two sections of the library.

As an example of peer-group selection, suppose you wanted to compare the U.S. against other nations. What would be the right peer group?  Here are some that make sense: democracies; the Anglosphere; constitutional republics; large countries; developed countries; OECD or NATO members; the western hemisphere; non-tropical countries; largely monolingual countries; business-friendly economies; and even baseball-playing nations. Moreover, peer groups could be formed dynamically, e.g., countries at least as big as the U.S. in population or territory. And what would be the right metrics? The mind boggles at the number of interesting possibilities, all of which may have available data. As already pointed out, standard practice is to first specify an overarching goal, which then drives the choice of metrics and peer group. (Some web examples of standard benchmarking outputs are herethere, and elsewhere.) But what if the goal is to understand broadly how you’re doing and where you could improve? Tunnel Vision is caused by over-specific goals, limited metrics, and biased peer groups, all part of standard benchmarking practice which is made obsolete in the face of exploring all interesting metrics and potential peer groups that could lead to operational improvements.

Let’s run some numbers to show the scope of Tunnel Vision. Suppose there are 10 attributes with yes/no values and another 10 attributes that can take on any of five different values, plus one attribute that can take on 50 values, e.g., a U.S. state. There are theoretically 210 x 510 x 50 = 500 billion peer groups. Even if we include only peer groups whose attribute values match those of the specific individual to be benchmarked, the number would be 221 = 2.1 million peer groups.

Let’s move from the abstract to the concrete. Here are two (accurate) peer comparisons that are arguably insightful:

1. St Anthony Community Hospital in Warwick, NY has the lowest average time patients spent in the emergency department before they were seen by a healthcare professional of all the church-owned hospitals in the mid-Atlantic.

2. Macalester College in Saint Paul, MN has the highest total student-service expenses of any big-city private college that doesn’t offer graduate degrees.

Note the peer groups: (1) church-owned; mid-Atlantic; and (2) big-city; private; doesn’t offer graduate degrees. Now consider an imaginary peer comparison that uses four attributes to form a noteworthy peer group:

3. Cumulus Inc. is the most profitable of all the B2B, cloud-based, venture-backed companies that have at least 200 customers.

We see that considering more peer groups leads to uncovering more valuable benchmarking insights. Since the number of possible peer groups is vast, and benchmarking has seen little automation, this means that Tunnel Vision is necessarily widespread.

But the Tunnel Vision gets much worse! Peer groups can be formed, not just by picking non-numeric (aka symbolic) attributes, but also by dynamically determining numeric thresholds. Here’s a revealing (and true) insight that contains a dynamically-formed peer group:

None of the other 344 hospitals with as many patients who reported YES, they would definitely recommend the hospital (85%) as Stanford Hospital in Stanford, CA also has as few patients who reported that the area around their room was always quiet at night (41%).

That is, among those 344 hospitals, it has the fewest patients who reported that the area around their room was always quiet at night.

Stanford Hospital

This is clearly a provocative insight. One can imagine a hospital CEO reacting in one of these ways:

  1. We’re profitable, prestigious, and have great weather. What’s a little nocturnal noise?
  2. There’s been night-time construction next door for the last year, and it’s almost done, so the problem will solve itself.
  3. I can’t think of any reason why we should be at the bottom of this elite peer group. I’ll forward this paragraph to our chief of operations to investigate and report back what may be happening.

This peer-comparison insight wouldn’t be found by today’s conventional benchmarking methods. Instead, what may be found is along these lines: The average value for this quantity among 309 California hospitals with known values is 51.5% with a standard deviation of 9.5%, so Stanford Hospital is about 1 standard deviation below average. The reader can judge which of the two insights is the more action-provoking, not just for the single individual in charge, but for the entire team that needs to be roused to act on and address performance gaps.

So far, we’ve used some math to highlight the Tunnel Vision problem and shown specific examples, real or fictitious, of what is being missed. As our last step, let’s report the results of actual software experiments.

The website hospitals.onlyboth.com showcases the results of applying an automated benchmarking engine to data on 4,813 U.S. hospitals described by 94 attributes, mostly downloaded from the Hospital Compare website at Medicare.gov. A combinatorial exploration of peer comparisons among the 4,813 hospitals turns up 98,296 benchmarking insights that survive the software’s quality, noteworthiness, and anti-redundancy filters, or about 20 per hospital. In this hospitals experiment, insights were required to place a hospital in the top or bottom ten within a peer group of sufficient size.

There appear 522 different peer groups that are formed by combining the hospital dataset’s 24 non-numeric attributes in various ways. As noted above, the number of peer groups is much, much larger if one counts, not the attributes used, but the diverse ways to combine attribute values, e.g., the attribute “state” can either be used or not, so there are two alternatives there, but the number of state values is 50 (or more, including non-state territories), implying many more alternatives. The number of peer groups becomes still larger when accounting for dynamically-formed peer groups based on numeric thresholds.

Of course, the engine explored more peer groups than appear in the end results, which are those found to be large and noteworthy enough to bring to human attention. Also, each peer group appears in many insights by combining them with the available metrics. On average, each of the 522 peer groups enables over 900 individual hospital insights, by further combining each peer group and metric with different hospitals.

Summarizing, Tunnel Vision in peer comparisons, or benchmarking for understanding and improvement, is widespread but misses a vast number of noteworthy and action-provoking insights that could help improve organizational performance. Without automation, there aren’t enough people and time in the world to explore what’s outside the Tunnel, select the best insights, and bring them to human attention. Software automation is the way forward.

Raul Valdes-Perez