A Comparison of 23 Healthcare Comparison Websites

This study examines 23 prominent healthcare comparison websites that U.S. consumers, and other healthcare participants, can use to find and evaluate healthcare providers in terms of, for example, their quality, patient experiences, and cost. Our emphasis is on free website tools that offer side-by-side comparison of two or more providers, in a way that informs consumer choice but also provides value to other users within the large and complex landscape of healthcare.

Our goals in carrying out this empirical study are these:

  • Inform potential users about the landscape of website tools, analogous to how each tool informs users about the healthcare landscape.
  • Inform website-tool designers about the scope of existing features that could be offered by their tools, and perhaps prompt thoughts about non-existent features that should be invented and added.
  • Data on healthcare providers contributes to transparency and helps drive improvement via several alternative pathways. Similarly, data on comparison websites enhances tool transparency and can drive improvement among comparison tools.

In science it is understood that when there exist numerous similar entities, it helps to overlay a structure based on identifying aspects on which they differ; such is our intent here. Often such structure leads further to devising categories that help make sense of the landscape, much as how medical science forms categories out of individual diseases.  

Continue reading

Evaluation Engines Recognized as Finalist for Healthiest Communities Data Challenge

At the recent Health Datapalooza held March 27-28, 2019 in Washington DC, OnlyBoth’s submission to the Healthiest Communities Data Challenge was recognized as one of the three finalists, all equal co-winners of the Challenge.  We submitted our portfolio of four distinct comparative-analytics engines – benchmarking, comparison, discovery, and scoring – to extract maximum value from the rich county-level dataset on Social Determinants of Health, made available to the pool of 30 contestants, to which we added Census data on county populations.  The resulting engines are accessible via Benchmine.com, or directly at sdoh.onlyboth.com.

Our supporting partners in this submission were HealthBegins and the Allegheny County Health Department.  The benchmarking engine, used to deeply assess the comparative performance of a single county against others nearby and nationwide, was augmented by HealthBegins with recommended actions to take for certain performance deficiencies. For example, here is a comparative deficiency for Bronx County:

Bronx County, New York has the most adults who don’t eat enough daily fruits & vegetables (77.60%) among the 64 counties with at least 876,764 in population (Bronx County, New York is at 1,471,160). That 77.60% compares to an average of 73.58% and standard deviation of 2.90% across those 64 counties. […]  Among those 64 counties, it also has the most adult diabetes (12.3%).

which can be addressed by the recommendations seen by clicking on Taking Action.

Institutions and communities can take a specific, multi-pronged approach to increase daily consumption of fruits and vegetables among adults.  The CDC Guide to Strategies to Increase the Consumption of Fruits and Vegetables describes each of the following strategies in detail. […]

Want to evaluate your city or state?  Let’s say Boston.  Assess the social determinants of health in Suffolk County or Middlesex County.  Then do a side-by-side comparison of the 7 counties surrounding Boston.  Then examine the biggest achievements and improvement opportunities (i.e., deficiencies) across all Massachusetts counties.

Migrating to California, here are two interesting comparative insights for San Francisco county and San Mateo county.

San Francisco County, California has the most violent crimes per 100,000 population (702.66) of the 78 counties with at least $78,621 in median household income (San Francisco County, California is at $81,294). […]

Only San Mateo County, California has both such a high median household income ($93,623) and such a high natural amenities index (8.19). […]

We at OnlyBoth are pleased to show that AI-style comparative analytics can enable unprecedented transparency in both healthcare as well as health, and thus help drive performance improvements and inform consumer choice, all for the public good.

Raul Valdes-Perez


Scope of Innovation of Healthcare Benchmarking Engines

The engines at BenchMine.com, powered by Artificial Intelligence methods and principles of User Experience design, show these technology-enabled advances over the benchmarking status quo.

1. A user experience based on selecting questions to be answered and getting noteworthy insights as answers, rather than pushing lots of data and dashboards without a clear sense of what is being answered and what is noteworthy. The first-encounter UI poses these questions:
o How is this provider doing? (i.e., where does it stand out positively or neutrally?)
o Where could it improve? (where does it stand out negatively?)
o Where has it changed? (over the last year or two, what changes stand out?)
o What’s best in class? (what are top achievements on specific measures by similar providers?)
o Where does it stand in its county? (or other geography, based on scoring the insights found)

2. Insights are written as perfectly readable, and shareable, English sentences, rather than dashboards. This key novelty led to our trademarked “A sentence is worth 1,000 data.®” and addresses a problem identified in the National Academy of Medicine article Fostering Transparency in Outcomes, Quality, Safety, and Costs, that ”Research has demonstrated that many of the current public reports make it cognitively burdensome for the audience to understand the data.” We believe that dashboards are fine to alert that a warehouse is on fire or your car is nearly out of gas, but not to motivate thoughtful deliberations on performance improvement.

3. Calculating provider latitude & longitude, which enables benchmarking each provider against others nearby, e.g., within 20 or 50 miles, or other distances selected by the user.

4. Insights are supplemented with highly-related facts which help the user understand the significance or scope of the stand-out behavior or outcome. These addenda are also written in precise English.

5. Peer groups are not limited to the usual state, national, and perhaps a pre-defined cohort. Instead, the engine does a massive search for peer groups, expressed as a simple combination of data attributes, in which the benchmarked provider stands out. Geographic proximity can be one of these attributes, alone or with others.

6. An especially novel type of benchmarking insight involves aligning two numeric measures. One measure expresses the stand-out behavior, while the second forms the peer group, possibly in combination with symbolic attributes. For example, “In Texas, Park Plaza Hospital in Houston, TX has the lowest nurse-communication rating (2 stars) of the 88 hospitals with as high a doctor-communication rating (4 stars).”

7. By specifying any known algebraic relationships among measures, the engine can insert action-oriented remarks such as the one italicized in this nursing-homes insight (see it online):  “Carroll Manor Nursing & Rehab in Washington, DC has the fewest total nurse staffing hours per resident per day (2.05) of all the 724 nursing homes that are located within a hospital. That 2.05 is 57% lower than the average of 4.8 across those 724 nursing homes. Reaching the average of 4.8 would imply an extra 80.9 nursing staff per day, assuming an 8-hour workday.”

8. Input data can be numeric, symbolic, yes/no, and even set-valued, which gives rise to innovative comparisons like this“Of the 1,488 hospitals that have at least 4 stars as an overall hospital rating, Shasta Regional Medical Center in Redding, CA is one of just 2 that have a 1-star rating in each of cleanliness, communication about medicines, doctor communication, and quietness (4 total).”

9. As discussed in an AHRQ report on usage of hospital evaluation websites, consumers and healthcare professionals often need different content. So, we have introduced a “Switch Audience” toggle, visible to the user when an insight contains content that appeals to one but not the other, which lets users declare their roles. See the difference by switching the audience to “professional” at this insight on emergency-room wait times. and noticing the paragraph that begins with “Note that …”

10. The final novelty is automation, so that many provider measures can be assessed with the same (human) effort, addressing this point by Dr. Robert Brook: “… quality must be measured in a comprehensive way in order to motivate an institution or physician to provide high-quality care. […] if just a few measures are used to assess quality, the quality of care delivered across all patients in all diseases will be distorted, emphasizing those things that are being measured. Fortunately, we have many well-tested comprehensive quality of care measures that can help prevent this distortion.” Moreover, automation enables introducing measures that express a change over time, and not just the last measurement period, so providers can be compared on how they’ve improved or gotten worse. For example: “… has the biggest plunge in cleanliness rating over one year (-2 stars) of the 1,193 hospitals on the East Coast.

Raul Valdes-Perez

Why Comparing Healthcare Providers Needs Automation

I’ve lived for years in the same area of Pittsburgh, whose streets don’t follow a grid design since it’s hilly and pre-dates the automobile. Sometimes before driving to a familiar destination, I’ll check Google Maps, which alerts me to a favored route that I didn’t even know existed. I act on the suggestion which usually turns out great. Is this unique to mapping, or can this happen in other domains of reasoning and discovery? How about healthcare?

Solution Spaces and Artificial Intelligence

Automated mapping helps me discover new routes not because I’m spatially challenged, but because the software explores side streets which motorists like me don’t consider. Instead, motorists tend to consider the larger, familiar streets that head toward their destination. Using Artificial Intelligence (AI) concepts, we say that mapping software searches for solutions within a larger space of possibilities than people do. In chess play, software considers piece sacrifices which none but top players will ever think of. It also occurs in scientific research. This should happen in healthcare, too, where there are huge potential gains for many stakeholders and rich data sets are publicly reported.

[Continue reading at LinkedIn Pulse …]



Unprecedented Data-Driven Performance Transparency in Healthcare, starting with Nursing Homes

I am proud to announce that, as part of OnlyBoth’s strong focus on healthcare during 2018, we just launched a web-based Nursing Homes benchmarking engine that deeply leverages the latest, rich data on 15,646 nursing homes published in January 2018 by Medicare’s Nursing Home Compare.  The press release is here, and the new, “front door” to the engine is at benchmine.com, shown here:benchmineOne of our goals is to bring the ultimate performance transparency to healthcare sectors, leveraging initially the tremendous work done by Medicare’s contractors, nursing-home inspectors, and nursing homes themselves to contribute data for public access.

To further this goal, we have chosen to make the service simple, quick, and especially affordable. To evaluate a single home, users pay $9 one-time with a credit card or Paypal. To evaluate any of the 15,646 nursing homes, pay $39. To perform queries that go across all nursing homes, pay $99. These payments give access to one quarterly edition of the engine. We expect to create new editions every quarter, using the latest published data. Read here about the features available at different price points, which support various roles within the nursing home industry.

Finally, I’ll emphasize that the benchmarking engine, which discovers comparative insights worth knowing and writes them up in perfect English, without injecting biased opinion anywhere, generates more words in its nursing-home application – around 80 million – within insightful sentences than are contained in the entire Oxford English Dictionary or the Encyclopedia Britannica.

But don’t let that volume scare you. Just as Google’s search engine stores nearly all the world’s web content, but brings you a manageable number of results, worth knowing, that are relevant to your query, so does a benchmarking engine!

Raul Valdes-Perez

Benchmarking the CDC 500 Cities on 28 Health Measures

The CDC’s 500 Cities Project recently published 28 health measures on the 500 largest U.S. cities (see them listed or mapped). The measures cover various resident behaviors, afflictions, medication, and screening. We at OnlyBoth downloaded the data and set up a cities benchmarking engine to answer these standard comparative questions: How is this city doing?, Where could it improve?, and What’s best in class?

Just enter any of the 500 cities at 500cities.onlyboth.com. Then click on a left-side question to discover noteworthy peer groups in which your selection is near the top or bottom. Or, set up a fencemarking query and click Go at the bottom to, for example, learn the top insights among all 121 California cities, or to uncover comparatively-high binge drinking there (guess who?).

To appreciate this technology and its simplifying potential to motivate human and customer progress, compare to how standard dashboards have been applied to the 500 Cities data. Or, to understand why dashboards aren’t really up to the task of comparative performance evaluation, check out Why San Mateo Daily Journal Really Doesn’t Like California’s Education Dashboards.

Lastly, if you also wish to benchmark counties, read here.

A sentence is worth 1,000 data.®

Raul Valdes-Perez


Where Does Automated Customer Benchmarking Make Sense?

A customer benchmarking engine is an emerging technology which uses an artificial intelligence approach to automate the reasoning that underlies data-driven benchmarking. Its benefits are discussed here, there, and elsewhere. Briefly, it uncovers comparative insights on customers which empower customer-focused employees to be more proactive, or which are shown directly to those customers as a premium information service. The business benefits include churn reduction, market differentiation, extra revenue, and deeper customer relationships.

But, automated customer benchmarking doesn’t always make sense. So where does it? Here I’ll summarize the criteria that we’ve learned from clients, trials, conferences, discussions, and analysis.800px-Street_Sign_with_ideas

Data. A single organization collects data on its business-customers’ traits, behaviors, business outcomes, and feedback, e.g., via surveys. Lack of data on customer outcomes narrows the scope of the insights, which may still have internal value for account management. Also, the organization should not be contractually prohibited from performing comparative analysis across customers, appropriately anonymized if the resulting insights are to be shown to customers. Evidently, the data shouldn’t be wrong or mostly missing.

Motivation. The organization should be B2B because consumers (B2C) are generally less motivated to improve, because they are less driven by external stakeholders. The same lack of strong motivation may be found if the B2B organization serves very small businesses, which are less prone to carry out performance analysis: if they are tiny but making money, then life is good, and if they’re losing money, there are more-urgent issues to address. Think of your small neighborhood restaurant, for example.

Also, the business process that the organization supports with its services should not be seen as a utility, meaning that customers only care that the service be available when they need it, and little or nothing more. Think of an internet connectivity service, for example.

A strong positive indicator of motivation is when customers themselves ask the vendor organization how they’re doing compared to other customers, where they could improve, etc.

Comparability. In principle, benchmarking only makes sense if the benchmarked entities are comparable. It makes little sense to benchmark an elephant against an armchair and an airplane. Comparable doesn’t mean identical or even similar, just productively worthy of comparison. For example, a business consultancy that brings the smartest people in the world to fix whatever problem you have, whether it’s a leaky roof, runny nose, or buggy software, won’t have comparable customers. An HR SaaS company does have comparable customers, even if its customers range from the Fortune 500 to startups and in between, because HR has common elements across companies of any size or industry: employee motivation, compensation, tenure, promotion, recruiting, dismissal, etc. Comparability is a judgment call, but most B2B vendor organizations do have comparable customers, otherwise it would be hard for them to scale their business.

Scale. A customer benchmarking engine is a powerful tool that scales beautifully with the number of customers. But, just as a search engine is probably overkill if you only possess 50 documents, or a receptionist is overkill if you have 5 employees, benchmarking 50 customers likely isn’t worth the trouble, even though the engine will do its job. Given the tradeoffs, we believe that about 150 is the right minimum number of customers for automated benchmarking to make sense.

It’s worth citing some false disqualifiers which are wrongly believed to invalidate customer benchmarking, automated or not. (1) Customers need not be concentrated by industry or segment, much less be competitors, since one is benchmarking the customer’s business process that is supported by the vendor organization’s service, not benchmarking the customer’s overall market performance. (2) The data suitable for benchmarking is rarely scarce. For example, if a given metric (employee satisfaction, say) is potentially insightful, then so is the quarterly change in that metric, since it expresses a trend. Ditto for the change when compared to the same quarter last year. Thus, the insightful metrics are easily tripled, based on changes over time, as we’ve discussed elsewhere. (3) Data need not be perfect; it never is. And the end-result of imperfect data is not a plane crash, but a misleading insight, which tends to be caught and discarded before significant action is undertaken.

Now let’s summarize the four qualifiers data, motivation, comparability, and scale in a single brief sentence:  A customer benchmarking engine makes sense for B2B organizations that generate rich data on its 150+ non-tiny customers as a by-product of its non-utility-like, repeatable service.

Who are these organizations?  B2B SaaS (software as a service), Industrial Internet of Things, BPO (business process outsourcing), Managed Services Provider, and 3rd-Party Administrator, are generally good matches if they fit the other criteria.

Automation doesn’t always make business sense, especially when the enabling technology lies outside one’s own organization, which circumstance always involves a coordination cost. But automation scales well and can enable things or insights that don’t yet exist. Apart from the benefits discussed elsewhere, this article shares what we’ve learned about where the emerging technology of customer benchmarking engines makes sense.

Raul Valdes-Perez