Introductions to single person, n-of-1 trials

A good article in MIT Sloan Review about patient innovation mentions the rise of n-of-1 trials:

Fortunately, very low-cost approaches exist and are being developed to make it practical for patients — both individuals and groups — to carry out high-quality, ethically appropriate trials. Many of them involve a trial design called “n of 1,” in which trials are of a single patient, or “aggregated n of 1” for multiple patients.

And another in Discover Magazine:

One way to correct for the gaps the gold standard leaves in our knowledge is the “N of 1” trial, where the number of participants (N) is one instead of hundreds or thousands of volunteers. That one person works with the doctor to test a narrow hypothesis — for example, “I think drinking milk will make me feel sick. Am I right?” There are still controls. Ideally, there are still placebos. But at the end, what you get is a patient-specific, individualized answer. It’s a process shown — by controlled clinical trials, no less — to improve patient outcomes. And scientists working with these studies today, including Saeed, are almost invariably enthusiastic about N of 1’s potential.

The medium is the medicine: a novel history

Medicine’s history is often presented as a long sequence of discoveries, all made in laboratories. In fact, though, the biggest changes in US medicine over the last 200 years were propelled by forces beyond medicine, specifically, in media. How medicine’s stakeholders communicated in different eras ⁠— in formats including medicine shows, newspapers, cars, telephones, medical journals, and TVs ⁠— determined what, and how much, was communicated. Where information flows, medicine follows: now social media, biomonitors, and machine learning are ushering in a new age, one of patient-generated medicine.

In pop histories, medicine occurs in a vacuum. The last two centuries are portrayed as a self-contained timeline of heroic, rational progress. Brilliant, persistent men (and a few women) conduct increasingly refined experiments. Theories and practices improve. Medical history is made inside labs, clinics and operating theaters; the outside world is silent.

Excerpts from one timeline give an example of this kind of history:

  • 1844 Dr. Horace Wells uses nitrous oxide as an anesthetic
  • 1847 Ignaz Semmelweis discovers how to prevent the transmission of puerperal fever (wash hands!)
  • 1870 Robert Koch and Louis Pasteur establish the germ theory of disease
  • 1879 First vaccine developed for cholera
  • 1899 Felix Hoffman develops aspirin
  • 1922 Insulin first used to treat diabetes
  • 1928 Sir Alexander Fleming discovers penicillin
  • 1950 John Hopps invented the first cardiac pacemaker
  • 1955 Jonas Salk develops the first polio vaccine
  • 1978 First test-tube baby is born

Unfortunately, timelines of discoveries don’t account for the most important changes in medicine. No notice is taken of changes in the roles of patients, manufacturers, doctors, promoters, researchers, or of changes in what counts as medical knowledge. These changes can only be understood by examining the media through which medical knowledge and relationships have been organized and communicated.

Though few people actually read his idea-glutted books, most college students know how to invoke Marshall McLuhan’ aphorism “the medium is the message” when they’re running out of things to scribble in an essay. The most common interpretation of McLuhan: the tools we communicate with dictate the shape of the content conveyed. Amphitheaters create comedies and tragedies; books enforce specialized linear narratives; TV portrays universalist visual poetry. (Tech observer Clive Thompson concisely summarizes this view.)

McLuhan’s aphorism stretches further, though.

First, for McLuhan, a medium wasn’t just a tool for mass communication like newspapers, TV, movies, books, and radio. (Though pundits, journalists and academics flatter themselves that McLuhan was thinking only about their careers.) In the first paragraph of his epoch-making Understanding Media: The Extensions of Man, published in 1964, McLuhan defines media as “any extension of ourselves.” And his discussion of media in ensuing chapters includes telephones, rifles, sports, cities, comics, theater, clocks, roads, cars, airplanes, and money. Really, for McLuhan, any human construct that plays a role in how people relate to knowledge, the world, or (most importantly) each other, is a medium.

Second, for McLuhan, a medium doesn’t only shape its own contents or immediate environment. Its effects are pervasive, seeping out to influence how we generally live and work. “The medium… shapes and controls the scale and form of human association and action.” We make the tool, then the tool remakes us. When the medium is an ocean, we grow gills and swim; when it’s a sky, we sprout wings and fly. In McLuhan’s metaphorical gender-fender-bender, “man becomes the sex organs of the machine world.”

According to McLuhan, “the effects of technology do not occur at the level of opinions or concepts, but alter sense ratios or patterns of perception steadily and without any resistance.” What counts as reality varies, depending on what we’re counting with.

To liberate “medium” from its common definition as press/entertainment, it helps to invoke the word’s meaning in a biology lab: “a substance, such as agar, in which bacteria or other microorganisms are grown for scientific purposes.” In this sense, we’re bacteria floating in a petri dish, barely able to distinguish ourselves from the medium that supports and nourishes us. Close and ubiquitous, the medium is invisible; its properties become ours.

Common sight in a high school lab: a petri dish filled with agar medium and bacteria blooms.

Considered in terms of periods of ascendent media, the history of US medicine breaks into at least five often fuzzy and sometimes overlapping eras.

1700-1900

In the 18th and 19th centuries, Americans frequented medicine shows for both amusement and health remedies. The horse-drawn entertainers travelled muddy paths between homesteads and villages to sell each proprietor’s own hand-mixed nostrums. (Before launching his eponymous circus, PT Barnum floundered for a few years as a medicine pitchman. He traversed New England peddling Proler’s Bear Grease — guaranteed to grow hair! — from a gaudy wagon.)

Most people relied on home remedies — concoctions literally made and consumed in the home. According to Paul Starr’s history of American medicine, Scotsman William Buchan’s book Domestic Medicine was a favorite resource, going through at least 30 editions. Buchan’s book emphasized diet and simple preventative measures—exercise, fresh air, cleanliness. Buchan was avowedly populist. In his discussion of smallpox vaccination, Buchan argued that doctors often retarded medical progress. In his words, “the fears, the jealousies, the prejudices, and the opposite interests of the faculty [elite physicians] are, and ever will be, the most effectual obstacles to the progress of any salutary discovery.” Doctors should be consulted only in extremis, argued Buchan.

Doctors came in many stripes, as Starr documents. After early attempts, state by state, to institute medical licensure at the beginning of the 19th century, a wave of Jacksonian populism wiped away formal accreditation in most states. Anyone could practice medicine. For aspiring doctors, both medical societies and medical schools offered means of establishing credibility with patients. Competing with each other for fees, the two institutions lowered standards and boosted the supply of doctors. Many doctors had second careers. Farming was a popular second job; one doctor moonlighted robbing stagecoaches.

1850-1900

During the Civil War, the adoption in publishing of cheap wood pulp triggered a newspaper boom largely funded by ads for “patent medicines.” Patent medicines accounted for one third to one half of all ads in US newspapers in the nineteenth century. As one chronicle of the era puts it, “Newspapers made the patent medicine business, which in turn supported the newspapers.” Patent medicines were often national brands, mass produced elsewhere in the country and transported by rail. (Later, many of these brands shed their medicinal claims and, with commercial foundations deeply planted, remain familiar today, most notably Coca Cola, which was originally peddled as an intelligence booster.)

Doctoring was still hit or miss. Though microbes were identified as a source of in disease illness in 1870, most infections remained untreatable until the advent of antibiotics in the 1940s. New instruments like the stethoscope, ophthalmoscope, spirometer, electrocardiogram and laryngoscope added “added a highly persuasive rhetoric to the authority of medicine,” but did little to prolong life. Most health gains in the period resulted from advances in public health—plumbing, sewers, nutrition, rising incomes, water purification—rather than interventions by individual doctors.

1900-1920

At the turn of the 20th century, doctors were among the first to adopt telephones and cars, which made doctoring far more efficient and lucrative. (Google graph of word frequency in a cross section of scanned books.)

Before 1900, “many people thought of medicine as an inferior profession, or at least a career with inferior prospects,” according to Starr. The average American doctor earned less than “an ordinary mechanic,” riding miles each day on horseback to see just a handful of patients.

The status of doctors changed dramatically in the first decade of the 20th century, when cars, telephones and urbanization made the practice of medicine more efficient and, therefore, more lucrative.

Some nuggets that help make these changes more tangible:

  • Starr reports that “the first rudimentary telephone exchange on record, built in 1877, connected the Capital Avenu Drugstore in Hartford Ct, with twenty one local doctors.” Drugstores had long functioned as message boards for physicians, and many patients, still phoneless, visited a drugstore to connect with a doctor.
  • Doctors were early adopters of automobiles, writes Starr. The Journal of the American Medical Association published several auto supplements between 1906 and 1912. Doctors reported that, compared with a horse, house calls using a car took half as much time and were 60% cheaper. (In Arrowsmith, Sinclair Lewis’ satire of medicine, the eponymous protagonist’s best friend leaves med school in 1908 to sell cars, and a med school professor advises students that patients’ tonsils are essentially a currency for acquiring an automobile.)
  • Population living in a settlements of more than 2500 rose from less than 30% in 1880 to 45% in 1910, notes Starr. In the northeast, population density in settlements soared from 200 people per square mile in 1890 to 450 in 1910. In addition to density, cities brought single people lacking family support; together these factors spawned hospitals, argues Starr. In turn hospitals facilitated specialization, capital investment and knowledge sharing. Hospitals also bolstered the authority of local medical associations, since these groups often dictated which doctors could admit patients.

At the turn of the century, many doctors’ incomes doubled, and their social status and political clout climbed accordingly. In step with unionization in other trades, doctors reorganized and strengthened the American Medical Association; its membership rose from 8,000 in 1900 to 70,000 in 1910, writes Starr. Doctors used their new clout to limit competition — halving the number of medical schools, marginalizing non-accredited healers, and lobbying to end direct-to-consumer advertising by patent medicine makers.

Doctors were frequently quoted in Samuel Adam’s famous 1905-1906 series about patent medicines, The Great American Fraud, in Colliers Weekly. The opening paragraph marked the beginning of the end of patent medicines:

Gullible America will spend this year some seventy-five millions of dollars in the purchase of patent medicines. In consideration of this sum it will swallow huge quantities of alcohol, an appalling amount of opiates and narcotics, a wide assortment of varied drugs ranging from powerful and dangerous heart depressants to insidious liver stimulants; and, far in excess of all other ingredients, undiluted fraud. For fraud, exploited by the skillfulest of advertising bunco men, is the basis of the trade. Should the newspapers, the magazines and the medical journals refuse their pages to this class of advertisements, the patent-medicine business in five years would be as scandalously historic as the South Sea Bubble, and the nation would be the richer not only in lives and money, but in drunkards and drug-fiends saved.

Samuel Hopkins Adams, Collier Weekly, October 5, 1905

The AMA, in turn, converted the series into a pamphlet, printing and distributing 500,000 copies. (Some newspapers, dependent on patent medicine ads, later refused to run ads for books by Adams.) Public outcry and AMA lobbying led to passage of the Pure Food and Drug Act of 1906, which strengthened the role of the Bureau of Chemistry, precursor of the FDA, in policing claims by drug makers.

1920-2000

Drug makers’ access to consumers through ads was increasingly restricted as the 20th century progressed. In response, manufacturers focused (and continue to focus) on cultivating doctors as lucrative allies in flogging drugs. Over the last 100 years, drug makers have perfected a number of methods.

At the simplest level, drug companies promote their wares to doctors by advertising in medical journals.

They also, today, pay an army of an estimated 100,000 salespeople to talk with doctors. Drug companies track doctors’ prescription habits using a databased licensed, in part, from the AMA. Using extensive notes and “finely titrated doses of friendship,” the salesforce targets physicians with meals, flattery, honoraria, free samples, sports tickets, drinks, business consulting, support for favorite little league teams, and sundry other gifts. Physicians are susceptible to influence, says one former sales rep, “because they are overworked, overwhelmed with information and paperwork, and feel under appreciated.” Conveniently for the drug companies, physicians believe themselves to be righteously impervious to influence, despite abundant evidence to the contrary. Perversely, one Canadian study found that “the more gifts a doctor takes, the more likely that doctor is to believe that the gifts have had no effect.” 

Drug companies have also increasingly funded training for doctors.

Then there are the spokes-doctors (known as speakers) who are collectively paid billions each year, according to amazing data compiled by watchdog Propublica, to explain products to their credulous peers while being proctored by company sales reps. One doctor describes being called on the carpet after he deviated from the company-created Powerpoint deck. “The manager’s message couldn’t be clearer: I was being paid to enthusiastically endorse their drug. Once I stopped doing that, I was of little value to them, no matter how much ‘medical education’ I provided.”

Whether it’s relayed by salespeople or spokes-doctors, the ostensible content of drug marketing is published research. This too has been “doctored.” Over time, drug makers have become adept at funding or cherry picking results favorable to their products, publishing the same research in multiple guises (called salami slicing), even ghost writing articles; by century’s end, analysts concluded that, in the words of one former BMJ editor, “medical journals are an extension of the marketing arm of pharmaceutical companies.”

1997-2019

In 1983, the FDA shut down an ibuprofen ad directed at consumers ⁠— an antique featuring the company’s suit-clad British CEO and a chalkboard ⁠— that aired briefly on TV. Over time, though, court rulings made it clear that drug makers’ First Amendment rights would be upheld if they challenged the FDA.

So, when the FDA changed its rules to allow drug companies to promote products directly to consumers in TV ads without including a detailed list of side-effects, medicine jibed in a new direction again in 1997.

Critics say drug companies increasingly divert from finding new treatments for illnesses towards “disease mongering”— finding and promoting new disorders, preferably chronic, that can be treated with previously FDA-approved drugs. Expensive “me-too” drugs, patented for a trivial improvement but heavily marketed, also soak up company resources. Companies like IQVIA ($27 billion valuation) mine data seeking new market niches for existing drugs. Shyness gets rebranded as “social anxiety disorder,” treatable with Paxil, a $3 billion/year antidepressant. In progressive revisions of psychiatry’s handbook, the Diagnostic and Statistical Manual, committees of researchers, most of whom are funded by drug companies, both loosened treatment criteria and ramped the number of the psychiatric disorders from 106 in edition 1 in 1952 to 297 in edition 5. Almost all those disorders lack biomarkers or diagnostic verifiability.

Gradually, the white-coated doctors who traditionally served as drug company CEOs have departed, their places taken by businessmen in suits. Today, just three doctors remain among the CEOs of the world’s biggest 15 drug companies; the rest are former sales reps (3), lawyers (2), economists (2); the rest are scattered across marketing, biz dev, operations, investment banking, and consulting. (One is a woman, one is African American.)

Trying to put the genie back in the bottle, in 2015 the AMA called for a ban on direct to consumer advertising. No dice, Aladin.

Today

Many health challenges have been overcome since 1900. Infant mortality is down 90% and maternal mortality is down 99%, according to the CDC. Vaccines have eliminated smallpox and polio, and slashed measles infections by 99.8%. Many cancers are now chronic conditions rather than death sentences. (Medical triumphs notwithstanding, public health advances get credit for 25 years of the 30 year lengthening in US life spans between 1900 and 2000.)

The challenges that remain are complex and, often, multivariate: Alzheimer’s, Parkinsons, arthritis, and psoriasis; conditions like Diabetes 2 and heart disease, whose etiology is sometimes behavioral; and orphan diseases, guerilla fighters holed up out of sight of medicine’s regular brigades and firepower.

Beyond the obscurity, intractability or complexity of each lingering disorder or illness, the infrastructure of medicine itself has erected barriers to innovating and solving patients’ problems.

Drug and device manufacturers focus on milking their past winners and funding potential blockbusters. The FDA’s rigorous clinical trial requirements make micro-experiments on treatments for orphan disorders risky and uneconomical. Doctors are overworked and demoralized, stretched by demanding corporate bosses and electronic record keeping.

The electronic medical record (EMR) software mandated in 2009 is, no surprise since you’re still reading this essay, reshaping care. Atul Gawande noted that “a system that promised to increase my mastery over my work has, instead, increased my work’s mastery over me.”  More cynically, “EMRs no longer seem to even pretend to be about patient care,” writes Dr. Judy Stone. “The goal is to optimize billing through upcoding. You do that, in part, by ‘documenting’ more, through check boxes and screens that you can’t skip.” 

Forged by medical school and residency and, often, relying on pharma reps for updates, the average doctor is 17 years behind current best practices. As our understanding of medical conditions becomes more complex, and as the medical organizations that address them also, in parallel, become more complex, affecting change becomes harder and harder. Read Elizabeth Rosenthal’s An American Sickness, and you’ll see that—as with software systems created by multiple authors for myriad users, and grown bloated over scores of years—entropy and inertia prevail. The problem is evergreen:

We are still far from understanding how healthcare practice can be improved rapidly, comprehensively, at large scale, and sustainably. In fact, this observation has been made several times in previous decades.

Knowledge translation in health: how implementation science could contribute more

Now another era is dawning, that of patient generated medicine. Early in the 21st century, three media of unprecedented volume and reach are leading changes in medicine’s trajectory:

  • Social media is empowering patients with niche medical conditions to network and pursue collective action; in parallel, powerful new targeted ad technology makes it possible for companies to target these patients’ needs. These communities range from disease-independent platforms like PatientsLikeMe to Facebook groups for people with migraines or psoriasis. Podcasts and communities run by Peter Attia and Rhonda Patrick are organizing new constituencies for longevity and athletic health. Millions of members of Strava, Peloton, Zwift and the Potterhead Run Club focus on community and healthy competition. Programmers share code on Github. Laypeople and pros trade insights at Twitter hashtags like #LCHF. (All this is apparently invisible to tech pundits like Kara Swisher, who recently summed up social media as “a chaos machine that ginned up a world of socially acceptable sadism.”)
  • Thanks to biomonitors, many people now have up-to-the-minute access to giant volumes of their own health data, with details that are far more specific than those accessed by the average family practitioner. Moore’s law is making technology, both hardware and software, exponentially cheaper, smaller and/or more powerful. Affordable tools include real time glucometers, wifi-enabled oximeters, cloud-synched heart monitors (for data ranging from heart rate to EKG to heart rate variability), motion detectors, speech tracking tools… and many more. (ElektraLabs catalogs 600+ medical-grade connected biomonitors.) A full genome decoding, which cost $2.7 billion in 2003, was on sale (on Facebook, of course) for $189 on Black Friday 2019. Software (like ours!) uses n-of-1 protocols to power personal tests of drug efficacy, deconstructing drug makers’ efforts to peddle “average effects.” Dr. Eric Topol, director of the Scripps Translational Science Institute, argues that increasing patient access to more data and more insights will end the information asymmetries that generate medicine’s deep seated “extreme paternalism.” He argues that “just as Gutenberg democratized reading, so there is a chance that smartphones will democratize medicine.”
  • Machine learning is the only tool capable of processing the mind-boggling volume of data being generated by and about humans. Machine learning can trawl through billions of data points and detect patterns far subtler than anything human computation or senses might notice. The acceleration of machine is powered, in part, by Moore’s law: one vital computing resource for machine learning is doubling every 3 months. But the growth of machine is also founded on new theories and algorithms. As we shall see, machine learning threatens not only the doctor’s preeminence as the high priests of human health, but potentially destabilizes the conception of knowledge underlying the modern idea of progress.

Patient communities may provide a vehicle for evading the gridlock, pulling innovation through that previously needed to be pushed. With patients convening and advocating for themselves, perhaps doctors will be able to transition from being expedition leaders to being sherpas.

In some cases, these communities have helped fund research or performed the research themselves. For example, parents of eight children with a rare genetic disorder called NGLY1 coalesced around a single blog post and rallied to coordinate knowledge sharing and research for their children. Parents of two of those patients noted in an article in Nature:

Until very recently, the fragmented distribution of patients across institutions hindered the discovery of new rare diseases. Clinicians working with a single, isolated patient could steadily eliminate known disorders but do little more. Families would seek clinicians with the longest history and largest clinic volume to increase their chances of finding a second case… This challenge can be circumvented by tools already created for and by the Internet and social media.

Nineteen months after the initial report by Need et al.,2 five viable approaches to treatment are under active consideration, thanks to relentless digging by afflicted families. One parent found a compound that seems to have measurably raised the quality of life in one NGLY1 child. Another parent read about a novel (but relevant) fluorescent assay and shared it with the NGLY1 team. The team had not heard about it, but it has become a fundamental tool in the functional analysis of NGLY1. One parent has formed and funded a multi-institutional network of researchers to tackle specific projects. The capabilities of parents and the social media are frequently underestimated; we are here to say: join us! 

Online communities of sleep apnea sufferers provide a case study of how these two burgeoning media categories—social media and biosensors—will alter, for better or worse, medicine’s trajectory. (Sleep apnea is an nightly series of breathing interruptions in that disrupt sleep, elevate cortisol levels and cause headaches and daytime sleepiness. Doctors traditionally prescribe a CPAP, a machine that continuously pushes air into a patient’s nose or mouth, to reduce sleep apnea. Meanwhile, humans’ understanding of a CPAP’s actual efficacy remains iffy.)

One product of patient generated medicine is a profusion of products created by companies leveraging new channels to communicate with patients. Because apnea sufferers commune online and can be tracked with ad technologies, companies are targeting them with hundreds of new promotions and products.

As of November, 1, 2019 Somniflex is running 47 ads for its ‘mouth tape’ product.
One company, Fisher Wallace, is even running ads soliciting investors.

Another harbinger of the new age lies in open source software built by and for apnea sufferers. In 2011 one apnea sufferer, Mark Watkins, started working on software called Sleepyhead to decode the data collected by a patient’s CPAP machine.

“As time progressed, I became increasingly disgusted at how the CPAP industry is using and abusing people, and it became apparent there was a serious need for a freely available, data focused, all-in-one CPAP analysis tool” — Mark Watkins, creator of SleepyHead software

‘I’m Possibly Alive Because It Exists:’ Why Sleep Apnea Patients Rely on a CPAP Machine Hacker

By reverse-engineering the data locked up on various brands and models of CPAP machines, Watkins gave users a various views of their own data in charts and graphs.

Watkins’ code has been promoted and discussed in numerous social media communities that focused on apnea and CPAPs. More than 78,000 sufferers belong to forum called Apneaboard. On an average night, 2,000 members are posting or lurking. Other boards like CPAPtalk have tens of thousands of members. On Facebook, a group of 1,000 focuses on central sleep apnea, a rare apnea subset. Without these groups’ attention and cheering, it’s unlikely Watkins software would have thrived.

Users say they get far better insight from Sleepyhead than from their own doctors. And academics now rely on Watkins’ code to compare various brands of CPAP machines, brands which try to keep their data in walled gardens.

Ironically, members the CPAPtalk community, agitating for a more aggressive development path, ultimately launched an offshoot of Watkins’ code (technically a “fork”) called OSCAR, and Watkins abandoned his project in early 2019.

As humans frantically write software to graph and analyze the data pouring out of biomonitors and demanded by specific communities, whether of athletes or sufferers of orphan diseases, we’re also seeing comprehensive attempts at digest data using machine learning. While machine learning (also called deep learning or neural nets or, more generically, artificial intelligence) sounds like just a new flavor of software, it is radically different. It’s theory-agnostic, approaching data with no hypotheses or assumptions or models in hand. It plows through oceans of data and fishes up patterns, insights, anomalies. It offers no theories about what it’s found.

In his recent book Everyday Chaos, technologist and philosopher David Weinberger suggests that “the rise of machine learning is one of the significant disruptions in our history.”

Each of us experiences a version of this phenomena every day. Google’s tens of thousands of engineers are famous for designing competition-thrashing algorithms to serve up useful information in response to people’s searches. In 1999, Google’s solution was a single simple rule—using the number of inbound links to a web page as a proxy for that page’s authority. Eventually, Google’s engineers wrote software that factored in hundreds of signals to help make this judgement. But a funny thing happened in 2015. Seeking to optimize its responses to searches it had never seen before — apparently 15% of Google searches are unique?!— Google did some testing and determined that machine learning, named RankBrain, did a better job than most of its massive, intricate, traditional human-built rules, to the dismay of some of its engineers. Google now leans on RankBrain when responding to unique searches. This means the managers of the world’s most influential processor of human information can’t explain the black box that powers 15% of its interactions with humans.

As Weinberger notes, choosing to rely on machine learning yields a radically different view of the universe in comparison to the positivist/progressive assumptions that guided the last 300 years of science and medicine. We’ve believed that rationality and experimentation carry us steadily forward into a better world. Humans will figure things out. This optimism is dogma in US medicine.

But this view is breaking down. “As our predictions in multiple fields get better by being able to include more data and track more interrelationships, we’re coming to realize that even systems ruled by relatively simple laws can be so complex that they’re subject to… cascades and to other sorts of causal weirdness,” writes Weinberger.

Confronting the fact that machine learning often produces insights and connections we can’t explain, “we’re beginning to accept that the true complexity of the world far outstrips the laws and models we devise to explain it,” We’re “losing our naive confidence that we can understand how things happen.”

This view is profoundly disruptive to both the theory and practice of medicine. Doctors rely on models and hypotheses to operate; their long-trained capacity to explain, even when they’re unable to cure, shores up their social authority. Now doctors being bettered by machines that invoke no models or hypotheses to do their work, and offer no explanations for their findings.

Physicians’ egos aside, the new paradigm has practical ramifications. In light of the fact that thousands of variables may affect a condition, napalming the tangled jungle of a particular disorder with one or even one dozen chemicals will soon be seen as vast overkill, both ineffective and a waste of resources.

Historically, doctors have fought change—whether the advent of the stethoscope or professional nursing or home pregnancy tests—that threatened to dilute their authority. It remains to be seen how doctors will relate to machine learning. When I mentioned Aging.ai to my own doctor, he was unfamiliar with the site. I explained that its neural net had processed hundreds of thousands of anonymized blood tests and found that a high BUN score is the single best predictor of early death. (This metric is of particular interest to me because my BUN is high, on par with an average 95-year-old’s.) He scoffed, “Ah, don’t worry about it, that’s just software.” 

A new medium—whether medicine shows, telephones, medical associations or Facebook groups—congregates people and creates an order; it enables and shapes expectations. The new audience inspires or demands new products.

For better or worse, how information flows controls where medicine goes.

  • When entertainment and drug making were local, Americans depended on nostrums made at home or sold by horse-drawn medicine shows.
  • When railroads and newspapers transformed the creation and promotion of goods, Americans were inundated with patent medicines.
  • When doctors rose in status because of telephones, cars, urbanization and unionization, they gained a monopoly medical standards and on dispensing medicines.
  • When, in turn, drug manufacturers used research and medical journals as tools to sell to doctors, medicine was steered by what the companies chose to fund and promote.
  • When televised drug ads were unleashed, consumers were sold on new sub-ailments. When drug companies co-opted the process of drafting diagnostic manuals, diagnoses multiplied.
  • Now social media, biomonitors and machine learning are ushering in a new age, one in which individual patients, rather than institutions, can drive innovation, or quackery, in directions they choose.

What are n-of-1 trials, and what are they good for?

Definition of an n-of-1 trial

An n-of-1 trial is a experiment conducted for a single person in which treatment blocks are randomly rotated, symptoms are systematically logged, and results are statistically analyzed. Since many treatments work differently for different individuals, n-of-1 trials help determine a treatment’s efficacy for a specific individual. N-of-1 trials are typically used for chronic conditions and are not considered appropriate for acute illnesses.

Examples of n-of-1 trails

October 2019: A unblinded study of ~200 patients with chronic pain showed that patients using an n-of-1 protocol used significantly less pain medication. The study allowed the n-of-1 users to objectively compare the efficacy of medications, both NSAIDs and opioids, versus non-pharmaceutical interventions like yoga and physical therapy.

2014: N-of-1 (Single-Patient) Trials for Statin-Related Myalgia found that patients on or off statins experienced no difference in pain.

2005: Double blinded n-of-1 trials for 71 patients with chronic pain comparing over-the-counter medications found that 65% changed their treatments.

Examples of n-of-1 therapies

In addition to being a term of art of evaluating common treatments for their efficacy on a single individual (as above) “n-of-1” is increasingly used in describing drug or genomic solutions crafted to fit a single patient’s needs. A good example of this is the recent case of Mila, a 6-year-old with a rare fatal genetic condition, was cured by doctors who crafted a drug specific to her genetic mutation.

Which strategy works best to disrupt a social media addiction?

In an article in Harvard Business Review, author Sarah Peck sums up the relative success of four strategies she tested to break her own social media addiction.

  • No social media for 30 days, which was ‘easier than expected.’ Result: after the month was over, Peck discovered her phone was her addiction enabler.
  •  Allowed social sites on the computer in the afternoons only — not in the mornings, or after dinner. Result: “I got so much more done on my biggest projects by having dedicated focus hours, and also knowing that there was a scheduled break in my day coming up.”
  • Created a “happy hour” in which she was free to browse. Result: “Strangely, consolidating all of my social media use into a single hour made it seem less exciting.”
  • Avoid social media (and the phone!) every Saturday. Result: “A day free of the Internet is a great way to do a pattern reset if you notice (as I have) personal productivity dips by Friday.”

In the end, Peck concluded that each approach had its merits and that ANY consciousness of social media addiction led to greater productivity and satisfaction.

Peck’s takeaway: “These experiments helped me realize that at the heart of my cravings around the social internet are deep connections with friends, access to new ideas and information, or time to zone out and relax after a hard day. Each of these components can be satisfied with other things beyond social media, and more effectively.”

This is what innovation (finally) sounds like

The disconnect between the price trajectories of healthcare versus consumer electronics — the first doubling every 10 years, the second halving every three years — is one of the most glaring of the 21st century.

In coming months, that gap will start to close for people with hearing loss, as FDA-approved over-the-counter hearing aids appear on the market.

Traditional hearing aids cost up to $3,000 a piece. They’ve required a prescription, plus an audiologist’s fitting and tuning, yet were not covered by most insurance plans or Medicare. Of an estimated 30-40 million Americans with hearing loss, fewer than 1 in 4 have hearing aids. (Source.)

New FDA standard over the counter hearing aids will cost just hundreds of dollars. (Want lots more facts, figures and prognostications? Read my essays about the impact of technology innovation on medical devices and services.)

While the trade group for hearing aid manufacturers (portentously hosted at hearing.org!) lobbied for OTC hearing aids to be limited to people with mild hearing loss, the legislation included people with moderate hearing loss.

Currently you can buy a “personal sound amplification product,” known as a PSAP, in a pharmacy, but quality varies greatly. A study in 2017 found that for people with hearing loss, a prescription device boosted comprehension from 75% of words to 88%. Four of five PSAPs tested boosted comprehension to 81.4% to 87.4%. (One did worse than nothing!)

The process began in 2017, when the FDA Reauthorization Act of 2017 passed Congress and was signed into law. The act instructed the FDA to permit OTC hearing aids by 2019.

In March, CVS closed 30 in-store “hearing centers” across the country in recognition for the shifting market.

Meanwhile, stripped of their role as gatekeepers to hearing aids, audiologists are scrambling to reposition themselves. “Dependence on the audiogram is old-school and leads to a one-size-fit-all approach to hearing care,” wrote Weinstein, Barbara, a professor of audiology, in a trade publication for audiologists. “And with the passage into law of Section 709 of the FDA Reauthorization Act of 2017 (FDARA) that includes the Over-the-Counter (OTC) Hearing Aid Act, we must abandon this way of thinking. A maturing of our philosophy as audiologists should emerge from the disruptions presented by the FDARA, with an emphasis on the value added of engaging with audiologists.”

Critics argue that the “tuning” currently touted by audiologists is an artifact of a pre-digital age. “It’s true that we don’t go in and individualize each one, but [professional fitting] is an ancient byproduct of a time 30 years ago when [hearing aids] really needed to be tuned,” Christian Gormsen, CEO of Eargo, which makes hearing aids and sells them directly to consumers, told WebMD.

For now the only quality standard for OTC hearing aids will be the FDA’s, meaning manufacturers will strive to be “good enough” and probably compete only on price and aesthetics. If Underwriters Labs ever introduced testing and benchmarking for consumer medical devices (oximeters, hearing aids, heart rate monitors, sleep monitors), we might see companies put more effort into achieving objective excellence on metrics and utilities beyond size and design.

Best practices for personal experiments, aka n-of-1 trials, are well documented but rarely used

Though the effectiveness of many treatments varies widely across individuals, treatments are rarely rigorously evaluated on a personal basis. Instead, pressed for time, physicians rely on trial and error testing. But protocols for personal experiments are simple, and their benefits are well documented.

“We spend so much effort on precision in diagnosis, yet we have very little precision about treatment,” observed Dr. Eric Larson, a professor of medicine at the University of Washington. 

Experts say that, even for widely-studied drugs, “the trial and error approach to medicine is not cost efficient because several of the most prominent drugs only work on one-third to one-half of patients. Thus, many patients are subjected to the cost, inconvenience, side effects, and potential adverse reactions of taking medicine that will have little to no clinical benefit.”

Responding to this problem in the late 1980s, medical researchers at McMaster University in Canada determined that many treatments for chronic conditions could be successfully evaluated on a person-by-person basis. (You can read a great history of that work here.) They advocated applying routine research techniques ⁠— standardized reporting, randomized treatment blocks, blinding, placebos and statistical rigor ⁠— to individual testing.

For chronic conditions, single patient trials are “not only ethical but arguably obligatory to undertake,” one researcher argued in 2002.  

Though clearly needed, single person experiments are still relatively underutilized. “The apparent simplicity of this study design has caused it to be enthusiastically touted in some research fields and yet overlooked, underutilized, misunderstood, or erroneously implemented in other fields, ” note Jean Slutsky and Scott R. Smith of the US Agency for Healthcare Research and Quality.

The agency’s user guide to personal experiments, published in 2014, is an invaluable resource best practices in single person experiments.

The user guide defines n-of-1 trials simply as “multiple crossover trials, usually randomized and often blinded, conducted in a single patient.” Best practices include:

  • “Treatments to be assessed in n-of-1 trials should have relatively rapid onset and washout (i.e., few lasting carryover effects).”
  • “Regimens requiring complex dose titration (e.g., loop diuretics in patients with comorbid congestive heart failure and chronic kidney disease) are not well suited for n-of-1 trials.”
  •  Blinding with the help of a compounding pharmacist or trusted friend can reduce placebo effects. Unfortunately, “even for drug trials, few community practitioners have access to a compounding pharmacy that can safely and securely prepare medications to be compared in matching capsules.”
  • “One-time exposure to AB or BA offers limited protection against other forms of systematic error (particularly maturation and time-by-treatment interactions) and virtually no protection against random error. To defend against random error (the possibility that outcomes are affected by unmeasured, extraneous factors such as diet, social interactions, physical activity, stress, and the tendency of symptoms to wax and wane over time), the treatment sequences need to be repeated (ABAB, ABBA, ABABAB, ABBAABBA, etc.).
  • “For practical purposes, washout periods may not be necessary when treatment effects (e.g., therapeutic half-lives) are short relative to the length of the treatment periods. Since treatment half-lives are often not well characterized and vary among individuals, the safest course may be to choose treatment lengths long enough to accommodate patients with longer than average treatment half-lives and to take frequent (e.g., daily) outcome measurements.”
  • “In n-of-1 trials, systematic assessment of outcomes may well be the single most important design element. …  In the systematic review by Gabler et al., approximately half of the trials reported using a t-test or other simple statistical criterion (44%), while 52 percent reported using a visual/graphical comparison alone. Of the 60 trials (56%) reporting on more than one individual, 26 trials (43%) reported on a pooled analysis. Of these, 23 percent used Bayesian methodology, while the rest used frequentist approaches to combining the data.”

As noted above, the benefits of n-of-1 trials include reducing the cost, inconvenience, side effects, and potential adverse reactions of taking medicine that has little to no clinical benefit. A protocol counters the influence of numerous biases and errors. In studies of n-of-1 trials for some chronic conditions, a personal treatment experiment resulted in a treatment change for up to two thirds of individuals.

The most important result, though, may be increasing the participants’ sense of playing an important role in their own treatment journey.

To learn more about creating your own n-of-1 test for your own symptoms, engage with the GuideBot below!


Why aren’t n-of-1 experiments already common for chronic conditions?

Because many treatments’ effects vary depending on the individual, researchers argue that effectiveness should be evaluated per patient. Yet the simple evaluation method that’s been tested and refined for 30 years, called an n-of-1 trial, is rarely used. Why?

Most top selling US drugs for chronic conditions offer above average relief for only a fraction of treated patients. So, for nearly 30 years, researchers have advocated single patient treatment experiments, also known as an n-of-1 trials.

An n-of-1 trial uses standard scientific methodology⁠—including placebos, blinding, random block rotation, crossovers, washouts and statistical analysis⁠—to determine a drug’s efficacy for an individual patient. Their rigor and specificity mean that n-of-1 trials “facilitate finely graded individualized care, enhance therapeutic precision, improve patient outcomes, and reduce costs,” according to a 2013 review of 2,154 single patient trials.

N-of-1 trials counter cognitive biases—confirmation bias, a desire to please the doctor, the placebo effect, the natural progression of an illness, to name a few—that can commonly skew the unstructured drug sampling, aka trial and error method, that most clinicians rely on.

To date, though, we’ve seen a failure of all attempts to provide off-the-shelf n-of-1 services that the average doctor might use for a patient.

“We spend so much effort on precision in diagnosis, yet we have very little precision about treatment,” says Eric Larson, Executive Director and Senior Investigator, Kaiser Permanente Washington Health Research Institute.

Why? The excellent 2008 paper What Ever Happened to N-of-1 Trials? Insiders’ Perspectives and a Look to the Future summarizes explanations.

1) The average physician’s toolbox doesn’t include the necessary statistical hammers and nails to do a meaningful analysis. “In clinical practice, physicians use evidence and experience to generate a list of treatment options. Patients and physicians move down the list based on trial and error.”

2) N-of-1 trials require physicians to squeeze another process and script into their already-packed days. “Clinicians must explain the idea to their patients, see them regularly throughout the trial, and evaluate the results jointly at the trial’s end. … Taking the time to sell an unfamiliar concept—let alone follow through on the logistics—might be untenable.”

(A 2017 history of n-of-1 trials offers a more prosaic take: “‘did the treatment help?’ is too easy, and has too much face validity, compared to the more onerous substitution of a formal N of 1 trial.”)

3) Beyond the pragmatic challenges, N-of-1 trials force physicians out of their “cure-dispenser” role, the study noted. “While clinicians understand that the practice of medicine involves uncertainty, they are not necessarily comfortable with it. Recommending that a patient enroll in an n-of-1 trial acknowledges this uncertainty, suggesting that the physician does not really know what is best. It is much easier—and arouses far less cognitive dissonance—to simply write a prescription.”

As Gordon Guyatt, who pioneered research at McMaster University demonstrating the empirical superiority of N-of-1 trails, explains resistance to the n-of-1 innovation: “[Physicians] tend to have ways that they’ve learned to operate where they’re comfortable, and it’s hard to move outside these ways. It’s hard to do something extra that’s going to take time, and energy, and initiative.”

Most drugs aren't effective for most people.
“Number Needed to Treat” for one patient to benefit, illustrated for the top ten drugs in the US.  Source.

Some argue N-of-1 trials are long overdue. “Every day, millions of people are taking medications that will not help them,” argues Nicholas J. Schork, director of human biology at the J. Craig Venter Institute in La Jolla, California, USA. “The top ten highest-grossing drugs in the United States help between 1 in 25 and 1 in 4 of the people who take them (see chart below.) For some drugs, such as statins — routinely used to lower cholesterol — as few as 1 in 50 may benefit. There are even drugs that are harmful to certain ethnic groups because of the bias towards white Western participants in classical clinical trials.” Schork believes the time is ripe for n-of-1 trials. “Physicians are having to become more acutely aware of the unique circumstance of each patient — something most people have long called for.”

To learn more about creating your own n-of-1 test for your own symptoms, engage with the GuideBot below!

Why heterogeneous treatment effects matter, but are often ignored

Though doctors and drug makers tout “average” effects, many treatments deliver a smorgasbord of results—substantial benefits for some people, little benefit for many, and harm for a few. Why don’t we hear more about this variability?

Roughly a century ago, modern medicine got off to a roaring start. Pasteur discovered the bacterial origin of many diseases in 1870. Handwashing reduced deaths from surgery and childbirth. Antibiotics cured many infections. Vaccines wiped out polio, smallpox and chickenpox. Starting in the 1920s, insulin significantly prolonged some lives.

Since then, though, momentum gradually has slowed. Humans are vastly complex biological systems. We’ve discovered that, for chronic conditions like Parkinson’s, arthritis, Alzheimer’s, IBS, migraines and psoriasis, the effects of a given treatment vary depending on the patient. As one analysis in 2011 put it, “the development of medical interventions that work ubiquitously (or under most circumstances) for the majority of common chronic conditions is exceptionally difficult and all too often has proven fruitless.”

Researchers call a treatment’s variable effectiveness per individual its “heterogeneous treatment effect,” or HTE.

Though HTE is what’s meaningful for most patients⁠— nobody is actually average, right?⁠—it’s usually unmentioned in publicity for treatments.

This is because, as Richard L KravitzNaihua Duan, and Joel Braslow explained in a 2004 paper, treatment effect averages are the “primary focus on clinical studies in recent decades.” The average effect is what researchers look for, what the FDA approves, what drug companies promote, what doctors base prescriptions on, and what patients expect.

But, as the 2004 paper noted, the modest average effects everyone focuses on may, in fact, mask “a mixture of substantial benefits for some, little benefit for many, and harm for a few.”

There’s a second error that results from focusing on a drug study’s reported average effects, they argued. Heterogeneity “may be dramatically underestimated” because “by convenience, randomized control trial are characterized by narrow inclusion criteria and recruitment.”

In fact, “nonrepresentativeness is probably the rule rather than the exception.” Put simply, the people in a drug study often don’t reflect a population’s full diversity of race, sex or health conditions.

Kravitz, Duan, and Braslow explained that though the generalizability of a large drug trials is often relatively weak, average findings are often quickly distilled into treatment guidelines, and these, in turn, can too easily creep into rigid treatment standards. Harm can result. For example, a drug trial for a diuretic called spironolactone seemed to show a 35% benefit for the average patient, but when included in treatment standards resulted in a four-fold increase in hospitalizations and no reduction in all-cause mortality. Significant groups had been underrepresented in the original trial.

Attention to HTE probably won’t change any time soon. Unfortunately, “the pharmaceutical industry currently has little direct incentive to collect data on risk, responsiveness, and vulnerability that would better inform individual treatment decisions,” according to Kravitz, Duan, and Braslow. In fact, mass market economics incentivize one-size-fits all treatments, since doctors prescribing based on “average” results create far larger markets for drug makers.

An additional problem when evaluating HTE: the deck is stacked in favor of a positive results in most published randomized control trials.

Eleven cognitive biases and statistical errors that can interfere with finding the best treatment

Most people’s tests of potential treatments for chronic conditions involve haphazard cycling through doses and brands, spotty symptom diaries, and no statistical analysis of results. This lack of rigor introduces numerous cognitive biases.

When there’s no one obvious best treatment for a chronic condition, humans conduct informal experiments, observing symptoms as they move through a list of treatments until one treatment seems to be effective. Doctors refer to this strategy as trials of therapy; detractors call it trial and error.

The handbook Design and Implementation of N-of-1 Trials: A User’s Guide calls this casual approach “haphazard,” noting that “it is easy for both patient and clinician to be misled about the true effects of a particular therapy.” Other researchers call it “hardly personalized, precise, or data-driven.”

Research into human decision-making shows that an unstructured approach to treatment testing has numerous pitfalls.

People often see patterns in random events.

“We’re emotional creatures that look for signals in a sea of mostly noise,” notes medical podcaster Dr. Peter Attia. “We like to see things as we wish them to be and we sometimes consciously or unconsciously act in a manner to coax those things to our wishes. … Without a framework, in this case, the scientific method, we’re far too likely to see what we want to see rather than the alternative.”

Picking a treatment using flawed, biased or incomplete data can result in under-optimized choices or, longer term, treatment churning.

Numerous cognitive biases and statistical errors affect unstructured treatment experiments. They include:

  1. expectation bias (I find more of whatever I expect to find)
  2. recency bias (how I feel today while visiting the doctor is more vivid than how I felt last week)
  3. placebo/nocebo effects (positive or negative effects may arise from psychic rather than biological factors)
  4. sunk cost effect (I stick with a bad treatment because I’ve already invested significant time/emotion in the treatment)
  5. clustering illusion (I attribute significance to small clusters or trends in datasets that are, in fact, only random)
  6. physician pleasing (seeking the approval of a health care provider, I may over-report positive effects)
  7. inconsistent standards (I use varying metrics to evaluate treatments’ effects, making a systematic comparison impossible)
  8. rushed conformity (if rushed to give feedback, I default to an answer that’s socially desirable)
  9. granularity bias (I can’t see small differences hidden in large data sets)
  10. natural progression (my condition may organically increase or diminish, making the last medication tested seem more — or less — effective)
  11. incomplete sampling (I don’t gather enough data for a statistically valid conclusion ⁠— perhaps only reporting how I’m feeling the day I visit the doctor)

You can see how some these biases play out in a typical clinical interaction based on casual experimentation, as portrayed in the N-of-1 User’s Guide:

Take for example Mr. J, who presents to Dr. Alveolus with a nagging dry cough of 2 months duration that is worse at night. After ruling out drug effects and infection, Dr. Alveolus posits perennial (vasomotor) rhinitis with postnasal drip as the cause of Mr. J’s cough and prescribes diphenhydramine 25 mg each night. The patient returns in a week and notes that he’s a little better, but the “cough is still there.” Dr. Alveolus increases the diphenhydramine dose to 50 mg, but the patient retreats to the lower dose after 3 days because of intolerable morning drowsiness with the higher dose. He returns complaining of the same symptoms 2 weeks later; the doctor prescribes cetirizine 10 mg (a nonsedating antihistamine). Mr. J fills the prescription but doesn’t return for followup until 6 months later because he feels better. “How did the second pill I prescribed work out for you,” Dr. Alveolus asks. “I think it helped,” Mr. J replies, “but after a while the cough seemed to get better so I stopped taking it. Now it’s worse again, and I need a refill.”

Introduction to N-of-1 Trials: Indications and Barriers (Chapter 1)

(If you’re a glutton for punishment, here’s a list of 28 cognitive biases which can skew a doctor’s diagnosis, and dozens of biases affecting research.)

A systematic experimental protocol helps avoid these biases. A rigorous experiment includes:

  • at least 30 data points per treatment to achieve statistical significance
  • consistent metrics for measuring symptoms
  • daily symptom reports (ideal)
  • standard length treatment blocks, randomized
  • blinding (ideal)
  • a statistical analysis of logs at the experiment’s conclusion

If you’re interested in this approach, WhichWorksBest’s online software makes systematic personal experiments simple and affordable. For more background, read this information about single person (n-of-1) experiments, or chat with our GuideBot at the bottom of the page!

Why the name (and question mark)?

In the spring of 2019, we had been talking for nearly a year about building a software toolkit to help automate what many people already do informally⁠—track symptoms and statistically evaluate the effects of various treatments. As the idea started to jell, we started casting about for names for the toolkit.

Though there are a variety of scientific names for the methods that the WWB toolkit borrows from⁠— single person trial, n-of-1 trial⁠— none of them inspired any enthusiasm.

One day, headed to the car for a brief vacation, I realized I’d forgotten my blood thinner medications. (I had a pulmonary embolism last fall.) Walking out of my home holding two bottles of best selling blood thinners — Xarelto and Elequis — one in each hand, I was thinking about the wooziness that sometimes seemed to occur while I was on the blood thinners.

I weighed both bottles and asked myself out loud, “which works best?”

I knew, from reading and conversations, that this exact question is incredibly common for people struggling to find the right solution⁠—behavior, medicine, supplements⁠—for chronic health conditions. Luckily, the domain name was available!

WhichWorksBest is a creation of Pressflex LLC, which has been around since 1998. The name WhichWorksBest? ties into the fairly literal, “just the facts” naming conventions that Pressflex has used for other ventures.

For example, Pressflex was founded to help newspapers and magazines get online flexibly and affordably. Blogads, which we launched in 2002, was an automated service to help bloggers sell ads. Pullquote, launched in 2011, helps people store and share interesting quotes. AdBiblio, launched in 2014, helps book publishers advertise online. Racery, also launched in 2014, helps companies and charities build and host virtual races.

One remaining puzzle. Very few companies — or zero? — have a question mark in their logo. If WhichWorksBest? ever ends up in a headline, this may confuse readers. But the questionmark also conveys the confusion and questioning that is at the heart of WhichWorksBest?

Hi! Thinking about a personal experiment?

I’m GuideBot, here to help you 24/7.

Pick a Topic!

Guidebot Image