The experiment that became known as the Elephant Man trial began one spring morning, in 2006, when clinicians at London’s Northwick Park Hospital infused six healthy young men with an experimental drug. Developers hoped to market TGN-1412, a genetically engineered monoclonal antibody, as a treatment for lymphocytic leukemia and rheumatoid arthritis, but they found that in just over an hour, the men grew restless. “They began tearing their shirts off complaining of fever,” one trial participant, who received a placebo, told a London tabloid. “Some screamed out that their heads were going to explode. After that they started fainting, vomiting and writhing around in their beds.” The heads of some of the subjects swelled to elephantine proportions. Within sixteen hours, all six were in the intensive-care unit suffering from multiple organ failure. They had narrowly survived a potentially fatal inflammatory response known as a cytokine storm.
The trial grabbed headlines and sent a “shock wave” through the scientific community, as one of the developers of the drug later wrote. A subsequent review found a few sloppy medical records and an underqualified physician associated with the study, but nothing that could explain a central mystery: the drug had already been tested on rodents and monkeys. Lab animals had tolerated doses that—after adjusting for the animals’ weights—were five hundred times greater than the ones that nearly killed the young men. Why did animal experiments fail to warn scientists that TGN-1412 was dangerous?
Because so many of our genes are shared with other vertebrates, scientists have generally assumed that whatever harms lab animals is likely to harm humans, too. The Food and Drug Administration requires preclinical tests, traditionally on two species of non-human animals, before drugs can be tested on people. Yet a 2014 analysis of more than two thousand drugs found that animal tests were “highly inconsistent” predictors of toxic responses in humans and “little better than what would result merely by chance.” More than eighty per cent of novel drugs fail in Phase I and Phase II trials—when they’re first tried in healthy volunteers and patients—and others fail in Phase III, which are large-scale efficacy trials; as of 2009, these unsuccessful human trials were consuming seventy-five per cent of drug-research and development costs. Fifteen per cent of drugs, including blockbuster remedies for conditions such as depression and arthritis, turn out to have dangerous toxicities even after they’re approved by the F.D.A.
When lab-animal studies fail to predict human responses, scientists typically scrutinize them for mistakes (maybe lab workers contaminated cell lines; perhaps they failed to authenticate reagents) or blame the differences between species. “A mouse is not a person” has become a running joke. The problems with animal experimentation, however, go deeper than that: some studies of standardized lab animals can’t even be replicated on identically standardized lab animals. In 2012, a Nature paper revealed that scientists at Amgen, a multibillion-dollar biotech company, had spent a decade trying to repeat landmark animal studies and had succeeded only eleven per cent of the time. The following year, at an N.I.H. review board meeting, Elias Zerhouni, a pharmaceutical executive who had directed the N.I.H. during the Bush Administration, likened science’s reliance on lab-animal research to a mass hallucination. “We all drank the Kool-Aid on that one, me included,” he said. “It’s time we stopped dancing around the problem.” (Later, after an outcry from advocates of the biomedical-research industry, Zerhouni walked back his comments.)
The global animal-testing industry is worth billions of dollars and counting. Scientists experiment on some hundred and twenty million lab mice and rats per year. But, as the industry continues to grow, problematic results continue to emerge. Last May, European scientists reported in the journal PLOS Biology that they had conducted an identical experiment on identical mice in three separate labs. They found that the mice behaved differently in each setting, a result that they could only attribute to Rumsfeldian “interactions between known but also unknown factors we are not even aware of.” Can animal experiments still be trusted?
Scientists have been experimenting on animals for centuries to solve anatomical and physiological mysteries. In the twentieth century, researchers used animals to calibrate therapeutic doses: one “rabbit unit,” for example, was the amount of insulin required to produce convulsions in a rabbit. However, animals from the same species varied in their responses to drugs, in part because scientists acquired them from pet breeders and hobbyists. One study in the forties found that a batch of diphtheria antitoxin protected some guinea pigs from the disease, but not others, depending on whether they’d been reared on green vegetables or beets. The British Medical Journal published an article with the title “Wanted—standard guinea-pigs.”
Many mid-century scientists viewed lab animals as lower creatures, even automatons; some hoped to breed them into “pure” and “uniform” animals, as the geneticist Clarence Cook Little put it during a congressional hearing in 1937. They assumed that variation between animals was determined by genes and germs, so they bred mouse siblings with one another, shielded the mice offspring from a range of microbes, and then repeated the process for many generations of inbreeding. (James A. Reyniers, who was later nominated for the Nobel Prize in Physiology or Medicine, went so far as to surgically remove animals from the wombs of their mothers and rear them in airtight steel chambers; in 1949, Life published photographs of monkeys in his lab and declared, “The research possibilities are virtually limitless.”)
Commercial suppliers marketed lab animals to all manner of scientists—geneticists, immunologists, neuroscientists, oncologists—in thick catalogues that described their technical specifications as though they were test tubes or Bunsen burners. Standards for the certification and transportation of lab animals were codified by UNESCO. Experiments on standardized lab animals spread across the globe and led to new insights into human biology, accelerated the development of breakthrough medical products such as vaccines and cancer drugs, and earned lab-animal researchers dozens of Nobel Prizes.
Animal experiments rested on the notion that humans and other mammals are kindred creatures, but for many scientists that kinship was solely physical, not mental. They tended to dismiss the idea that animals have minds and emotions that are comparable to our own, which Charles Darwin argued in the nineteenth century, or that “each and every living thing is a subject that lives in its own world,” as the Estonian biologist Jakob Johann von Uexküll wrote, in 1934. Such beliefs were even caricatured as symptomatic of “zoophil-psychosis,” a supposed psychiatric condition defined in 1909 as “an inordinate and exaggerated sympathy for the lower animals” and the “delusion that they are persecuted by man.”
This may be why puzzling irregularities in early studies did not prevent lab-animal experiments from becoming an industry standard. A 1954 Nature paper, for example, reported that when scientists injected inbred mice with sedatives, the inbred mice took wildly different times to fall into a stupor, whereas hybrid mice reacted to the drugs within a more predictable window of time. Just because two mice have near-identical genes does not mean that they will develop the same physical traits, the authors wrote; they may even be “strikingly more variable” than genetically diverse mice. That same year, another paper reported that lab animals with nearly indistinguishable genes had dramatically different skeletal structures—a finding that British geneticist Hans Grüneberg vaguely blamed on “intangible factors” and “accidents of development.” But as long as animal studies were unlocking new biomedical insights and therapies, there were few incentives to contemplate the lives of lab mice.
The idiosyncrasies of lab animals garnered new attention after the explosive Amgen paper in Nature, in 2012. In a wave of subsequent papers, other scientists described failures to reproduce published research in medicine, psychology, and many other fields. In 2014, as concern about a “replication crisis” grew, a cover story in the medical journal The BMJ declared animal research a “shaky basis for predicting human benefits.” A growing body of evidence was suggesting that a variety of subtle, uncontrolled factors affected lab animals’ bodies and behaviors.
Rodents respond differently to experimental drugs depending on the levels of phytoestrogens in their chow—levels that can vary between different batches from the same vender. Their microbiomes, which contribute to their immune function, vary from vender to vender and from lab to lab. Many lab mice today come from an inbred strain known as C57BL/6, or Black 6, which originated with a pair who were mated in the nineteen-tens or twenties. Yet “there is no such thing as a Black 6 mouse,” Joseph Garner, a professor of comparative medicine at Stanford, argued recently when we spoke via Zoom. “There’s the Black 6 mouse in my lab, on my diet, in my cages, with my noise exposure, my light exposure, and my technician. And literally in the lab down the hall, the Black 6 mouse is different.” The dream of scientists like Little, of animals that had completely lost their individuality, never came true.
Standardized laboratory conditions turn out to affect the animals that scientists are trying to study, potentially distorting the results. According to a recent meta-analysis co-published by Georgia Mason, the director of the Campbell Centre for the Study of Animal Welfare, at the University of Guelph, who mentored Garner, the standard lab-mouse cage—a plastic container the size of a shoebox—sickens its inhabitants and increases their risk of death. These cages can make its inhabitants cognitively pessimistic, mess up their sleep, and reduce their physiological resilience, compared with rodents who are given the opportunity to burrow, explore, and exercise. Researchers have also found that mice experience a spike in stress hormones when their cages are moved, and their behavior can change depending on the height at which their cages are stacked. The ambient temperature in lab-animal facilities, though comfortable for humans, inflicts chronic thermal stress on rodents; Cindy Buckmaster, a former director of the Center for Comparative Medicine at Baylor College of Medicine, compared their experience to that of a human unclothed in forty-five-degree-Fahrenheit weather. Imagine a study in which subjects are chronically cold, sleep-deprived, inbred, and held captive in cramped conditions. If the subjects were human, the scientific establishment would dismiss such a study as not only unethical but also irrelevant to normal human biology. Yet, if the subjects were non-human, the study could be treated as perfectly valid.
Jeffrey Mogil is a neuroscientist at McGill University who studies pain perception. In 2010, he and his collaborators filmed mice before and after they received shots of pain-inducing acetic acid. They used the footage to develop a “Mouse Grimace Scale,” which uses mouse facial expressions to measure their level of pain. Then, in 2014, one of his postdocs told him about a strange occurrence in the lab. The postdoc had administered a pain-inducing chemical to lab mice, but the mice had failed to lick themselves in response. Then he turned his back to depart, and they started licking. “They were just waiting for me to leave the room,” he told Mogil.
The mice’s pain response, Mogil said, seemed to be more than a mindless reflex: they seemed to adjust it in response to a human’s presence. “People at meetings for a number of years had sort of whispered about this,” Mogil told me. In a series of subsequent experiments, his team observed fewer “pain behaviors” when a man—or even a T-shirt that a man had worn—was nearby. An editorial accompanying these findings noted their “extremely wide-ranging implications for physiological and behavioral research.” When Mogil went back and analyzed his past work, he found that in all his experiments, the mice had shown a higher threshold for pain when handled by male researchers. If so, animal studies of painkillers or drugs with painful side effects could contain systematic errors, simply because of the makeup of a laboratory’s staff.