Why don’t we get more scientific breakthroughs?
Daniel Lemire's blog
It is absolutely clear to me that large language models represent the most significant scientific breakthrough of the past fifty years. The nature of that breakthrough has far reaching implications for what is happening in science today. And I believe that the entire scientific establishment is refusing to acknowledge it.
We often excuse our slow progress with tired clichés like “all the low-hanging fruit has been picked.” It is an awfully convenient excuse if you run a scientific institution that pretends to lead the world in research—but in reality is mired in bureaucracy, stagnation and tradition.
A quick look at the world around us tells a different story, progress is possible and even moderately easy, even through the lens of everyday experience. I have been programming in Python for twenty years and even wrote a book about it. Managing dependencies has always been a painful, frustrating process—seemingly unsolvable. The best anyone could manage was to set up a virtual environment. Yes, it was clumsy and awkward as you know if you programmed in Python, but that was the state of the art after decades of effort by millions of Python developers. Then, in 2024, a single tool called uv appeared and suddenly made the Python ecosystem feel sane, bringing it in line with the elegance of Go or JavaScript runtimes. In retrospect, the solution seems almost obvious.
NASA has twice the budget of SpaceX. Yet SpaceX has launched more missions to orbit in the past decade than NASA managed in the previous fifty years. The difference is not money; it is culture, agility, and a willingness to embrace new ideas.
Large language models have answered many profound scientific questions, yet one of the deepest concerns the very nature of language itself. For generations, the prevailing view was that human language depends on a vast set of logical rules that the brain applies unconsciously. That rule-based paradigm dominated much of twentieth-century linguistics and even shaped the early web. We spent an entire decade chasing the dream of the Semantic Web, convinced that if we all shared formal, machine-readable metadata, rule engines would deliver web-scale intelligence. Thanks to large language models, we now know that language does not need to be rule-based at all. Verbal intelligence does not need to require on explicit rules.
It is a tremendous scientific insight that overturns decades of established thinking.
A common objection is that I am conflating engineering with science. Large language models are just engineering. I invite you to examine the history of science more closely. Scientific progress has always depended on the tools we build.
You need a seaworthy boat before you can sail to distant islands, observe wildlife, and formulate the theory of natural selection. Measuring the Earth’s radius with the precision achieved by the ancient Greeks required both sophisticated engineering and non-trivial mathematics. Einstein’s insights into relativity emerged in an era when people routinely experienced relative motion on trains; the phenomenon was staring everyone in the face.
The tidy, linear model of scientific progress—professors thinking deep thoughts in ivory towers, then handing blueprints to engineers—is indefensible. Fast ships and fast trains are not just consequences of scientific discovery; they are also wellsprings of it. Real progress is messy, iterative, and deeply intertwined with the tools we build. Large language models are the latest, most dramatic example of that truth.
So what does it tell us about science? I believe it is telling us that we need to rethink our entire approach to scientific research. We need to embrace agility, experimentation, and a willingness to challenge established paradigms. The bureaucratization of science was a death sentence for progress.
Generated by RSStT. The copyright belongs to the original author.