Why This Conversation Now
We are at the peak of the northern-hemisphere summer vacation period. This is when inboxes are skimmed, attention spans are shorter, and anything heavier than a quick update runs the risk of drawing yawns, if not being sent straight to the spam folder. Yet here we are, choosing to tackle a subject with a massive long-term impact, hoping that it catches the attention of those heavily focused on the mathematical potential of the transformer model phenomenon.
Lest the title suggests that this is another optimistic piece proclaiming the transformative quantum potential of AI, allow us to pour cold water on that notion early. This is not about quantum computing making large language models vastly more powerful. We are not believers of the quantum capabilities: we fully agree that quantum hardware will continue to improve rapidly, but the real bottleneck stems from the unknowable, and hence in our eyes vastly dreamt up, potential of the new type of mathematics quantum phenomenon gives rise to. We are an outlier in our belief, and would love to discuss that some day in another lean times when we are assured that our piece will not lead to us getting pummelled!
The subject here is different: how transformer-based models, the engines behind modern LLMs, are becoming not just powerful tools for generating answers but potential laboratories for understanding complexity itself. They excel at turning sprawling, chaotic information into coherent, context-appropriate results. As we study their internal workings more closely, they could reveal patterns and principles that apply far beyond AI.
Understanding these models is not optional. We still do not fully know how transformers operate at a fine-grained level. We know the architecture, the mathematics, and the training process, but we cannot yet map exactly how and why they arrive at particular outputs. Much current research is aimed at changing that, motivated by immediate needs such as safety, reliability, bias mitigation, and regulatory compliance. In pursuing these goals, we may also discover that the same understanding has implications for neuroscience, climate science, biology, and other fields where complexity resists easy explanation.
From Quantum Analogy to Complex Systems
Schrödinger’s famous cat, both alive and dead until observed, is a metaphor for quantum superposition, a system existing in multiple possible states until an act of measurement collapses it into one.
An LLM before it generates the next word sits in a similar-sounding condition. Inside the model is a probability distribution across thousands of possible tokens. None is chosen yet; all coexist in potential. When decoding begins, one is selected and made real. The most important similarity is that there is no way to predict what an LLM may answer to a particular question regardless of the knowledge of the models.
While quantum decoherence could be completely different from the way an LLM’s “actualization” occurs, and no serious scientist we know of has even conjectured that quantum physicists might learn about collapse from transformer workings, there could just be something there. And if there is, please remember where you read it first. More seriously, the analogy is still useful because it points to a shared pattern: a system holding many possible futures, then committing to one.
Quantum superposition is grounded in physical amplitudes and interference patterns; transformer probabilities are statistical, learned from vast training datasets. Quantum collapse is still a physical mystery; transformer “collapse” is the result of a deterministic or stochastic algorithm.
In both cases, the nature of that transition from possibility to coherence is critical to understanding the system’s behaviour. In transformers, this transition is mediated by self-attention mechanisms that route context-specific information across the entire input.
Where the analogy breaks is where transformers offer a unique advantage: full observability. Every weight, every attention score, every activation is open to inspection. We can stop the process mid-generation, rewind it, and see how different choices would play out. No physical system, from a neuron to a quantum particle, offers that kind of access.
Observability and the Path to Understanding
In natural systems, the very act of observation is constrained. We cannot watch every neuron firing in real time in the human brain. We cannot map every atom’s trajectory in a folding protein. We cannot measure a galaxy evolving over millions of years. Quantum systems change when measured, obscuring their unobserved states.
Transformers present none of these barriers. They are systems. We can record and replay every step of their reasoning process, compare multiple runs under different conditions, and modify their architecture to test hypotheses.
This matters because we do not yet fully understand why transformers work as well as they do. Their capabilities have outpaced our theoretical grasp. That gap is driving an active area of research focused on interpretability and mechanistic transparency. Much of it is aimed at practical ends: making systems less prone to error, bias, or harmful output; ensuring they can be trusted in high-stakes domains; meeting upcoming regulatory requirements.
But this same work is building a detailed picture of how a complex, distributed system turns noisy, ambiguous input into coherent output. That picture could have value far beyond AI. It could act as a reference model for how complex systems in nature — biological, climatic, even social — resolve into stable patterns.
Case Studies in Chaotic Domains
Neuroscience: From Inspiration to Blueprint
Artificial neural networks began as loose imitations of the brain. Now, transformers are feeding concrete ideas back into neuroscience. Their attention-driven architecture offers a plausible computational model for how brains might dynamically route information across distributed regions. This challenges the traditional view that specific functions are tied to fixed locations. In both brain and model, meaning emerges from patterns of interaction spread across the whole network.
Transformers also offer something biology cannot: the ability to run controlled, repeatable interventions. Disabling specific attention heads in a model can mimic functional impairments, producing patterns that neuroscientists can compare to clinical data. This is already influencing computational psychiatry, offering new ways to think about disorders of cognition and perception.
Diagnostics and Medicine: Understanding the Diagnoser
In medical imaging, Vision Transformers have pushed accuracy to new levels, often exceeding older architectures. More importantly, their internal maps show which features of an image drove a diagnosis. Text-based diagnostic models can reveal the sequence of hypotheses they considered before producing an answer.
These capabilities make it possible to study the act of diagnosis itself. By deliberately introducing biased data, researchers can watch how cognitive biases emerge in the model’s reasoning chain. By studying highly accurate models, they can see how those biases are avoided. This turns the diagnostic process into something measurable and improvable, for both AI and human clinicians.
Climate Science: Learning Emergent Dynamics
Climate and weather systems are chaotic, with countless interlinked variables. Traditional models simulate the underlying physics directly, which is computationally expensive and limited in resolution. Transformer-based climate models learn from large datasets to predict emergent patterns directly, capturing long-range dependencies like how Pacific sea-surface temperatures influence rainfall continents away.
These models run orders of magnitude faster than numerical simulations, enabling massive ensembles of forecasts. That speed allows scientists to explore hypothetical scenarios and test climate theories far more quickly than before, accelerating both prediction and understanding.
Protein Folding: The Shortcut and Its Limits
AlphaFold2, using a transformer-based Evoformer module, can predict a protein’s final 3D structure with remarkable accuracy. But it does not simulate the folding pathway itself — the dynamic journey from chain to structure. Despite this, analysing its internal attention maps reveals which amino acid interactions are most important for stability, guiding experiments and deepening our knowledge of protein behaviour.
Other Fields: Extending the Pattern
The same dynamic is playing out in other domains. In astrophysics, transformers are modelling time-series data to uncover causal relationships. In epidemiology, they integrate heterogeneous data streams for better outbreak forecasting. In evolutionary biology, protein language models discover conserved motifs that reflect deep biological principles. In linguistics, they quantify the structural differences between dialects with unprecedented precision.
From Prediction to Understanding
High predictive accuracy is valuable, but it does not automatically advance theory. The deeper opportunity lies in using transformers to understand the mechanism: how a complex system moves from inputs to outputs, from noise to order.
This is why mechanistic interpretability work matters so much. It is not just about making AI safer or more compliant. It is about building a map of the processes by which distributed systems self-organise into coherent outcomes. And once we have that map, we can compare it to the processes at work in natural systems, borrowing insights in both directions.
Transformers are uniquely suited for this because they combine power, observability, and manipulability. They can be studied in a level of detail impossible for any natural system. They can be altered and rerun until hypotheses are confirmed or rejected. They can be retrained to explore how changes in architecture or data affect behaviour. This is a laboratory we have never had before.
Why GenInnov is an Innovation, and not an AI, fund
In distilling all sorts of noise or chaos into order, as in a well-written text or images, we have repeatedly hailed the transformers as a mathematical, rather than a mere technical or technological, revolution.
Just as the internet turned out not to be mainly about email or static web pages that were seen as the first revolutionary use cases when the era began, we believe that the truly revolutionary aspects of the transformer or GenAI era are not in chatboxes or agents.
In our evolution diagram published in earlier note, we noted that the next two stages are multiple times more important: one is physical or embodied AI, where we begin to use the newfound ability to inject intelligence into anything inanimate to change the world around us in the name of Robotics, for instance. But, the even bigger stage is a world of what we might call a trillion super-Einsteins: an explosion of distributed, augmented intelligence applied across every discipline.
These are biotech firms mapping the grammar of biology to find new drug targets. Climate-tech companies are turning high-resolution forecasts into actionable risk analytics. Neurotech start-ups are decoding brain signals with transformer-inspired architectures. Materials-science ventures use AI surrogates to design and test new compounds rapidly. And, each of these innovation field is likely to keep progressing exponentially in the decades ahead.
This is why GenInnov is not an AI fund in the narrow sense. We are an innovation fund. Our name stands for Generative Innovation.