The Customisation Paradox

I.

Mornings used to begin with coffee. Now there is a detour to be made en route to the kitchen. The aluminum is already warm before the sun is up. The Mac is whirring when I sit down, and it must have been at it for hours, running the errands I set in motion the night before and then tried to forget about. There is a low vibration in the desk, the faint sense of things happening while you were asleep, a private party you were not invited to.

You touch the trackpad, and the screen bleeds to life, pulling you mid-stream into a workflow that never actually slept. Hoping against hope, I open the folders. Several agents finished. Of course, several did not. The old ones are finally obedient; the new ones I keep inventing are still half-feral. They seem to have gotten lost in the digital underbrush around 3:00 a.m. I refire them. This is not how it was supposed to work.

What we are trying to do is simple on paper. Dozens of stocks, each with its own secret weather. Before the markets wake, before the noise arrives, I want the overnight distilled, not the press releases written for robots, but the filings, the earnings-call asides that only matter three months later, the supply-chain whispers. The kind of briefing a brilliant but slightly obsessive team of analysts would hand you if they never slept, formatted precisely to the idiosyncrasies of my own brain. Underneath it all, the quiet scanner tracing relative moves, volume divergences, the tiny fractures in stocks that usually breathe in unison. At GenInnov, we don't trade the signals. I just look at the fractal twitches because they may just tell me where to point the flashlight.

And here is the part that still surprises me, even now. It is actually working. Not perfectly, not without the 3 a.m. abandonments and the half-feral agents, but working in a way that would have seemed faintly delusional even two months ago. I have almost stopped going to emails. Not reduced them, not triaged them better. Almost stopped. Scores of emails collapse into one clean conversation, mine, on my terms, in the order I find useful, with the comparisons and contrasts I wanted, not the ones a publisher or a platform decided I should want. All things Nvidia from dozens of sources are brought together, not only without repetition but also with contrasts and disagreements made apparent. My agents are at their best hunting for details on the third page or from the sub-stack I would never have read to generate admonishments on what does not sync with my views. After all, that is what I designed them to do.

I have my own newspaper now. Several, in fact. My own briefings. My own formats.

The separate matter, the one I try not to examine too closely, is that I am not entirely sure what I like. The preferences I have encoded in the agents are the ones I thought were best for me yesterday; today is a new day.

And yet, as I sit there orchestrating and perfecting this daily cognitive symphony, here I am, back in the nineties, staring at a spinning wheel while the machine exhales hot air like an old radiator. I cannot believe I am watching the spinner in 2026. Didn't I literally write the other day that hardware is dead as we know it?

I haven't admitted this yet: I am running out of machine. Not metaphorically. Literally. The CPU is gasping. The heat is different. The noise is different. For the first time in, and I actually stop to count, which is itself a small vertigo, two decades, I can feel the physical edge of what my hardware can do, and I am on the wrong side of it.

One large model provider quietly lifts three figures from my account every month, the way a decent water company sends the bill. You pay it. You don't argue. I am resigned to the fact that my overall spend on AI and silicon is climbing some vertical asymptote, and I will soon be paying similar amounts to a few more. Yet even that is not enough, or not the right kind of enough. Because what I need now is not just more intelligence or beautiful, cold, centralised data-centre compute. I need more parallel lanes on my home machine at 5 a.m. without the chassis sounding like it is reconsidering its life choices.

I bought new servers for the office not long ago. It felt, at the time, like getting ahead of the problem. The team already needs more machines doing work we can finally stop doing ourselves. I used to be the typical finance guy whose only need was more monitors. Now, in the supposed golden age of the cloud, I need more CPUs. Not just in the office but even at home. Not just upgrades. Actual, new machines over and above what I have.

I am already shopping for a new Mac mini.

II

Half a world away, the digital ecosystem has quietly inverted. In China, they are raising lobsters.

This is not aquaculture. "Raising a lobster" is the colloquialism that has suddenly emerged for running a localised, highly customised agent swarm on a personal machine. The namesake comes from OpenClaw, the open-weight model currently tearing through the mainland. The great model wars of 2026 were universally predicted to be fought in the stratosphere, a clash of trillion-parameter cloud titans battling for enterprise dominance. Instead, the front line has collapsed directly into the living room.

The phenomenon is as baffling as it is pervasive. A platform-scale model running on individual hardware was meant to be a developer's plaything. But the use cases flooding out of Hangzhou and Chengdu tell an entirely different story. It is the exhausted, 996 office worker running a local swarm to auto-negotiate minor digital tasks and manage dual-phone identities. It is the suburban housewife dedicating a humming desktop to an agent that arbitrates grocery algorithms, scrapes underground tutoring schedules, and generates perfectly calibrated responses for a dozen neighbourhood committee group chats. Raw, heavy-duty intelligence, domesticated.

The institutions noticed. Within weeks, every major platform, Alibaba, Baidu, Tencent, ByteDance, was shipping its own OpenClaw variant, each claiming superior local performance, lower memory overhead, better Mandarin idiom. Municipal governments in Shenzhen and Chengdu announced hardware subsidies to help residents buy the machines needed to run the models. Other cities quietly recommended restraint, citing data leakage risks and electricity grid strain. Thousands of consultants materialised overnight, offering OpenClaw implementation packages for households, small businesses, schools. Some charged by the agent. The ecosystem had its own economy before anyone had properly decided whether it was useful.

To an outside observer, this localised frenzy presents a profound puzzle. The instinct elsewhere is still to subscribe to centralised omnipotence, paying monthly tolls to access intelligence hosted in a distant desert. In China, the sudden instinct is to trap the intelligence in a metal box on the dining table and put it to work on Tuesday's grocery list.

Queries fed into premium large language models regarding this discrepancy yield confident, sterile analyses about data sovereignty. When pushed, the models eventually offer a more cynical hypothesis: perhaps raising digital lobsters is simply the new robot dog mania. A few years ago, the affluent bought mechanical quadrupeds because their neighbours had them. Now, social pressure dictates that one must have a localised agent hunting for micro-efficiencies, whether one needs it or not. Performative compute.

Perhaps. But status symbols still require silicon. If millions of people run localised, platform-scale models just to keep up with the Wangs, the thermal dynamics remain exactly the same. Which explains why there are no Mac minis left on the shelves in Shenzhen.

And then, this week, quietly, almost as a footnote, the three largest cloud providers raised prices simultaneously, some categories by up to thirty percent, within hours of each other. After last year, when coordinated moves by the major memory producers were waved away until they were impossible to ignore, one might have expected this to be the story everyone discussed with bated breath. Instead, the discourse moved on to whether the lobster needed a harness.

III

The pilgrimage occurred exactly on schedule, as it does, as it always does, regardless of what else is happening in the world. Straits have been tense tense. Certain skies have grown complicated. The price of crude is doing something that people would discuss in retrospect as obvious. None of this appeared to alter the conference calendar. The faithful assembled in California, the proceedings commenced, and the ritual of collective attention was offered, as annually contracted, to Jensen Huang.

The cheering was familiar. The leather jacket was familiar. The talk of AI factories, of a new computing era, of inference as the next frontier, was familiar in the way that a favourite sermon is familiar, its power residing not in novelty but in repetition. One had heard the AI factory framing before. One had heard the inference pivot before, in fact as early as 2023, though the room received it with the fresh astonishment of revelation. This is not a criticism. It is simply an observation about the nature of conviction.

But one had come looking for something new. And something new, it turned out, was hiding in the substrate.

The memory hierarchy, that patient, unglamorous ladder of latency, had acquired a new rung. Consider the sequence as it actually evolved: magnetic disks first, then flash drives displacing spinning platters, then DRAM as the working layer, then on-chip SRAM caches as the fast buffer, and most recently HBM, stacked high-bandwidth memory grafted directly to the accelerator. Each addition was, in its moment, a marginal fix. Each one, in retrospect, changed everything. Now, SRAM is being discussed not as cache, nor sa a displacement but as the newest element of the architecture, not as the fast scratch space between the real memory and the real processor, but as a primary design constraint around which entire chips were being organised.

Meanwhile, the processors were developing similar identity crises. The LPU joined a growing collection of acronyms that had once sounded like marketing and now demanded to be taken seriously. Purpose-built, latency-obsessed, positioned to orchestrate.

And then Huang said it plainly. The CPU, he declared, is coming back.

The statement deserved a pause it did not entirely receive. This is not nostalgia. The CPU he described bore the name of its ancestor while sharing almost none of its character, a central coordinator for an ecosystem of hyper-specialised accelerators, an agent-native conductor for the age of distributed swarms. The old CPU had been a generalist in a world without specialists. This one is a generalist in a world of nothing but.

Recently, the memory-makers have announced memories with on-chip processors. It is not a surprise that processors want their own memory. Copper is giving ground to optics at one level, holding firm at another. The clean taxonomy of the previous fifty years, CPU here, accelerator there, storage somewhere else, is being retired, because as good as the current hardware is compared to the one of yesteryears, say by a factor of a hundred or even a thousand depending on what you compare with, an urgent need is felt to improve them another hundred or thousand fold in the next few years. And that’s the only way to a trillion Dollar in business.

It is important to perhaps re-read the numbers in the previous paragraph. Not the trillion Dollar revenue ambitions, but the pace of improvement. We are not talking about an increase of a hundred percent, but of a hundred times.

Clearly, somebody is worried about the tens of minutes those agents are taking for the tasks I, and the folks in China, suddenly deem very important.

IV.

Somebody is worried about those tens of minutes. I know because I am somebody, and I am worried about them too. Not in the abstract. In the specific, thermal, audible way of a machine that is being asked to do too much by someone who keeps inventing new things to ask.

We are building things we were never supposed to build. It started, as these things do, with a small frustration. A vendor whose CRM did not quite bend the way we needed. An accounting reconciliation that lost twenty minutes of a human being's dignity every evening. One grumbles. One opens a model. One does not stop. Now we are developing our own order management system. Our own portfolio management system. We are sitting in meetings, seriously, with straight faces, discussing whether to rewrite our product liquidity criteria because our own code can finally map to our exact paranoia rather than the other way around. Three years ago, this would have sounded delusional. Now it sounds like Thursday.

Andrej Karpathy recently open-sourced a 630-line script, dropped a markdown file with a research goal, and went to sleep. By morning, his single GPU had run fifty experiments, rewritten its own architecture, and iterated. He bought a new Mac mini specifically to give it breathing room. Somewhere in Shenzhen, someone is trying to buy the same machine.

An entrepreneur sequenced his dying dog's tumour and built a personalised vaccine for a trial of one. A founder trained a local agent on sixty of his own emails until it replied in his exact register, his exact greetings, his French when the situation required French. He will not use the cloud version. Only this one, he says, truly sounds like me.

I read these and feel something between envy and recognition. Because what each of them did is structurally identical to what we are doing, what the lobster farmers are doing, what the architects of silicon are doing at the atomic level. We are all, in our different registers, refusing the standard tool. We are all saying: almost right is the least acceptable state.

The orthodox objection is obvious and correct. None of these scales. We are duplicating effort on a planetary level, building slightly different wheels because we want the spokes painted our own colour.

And yet.

At the end of the day, I am me. We are we.

That is not a slogan. It is a force. The gains from extreme fit are now larger than the penalties from missing scale. Standardisation was never the ideal. It was the compromise we accepted because the extreme customisation was not possible. Now it is. And when the cost of being yourself collapses, it turns out an extraordinary number of people would very much like to be exactly that. Themselves. Fully. Without apology. Without the vendor's drop-down menu standing between their instinct and their output.

I am me. We are we. The Chinese housewife with her lobster. The sleeping researcher with his overnight loop. The father who refused a prognosis. Our team on a Thursday, deploying something no one else will ever use, that fits our hand like it was cast from it.

This is a beautiful, almost embarrassingly human thing.

And an absolutely catastrophic thing for anyone trying to forecast how much silicon the world actually needs.

The Demand We Are Not Measuring

The investment world keeps staring at supply. Fab utilisation. Order backlogs. Hyperscaler capex guidance parsed to the last syllable. Demand, meanwhile, is modelled on subscriptions, seat licences, API call volumes. Tidy proxies. Comforting arithmetic. They measure how many people have access to intelligence. They do not measure what those people are doing with it.

What is changing is not the headcount. It is the depth, the persistence, the relentlessness. Not a query. Not a conversation. A recursive loop that does not clock out. Agents calling agents. Workflows rewriting themselves overnight. Private systems running continuously on machines that were not built for this and are beginning to say so. For the first time in a generation, we are waiting again. The spinner is back. The fan is loud. The machine is warm before the sun is up.

Millions of private, idiosyncratic compute islands are forming quietly at the edge, each refusing to share, each insisting on its own stack, its own logic, its own version of the morning briefing. None of this appears cleanly in hyperscaler guidance or chipmaker order books. Those numbers capture provisioning. They miss the compounding intensity of personalised, always-on, structurally duplicated demand.

And we are still early. The current strain on silicon reflects a world that has barely learned to walk with this intelligence. As familiarity deepens, expectation accelerates. What feels ambitious today becomes routine within months, and the tasks that replace it will be harder, more personal, more urgent. There is also the matter of speed. The same instinct that drives customisation drives impatience. Waiting minutes for an agent feels unacceptable to someone who waited seconds for a search result. Faster answers require more silicon, not less. Each solved problem generates three requirements that did not previously exist, each of them time-sensitive, each of them bespoke.

The complexity of what we ask does not plateau. It compounds, shaped by the same individualising instinct that is already creaking the machines.

The order books are full. They are probably still too small.

I.

II

III

IV.

The Demand We Are Not Measuring

‍

Related Articles on Innovation