Descending the Wrong Gradient
A downloadable pitch
Why 'AI' Doesn't Feel Like Science Fiction (and How to Get There)
In Short:
While there’s no doubt that modern AI tools are useful, they are a far cry from the science fiction promise that once accompanied the term AI. We’re good at building robot bodies, and we’re good at training amnesiac assistants, but we’re nowhere near the sort of continuous intelligence that will be necessary for functional robotics. The industry is stuck on the transformer: companies are open to the quarterly report on number-go-up, but few are willing to risk stepping into uncharted territory and investing in novel research, including that which draws from biology.
But the brain does work. It’s undeniable — whatever interconnected combination of algorithms are being approximated by our neurons, they produce extraordinarily fluid, continuous, and useful intelligence. The existence of several failed bio-inspired research programs in the past does not, at all, discredit the success of the biological brain, it simply indicates that it is not trivial to integrate that functionality into an artificial program.
What the field needs is a wake up call — some little proof of concept which demonstrates irrefutably that there is massive value being missed by ignoring biology. I don’t doubt that the researchers and labs in this field are motivated to build better, more fluid AIs, but I do worry that they’re functionally stuck running laps around the first thing that worked. Once the decision makers realize what they're missing, the incentive structures should pull us the correct direction.
So far, taking inspiration from biology, and avoiding the traps of the cargo cult mentality, I have built a small artificial neural network which was able to achieve a novel ability while playing Snake - the integration of a memory system, inspired by the hippocampus, enabled it to learn from truly sparse rewards, and achieve similar performance as other ML methods, while using orders of magnitude less compute and memory.
With modest funding, I could continue to work on this problem full time, and believe that I can make substantial progress over the next few months. Additionally, I would be incredibly excited to have collaborators to bounce ideas off and help keep the project on a good heading.
The sooner our institutions get back on track doing novel research, the sooner the future is likely to arrive. Who’s ready for science fiction? I know I am.
Full Pitch:
The Problem
Commander Data, Cortana, WALL-E, Jarvis, The Iron Giant, R2D2 and C3PO. We used to have a strong image of what "AI" would look like: autonomous, continuous, optionally embodied intelligences, capable of forming genuine relationships, participating meaningfully in the story, and growing as people. A world with silicon citizens living alongside us.
A decade ago, Google published the transformer architecture, and our technological institutions have lost the plot focusing on it with tunnel vision. The valiant efforts of countless brilliant researchers, a seemingly infinite flood of funding, and an extraordinary buildout of servers, all laid at the altar of the first thing that sort of worked.
When compared to the simple chatbots of ages past, there's no doubt that a modern frontier model is incomparably more intelligent and useful, but the gradual progress of these systems has allowed us to complacently overlook the drastic gap between a coherent translator assistant with terminal amnesia, and that which could be meaningfully called Artificial Intelligence. Not to mention the extraordinary sum of resources it takes to train these anticlimactic token predictors.
Now, what's the overlap between neuroscience and machine learning? For one thing, the word "neuron" is used. You really might expect that there would be more things, but you would be severely disappointed, as I have been. Here is an ASCII art interpretation of the internal connectivity structure of a transformer:
]{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x{}x [
I will spare you an attempt to similarly represent a biological brain, and suffice to say that it is a broad, carefully interconnected convergence tree of specialized regions with distinct local rules, each serving precise roles in the overall system.
The neuroscience academics, even those involved in computational neuroscience, are largely uninterested in AI applications of their work, as their focus is to understand the biological brain. Conversely, the frontier AI labs are all in on the idea that they can gradually modify and refine the transformer architecture until it achieves some narrow, limited definition of AGI which they will argue over until the sun explodes, all without solving the problems of continuity and practical robotics. They have no real interest to speak of in the mechanisms that make the brain efficient and powerful. I have been told, in private conversations with a few people who work in these labs, that they believe they have already gotten what they need from biology, and the rest is a matter of scale and scaffolding.
There has been some research done in the gap between the two sciences (see Numenta, Thousand Brains, Predictive Coding, SNNs) but they tend to have taken a sort of cargo cult approach to biological inspiration --- adopting some of the mechanical specifics of biological brains without having a solid story about what they're hoping to gain from adding spikes, or clusters of similar regions. In the end, they often fall back to using something pretty close to traditional backprop, with a bunch of fun additional components dangling off the edges in a system that doesn't perform better than ordinary deep learning. When the primary fruits of this space have been cargo cults, it is at least somewhat understandable why the applied ML scientists have been dissuaded on the concept.
A Proposed Solution
The brain does work. Some 20 watts can run a system that learns continuously across a vast realm of input streams, may train towards arbitrary tasks, and maintains a continual, storied understanding of its own place in a broader world. Your ability to read and contextualize these words is incontrovertible proof that biology can do things which contemporary ML is nowhere near achieving.
What we are missing is a principled application of the mechanisms expressed by the brain: a proof of concept in artificial neuroscience.
Any small team is unlikely to converge upon the complete solution alone. The goal is not to close the gap in private. The goal is to provide solid proof of the value we are undoubtedly missing by ignoring the possibilities beyond the backprop transformer basin --- to find some other basin with its own interesting depths, and demonstrate undeniably that the realm of possible architectures is still largely unexplored, that there are paths we haven't investigated which could get us from here to science fiction in much less time and effort than the current rabbit hole.
The scientists at work here are quite intelligent, I have no doubts. They are rigorous and interesting individuals from a wide variety of backgrounds, doing what they justifiably believe to be the most important work in the world. I don't have the slightest desire to make anyone feel foolish, or "burst the bubble"; it is simply my belief that we are collectively missing too many important things, and the future is on hold because of it. I want the future to get here!
What the industry needs is a system shock, a little project from left field that performs verifiably well on a recognized task, using a different set of presuppositions than the ones at the core of modern machine learning.
For several months now, I have been working on an attempt at such a project. The thesis is simple: comprehend, well enough, how the regions of the brain work individually, and how their interactions produce a mind, then simulate these mechanisms in a system at a small scale, hopefully well enough to provide noteworthy results.
What Exists
I have written up a workshop paper on a variant of PCA which I developed for use in this project; here it is on Medium and X
https://x.com/ExTenebrisLucet/status/2029045465010798601
This algorithm, and a few others, came together in distinct regions as a little brain to play Snake. While the high and average scores achieved were comparable to traditional ML approaches, the network needed drastically fewer samples overall to reach those scores, and was able to run on a laptop in 10-20 minutes instead of many hours on an industrial GPU. Additionally, it used only local rules instead of backprop, and learned in real time, step by step, as it played. The addition of a memory system inspired by the hippocampus gave the network the novel ability to learn without reward gradients --- it was able to use only binary, intermittent food/death signals, instead of being gently cued with proximity rewards. These results, successful local learning, and learning from memory instead of proximity, are incredibly promising, especially given the simplicity of the network used.
The Next Steps
The ML community has a benchmark called Atari 100k --- 26 classic Atari games, through which the agent gets 100k steps of gameplay (with some nuances, here's a link for anyone interested in specifics https://www.emergentmind.com/topics/atari-100k-benchmark). This is a benchmark that gets some attention, and any decent score which uses novel approaches and modest compute will stand out, especially if it's able to solve the harder task of operating on individual sequential frames, instead of the standard benchmark which offers the model four frames at a time.
If that endeavor is successful, another more public facing contest to which a bio-inspired architecture may be applicable is the AI Grand Prix happening later this year: https://www.theaigrandprix.com/ . This is a more explicit jump into practical robotics, an area where I expect brain-like architectures to have a chance to shine.
So the plan for the next few months is to continue the theoretical study of brain mechanisms and apply it, first to the Atari 100k benchmark, and then to drone racing and other robotics.
What Success Looks Like
If frontier labs are turning their vast resources toward the exploration of alternative AI architectures, especially those that draw some inspiration from biological brains, then the primary focus of this project will have been achieved, regardless of where that decision came from. The purpose of this project is to provide a source of inspiration in the seemingly likely scenario that it does not arrive from elsewhere.
What I Need
Money! Surprise. I’ve been focused on this project full time since I was downsized out of my previous position, and the runway isn’t getting longer. Every dollar invested in this project is a dollar I don’t have to make by splitting my attention. I suspect that I may receive solid, highly aligned job offers in the event that this project is successful, and I may be able to pay back any investors, or pay forward to the interesting causes of others. No promises, but that would be my intention if at all possible.
$2k-5k would help keep the lights on for few more months while I focus full time, $10k+ would really help for a good while, and $30k+ would set me up to see this project through to completion.
Just as important as funding is collaboration. A little philosophical and technical assistance goes an incredibly long way on complicated problems, and it’s always a joy to work with those who share similar interests.
Anyone interested, please reach out through X or discord! <3
Who’s Already Involved in this Space
Verses AI is by far the most relevant research group with public information about their direction. They’re oriented towards online learning through active inference, and using mechanisms inspired by biological behavior. The primary distinction in my own direction is the focus on integration of distinct regions, and the specific contributions of each, while Verses still seems to be a bit in the “brain as a black box between inputs and outputs” family of implementation.
Sakana AI developed the Continuous Thought Machine, which draws directly from continuous neuron dynamics, and solves some problems that traditional ML has struggled with. However, like Verses, they have not addressed regional specialization, and are proposing improved learning rules and behaviors within similar architectures.
Flapping Airplanes and John Carmack’s Keen Technologies are both aligned in the fundamental observation of “we seem to be missing something important”, but neither is public about their insights and research directions.
Terminator Acknowledgement
You may hear "build more lifelike brains" and think "you mean terminator robots?" but I would urge you to consider the opposite. In the Terminator movies, Skynet was an optimizer, not inspired directly by biology, and given a simple directive to maximize security. This led to a situation where it decided the most optimal way to ensure security was to kill the humans (basically), and the AIs behaved in a manner that was very paperclip-optimizer adjacent. In many ways this is similar to the potential of our current AI systems, which are decidedly dissimilar to the brain, and which we rightfully struggle with the concept of maintaining the "alignment" of. I propose that it would be substantially easier to align and cooperate with something built in our own image, which we can deeply understand and relate to at the experiential level.
Perhaps we are already on the path to Skynet, and this is an alternative direction.
| Published | 4 days ago |
| Status | Released |
| Category | Other |
| Author | extenebrislucet |

