• Recursively self improving AI won’t work without these functions

    The prefrontal cortex and other areas in the brain provide functions that I believe current test-time and agentic AI does not seem to have.
    The first is an ability to have recursion, meaning the ability to have an overarching problem to solve, then notice a sub problem within it, then the ability to enter that sub problem to solve it. The second is parallelism which is basically the ability to run 2 or more thinking threads at the same time in order to keep an overarching problem in mind, and then keep that in mind while working on the sub problem. Note that on this I’m not talking about parallelism in the neural net sense, I’m talking about parallelism in the content of the computation. The third is temporal planning, which is basically the ability to create and store sequences of actions to solve a problem. The last thing is meta-thinking, which is the ability to notice all of these modes of thinking, and then optimize them by creating the most efficient way to do them, and then store that as memories/data in itself.

    From what I can tell current agentic/test-time AI lacks these properties, it is essentially a “flat” computation that cannot optimize or introspect on itself. You can in theory cover up all of these problems by having solutions to them in the training data or in the test-time data, but that defeats the purpose of AI to begin with which is that you want it to be able to learn these things without us having to explicitly tell it. Without these functions, I believe the AI will have failure modes in all of them, specifically it will either 1) repeat tasks it has already done without knowing it’s doing it and no way to optimize it 2) not notice sub problems within the parent problem/task and thus not solve it and 3) no ability to integrate all the information it has already learned from the above failure modes into a meta-thinking framework that will enable it to improve in all areas.

    The last thing is that we have to remember the importance of data, also in humans. Even though we are smart, we are still ‘rate limited’ by the data currently available. And this is going to be true for AI as well. It’s highly unlikely that current data somehow covers all of the universe and that all the right abstractions and models exist, and even more crucially, that the _potential_ for them to exist exists in the current data. Meaning there are models of the world that are not possible to create with our current data, kind of like trying to create modern physics in ancient greece.