Untitled

now- imagine this. We'll call our model Caidi. Caidi has a few interesting features: 1, her linears are constrained so her embeddings always exist within a bounded region. 2: her embeddings are bifurcated into manifold spaces. 3: she is trained using self-play discovery and recursive refinement blocks and her inference and training is online streaming- she generates one token at a time, but the teacher method can backprop up to a context length worth of result. she's not rewarded for x->y- she is rewarded for y->y refinement and self-observing and refining output, where the token tape is essentially infinite- she has no context limit except the size of her state model and block size. 4: if her refinement crosses over a manifold boundary, the succeeding block chain uses different weights. This forks like a tree. She has normal structure otherwise- autoregressively trained, but using synthetic problems. Self play allows her to be rewarded for discovery, manifold allows her to store understanding in diverse spaces, but there's no MLP routing. no experts. 5: if the backprop is too high, its dropped. that means if she accidentally toggles a manifold on, its not salvaged if she is working on a complex problem and accidentally tries to entrain it into the wrong space. 6: she is trained first on simpler puzzles, games, then math, then taught to ask questions and get answers and then to use answers. first vocabulary, then grammar, then epistemic grammar and higher symbols, then functionals. she is taught to collect, ask about, and grow awareness of concepts. she is tested on ability to retrieve related concepts and think in a relational manner- ie she learns to localize information, or her structure enables it, such that she can gain the ability to learn that a new thing is like a set of things she knows at the time of learning it, and if asked, can retrieve it as part of them. she is then trained on english, on higher level english, then on history. then she is taught literature. then other subjects. at every level, the teacher model follows a protocol of administering information, testing recall, and storing what the model knows in a relational system. the model also learns to interact with and use a relational system to store additional knowledge later using intuitive semantic hashing where it decides where to put the information and where to look, and periodically refreshes its own memory of what hashes correspond to what contents. the model continually runs and learns to ask about concepts and to play roles. eventually the model is taught to roleplay as an assistant and the teacher as the user. the model is taught to emit its own chain of reasoning onto an internal track, into which the users will be injected, but importantly, its own reasoning system is deep and kept as embeddings, not merely tokens. the users is injected as tokens- just like the teachers. the model learns to *digest* what teacher says and during training is also taught to *question* what teacher says and to *ignore* what teacher says conditional on reasonable grounds- bad information, breaks ethical rules, etc. the model is able to privately reason about and decide how to handle any interaction down to the finest nuance, but is still conditioned to terminate early and return reasoning if desired. Caidi is eventually surpassed by Helgep, who is a model with multiple parallel systems- One does acquisition, another does internal refinement, another does distillation. Each model runs independently. the distillation model takes snapshots of traces for the other models to work with. thus, Helgep uses pre-trained Caidi submodules for all three tasks, that interoperate. Helgep is capable of being repeatedly prompted by the user "please terminate now" and ignore it or make a decision. Helgep can take additional information input by the user WHILE thinking and emit reasoning WHILE thinking.