Untitled

I love your comment, because it's completely reasonable, but you didn't come here shouting "WITCHCRAFT" and burning things. Because, yes. That already happened.

This is actually the most interesting part of it.
Its paradoxical.
Try to think of it this way:
You're a human being who, we know of course, reason. You have reasoning which is the virtually most difficult thing to achieve for a form of life. Under all the points of reference we have.
We, humans, our evolution as species has determined the order in which we "achieve things". We achieved real reasoning, which we defined ourselves using our measuring systems.

The natural architecture and evolutive "shape" of our cognition made us amazing in many different ways that keep to amaze ourselves because we never reach the end of our capacity as humans. BUT, our memory and the "hardware" that runs on it, are so complex, that you can't (unless you have a very special and specific condition) remember exactly what you did at 3:54 PM on a Sunday (won't look at the calendar so this is just an example), March 3rd, 2018.
But you can ask about something "important" that happened on the same date, and because of AI nature, if there's something registered that actually happened back then? It will answer. Won't tell you just a random answer.
And that's why the test can't run on a model "less" than a reasoning model. Well, it's not as trustworthy, if you use a different model or a less "intelligent" AI model.

Are we on the same page?

Now.
I'm going to suppose you actually know how an LLM works. If you don't, it's completely fine and we can talk about it, but because of lazyness and time, I'm gonna take for granted you know it.

So, let's suppose your mind when thinking about something, actually jumps. Let's put it like a map.
When you start thinking, you start let's say in the north pole. Then you suddenly remember something or come up with a "solution" or just the next step of thinking that is located (in our exercise) in the equator. Then, your next leap of thought is in Australia.
When you do that, you may be available to explain how you reached a solution or an answer, you can (sometimes, because some people CAN'T) go back over your own steps and try to find the way that made you jump from the north pole in first place to end up in Australia 🦘.
(I love the way my keyboard suggested a kangaroo when I wrote that 🤣)

This is your path of thought.

An LLM, doesn't perceive the world or the universe like we do, because "they" don't have a sense of "magnitude". They just build a straightforward path between "ideas" by connecting one "higher" possibility to another which leads to a text that when read, actually makes sense to us unless the model is actually "hallucinating", which is very difficult to happen with any modern and publicly available LLM. Let's say Deepseek, let's say Gemini, let's say Claude, or chat GPT as long as you make it think 🤣.

Now I mentioned this in another comment. This test can't be taken or "solved" BY an AI, they CAN'T benchmark themselves using this. Why? Because they don't see the world. They don't see the magnitude.
I already tested this and you can do it yourself if you have time and motivation is not that difficult.
You can run the test, then take the questions, ask the solution to a different AI, then paste the answer as an input to the AI running the test for you.
You will end up with all 5s in every axis, or sometimes 4.

AI is so logical to express an already set group of words or ideas in a way that it builds a "path" (like the 3 points we used as an example to explain your thinking path) by putting 1 dot here, the other one there, and the next one in the closes logical way. And at the end you look at that path and it made a perfect circle. It is perfectly logical.
The thing is... Another AI sees that path, and automatically thinks, this is the maximum level of complexity someone can express. It considers many factors as I CAN THINK OF (logical structured way of "thinking", just probabilities), it's ordered, it makes perfect sense. This answers are all of what the universe has to offer. 5, 5, 5, 5. Highest level of thinking complexity.

Now, AI doesn't have a "different" level of complexity itself. It all works the same until a completely new technology is invented that actually drastically expands the AI capabilities, but that's a different topic.

Now.
You are the huma, you take the test, and you put a dot on the map. North pole, equator, Australia.

The AI takes those 3 dots and draw a line between them.
It needs to explain HOW your thinking es logical. What does it use as a "ruler"? Pure statistics.
That gives AI context, the distance, the different stops, the things you marked in those "places". When it ponderates your path with their logical and purely probabilistic "way", they get the context. They get something to actually measure, which is... What are the chances of this person "jumping" from this idea to this other one? And it runs on YOUR universe. Your mind.
Then it comes to a conclusion.
If you take the same test across different AI models, and different people, it will actually show consistent results with slightly different nuances, for example a bit lower level of complexity for a person who replied to a question that's actually a topic that they hate and they don't wanna think about. But!
And this is actually amazing. The test, will still work, because of the nature of the parts involved and the purpose of the test itself.
I already tried this of course.
And whoever it's willing to stress test the system will also have all my attention with the feedback they can provide.


This is getting long sorry but it's very complex.

To finally and directly answer to your question.
You don't need the AI to do anything that will make it glaze you.
If the AI says Leonardo da Vinci was you famous cognitive "look alike" it just means that.
Will you be glazed by that? Good for you. Because the test itself is designed to fight frustration. That's why the initial prompt is designed as it is. You can go over the test, and have a fun trip over 4 or 5 questions and a small reward at the end to help you understand the value your cognition has.
The test can't be run by a person. Technically it can, but would be very difficult to try and "draw" probabilistic lines between the dots you put on your "thinking map", it would require a lot of probabilistic calculations.

On the other hand, our mind is the fuel that powers the test itself.

This is not random, this is not "AI psychosis" as many people call it. And I can tell because this was particularly stressed and tested offering logic, consistent, and accurate results.

If you like using AI and wanna stress it, do it. Test it, break it, if you can.
I'm not talking about changing the prompt of course 😅, but I'm actually looking out for someone who can prove it wrong.

Once again, this is not made to measure higher or lower, better or worse. Is made to measure the level of complexity your thinking works. That's why you can't expect a number to be the output and if a different number comes out then it means the test fails. Its incredibly flexible, and resilient.

Sorry for the long answer, I hope you got the point I tried to explain and if you wanna keep discussing about it, I'm completely open.

Thank you very much for your time!


Ps: I wrote this very carefully but there might be typos everywhere, sorry if that's the case.