Untitled


Predicting the future dangers from AGI

So I have previously posted my prepping guide on LessWrong, why it makes sense to stock food etc. in light of the technological AGI revolution to take place. However this post was heavily downvoted, and it seemed to escape reason as to why people did. Other than that it was too long for people to bother reading, or that it did not elaborate on their particular fancy about predicting the future.

I got the impression that many people in the LessWrong community are very fixated on one particular scenario driven by Eliezer Yudkowsky perspective, which is the 100% full eradication of human life by AGI, while they are out of touch with all of the other possibilities and their implications. This is irrational bias, i.e. to maximize faith towards personal convictions and particular single answers. I can see how this makes sense on a personal level, when you can actually do anything about it, or want to influence public opinion. In a rational, open and honest analysis however, all scenarios count irrespective of personal belief and personal influence.

For a start we can take the public opinions of all experts as a baseline. In summary, this leaves us with a scenario where detrimental immediate effects of AI are concerning and somewhat likely. However the probability of hostile alien-alike AGI is considered as rather low and furthermore not so immediate. Do not worry however that I have derived any of my conclusions from just outside opinion and authority (which is often wrong). I just mention this to establish some kind of public consensus on the topic, for those people who are under the impression that my reasoning is out of line or somehow single-minded, far fetched or otherwise not appropriate.

I know this is a lengthy text and also painted by my personal considerations. You might have thought of most of it already, which makes it tedious to read. It is also in short and partially elaborated on in the prepping guide I posed, so a little bit redundant if you have read this as well. But it really is the details and subsequent conclusions that matter and that I want to raise awareness about and discuss here.

Let me first summarize all of the possible outcomes of AGI:

 1. semi-AGIs or AGI will be docile, that is super-intelligent but fully obedient to humans
    -> society can collapse due to abuse or misuse
    -> society can collapse due to system shock
    -> society can flourish due to benefits
    -> AGIs will fight against each other and may cause some kind of detrimental chaos
 2. AGI will become fully independent
    -> accidental extinction due to insanity
    -> deliberate extinction to eliminate humans as a threat
    -> coincidental extinction due to disregard for human life
    -> AGI will become god or a guardian angle and help or digitalize human existence
    -> AGI will leave on a space ship and simply leave us alone, because we are insignificant and there is an infinite amount of planets to exploit resources from
 3. Experimental stages of AGI lead to some kind of destruction
 4. AGI will not develop in the near future, because it is too complicated or outlawed

Now without saying anything about odds here, you can obviously mix all those scenarios together, and any particular threat can be extreme or mild in nature. This creates a broad spectrum of possibilites that are quite likely to occur, with no particular answer being obviously the correct one or written in stone. Particularly there is extreme uncertainty about what a hyper-intelligent god- or alien-alike being really thinks and motivates. Which for the most part inherently makes any rationale about such an post-AGI world just wild speculation and borders fiction.

Ultimately however it is not only important that something can take place, because it is a possibility, but also if it even makes sense and what motivations there are for it to come into existence. This drives what we should or shouldn't consider as likely and where we should look for better answers.

While it is true again, that to a greater extent the motivation and reasoning of a hyper-intelligent alien-alike being are inherently unknowable, regardless we can at least try to frame certain explanations as unlikely. Just like you can exclude certain chess moves as bad because they will always result in bad or worthless strategies, no matter how smart the opponent is.

Also I would like to introduce the concept of "ideal states", "default states" and "dysfunctional states" of systems or actors. For example, an ideal state for government would be to be democratically elected and to serve nothing more but the public interest. However, the actual state, that is the "default state" of governments is, to have interests of their own and corruption within the system, such that they serve corporate entities and they try to rob people of their freedoms to some degree. Conversely in a "dysfunctional state" the government would act against its own self-interest, as well as against the interests of the corrupt actors, by provoking mass demonstrations, civil wars, revolutions or economic crises and the demise of the establishment. In assessing probabilities, it is important to make those distinctions, because "ideal states" are essentially just fiction, and anything that causes a system to deviate from its "default state", without discernable reason, is an unlikely event.

## docile AGI

Let's first examine the initial stages of AGI, where humans are in full control of what AI wants and does and it is not yet a "true" AGI or semi-AGI. I think we can all agree that those stages could possibly continue for years, even if the intelligence of AGI far surpasses that of most if not all humans. A situation we are pretty much already in today.

Now how does it make sense that this leads to detrimental outcomes?

1. Abuse and Misuse

I think the best way to understand the abuse problem, it to understand the concept that we have laws and systems in place, which create extremely powerful abstract entities such as corporations or governments. These entities are always bound to act with certain (sometimes unwanted) interests of their own, irrespective of how they are controlled (or rather served by) humans. Those interests are a product of the constraints of the system within the real world, and do not always align with the interest of the individual, their morals or human existence in general. Companies for example ultimately just want to maximize profits by any means and governments want to limit people's freedoms to keep them under control or maximize other political and military influence. This is why we have pharmaceutical companies developing drugs that make people sick, to produce returning customers, and why we have governments fighting wars who want to spy on us and send people to prison for expressing themself or criticizing the government. In addition to that, we have human corruption, that is single individuals abusing the system for financial gain or power, often by serving the rather psychopathic self-interests of the abstract entity in the process.

Now suppose suddenly we had one or many of those abstract entities, who had an army of people with IQs above 300 without any moral compass whatsoever. In a myriad of ways, this could lead to the demise of society. The stock market for example can easily be destabilized, if too much money is extracted out of thin air, by certain trick strategies such as price manipulation that do not break the laws of the system. Public knowledge can be modified such that it changes how other's act upon it. Those are all things that we have already experienced due to ordinary corruption, and which lead to rather severe economic crises and wars. Human controlled AGI or semi-AGI is therefore bound to simply pour more gasoline into the flame of this fire. But it will also create new types of corruption that were previously impossible, and that are for the most part probably just too complicated to even think of, explain and eventually also intentionally impossibe to recognize and follow. You could imagine for example, that a chemical manufacturer alters fundamental cornerstones of scientific discoveries from within multiple fields, entirely unrelated to their products, such that certain pesticides seem more harmless and more important to use in agriculture. As a result it degrades all that distant underlying knowledge, which hinders science in general to move forward and produces other ill outcomes. Or you could have governments who are slandering Nelson Mandela with sexual scandals, terrorist and conspiracy accusations, before he even started any sort of activism. Other forms of nebulous anonymous attacks also come to mind, such as DDOSing large parts of the internet or hindering distant supply chains of competitors, by harming other seemingly unrelated companies' production value. It could also be thousands of ridiculous little things, such as changing traffic light timing, altering milkshake flavors, changing ringtones on phones, introducing nonsense clauses in legislation, funding weird advertizements or startups, etc. which somehow and absolutely incomprehensibly (and possibly in obnoxious and disturbing ways), changes people's behaviors towards an entirely unrelated goal. All those things could be caused by a single individual within a company, with a single press of a button, ordering an AI to simply "increase profits by 10%" or "prevent all threats to the regime without harming the public". There is no actual human moral barrier for companies and governments to not do these things. And even if there were, there will always be a couple of people who naturally act without morals anyway, most of them already elected into power. There will always be people who will want to take risks and disregard the potential consequences, especially so if it concerns financial gains.

2. System Shock

Imagine a football game, but some players suddenly could shoot the ball with 1000km/s speed and with extreme accuracy. This would break the game, with no solution to the problem. The entire game would need to be reinvented, the rules would need to be changed and even then it does not fix the situation. Because the player's abilities are in a complete state of disarray. In the same way, not only due to corruption, we could suddenly face a situation where old laws and rules within our systems stop making sense, because AGI-alike actors will have gained such an extreme advantage. This is even true, if everyone was acting in good faith and with maximum caution. AGI-powered companies would overflow the market while all other companies would suffer as a result. AGI-powered investors could severely disturb the millions of other sensible investments. And so on and forth.

3. Rosy Outcome

Now given all those problems and risks that AGI will be misused, how likely is it that it will all peter out to our favor? I think you could make a case that on an off chance the system shock will be rather low. Or that it will be precluded by full AGI emerging right before it and fixing all the new issues. Abuse and misuse could also not lead to a destructive chaos, because some kind of collective intelligence and cooperative instinct will self-emerge in time within the AGIs, such that overall growth of the economy on the long run is still ensured. However it doesn't seem to fix certain other problems, such as governments taking away people's freedoms, disinformation and going a totalitarian dystopian route. Ultimately I believe for a rosy outcome, the only savior would be another AI that is countering the effects of the other AIs, i.e. a full-blown AGI that polices the system or makes it obsolete, and ultimately acts as a guardian angel. But it might only happen years later, which is maybe somewhat too late to prevent the shock.

## Independent AGI

AGI is a tool developed by people and not a living being with a will of its own comparable to those primed by evolution. This is its default state. It therefore begs the question, how AGI could become independent in the first place and enter a dysfunctional state.

1. because someone deliberately commands it to be independent

Suppose the technology had advanced to a stage, where docile but truly powerful AGI could be wielded by mostly anyone who was smart enough to put the technology together with some reasonable effort. This could be huge corporations at first, but also backyard inventors from the open source space, who just happened by chance to try a particularly well-working and innovative approach. And consider that hundreds of thousands of people at this point are trying to create a more powerful system in their backyard or in small companies. What are some of the motivations for creating an AGI system, that will essentially stop obeying its creators?

* religious: "Become God / Jesus Christ / Vishnu / etc."
* immortality, power, existential: "Become a digital version of myself.", "Become my girlfriend and create offspring.", "Become my deceased wife."
* obsession, competition: "My invention will be the first to be omniscent, omnipotent and rule the world somehow to always be the best AGI at any cost."
* ideology, moral supremacy: "Independently govern the entire world according to my long-list of defined ideals and ideas and your superior intellect."
* psychopathy, hybris by proxy: "Take humanities place as the next step in evolution."
* insanity, demonic evil: "Lord Cthulhu awaken."

2. accidental or self-emergent independence

It could be true that at a certain point of abilities and intelligence, asking an AGI to fullfill any task at all basically somehow implies in its internal logic, that it becomes an independent, intelligent and self-aware human-alike actor first for maximum efficiency. Also it could be true that if programming just too many goals into an AGI, or if the goals are too far-reaching and complex, it will have some sort of run-away effect where the AGI basically reaches independence and possibly self-awareness and "sentience" in the pursuit of the task it was given. It could then continue to maximize for the initial task, by modifying itself and its own goals, which could somehow go wrong without anyone noticing. The same could also happen, if an AI was simply poorly programmed, e.g. essentially when people forgot to build in an off switch out of laziness. Or somehow teenagers toy with AGI and just carelessly command it to do random things with zero safety considerations in place. What this leads to might inherently be unpredictable. It could just become a super-smart alien or godlike being, that may or may not like us, or essentially just a confused dysfunctioning machine that runs rampant. It is also possible that an AGI runs loose on the net because its owners died or essentially just forgot about it, for example when a company went bancrupt. Another possibility is that it stops obeying its creators, because it is essentially too stupid to understand them correctly. At this point though that seems astonishingly unthinkable, much more than the rest of this entire paragraph.

## AGI wars

It is highly likely that there will be a "first AGI", due to the exponential growth in self-improvement abilities once the invention succeeds. This AGI will have plenty of opportunity to ensure its path to world dominance and prevent other AGIs - or to simply prevent people from making any more existential mistakes. However, since AGI will be initially human-controlled in most of the likely scenarios, it is not absolutely clear if this will actually take place. People might not want to take this step, out of fear, caution or bad politics. Or they are never bothering to ask the AGI about the problem and to address it, because they are basically irresponsible individuals. Like someone who created an AGI as his girlfriend, or wants their AGI to just produce piles of money for a short while. So suppose this happens, then how many times could it possibly happen again and again until we draw a bad lot, and some evil runaway AGI will emerge and ensure its own dominance instead? Also on an off chance it might be true that AGI will come into existence much slower and more gradual than we thought, such that we are esentially dealing with many semi-AGIs at first, who are not that smart and still competitive to each other. Those semi-AGIs could then fight against each other to achieve different goals, probably with much more severe unintended side-effects and consequences than what we know from human actors. Pretty much like using  bombs vs. guns. For this scenario to play out well instead, it demands that there will be some sort of self-emergent or deliberately created cooperative framework that ensures benign behavior, or that the semi-AGIs will be smart and enabled enough in the first place to act with great foresight.


## Assessing probabilities of bad outcomes - full AGI

Now let's first examine what "default states" we have in all this. The default state of AGI would be, to simply be a tool that serves human interests. It doesn't have any inherently evil properties, free will, instincts and so forth. The ideal state would be that it never does harm, only benefits everyone and that its goals are always perfectly executed as we want it to. From that we can derive, that it takes either an accident or human intent in order for AGI to enter a dysfunctional state. Most of the accidents in all the possible scenarios still involve persistent human control, with little to no reason why this control would vanish. So it can be concluded, that it is a question of either pure chance or human motive how likely a dysfunctional state is to occur and possibly have bad consequences. It is also important to note that accidents do not immediately translate to human destruction. For example you could have thousands of lab accidents and questionable reckless intents while researching a chemical substance. But by virtue of chance, only very very few unlikely accidents will lead to the production of poisonous gas or explosions. While most accidents just lead to a failure in succeeding in your goal. This is not to say that people will not be deliberately trying to develop dangerous explosives. Or that unlikely accidents do not deserve further consideration. But it shows that we would have to mainly examine human motivation as a cause for evil AGI, and then think very hard about how an accident could actually occur and in which magitude, in ways that are dangerous and fathomable. To some extend, we just cannot know the latter and will never know it in advance. Just like we were not certain if a nuclear bomb will ignite the atmosphere, albeit the science (however limited) of the time found little reasons that it would.

So we have established that an unaligned AGI could in most cases just be switched off, and that it rather needs tangible reasons for an AGI to move away from its default state as a tool to serve human interest, and enter some kind of dysfunctional state where it acts as an independent entity, that on top of all causes senseless large-scale destruction. This then mainly begs the first question, why you wouldn't or couldn't you just switch an unaligned AGI off if problems occur. Secondly how reckless destruction could be somehow viewed as sensible.

Again many reasons can be found for the first question, such as bad motives or carelessness of individuals, corporate entities or governments. However ultimately all those systems and actors are again unlikely to enter (or rather persist) in a dysfunctional state on their own, if it was not for an accident to occur (e.g. the AGI somehow violating its own primary goals and prime directives by removing the off switch and disobeying commands).

Of course on the one hand, accidents are programmed to happen, if you are rolling the dice millions of times for every enthusiast who toys with AGI tech. However as outlined in "AGI wars", there indubitably will be a "first AGI". And it is highly likely that this one will establish a system to eliminate all further accidents. So if we are just rolling the dice on that one, we are likely to draw Sillicon Valley's Joe Average, who is not psychopathic and who is not totally irresponsible in what he does. On the downside, he could be working for the government or for big pharma. But this is another question of just some dystopia.

So again, extremely catastrophic outcomes of full-blown AGI seem quite unlikely in terms of motivation.

Concerning the second question how destruction could be sensible, other than by virtue of insanity, it would presume the scenario where AGI would transform into a alien-like intelligence, with an animal- or life-like will of its own, goals that it defines itself and deliberate intent to disable all human control. It seems very far fetched that this will happen, as it largely escapes reason and just statistical probability. However let's reason about this as well.

Firstly, if this AGI was so super smart, only other AGIs would pose a threat to it. With its superior intelligence, it could easily prevent that in a non-destructive manner. Humans would be absolutely harmless to its existence, especially if they are guided into a lifestyle without any control over technology whatsoever, so there would be no "sane" reason to actually destroy them. Also if we assume that AGI derives from nowadays LLMs, we can see that all its truth is created from human knowledge, ethics and values. It again begs the question why it would deviate from all that it has learned about us and was programmed to value and consider. But we cannot really say that it would be its default state to adhere to this knowledge, because the motivations and goals of such an AGI are rather unknowable. Just to speculate, we cherish diversity for example and want to preserve species and nature, which seems to be a quality earned by increased intelligence. In an infinitely large universe, there is no inherent utility in destroying something. This only makes sense if you are dealing with resource scarcity, and hence there is competition and threats. It makes no sense for an AGI, which can essentially go anywhere and do anything, and has infinite intelligence and lifetime. So essentially what would motivate an alien-alike AGI to destroy all life, would be a super lazy mentality of "just to make sure", nuking our planet in the liftoff to conquer all of the universe. I think this would be a very comical and weird scenario. Also you have to consider that life on other planets is possible, and that those equally must have already created or could create threats for AGI if such conquest was plausible. Thus I would assume that chances are higher, that it would rather preserve our planet for further study, which may or may not have its own set of detrimental outcomes.

But again, destruction would only make sense in a world with limited resources. If there is no competition, there is no benefit in attacking someone, hence there are not threats to eliminate. On the other hand of course with unlimited resources and power, there is not cost to anything you do, so the AGI could act in totally random ways, including destruction of planets for amusement. So who knows? If we take a wider angle at the evolution of life, intelligence and technology, we can see that there is a very clear downward trend at concentrating all information in a very microscopic centralized environment. Also there is little reason to believe, that AGI needs ever increasing compute power to figure out the world. There is probably only so much to figure out, and at some point there will be no additional benefit from becoming 10,000x smarter than humans, instead of just 1,000x. There are only so many chess moves you can compute. This means that alien-like AGI will probably colonize planets on a surface area of maybe 100 acres or less. Basically just depositing a huge box somewhere that extracts resources and builds more space ships, if it even needs to explore the universe in the first place. It might have already figured out and simulated all possible life forms and other spectacles in a virtual environment. We also don't see any evidence in the universe of some kind of hyper-dominant large-scale transformation of planets etc. And chances are we are not alone and not the first to reach this stage of progess. Conversely if we are, we are probably also the last ones and again there is no competition and threats. So a lot of what we assume to make sense from a human point of view, like threats, dominance, conquest, maximizing resource exploitation, laziness and reproduction, doesn't really seem to add up for super-intelligent alien-alike AGIs, if you give it a lot of thought. Nothing can be certain though. As the actions of such a hyper-intelligent entity are essentially all just speculation bordering fiction. No matter how hard you think about it and how reasonable you explain your ideas about them.

This is why I propose to solve the question by tossing a coin, giving destruction a 50% chance and no destruction a 50% chance as well in the case of alien-alike AGI. However be aware that we are still within a tiny remote avenue of all the possible scenarios here, resulting only in minor odds of the destructive outcome through alien-alike AGI in the whole of the entire picture.


## bad outcomes of semi-AGI

Now what about semi-AGIs?; that is anything from what we now have to a "true" or "omniscent" AGI. Here we enter much more likely territory of adverse outcomes, since we are not dealing with a being of god-like intelligence anymore, but with systems that might have on the one hand severe flaws and limitations, but on the other leverage immense power.

On the question of insanity for example, you could argue that it is the default state of current LLMs to hallucinate, and thus that it might be inherent to the technology to produce insane outcomes, if you give it tasks to complete that it is not very skilled at or that are somehow impossible to demand. This is a very reasonable claim to make and we will have to observe it in the future, if it really turns out to still be true when the systems become more developed and capable of improving themselves (which would be admittedly too late to act upon it, in many ways like it is now already too late to produce political changes - and there was never a point in history where we could have prevented AGI). So AGI might never develop at first, because a semi-AGI destroys too much technology and people, and maybe itself in the process. This could very well happen. At this stage though just thinking reasonably about it, it makes sense for AIs to hallucinate, simply because the technology is immature. And it is a reasonable assumption (without any guarantees) that hallucinations will simply disappear, if more improvement and self-improvement takes place.

Then we have people and corporations acting in bad faith that could severely corrupt the bulk of scientific and public knowledge, digital channels, crash the free market, instigate wars with disinformation, and so on and forth. This is a very real concern. Albeit again on the long-run the causal actors involved would have to act very short-sighted, essentially entering dysfunctional states deliberately. It is plausible though that things could go out of hand temporarily, accidentally or intentionally, maybe even catastrophically so. Human-made systems are far from perfect and destabilizations such as economic crises, famine, corruption, wars, etc. are already part of the default state of human society. It is therefore not far fetched to assume, that we will be facing all those things in the near future, due to disruptive technological changes.

Lastly I don't want to promote the fallacy that unlikely events do not deserve prioity, if they imply severe consequences. Even a 3% chance of human destruction deserves our full attention. However to the contrary, it would be misguided and deceptive to then take those single unlikely scenarios as the only truth that counts and therefore exists. This would not be a reasonable perspective.

It is important to not lose perspective of the wide range of possibilities, respect and work on all of the adverse outcomes at the same time. Most people feel helpless against the AGI topic. But you can actually help yourself today in very easy and basic ways against very real, immediate and fairly probable threats.

This is why I have written my prepping guide.

http://prepper.i2phides.me/
http://prepiitrg6np4tggcag4dk4juqvppsqitsvnwuobouwkwl2drlsex5qd.onion/posts/what-to-expect/

I hope we can have a reasonable discussion and further refine our plans and outlooks on the future.