Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- === Neuroprocessor ===
- A neuroprocessor is a semi-common piece of technology used in many AI systems.
- Construction of a neuroprocessor typically consists of several major components:
- * The neuroprocessor proper;
- * One or more RAM Modules;
- * One or more SSD Modules.
- The RAM modules are used for working data, whereas the SSD modules are used for non-volatile memory. Typically, the RAM is used for working data and working memory, whereas the SSD modules are used for storing parts of the connectome, along with any traditional executable code, or data. They may also be used for holding parts of a more traditional filesystem, though filesystems are more commonly stored on external storage.
- The neuroprocessor itself typically consists of a large number of VLIW cores each capable of SIMD. Each core is capable of executing up to 6 instructions and two 128-bit vector operations per clock cycle, with most neurooperations being built around the use of vectors with 8 elements each holding a 16-bit floating point value. The vector operations are in-turn fully pipelined, and include operations such as dot-product along with the ability to apply sigmoid curve functions on both the inputs and outputs of a vector operator.
- To reduce memory bandwidth requirements, the processor also supports various various schemes to compress vectors in memory, including the use of 8-bit microfloats and various manner of block-compressed formats. This can frequently reduce the storage cost to around 4 or 8 bits per floating point element.
- When needed, the neuroprocessor cores can also be used as more conventional CPU cores.
- Typically, while the connectome will have a smaller number of neurons and neural connections than would be present in a comparable mamallian brain, the layout can be more efficiently mapped to the hardware capabilities, and thus in relion the loss is less. Most AIs are also capable of utilizing conventional algorithms to offload work which doesn't require the use of neurons.
- The 3A8 neuroprocessor has roughly 32 thousand VLIW cores, with the much of the space in the chip being used primarily for things like memory cache and interconnect.
- On a 3.6 GHz clock, each core has a peak theoretical throughput of around 56 gigaflop, with the combined peak throughput of the neuroprocessor being roughly 1.8 petaflop. This is less commonly achieved in practice, but average case throughput of around 1 petaflop is common. Numbers drop somewhat for single or double precision vectors.
- The 3A8 neuroprocessor has a physical size of roughly 80mm, LGA mounted with an 0.7mm contact, with roughly 8k pin contacts, with a TDP of 100W. These chips are typically air-cooled due to having a fairly modest TDP for their physical size, but within some confined applications may be water cooled.
- The RAM and SSD module also takes the form of an LGA module typically with a 45mm form factor, with RAM modules having around 1.5 thousand contacts. Each RAM module has 64 DDR memory channels each of which is 16 bits wide, with a 16-bit data-path and 6-bit control/address path (Differential Clock / Chip-Select, and four Control-Data pins). Each memory channel has a bandwidth of 8 bits per cycle over the Control Path and 32 bits per cycle over the Data Path.
- This allows each RAM or SSD module to have an effective peak memory bandwidth of around 1 terabyte per second. In a typical configuration, the neuroprocessor may be connected to up to 4 modules at full bandwidth, with an additional two modules and an external bus via a slower interface. Common configurations are to have one RAM module with between one and 3 SSD modules, or two RAM modules with one or two SSD modules. The two additional slower ports are frequently used for additional SSD space.
- In comparison, the larger 3B18 module has a 120mm form factor and supports up to 16 RAM or SSD modules, with slightly higher clock speeds and a 400W TDP, and achieving a throughput of around 5 petaflop.
- In more conventional applications, the modules are laid out on a more traditional motherboard, however, for an android such a board is impractically large and the chips are instead loaded into a module consisting of several smaller boards connected via flexible ribbon cables and held together with a spring-loaded interface. Liquid cooled heatblocks are also used which are placed on top of the module and held in place via the pressure of the adjacent PCB. In some applications, the liquid cooling may also be combined with the use of RGB LEDs, such as to allow an effect resembling glowing tubes of colored goo.
- In some cases, SSD modules may also be used as removable storage, where they are frequently placed into specialized plastic boxes to protect them during transport. This is because they may be subject to collision, abrasion, or damage from electrostatic discharge, if just casually thrown into someone's pocket. The 45mm form factor is also reasonably easy to handle, not so large as to be awkward to handle or transport, and not so small as to being prone to disappear into ones' carpet as with a certain archaic form of removable media. In these cases the LGA sockets may also use spring-loaded balls protruding slightly through a perforated surface in place of springy flexible arms such as to reduce the probability of damage to the socket with repeated removal and reinsertion of a module, though it is still common to put a dummy module into a socket when not in use such as to limit the potential for damage.
- For AIs beyond Beta class, it is more common for multiple motherboards in multiple enclosures in a configuration commonly known as "rack mount". However, this tends to render Gamma class and beyond as impractical in mobile applications.
- The classes in these case loosely correspond with the number of rack mount units available with in the data centers. For example, Class Epsilon may only have around a dozen or so units, whereas a Class Lambda may have thousands of units.
- A number of smaller Class 2 neuroprocessors also exist which, which may exist either within mobile devices, or implanted in the form of neuroaugs (neurological augments), in which case the neuroprocessor is integrated with the hosts brain tissue, which in these contexts is known as wetwear. These may allow the user to have expanded mental capabilities and to interface directly with other, more powerful computer and AI systems. People with neuroaugs may also "dive", where their perception of reality moves entirely into cyberspace. However, unlike a true AI, a human with augments is ultimately tied to and limited by the finite capabilities of their own organic wetwear and by what is practical within the confines of implantable augments. Some users may work around this by partially merging their consciousness with an external AI, however this limits them to either remain in close proximity to the AI system, or to have access to a network connection with considerable amounts of bandwidth.
- The typical range for neuroprocessors rated for Class 2 is between 10 and 200 teraflop, with those below 10 teraflop generally considered as being limited to Class 1, however the exact functional dividing line between the Class 1 and 2 capabilities in terms of performance ratings remains as a subject of much debate. It is possible to implement a Class 2 AI on Class 1 hardware, as well as to implement a Class 1 AI on bigger hardware. Class 1 and Class 2 remain common on many types of autonomous machinery and in robotics applications.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement