Advertisement
atreyu187

Dreamcast Guides : Power VR GFX Chipset

Jan 4th, 2013
195
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 12.22 KB | None | 0 0
  1. The NEC PowerVR chip consists of a true-colour RAMDAC, and a hardware 3D engine based on a Tile Accelerator. The Tile Accelerator divides the scene into 32 by 32 pixel tiles, which can be rendered individually. Each tile is then rendered into an internal 32 by 32 pixel frame buffer in register memory before it is copied to the main frame buffer. As rendering is done to the internal frame buffer, the fill rate is very high. Also, no texel data is actually fetched from texture VRAM until the tile is copied to the frame buffer, which means that the texture fill rate is not affected by overpainting at all.
  2.  
  3.  
  4.  
  5.  
  6. The follwing diagram shows the principle by which the hardware 3D engine works:
  7.  
  8. There are two stages, which can be run in parallell (provided you have dual sets of buffers of course). During the Binning stage, the Tile Accelerator is fed graphic primitives (either using DMA or directly by the CPU using the Store Queues or direct writes), which it will compile to an internal format. While doing this, it will register in which tiles this primitive might be visible by putting it in one or more tile bins. (If it's not visible in any tile, it can be completely clipped of course.) During the rendering stage, the ISP/TSP will read the lists created by the Tile Accelerator, and for each tile render the primitives visible for that tile into its internal framebuffer, before writing it out to the right place in the VRAM framebuffer, where the RAMDAC can display it.
  9. For a double buffering stratgy that allows you to run both stages simultanously (but for different frames, i.e. binning frame N+1 while rendering frame N), you need double sets of buffers for the display list and the tile bins, as well as double frame buffers to avoid rendering artifacts to be visible on the screen. The following diagram shows which tile bin set and frame buffer to use to avoid conflict:
  10. Frame # Bin to TB # Render from TB # Render to FB # Display FB #
  11. 1 1
  12. 2 2 1 1
  13. 3 1 2 2 1
  14. 4 2 1 1 2
  15. 5 1 2 2 1
  16. 6 2 1 1 2
  17. etc. As you can see, there is a two frame latency, i.e. frame 1 will not be visible on screen until frame 3 is being generated.
  18.  
  19.  
  20.  
  21. There are 8 megabytes of video memory, located in memory area 1 (see the memory map). This memory is organized as two banks of 32×1Mbit each, and depending on the value of address bit 24 they can either be accessed sequentially as 32 bit memory, or parallelly as 64 bit memory. In both cases, you get 8 megabytes of continuous address space, but the correspondence of address to memory cell is slightly different, as this figure shows:
  22. 32 bit interface 64 bit interface
  23. 0xA57FFFFC Bank 2
  24. .
  25. .
  26. .
  27. 0xA5400000
  28. 0xA53FFFFC Bank 1
  29. .
  30. .
  31. .
  32. 0xA5000000
  33. 0 ... 31
  34. 0xA47FFFF8 Bank 1 Bank 2
  35. .
  36. .
  37. .
  38. 0xA4000000
  39. 0 ... 31 32 ... 63
  40. So, the bytes 0xA4000000-0xA4000003 correspond to 0xA5000000-0xA5000003, 0xA4000004-0xA4000007 to 0xA5400000-0xA5400003, 0xA4000008-0xA400000B to 0xA5000004-0xA5000007 and so on. Both interfaces can handle 16-bit writes and up, 8-bit writes are not possible. It is possible to read any length of word, including 8-bit, though.
  41.  
  42. In the following register descriptions, an address specification using the 32 bit interface will be referred to as a 32 bit address, and an address specification using the 64 bit interface as a 64 bit address, although this should not be mistaken as the width of the actual address since both types of addresses are really 23 bits wide.
  43.  
  44.  
  45.  
  46. The addresses given here are to the P2 area, as the registers should of course be accessed without cache. The register descriptions are partly based on research done by bITmASTER and maiwe.
  47.  
  48.  
  49. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  50. Red Green Blue
  51. This register sets the solid colour displayed around the main display area.
  52.  
  53.  
  54. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  55. C COL SD DE
  56. C - Clock double enable
  57. Setting this bit doubles the pixel clock, giving a scan rate suitable for VGA monitors.
  58. COL - Colour mode select
  59. Selects the frame buffer pixel colour mode (all colour modes are little endian, e.g. in RGB888 the blue byte comes first)
  60. Value Colour mode Bytes per pixel
  61. 0 0 RGB555 2
  62. 0 1 RGB565 2
  63. 1 0 RGB888 3
  64. 1 1 RGB888 4
  65. SD - Scan Double enable
  66. Setting this bit makes each scan line be sent twice, allowing low resolutions in VGA mode.
  67. DE - Display Enable
  68. This bit must be set for any graphics to be display. If it is set to zero, only the border colour will be visible.
  69.  
  70.  
  71. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  72. 32-bit Address
  73. This sets the address in the video RAM of the first pixel displayed (top left). Address 0 means the first byte of the video RAM bank 1 (usually accessed as A5000000 from the CPU). The address must be longword aligned. This register is used for noninterlaced screens and the long field of interlaced screens.
  74.  
  75.  
  76. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  77. 32-bit Address
  78. Same as A05F8050, but used for the short field of interlaced screens.
  79.  
  80.  
  81. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  82. Modulo Lines per field Pixel data per line
  83. This register determines how much pixel data to display each field, and the modulo between each line of data.
  84. Modulo
  85. The number of 32-bit words to skip between each line, plus 1. I.e. a value of 1 means the lines are stored immediatelty after each other in memory.
  86. Lines per field
  87. How many lines of pixels to fetch and display each field, minus 1. Since this is per field and not per frame, it should be set to half the total vertical resolution (minus 1) in interlaced mode.
  88. Pixel data per line
  89. The number of 32-bit words of pixel data to fetch and display each line, minus 1. If you want X pixels per line, and each pixel is Y bytes, X*Y/4-1 is the correct value to write.
  90.  
  91.  
  92. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  93. Top Bottom
  94. This register defines two rasterlines on the screen, which when they are passed by the raster beam will generate a raster event (which optionally causes an interrupt). The rasterline for the "Top" raster event is typically set just above the display area, and the rasterline for the "Bottom" raster event just below the display area.
  95.  
  96.  
  97. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  98. VO BC I HP VP
  99. VO - Video Output enable
  100. Set to 1 to enable video output.
  101. I - Interlace
  102. Set to 1 to enable interlaced video.
  103. BC - Broadcast standard
  104. Used to select type of colour sync for composite video
  105. Value Broadcast standard
  106. 0 0 NTSC
  107. 0 1 PAL
  108. 1 0 PAL-M (?)
  109. 1 1 PAL-N (?)
  110. HP - H-sync polarity
  111. Set to 1 to for positive H-sync, 0 for negative H-sync.
  112. VP - V-sync polarity
  113. Set to 1 to for positive V-sync, 0 for negative V-sync.
  114.  
  115.  
  116. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  117. Start Stop
  118. This register selects the horizontal range in which the border colour is displayed. Left and right of this range, the border is displayed as black.
  119. Start
  120. The number of pixels from the horizontal sync where border display starts.
  121. Stop
  122. The number of pixels from the horizontal sync where border display ends.
  123.  
  124.  
  125. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  126. Vertical Horizontal
  127. This register selects the total number of lines and "pixels" (including lace) between each retrace. The horizontal and vertical refresh rate are determined by this register, and the pixel clock. For 50Hz (PAL), set V=624 H=863. For 60Hz (NTSC/VGA), set V=524 H=857. (Halve the V value for non-interlaced PAL/NTSC screens.)
  128.  
  129.  
  130. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  131. Start Stop
  132. This register selects the vertical range in which the border colour is displayed. Above and below this range, the border is displayed as black.
  133. Start
  134. The number of scanlines from the vertical sync where border display starts.
  135. Stop
  136. The number of scanlines from the vertical sync where border display ends.
  137.  
  138.  
  139. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  140. N LR
  141. Misc additional video settings.
  142. N
  143. Unknown. Set to 22.
  144. LR
  145. Low-res; setting this bit makes each pixel be output twice, effectively giving a 320 pixel horizontal resolution.
  146.  
  147.  
  148. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  149. Horizontal pos
  150. This register sets the distance from the horizontal sync to where pixel display starts.
  151.  
  152.  
  153. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  154. Vertical pos 2 Vertical pos 1
  155. This register sets the distance (in scanlines) from the vertical sync to where pixel display starts.
  156. Vertical pos 1
  157. This value is used for noninterlaced screens and the long fields of interlaced screens
  158. Vertical pos 2
  159. This value is used for the short fields of interlaced screens
  160.  
  161.  
  162. The addresses given here are to the P2 area, as the registers should of course be accessed without cache.
  163.  
  164.  
  165. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  166. 32-bit Address
  167. This sets the address of the Tile Bin array to which the Tile Accelerator should perform its binning. 64 bytes of memory per tile will be used at the video memory address pointed out by this register.
  168.  
  169.  
  170. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  171. 32-bit Address
  172. This sets the address of the compiled Display dist buffer to which the Tile Accelerator should output the processed primitives. The amount of memory needed depends on how large the scene is.
  173.  
  174.  
  175. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  176. Vertical Horizontal
  177. The size of the Tile Bin array in rows and columns.
  178. Vertical
  179. How many tiles high the Tile Bin array is, minus 1. Each tile is 32 pixels high.
  180. Horizontal
  181. How many tiles wide the Tile Bin array is, minus 1. Each tile is 32 pixels wide.
  182.  
  183.  
  184. The addresses given here are to the P2 area, as the registers should of course be accessed without cache.
  185.  
  186.  
  187. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  188. 32-bit Address
  189. The address of the compiled Display list created by the Tile Accelerator which contains the primitives for the scene.
  190.  
  191.  
  192. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  193. 32-bit Address
  194. The address of a structure describing the location and clipping(?) of each tile on the screen, as well as pointers to the respective Tile Bin buffers. This structure has to be created before any rendering can be done, but can be reused in subsequent renders using the same set of Tile Bin buffers.
  195.  
  196.  
  197. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  198. Modulo
  199. The modulo of the frame buffer to which rendering is to take place, in bytes / 8.
  200.  
  201.  
  202. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  203. TH D COL
  204. The pixel format of the frame buffer to which rendering is to take place.
  205. TH - Alpha threshold
  206. Set this to control the alpha threshold level when output colour mode is ARGB1555.
  207. D - Dither enable
  208. Setting this bit enables dithering in highcolour modes.
  209. COL - Colour mode select
  210. Selects the frame buffer pixel colour mode (all colour modes are little endian, e.g. in RGB888 the blue byte comes first)
  211. Value Colour mode Bytes per pixel
  212. 0 0 0 RGB555 2
  213. 0 0 1 RGB565 2
  214. 0 1 0 ARGB4444 2
  215. 0 1 1 ARGB1555 2
  216. 1 0 1 RGB888 4
  217. 1 1 0 ARGB888 4
  218.  
  219.  
  220. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  221. 32-bit Address
  222. The address of the frame buffer to which rendering is to take place. The coordinates for the individual tiles will be added as an offset to this base address.
  223.  
  224.  
  225. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
  226. COL
  227. The format of the entries in the palettes used by CLUT mode textures.
  228. COL - Colour mode select
  229. Value Colour mode Bytes per entry
  230. 0 0 ARGB1555 2
  231. 0 1 RGB565 2
  232. 1 0 ARGB4444 2
  233. 1 1 ARGB8888 4
  234.  
  235. Note that each palette entry always occupies 4 bytes of address space, even if only two bytes are used.
  236. Dreamcast Programming by Marcus Comstedt
  237. Last modified: Wed Apr 25 12:39:02 MEST 2001
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement