Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- PH3NOM
- Post subject: Re: 3D Z Clipping For the PVR + OpenGL
- PostPosted: Sun Mar 09, 2014 7:12 pm
- Offline
- DC Developer
- DC Developer
- User avatar
- Joined: Fri Jun 18, 2010 7:29 pm
- Posts: 359
- moribus wrote:
- Any news about this project ? :)
- Thanks for the interest guys!
- Yes, I am still at work on the finishing touches.
- I decided to re-write some of the lighting code in pure SH4 assembly to get some speed gains.
- Good news, the lighting code is now able to produce over 2 times the throughput as before using c with inline asm.
- Spoiler: hide
- sh4_light.S
- Code:
- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
- ! SH4 Assembly Light Code (C) 2014 Josh PH3NOM Pearson
- ! Computes Diffuse and Attenuation Factors
- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
- .globl _sh4_light_3f
- !void sh4_light_3f( void *glLight, void *vertex6f, void *out )
- !fr0 = vertex x = dr0 = fv0
- !fr1 = vertex y
- !fr2 = vertex z = dr2
- !fr3 = vertex w
- !fr4 = light x = dr4 = fv4
- !fr5 = light y
- !fr6 = light z = dr6
- !fr7 = light w
- !fr8 = normal x = dr8 = fv8
- !fr9 = normal y
- !fr10 = normal z = dr10
- !fr11 = normal w
- !fr12 = misc x = dr12 = fv12
- !fr13 = misc y
- !fr14 = misc z = dr14
- !fr15 = misc w
- !r0 = boolean flag1
- !r1 = boolean flag2
- !r2 = boolean flag3
- !r3 =
- !r4 = [arg][void*] glLight Structure
- !r5 = [arg][void*] float3 Vertex Position / Normal
- !r6 = [arg][void*] float3 Output Write Address FOR D, A light factors
- _sh4_light_3f:
- mov #0, r1 ! boolean flag2
- fmov @r5+, fr0 ! load vertex to fv0
- fmov @r5+, fr1 !
- fmov @r5+, fr2 !
- fldi0 fr3 ! load 0 for w
- fmov @r4+, fr4 ! load light position to fv4
- fmov @r4+, fr5 !
- fmov @r4+, fr6 !
- fmov @r4+, fr7 !
- fcmp/gt fr15, fr7 ! light w component set = spot light
- bt .SPOTL1s
- bf .LIGHT1s
- .SPOTL1s: ! Spotlight Calculations
- fmov fr0, fr8 ! copy vertex position to fv12
- fmov fr1, fr9 !
- fmov fr2, fr10 !
- fldi0 fr11 ! load 0 for w
- fsub fr4, fr8 ! vetex-to-light vector
- fsub fr5, fr9 !
- fsub fr6, fr10 !
- fipr fv8, fv8 ! Normalize vertex-to-light vector
- fsqrt fr11
- fcmp/gt fr3, fr11 ! Check for divide-by-zero
- bt .DIV3Fsl
- bf .Csl
- .DIV3Fsl:
- fdiv fr11, fr8
- fdiv fr11, fr9
- fdiv fr11, fr10
- .Csl:
- mov #1, r1 ! flag input light has read next 4 float vector
- fmov @r4+, fr12 ! load spot light direction to fv8
- fmov @r4+, fr13 !
- fmov @r4+, fr14 !
- fldi0 fr11
- fldi0 fr15
- fipr fv12, fv8 ! fr11 now holds the cosDir of vertex-to-light
- fmov @r4+, fr15 ! fr15 now holds spot light cutoff
- fcmp/gt fr15, fr11 ! if cosDir > spotCutOff, vertex gets no light
- bf .RET0s
- .LIGHT1s:
- fsub fr0, fr4 ! fv4 = L = normalize(light pos - vertex pos)
- fsub fr1, fr5
- fsub fr2, fr6
- fldi0 fr7
- fipr fv4, fv4
- fsqrt fr7
- fcmp/gt fr3, fr7 ! Check for divide-by-zero
- bt .DIV3Fl
- bf .Cl
- .DIV3Fl:
- fdiv fr7, fr4
- fdiv fr7, fr5
- fdiv fr7, fr6
- .Cl:
- fmov @r5+, fr8 ! fv8 = N = vertex normal
- fmov @r5+, fr9
- fmov @r5, fr10
- fldi0 fr11 ! load 0 for N w
- fmov fr7, fr3 ! store L vector length to fr3
- fldi0 fr7 ! load 0 for L w
- fipr fv8, fv4 ! N dot L
- fcmp/gt fr3, fr7 ! fr7 = Diffuse Mag >= 0 ?
- bf .RET0s ! Diffuse Mag < 0, return 0
- fmov fr3, fr7 ! restore L vector length from fr3
- fldi0 fr3
- mov #0, r2 ! Compute Attenuation Factor
- cmp/gt r2, r1 ! check boolean flag for light read pos
- bf .READATT1 ! this means light is not a spot light
- bt .READATT0 ! this means light is a spot light
- .READATT1:
- fmov @r4+, fr12 ! read past spot light direction x
- fmov @r4+, fr12 ! read past spot light direction y
- fmov @r4+, fr12 ! read past spot light direction z
- fmov @r4+, fr12 ! read past spot light CutOff
- .READATT0:
- fmov @r4+, fr12 ! load Kc light attenuation factor
- fmov @r4+, fr13 ! load Kl light attenuation factor
- fmov @r4+, fr14 ! load Kq light attenuation factor
- fmov @r4+, fr15 ! light exponent - not implemented
- ! fr13 = Attenuation = 1.0f / (light->Kc + light->Kl * d + light->Kq * d * d);
- fmul fr7, fr13 ! light->Kl * d
- fmul fr7, fr14 ! light->Kq * d * d
- fmul fr7, fr14
- fadd fr13, fr12
- fadd fr14, fr12
- fldi1 fr13
- fdiv fr12, fr13 ! finsh Attenuation calculation
- .RET0s:
- fmov fr7, @r6 ! Write D(Diffuse) factor to output
- fmov @r6+, fr7 ! Move write address
- fmov fr13, @r6 ! write A(Attenuation) factor to output
- rts
- nop
- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
- EDIT:
- As that was the first assembly piece I've ever written, I have realized some problems with that code.
- I have since corrected the branching errors, and optimized a few things.
- Here is my current code that is producing correct results while being much faster than the C/inline asm implementation:
- Spoiler: hide
- Code:
- !int sh4_l3f( void * glLight, void * vertex6f, void * out );
- !fv0 = vertex position
- !fv4 = light position
- !fv8 = spotlight,normal
- !fv12 = misc
- !r0 = return value
- !r1 = boolean flag r1
- !r4 = [arg][void*] glLight Structure
- !r5 = [arg][void*] float6f Vertex Position / Normal
- !r6 = [arg][void*] float3 Output Write Address FOR D, A light factors
- .globl _sh4_l3f
- _sh4_l3f:
- mov #0, r0 ! set return value r0 to 0
- mov #0, r1 ! set boolean flag r1 to 0
- fschg ! switch to double precision floats
- fmov @r5+, dr0 ! load vertex x,y position into fv0
- fmov @r4+, dr4 ! load light position x,y,z,w into fv4
- fmov @r4+, dr6
- fschg ! switch back to single precision floats
- fmov @r5+, fr2 ! load vertex z position to fr2
- fldi0 fr15 ! load 0 to fr15 to use for fcmp/gt 0
- fcmp/gt fr15, fr7 ! check light w component - fr7 set to 1 = gl spot light
- bf .LIGHT1
- .SPOTLIGHT1: ! Handle Spot Light Calculations
- fschg ! switch to double precision floats
- fmov dr0, dr8 ! copy vertex postion to fv8 to hold normalized P->L vector
- fmov dr2, dr10
- fschg ! switch back to single precision floats
- fsub fr4, fr8 ! sub3f light position from vertex position for P->L vector
- fsub fr5, fr9
- fsub fr6, fr10
- fipr fv8, fv8 ! normalize P->L vector
- fsqrt fr11
- fcmp/gt fr15, fr11
- bf .SPOT1 ! normalized P->L w less than 0 - skip division
- fdiv fr11, fr8 ! div3f for P->L normalization
- fdiv fr11, fr9
- fdiv fr11, fr10
- .SPOT1: ! branch div3f for P->L normalization
- mov #1, r1 ! set boolean flag r1 to 1 - Indicate BitStream Has Moved forward a vector3f
- fmov @r4+, fr12 ! load spot light direction
- fmov @r4+, fr13
- fmov @r4+, fr14
- fldi0 fr11 ! load 0 for P->L vector w component
- fipr fv12, fv8 ! P->L dot Light Dir || fr11 now holds light cosDir
- fmov @r4+, fr15 ! load light cutOff to fr15
- fcmp/gt fr15, fr8 ! If cosDir > cutOff, vertex is outside of spot light, return 0
- bt .RETURN0
- .LIGHT1: ! process vertex lighting
- fsub fr0, fr4 ! transform Light position into L vector ( normalize(Lp-Vp) )
- fsub fr1, fr5
- fsub fr2, fr6
- fldi0 fr7 ! load 0 for L vector w component
- fldi0 fr15 ! load 0 to fr15 for fcmp/gt 0 comparison
- fipr fv4, fv4 ! normalize L vector
- fsqrt fr7
- fmov fr7, fr3 ! copy L vector length to fr3
- fcmp/gt fr15, fr7
- bf .LIGHTN ! normalized L w less than 0 - skip division
- fdiv fr7, fr4 ! div3f for L normalization
- fdiv fr7, fr5
- fdiv fr7, fr6
- .LIGHTN: ! branch past L w division - load normal
- fmov @r5+, fr8 ! load vertex normal to fv8
- fmov @r5+, fr9
- fmov @r5+, fr10
- fldi0 fr11 ! load 0 for vertex normal w component
- fldi0 fr7 ! load 0 for L w component
- fipr fv8, fv4 ! N dot L || fr7 now holds Diffuse Magnitude
- fcmp/gt fr15, fr7
- bf .RETURN0 ! Diffuse Mag < 0 - Return 0
- fmov fr7, @r6 ! write Diffuse Mag to output
- add #4, r6
- cmp/gt r0, r1 ! check if spot light read past vector3f light dir
- bt .READATTEN1
- add #16, r4 ! if light is not spot light, read past spot factors
- .READATTEN1: ! compute attenuation factors
- fmov @r4+, fr12 ! Kc load light attenuation factors to fv12
- fmov @r4+, fr13 ! Kl
- fmov @r4+, fr14 ! Kq
- fldi1 fr15 ! load 1 to fr15
- fmul fr3, fr13 ! perform attenuation calculations
- fmul fr3, fr14
- fmul fr3, fr14
- fadd fr13, fr12
- fadd fr14, fr12
- fdiv fr12, fr15 ! 1.0f / Kc + Kl * d + Kq * d * d
- fmov fr15, @r6 ! Write Attenuation Factor to output
- .RETURN1: ! vertex recieves light, diffuse and attenuation are written to output
- mov #1, r0
- rts
- nop
- .RETURN0: ! vertex recieves no light, return 0
- fldi0 fr3
- fmov fr3, @r6
- add #4, r6
- fmov fr3, @r6
- mov #0, r0
- rts
- nop
- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement