Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to ps_1_4 #58

Merged
merged 8 commits into from
Nov 23, 2017
Merged

Convert to ps_1_4 #58

merged 8 commits into from
Nov 23, 2017

Conversation

elishacloud
Copy link
Contributor

@elishacloud elishacloud commented Nov 23, 2017

Ok, I believe this is finally ready to check in. This has a few different updates in it, but the biggest change is the code to convert to ps_1_4. Let me describe the other changes first.

Sorry for the wall of text!

Code Changes:

1. Added some spaces to make the code more readable

See here and here for an example.

2. Changed the way Instruction/Arithmetic count is computed:

This is a realitivly small update but I changed from using int to using size_t and I changed from using atoi to using strtoul. This also changes from using a pointer and instead uses substr and c_str. See here and here.

3. Checks the destination register before updating the constant modifier:

Some games use a pixel shader that uses the same register for both the source and destination register along with a modifier on the constant. Example:

  + mul r0.w, 1-r0.w, 1-c0.w

Notice how r0 is used for both the source and destination.

The code used to change it to:

    mov r0, c0 /* added line*/
  + mul r0.w, 1-r0.w, 1-r0.w /* changed c0 to r0 */

This is obviously wrong, and I think it was part of the contributor to issue #44. Updated to fix this here.

4. Added code to convert to ps_1_4:

The logic here is fairly straight forward. Rather than describing it all (since you can just review the code), let me put a few notes about the logic:

  • It only tries to convert to ps_1_4 if it is unable to fix all the constant modifiers with an exact conversion. Only three games I tested had this issue: Silent Hill 2, Star Wars Republic Commando and True Crime New York City. More about these games in the testing section.
  • It only converts ps_1_0, ps_1_1, ps_1_2 and ps_1_3.
  • It won't attempt conversion if unsupported instructions are used, namely: texbem, texcoord, texm3x3, texreg2rgb, etc.
  • The ps_1_4 conversion is done using a temporary string and assembly is tested before updating the SourceCode string. If assembly fails it keeps the previous logic.
  • The SourceCode and ArithmeticCount variables used for ps_1_4 are created before the constant modifiers are updated because the ps_1_4 conversion has better logic. However the check to see if ps_1_4 conversion is needed is done after the constant modifiers are updated because there is no reason to convert to ps_1_4 if the modifiers can be fixed without it. The ps_1_4 conversion code is more complex and has a higher change of causing an issue.
  • ps_1_4 has the same arithmetic count limitation of 8, however it has two phases and can use can use 8 arithmetic instructions in each phase.
  • ps_1_4 allows for 6 temporary registers not 2 like other ps_1_x versions. However texture registers cannot be used outside of the texld type functions so some of the temporary registers need to be use for the texture registers.
  • Most of the arithmetic instructions need to be in the second phase. However at least one instruction needs to be in the first phase. In some cases if the texld instruction or the mov instruction is put in the first phase it will not assemble. So the code prefers to add all these functions to the second phase.
  • If the code is unable to put all instructions in the second phase it creates a def instruction with an unused constant to ensure that there is at least one instruction in the first phase.
  • All def instructions must be in the first phase.
  • All texld instructions must come before any instruction that uses any arithmetic count, but only in that phase.
  • All temporary registers are updated so that the numbers are different than what is being used for the texture registers. For example if t1 is used than r1 would need to be changed to something like r2 or r3.
  • Texture registers always use the same number as their corresponding temporary registers. For example r0 is always used to hold texture t0, if t0 exists.

Basic conversion logic:

  • The ps_1_4 conversion code will break the SourceCode up line-by-line and evaluate each line independently.
  • There is special code handling ps_x_x lines and the def and tex instructions.
    • ps_x_x lines are ignored.
    • def lines are added to phase 1.
    • tex lines are converted to texld.
  • There is special code to handle modifiers on constants.
    • If the same constant is used more than once with modifiers then the code will try and find an unused register and assign the constant to that register and then replace all the constants with that register.
    • Otherwise the code will just try use a register that has not yet been used yet and replace only this one instance of the constant. This allows the register to be reused later.
  • Most of the remaining code has to do with converting texture registers into temporary registers.
    • Once a texture register is used for the last time then the destination register needs to be changed to match the texture register number. However, if the destination register is used later on in the code then it will only update the destination register the very last time it is used, so in this case it gets stored in a vector until it is needed. All updates to a register need to be done before assigning it back to the texture register.
    • If a texture register is used for the last time and the destination register is being used for the first time then destination register gets changed to match the texture register number, but only if it is not already a texture register number.
    • Co-issued commands are counted as part of the current line so that when checking if the texture is used it will ignore the next line if it is a co-issued line.
    • If a texture register is used for the last time and there are multiple different texture registers being using in the same line then it will error out and stop the conversion. I could not figure out how to convert to ps_1_4 in this case. Note: as far as I can tell ENBSeries converter seems to crash in this case. More details below in the testing section. See section about Star Wars Republic Commando.

Testing:

I tested this with the following games:

  • Grand Theft Auto 3
  • Grand Theft Auto Vice City
  • Haegemonia Legions of Iron
  • Haegemonia The Solon Heritage
  • Hitman 2 Silent Assassin
  • Max Payne
  • Max Payne 2
  • Need for Speed III
  • Need For Speed Hot Pursuit 2
  • Raymond 3
  • Serious Sam The FirstEncounter
  • Serious Sam The Second Encounter
  • Silent Hill 2
  • Star Wars Republic Commando
  • Star Wars Starfighter
  • True Crime New York City
  • Warcraft 3

However because ps_1_4 conversion is only attempted if an exact conversion cannot be done without it, only three of the games actually use this code. All other games either don't use pixel shaders or can do exact conversion without needing ps_1_4.

1. Silent Hill 2

This should be a complete fix for Silent Hill 2 pixel shader issues. I was able to do an exact conversion. It also fixes issue #56.

Here is what the conversion looks like for one of the pixel shaders:

Previous d3d8to9 conversion

    ps_1_1
    tex t0
    tex t1
    tex t2
    dp3_sat r0, t2_bx2, c1 /* removed modifier _bx2 */
    dp3 r1.xyz, t2_bx2, v1_bx2
  + sub r0.w, r0, c2.w /* changed 'add' to 'sub' removed modifier - */
    mul_sat r1.xyz, r1, t0.w
    mul_sat r0.xyz, r0.w, r1
    mul r1.xyz, t1, t0.w
    mad_sat r0.xyz, r0, c2, v0
    mul r0.xyz, r0, t0
  + mov r0.w, t0.w
    mad r0.xyz, r1, c3, r0

ENBSeries conversion

    ps_1_4
    texld r1, t1
    texld r2, t2
    mov r3, c1
    phase
    texld r0, t0
    dp3_sat r3, r2_bx2, r3_bx2
    dp3 r2.xyz, r2_bx2, v1_bx2
  + sub r3.w, r3.x, c2.w
    mul_sat r2.xyz, r2, r0.w
    mul_sat r3.xyz, r3.w, r2
    mul r2.xyz, r1, r0.w
    mad_sat r3.xyz, r3, c2, v0
    mul r3.xyz, r3, r0
    mad r0.xyz, r2, c3, r3

Updated d3d8to9 conversion

    ps_1_4 /* converted */
    mov r3, c1
    phase
    texld r2, t2
    texld r1, t1
    texld r0, t0
    dp3_sat r3, r2_bx2, r3_bx2
    dp3 r2.xyz, r2_bx2, v1_bx2
  + sub r3.w, r3, c2.w /* changed 'add' to 'sub' removed modifier - */
    mul_sat r2.xyz, r2, r0.w
    mul_sat r3.xyz, r3.w, r2
    mul r2.xyz, r1, r0.w
    mad_sat r3.xyz, r3, c2, v0
    mul r3.xyz, r3, r0
  + mov r3.w, r0.w
    mad r0.xyz, r2, c3, r3

2. True Crime New York City

This update also fixes True Crime New York City pixel shader issues. Previous fixes for #44 would simply remove constant modifiers.

The ENBSeries converter also tries to convert pixel shaders to ps_1_4, but seems to crash with this game.

Previous d3d8to9 conversion

    ps_1_1
    tex t0
    tex t1
    mul r0, v0, t0
    mul r1, v1, t1
    mul r0.xyz, r0, r0.w
    lrp r0.xyz, c0.w, r1, r0
  + mul r0.w, 1-r0.w, c0.w /* removed modifier 1- */

Updated d3d8to9 conversion

    ps_1_4 /* converted */
    def c1, 0, 0, 0, 0 /* added line */
    phase
    texld r1, t1
    texld r0, t0
    mov r2, c0
    mul r0, v0, r0
    mul r1, v1, r1
    mul r0.xyz, r0, r0.w
    lrp r0.xyz, c0.w, r1, r0
  + mul r0.w, 1-r0.w, 1-r2.w

3. Star Wars Republic Commando

Unfortunately the code is not able to convert Star Wars Republic Commando to ps_1_4 because the pixel shaders here are already using 8 arithmetic instructions in phase 2 but they also require the mov instruction to be in phase 2. In addition the pixel shaders here use more than one texture register in the same instruction, which I am not sure how to covert to ps_1_4. Any attempted conversion here seems to cause incorrect texture lighting.

The pixel shader ps_1_4 conversion errors out here for this game and it continues using the previous logic. However, the game is fully playable and has no ill side effects (that I could find), other than the modifier is removed from a constant. See below pixel shader code.

The ENBSeries converter also crashes with this game. It seems that there is no easy way to convert these pixel shaders to ps_1_4.

Note: This is the only game of the ones above I tested that I am unable to do exact conversion from the d3d8 pixel shader to the d3d9 pixel shader.

Previous/current d3d8to9 conversion

    ps_1_1
    tex t0
    tex t1
    tex t2
    tex t3
    dp3_sat r0, t1_bx2, t2_bx2
    dp3_sat r1, t1_bx2, t3_bx2
    mad r0.xyz, r0, v1, v0
    sub_x2_sat r1, r1.w, c0 /* changed 'mad' to 'sub' removed r1.w removed modifier - */
    mul_sat r1, r1, v1
    mul r1, r1, t1.w
    lrp r0, t0.w, t0, r0
    mad_sat r0, r0, t0, r1

Error when trying to convert to ps_1_4

> Failed to convert shader to ps_1_4
> Dumping translated shader assembly:

    ps_1_4 /* converted */
    mov r4, c0
    phase
    texld r3, t3
    texld r2, t2
    texld r1, t1
    texld r0, t0
    dp3_sat r2, r1_bx2, r2_bx2
    dp3_sat r3, r1_bx2, r3_bx2
    mad r2.xyz, r2, v1, v0
    mad_x2_sat r3, r3.w, r3.w, -r4
    mul_sat r3, r3, v1
    mul r3, r3, r1.w
    lrp r2, r0.w, r0, r2
    mad_sat r2, r2, r0, r3

// approximately 12 instruction slots used (4 texture, 8 arithmetic)

> Failed to reassemble shader:

D:\Games\Star Wars - Republic Commando\GameData\System\memory(11,5): error X5119: Read of uninitialized component(*) in r4: r/x/0 g/y/1 b/z/2 *a/w/3. Note that an unfortunate effect of the phase marker earlier in the shader is that the moment it is encountered in certain hardware, values previously written to alpha in any r# register, including the one noted here, are lost. In order to read alpha from an r# register after the phase marker, write to it first.

@crosire
Copy link
Owner

crosire commented Nov 23, 2017

Amazing work as always!

@crosire crosire merged commit b873c6d into crosire:master Nov 23, 2017
@elishacloud elishacloud deleted the Convert-to-ps_1_4 branch November 23, 2017 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants