Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | shader: Remove old shader management | ReinUsesLisp | 2021-07-23 | 28 | -4919/+0 |
| | |||||
* | Review 1 | Kelebek1 | 2021-02-15 | 1 | -1/+1 |
| | |||||
* | Implement texture offset support for TexelFetch and TextureGather and add offsets for Tlds | Kelebek1 | 2021-02-15 | 2 | -2/+10 |
| | | | | Formatting | ||||
* | video_core: Reimplement the buffer cache | ReinUsesLisp | 2021-02-13 | 1 | -0/+1 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reimplement the buffer cache using cached bindings and page level granularity for modification tracking. This also drops the usage of shared pointers and virtual functions from the cache. - Bindings are cached, allowing to skip work when the game changes few bits between draws. - OpenGL Assembly shaders no longer copy when a region has been modified from the GPU to emulate constant buffers, instead GL_EXT_memory_object is used to alias sub-buffers within the same allocation. - OpenGL Assembly shaders stream constant buffer data using glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In theory this should save one hash table resolve inside the driver compared to glBufferSubData. - A new OpenGL stream buffer is implemented based on fences for drivers that are not Nvidia's proprietary, due to their low performance on partial glBufferSubData calls synchronized with 3D rendering (that some games use a lot). - Most optimizations are shared between APIs now, allowing Vulkan to cache more bindings than before, skipping unnecesarry work. This commit adds the necessary infrastructure to use Vulkan object from OpenGL. Overall, it improves performance and fixes some bugs present on the old cache. There are still some edge cases hit by some games that harm performance on some vendors, this are planned to be fixed in later commits. | ||||
* | half_set: Resolve -Wmaybe-uninitialized warnings | Lioncash | 2020-12-30 | 1 | -7/+7 |
| | |||||
* | video_core: Rewrite the texture cache | ReinUsesLisp | 2020-12-30 | 2 | -32/+35 |
| | | | | | | | | | | | | | | The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage.The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage. This commit aims to address those issues. | ||||
* | video_core: Remove unnecessary enum class casting in logging messages | Lioncash | 2020-12-07 | 9 | -48/+38 |
| | | | | | | | fmt now automatically prints the numeric value of an enum class member by default, so we don't need to use casts any more. Reduces the line noise a bit. | ||||
* | video_core: Resolve more variable shadowing scenarios pt.3 | Lioncash | 2020-12-05 | 1 | -3/+4 |
| | | | | | Cleans out the rest of the occurrences of variable shadowing and makes any further occurrences of shadowing compiler errors. | ||||
* | video_core: Resolve more variable shadowing scenarios pt.2 | Lioncash | 2020-12-05 | 2 | -10/+10 |
| | | | | | | | Migrates the video core code closer to enabling variable shadowing warnings as errors. This primarily sorts out shadowing occurrences within the Vulkan code. | ||||
* | Merge pull request #3681 from lioncash/component | Rodrigo Locatti | 2020-11-24 | 1 | -2/+2 |
|\ | | | | | decoder/image: Fix incorrect G24R8 component sizes in GetComponentSize() | ||||
| * | decode/image: Fix typo in assert in GetComponentSize() | Lioncash | 2020-04-16 | 1 | -3/+3 |
| | | |||||
| * | decoder/image: Fix incorrect G24R8 component sizes in GetComponentSize() | Lioncash | 2020-04-16 | 1 | -2/+2 |
| | | | | | | | | The components' sizes were mismatched. This corrects that. | ||||
* | | Merge pull request #4854 from ReinUsesLisp/cube-array-shadow | bunnei | 2020-11-06 | 1 | -1/+0 |
|\ \ | | | | | | | shader: Partially implement texture cube array shadow | ||||
| * | | shader: Partially implement texture cube array shadow | ReinUsesLisp | 2020-10-28 | 1 | -1/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | This implements texture cube arrays with shadow comparisons but doesn't fix the asserts related to it. Fixes out of bounds reads on swizzle constructors and makes them use bounds checked ::at instead of the unsafe operator[]. | ||||
* | | | shader/arithmetic: Implement FCMP immediate + register variant | ReinUsesLisp | 2020-10-28 | 1 | -1/+2 |
|/ / | | | | | | | Trivially add the encoding for this. | ||||
* | | shader/texture: Implement CUBE texture type for TMML and fix arrays | ReinUsesLisp | 2020-10-08 | 1 | -19/+22 |
| | | | | | | | | | | | | | | | | TMML takes an array argument that has no known meaning, this one appears as the first component in gpr8 followed by s, t and r. Skip this component when arrays are being used. Also implement CUBE texture types. - Used by Pikmin 3: Deluxe Demo. | ||||
* | | arithmetic_integer_immediate: Make use of std::move where applicable | Lioncash | 2020-09-24 | 1 | -16/+19 |
| | | | | | | | | | | Same behavior, minus any redundant atomic reference count increments and decrements. | ||||
* | | Merge pull request #4672 from lioncash/narrowing | Rodrigo Locatti | 2020-09-17 | 1 | -1/+1 |
|\ \ | | | | | | | decoder/texture: Eliminate narrowing conversion in GetTldCode() | ||||
| * | | decoder/texture: Eliminate narrowing conversion in GetTldCode() | Lioncash | 2020-09-17 | 1 | -1/+1 |
| | | | | | | | | | | | | The assignment was previously truncating a u64 value to a bool. | ||||
* | | | decode/image: Eliminate switch fallthrough in DecodeImage() | Lioncash | 2020-09-17 | 1 | -0/+1 |
|/ / | | | | | | | | | Fortunately this didn't result in any issues, given the block that code was falling through to would immediately break. | ||||
* | | video_core: Enforce -Werror=switch | ReinUsesLisp | 2020-09-16 | 2 | -4/+13 |
| | | | | | | | | This forces us to fix all -Wswitch warnings in video_core. | ||||
* | | shader/memory: Amend UNIMPLEMENTED_IF_MSG without a message | Lioncash | 2020-08-14 | 1 | -1/+2 |
| | | | | | | | | | | We need to provide a message for this variant of the macro, so we can simply log out the type being used. | ||||
* | | General: Tidy up clang-format warnings part 2 | Lioncash | 2020-08-13 | 1 | -3/+3 |
| | | |||||
* | | Merge pull request #4391 from lioncash/nrvo | bunnei | 2020-07-24 | 3 | -20/+20 |
|\ \ | | | | | | | video_core: Allow copy elision to take place where applicable | ||||
| * | | video_core: Allow copy elision to take place where applicable | Lioncash | 2020-07-21 | 3 | -20/+20 |
| | | | | | | | | | | | | | | | Removes const from some variables that are returned from functions, as this allows the move assignment/constructors to execute for them. | ||||
* | | | Merge pull request #4361 from ReinUsesLisp/lane-id | Rodrigo Locatti | 2020-07-21 | 1 | -2/+1 |
|\ \ \ | | | | | | | | | decode/other: Implement S2R.LaneId | ||||
| * | | | decode/other: Implement S2R.LaneId | ReinUsesLisp | 2020-07-16 | 1 | -2/+1 |
| |/ / | | | | | | | | | | | | | | | | This maps to host's thread id. - Fixes graphical issues on Paper Mario. | ||||
* / / | video_core: Rearrange pixel format names | ReinUsesLisp | 2020-07-13 | 1 | -27/+27 |
|/ / | | | | | | | | | | | Normalizes pixel format names to match Vulkan names. Previous to this commit pixel formats had no convention, leading to confusion and potential bugs. | ||||
* | | Merge pull request #4147 from ReinUsesLisp/hset2-imm | bunnei | 2020-06-27 | 1 | -21/+67 |
|\ \ | | | | | | | shader/half_set: Implement HSET2_IMM | ||||
| * | | shader/half_set: Implement HSET2_IMM | ReinUsesLisp | 2020-06-23 | 1 | -21/+67 |
| | | | | | | | | | | | | | | | | | | Add HSET2_IMM. Due to the complexity of the encoding avoid using BitField unions and read the relevant bits from the code itself. This is less error prone. | ||||
* | | | decode/image: Implement B10G11R11F | Morph | 2020-06-20 | 1 | -9/+17 |
|/ / | | | | | | | - Used by Kirby Star Allies | ||||
* | | shader/texture: Join separate image and sampler pairs offline | ReinUsesLisp | 2020-06-05 | 1 | -18/+37 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Games using D3D idioms can join images and samplers when a shader executes, instead of baking them into a combined sampler image. This is also possible on Vulkan. One approach to this solution would be to use separate samplers on Vulkan and leave this unimplemented on OpenGL, but we can't do this because there's no consistent way of determining which constant buffer holds a sampler and which one an image. We could in theory find the first bit and if it's in the TIC area, it's an image; but this falls apart when an image or sampler handle use an index of zero. The used approach is to track for a LOP.OR operation (this is done at an IR level, not at an ISA level), track again the constant buffers used as source and store this pair. Then, outside of shader execution, join the sample and image pair with a bitwise or operation. This approach won't work on games that truly use separate samplers in a meaningful way. For example, pooling textures in a 2D array and determining at runtime what sampler to use. This invalidates OpenGL's disk shader cache :) - Used mostly by D3D ports to Switch | ||||
* | | Merge pull request #4016 from ReinUsesLisp/invocation-info | LC | 2020-06-02 | 1 | -1/+1 |
|\ \ | | | | | | | shader/other: Fix hardcoded value in S2R INVOCATION_INFO | ||||
| * | | shader/other: Fix hardcoded value in S2R INVOCATION_INFO | ReinUsesLisp | 2020-05-30 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Geometry shaders built from Nvidia's compiler check for bits[16:23] to be less than or equal to 0 with VSETP to default to a "safe" value of 0x8000'0000 (safe from hardware's perspective). To avoid hitting this path in the shader, return 0x00ff'0000 from S2R INVOCATION_INFO. This seems to be the maximum number of vertices a geometry shader can emit in a primitive. | ||||
* | | | shader/other: Implement MEMBAR.CTS | ReinUsesLisp | 2020-05-27 | 1 | -2/+12 |
|/ / | | | | | | | | | This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it. | ||||
* | | Merge pull request #3981 from ReinUsesLisp/bar | bunnei | 2020-05-26 | 1 | -0/+5 |
|\ \ | | | | | | | shader/other: Implement BAR.SYNC 0x0 | ||||
| * | | shader/other: Implement BAR.SYNC 0x0 | ReinUsesLisp | 2020-05-22 | 1 | -0/+5 |
| | | | | | | | | | | | | | | | Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here. | ||||
* | | | Merge pull request #3980 from ReinUsesLisp/red-op | bunnei | 2020-05-26 | 1 | -2/+1 |
|\ \ \ | | | | | | | | | shader/memory: Implement non-addition operations in RED | ||||
| * | | | shader/memory: Implement non-addition operations in RED | ReinUsesLisp | 2020-05-22 | 1 | -2/+1 |
| |/ / | | | | | | | | | | Trivially implement these instructions. They are used in Astral Chain. | ||||
* / / | shader/other: Implement thread comparisons (NV_shader_thread_group) | ReinUsesLisp | 2020-05-22 | 1 | -0/+21 |
|/ / | | | | | | | | | | | | | | | | | | | | | Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt | ||||
* | | shader_ir: Separate float-point comparisons in ordered and unordered | ReinUsesLisp | 2020-05-09 | 1 | -6/+6 |
| | | | | | | | | | | This allows us to use native SPIR-V instructions without having to manually check for NAN. | ||||
* | | Merge pull request #3693 from ReinUsesLisp/clean-samplers | bunnei | 2020-05-02 | 2 | -94/+116 |
|\ \ | | | | | | | shader/texture: Support multiple unknown sampler properties | ||||
| * | | shader/texture: Support multiple unknown sampler properties | ReinUsesLisp | 2020-04-23 | 1 | -51/+74 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows deducing some properties from the texture instruction before asking the runtime. By doing this we can handle type mismatches in some instructions from the renderer instead of the shader decoder. Fixes texelFetch issues with games using 2D texture instructions on a 1D sampler. | ||||
| * | | shader_ir: Turn classes into data structures | ReinUsesLisp | 2020-04-23 | 2 | -59/+58 |
| | | | |||||
* | | | shader/arithmetic_integer: Fix tracking issue in temporary | ReinUsesLisp | 2020-04-28 | 1 | -4/+0 |
| | | | | | | | | | | | | | | | This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented. It caused issues when tracking global buffers. | ||||
* | | | shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplemented | ReinUsesLisp | 2020-04-26 | 1 | -1/+6 |
| | | | | | | | | | | | | | | | IADD.X Rd.CC requires some extra logic that is not currently implemented. Abort when this is hit. | ||||
* | | | shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflow | ReinUsesLisp | 2020-04-26 | 1 | -2/+2 |
| | | | | | | | | | | | | | | | | | | Signed integer addition overflow might be undefined behavior. It's free to change operations to UAdd and use unsigned integers to avoid potential bugs. | ||||
* | | | shader/arithmetic_integer: Implement IADD.X | ReinUsesLisp | 2020-04-26 | 1 | -0/+6 |
| | | | | | | | | | | | | | | | IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers. | ||||
* | | | shader/arithmetic_integer: Implement CC for IADD | ReinUsesLisp | 2020-04-26 | 1 | -3/+19 |
| | | | |||||
* | | | decode/register_set_predicate: Implement CC | ReinUsesLisp | 2020-04-26 | 1 | -9/+14 |
| | | | | | | | | | | | | | | | | | | P2R CC takes the state of condition codes and puts them into a register. We already have this implemented for PR (predicates). This commit implements CC over that. | ||||
* | | | decode/register_set_predicate: Use move for shared pointers | ReinUsesLisp | 2020-04-26 | 1 | -16/+17 |
| | | | | | | | | | | | | Avoid atomic counters used by shared pointers. | ||||
* | | | Merge pull request #3734 from ReinUsesLisp/half-float-mods | bunnei | 2020-04-25 | 1 | -14/+37 |
|\ \ \ | | | | | | | | | decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits | ||||
| * | | | decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits | ReinUsesLisp | 2020-04-23 | 1 | -14/+37 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L68 That is itself tested against nvdisasm (Nvidia's official disassembler). | ||||
* | | | | Merge pull request #3749 from ReinUsesLisp/lea-imm | bunnei | 2020-04-24 | 1 | -2/+2 |
|\ \ \ \ | |_|/ / |/| | | | shader/arithmetic_integer: Fix LEA_IMM encoding | ||||
| * | | | shader/arithmetic_integer: Fix LEA_IMM encoding | ReinUsesLisp | 2020-04-21 | 1 | -2/+2 |
| |/ / | | | | | | | | | | | | | | | | | | | The operand order in LEA_IMM was flipped compared to nvdisasm. Fix that using nxas as reference: https://github.com/ReinUsesLisp/nxas/blob/8dbc38995711cc12206aa370145a3a02665fd989/table.h#L122 | ||||
* | | | decode/memory: Resolve unused variable warning | Lioncash | 2020-04-17 | 1 | -1/+1 |
| | | | | | | | | | | | | Only the first element of the returned pair is ever used. | ||||
* | | | decode/texture: Resolve unused variable warnings. | Lioncash | 2020-04-17 | 1 | -5/+7 |
| | | | | | | | | | | | | | | | | | | | | | | | | Some variables aren't used, so we can remove these. Unfortunately, diagnostics are still reported on structured bindings even when annotated with [[maybe_unused]], so we need to unpack the elements that we want to use manually. | ||||
* | | | decode/texture: Collapse loop down into std::generate | Lioncash | 2020-04-17 | 1 | -3/+1 |
| | | | | | | | | | | | | Same behavior, less code. | ||||
* | | | decode/texture: Eliminate trivial missing field initializer warnings | Lioncash | 2020-04-17 | 1 | -3/+4 |
|/ / | | | | | | | We can just specify the initializers. | ||||
* | | Merge pull request #3673 from lioncash/extra | bunnei | 2020-04-17 | 1 | -3/+8 |
|\ \ | | | | | | | CMakeLists: Specify -Wextra on linux builds | ||||
| * | | CMakeLists: Specify -Wextra on linux builds | Lioncash | 2020-04-16 | 1 | -3/+8 |
| |/ | | | | | | | | | | | | | | | | | | | | | Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well. | ||||
* / | decode/shift: Remove unused variable within Shift() | Lioncash | 2020-04-16 | 1 | -1/+0 |
|/ | | | | | Removes a redundant variable that is already satisfied by the IsFull() utility function. | ||||
* | Merge pull request #3612 from ReinUsesLisp/red | Fernando Sahmkow | 2020-04-15 | 1 | -43/+57 |
|\ | | | | | shader/memory: Implement RED.E.ADD and minor changes to ATOM | ||||
| * | shader/memory: Implement RED.E.ADD | ReinUsesLisp | 2020-04-06 | 1 | -1/+15 |
| | | | | | | | | | | | | | | | | Implements a reduction operation. It's an atomic operation that doesn't return a value. This commit introduces another primitive because some shading languages might have a primitive for reduction operations. | ||||
| * | shader/memory: Add "using std::move" | ReinUsesLisp | 2020-04-06 | 1 | -11/+13 |
| | | |||||
| * | shader/memory: Minor fixes in ATOM | ReinUsesLisp | 2020-04-06 | 1 | -32/+30 |
| | | |||||
* | | shader/arithmetic: Add FCMP_CR variant | ReinUsesLisp | 2020-04-15 | 1 | -1/+2 |
| | | | | | | | | Adds another variant of FCMP. | ||||
* | | Merge pull request #3619 from ReinUsesLisp/i2i | Mat M | 2020-04-13 | 1 | -13/+100 |
|\ \ | | | | | | | shader/conversion: Implement I2I sign extension, saturation and selection | ||||
| * | | shader/conversion: Implement I2I sign extension, saturation and selection | ReinUsesLisp | 2020-04-07 | 1 | -13/+100 |
| | | | | | | | | | | | | | | | | | | | | | | | | Reimplements I2I adding sign extension, saturation (clamp source value to the destination), selection and destination sizes that are not 32 bits wide. It doesn't implement CC yet. | ||||
* | | | Merge pull request #3633 from ReinUsesLisp/clean-texdec | Mat M | 2020-04-13 | 1 | -14/+0 |
|\ \ \ | | | | | | | | | shader/texture: Remove type mismatches management from shader decoder | ||||
| * | | | shader/texture: Remove type mismatches management from shader decoder | ReinUsesLisp | 2020-04-10 | 1 | -14/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit e22816a5bb we handle type mismatches from the CPU. We don't need to hack our shader decoder due to game bugs anymore. Removed in this commit. | ||||
* | | | | Merge pull request #3578 from ReinUsesLisp/vmnmx | Fernando Sahmkow | 2020-04-12 | 1 | -0/+58 |
|\ \ \ \ | |/ / / |/| | | | shader/video: Partially implement VMNMX | ||||
| * | | | shader/video: Partially implement VMNMX | ReinUsesLisp | 2020-04-12 | 1 | -0/+58 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance). | ||||
* | | | | Merge pull request #3601 from ReinUsesLisp/some-shader-encodings | bunnei | 2020-04-09 | 1 | -3/+9 |
|\ \ \ \ | | | | | | | | | | | video_core/shader: Add some instruction and S2R encodings | ||||
| * | | | | shader/other: Add error message for some S2R registers | ReinUsesLisp | 2020-04-04 | 1 | -0/+6 |
| | | | | | |||||
| * | | | | shader_bytecode: Rename MOV_SYS to S2R | ReinUsesLisp | 2020-04-04 | 1 | -3/+3 |
| | |_|/ | |/| | | |||||
* | | | | Merge pull request #3489 from namkazt/patch-2 | Rodrigo Locatti | 2020-04-07 | 1 | -11/+349 |
|\ \ \ \ | |_|_|/ |/| | | | shader: implement SULD.D bits32/64 | ||||
| * | | | address nit. | Nguyen Dac Nam | 2020-04-07 | 1 | -1/+1 |
| | | | | |||||
| * | | | Apply suggestions from code review | Nguyen Dac Nam | 2020-04-07 | 1 | -9/+9 |
| | | | | | | | | | | | | Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc> | ||||
| * | | | shader_decode: SULD.D using std::pair instead of out parameter | namkazy | 2020-04-06 | 1 | -17/+13 |
| | | | | |||||
| * | | | shader_decode: SULD.D avoid duplicate code block. | namkazy | 2020-04-06 | 1 | -39/+2 |
| | | | | |||||
| * | | | shader_decode: SULD.D fix conversion error. | namkazy | 2020-04-06 | 1 | -3/+3 |
| | | | | |||||
| * | | | shader_decode: SULD.D implement bits64 and reverse shader ir init method to removed shader stage. | namkazy | 2020-04-06 | 1 | -35/+92 |
| | | | | |||||
| * | | | silent warning (conversion error) | namkazy | 2020-04-05 | 1 | -3/+2 |
| | | | | |||||
| * | | | shader_decode: SULD.D -> SINT actually same as UNORM. | namkazy | 2020-04-05 | 1 | -5/+4 |
| | | | | |||||
| * | | | shader_decode: SULD.D fix decode SNORM component | namkazy | 2020-04-05 | 1 | -10/+9 |
| | | | | |||||
| * | | | clang-format | namkazy | 2020-04-05 | 1 | -2/+2 |
| | | | | |||||
| * | | | shader_decode: get sampler descriptor from registry. | namkazy | 2020-04-05 | 1 | -77/+93 |
| | | | | |||||
| * | | | tweaking. | namkazy | 2020-04-05 | 1 | -3/+3 |
| | | | | |||||
| * | | | cleanup unuse params | namkazy | 2020-04-05 | 1 | -8/+6 |
| | | | | |||||
| * | | | cleanup debug code. | namkazy | 2020-04-05 | 1 | -14/+3 |
| | | | | |||||
| * | | | reimplement get component type, uncomment mistaken code | namkazy | 2020-04-05 | 1 | -18/+93 |
| | | | | |||||
| * | | | remove disable optimize | namkazy | 2020-04-05 | 1 | -2/+0 |
| | | | | |||||
| * | | | [wip] reimplement SULD.D | namkazy | 2020-04-05 | 1 | -22/+229 |
| | | | | |||||
| * | | | clang-fix | Nguyen Dac Nam | 2020-04-05 | 1 | -1/+1 |
| | | | | |||||
| * | | | shader: image - import PredCondition | Nguyen Dac Nam | 2020-04-05 | 1 | -0/+1 |
| | | | | |||||
| * | | | shader: SULD.D bits32 implement more complexer method. | Nguyen Dac Nam | 2020-04-05 | 1 | -4/+28 |
| | | | | |||||
| * | | | shader: SULD.D import StoreType | Nguyen Dac Nam | 2020-04-05 | 1 | -0/+1 |
| | | | | |||||
| * | | | shader: implement SULD.D bits32 | Nguyen Dac Nam | 2020-04-05 | 1 | -11/+27 |
| |/ / | |||||
* | | | Merge pull request #3592 from ReinUsesLisp/ipa | Fernando Sahmkow | 2020-04-06 | 1 | -15/+21 |
|\ \ \ | |/ / |/| | | shader_decompiler: Remove FragCoord.w hack and change IPA implementation | ||||
| * | | shader_decompiler: Remove FragCoord.w hack and change IPA implementation | ReinUsesLisp | 2020-04-02 | 1 | -15/+21 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Credits go to gdkchan and Ryujinx. The pull request used for this can be found here: https://github.com/Ryujinx/Ryujinx/pull/1082 yuzu was already using the header for interpolation, but it was missing the FragCoord.w multiplication described in the linked pull request. This commit finally removes the FragCoord.w == 1.0f hack from the shader decompiler. While we are at it, this commit renames some enumerations to match Nvidia's documentation (linked below) and fixes component declaration order in the shader program header (z and w were swapped). https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html | ||||
* | | | shader/memory: Silence no return value warning | ReinUsesLisp | 2020-04-02 | 1 | -0/+3 |
|/ / | | | | | | | Silences a warning about control paths not all returning a value. | ||||
* | | Merge pull request #3561 from ReinUsesLisp/f2f-conversion | Fernando Sahmkow | 2020-03-31 | 1 | -5/+10 |
|\ \ | | | | | | | shader/conversion: Fix F2F rounding operations with different sizes | ||||
| * | | shader/conversion: Fix F2F rounding operations with different sizes | ReinUsesLisp | 2020-03-26 | 1 | -5/+10 |
| |/ | | | | | | | | | | | | | | | | | Rounding operations only matter when the conversion size of source and destination is the same, i.e. .F16.F16, .F32.F32 and .F64.F64. When there is a mismatch (.F16.F32), these bits are used for IEEE rounding, we don't emulate this because GLSL and SPIR-V don't support configuring it per operation. | ||||
* | | Merge pull request #3577 from ReinUsesLisp/lea | Fernando Sahmkow | 2020-03-31 | 1 | -11/+4 |
|\ \ | | | | | | | shader/lea: Fix LEA implementation | ||||
| * | | shader/lea: Simplify generated LEA code | ReinUsesLisp | 2020-03-28 | 1 | -3/+2 |
| | | | |||||
| * | | shader/lea: Fix op_a and op_b usages | ReinUsesLisp | 2020-03-27 | 1 | -2/+2 |
| | | | | | | | | | | | | They were swapped. | ||||
| * | | shader/lea: Remove const and use move when possible | ReinUsesLisp | 2020-03-27 | 1 | -11/+5 |
| |/ | |||||
* | | clang-format | Nguyen Dac Nam | 2020-03-31 | 1 | -2/+1 |
| | | |||||
* | | shader_decode: fix by suggestion | Nguyen Dac Nam | 2020-03-31 | 1 | -27/+22 |
| | | |||||
* | | clang-format | namkazy | 2020-03-30 | 1 | -3/+3 |
| | | |||||
* | | shader_decode: ATOM/ATOMS: add function to avoid code repetition | namkazy | 2020-03-30 | 1 | -70/+39 |
| | | |||||
* | | shader_decode: implement ATOM operation for S32 and U32 | Nguyen Dac Nam | 2020-03-30 | 1 | -6/+39 |
| | | |||||
* | | clang-format | namkazy | 2020-03-30 | 1 | -3/+3 |
| | | |||||
* | | shader_decode: implement ATOMS instr partial. | Nguyen Dac Nam | 2020-03-30 | 1 | -10/+42 |
|/ | |||||
* | xmad: fix clang build error | makigumo | 2020-03-23 | 1 | -4/+5 |
| | |||||
* | Merge pull request #3505 from namkazt/patch-8 | bunnei | 2020-03-19 | 1 | -15/+48 |
|\ | | | | | shader_decode: implement XMAD mode CSfu | ||||
| * | nit & remove some optional param | Nguyen Dac Nam | 2020-03-13 | 1 | -10/+11 |
| | | |||||
| * | shader_decode: implement XMAD mode CSfu | Nguyen Dac Nam | 2020-03-13 | 1 | -9/+41 |
| | | |||||
* | | Merge pull request #3502 from namkazt/patch-3 | Rodrigo Locatti | 2020-03-16 | 1 | -21/+48 |
|\ \ | | | | | | | shader_decode: Reimplement BFE instructions | ||||
| * | | clang-format | Nguyen Dac Nam | 2020-03-14 | 1 | -2/+1 |
| | | | |||||
| * | | nit | Nguyen Dac Nam | 2020-03-14 | 1 | -1/+1 |
| | | | |||||
| * | | clang-format | Nguyen Dac Nam | 2020-03-13 | 1 | -4/+8 |
| | | | |||||
| * | | Apply suggestions from code review | Nguyen Dac Nam | 2020-03-13 | 1 | -5/+5 |
| | | | | | | | | | Co-Authored-By: Mat M. <mathew1800@gmail.com> | ||||
| * | | shader_decode: BFE add ref of reverse parallel method. | Nguyen Dac Nam | 2020-03-13 | 1 | -0/+3 |
| | | | |||||
| * | | shader_decode: implement BREV on BFE | Nguyen Dac Nam | 2020-03-13 | 1 | -6/+25 |
| | | | | | | | | | Implement reverse parallel follow: https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel | ||||
| * | | shader_decode: Reimplement BFE instructions | Nguyen Dac Nam | 2020-03-13 | 1 | -25/+27 |
| |/ | |||||
* / | video_core: Rename "const buffer locker" to "registry" | ReinUsesLisp | 2020-03-09 | 1 | -2/+3 |
|/ | |||||
* | shader: FMUL switch to using LUT (#3441) | Nguyen Dac Nam | 2020-02-27 | 1 | -19/+14 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | * shader: add FmulPostFactor LUT table * shader: FMUL apply LUT * Update src/video_core/engines/shader_bytecode.h Co-Authored-By: Mat M. <mathew1800@gmail.com> * nit: mistype * clang-format & add missing import * shader: remove post factor LUT. * shader: move post factor LUT to function and fix incorrect order. * clang-format * shader: FMUL: add static to post factor LUT * nit: typo Co-authored-by: Mat M. <mathew1800@gmail.com> | ||||
* | Merge pull request #3440 from namkazt/patch-6 | bunnei | 2020-02-26 | 1 | -36/+58 |
|\ | | | | | shader: implement LOP3 fast replace for old function | ||||
| * | nit: add const to where it need. | Nguyen Dac Nam | 2020-02-21 | 1 | -14/+14 |
| | | |||||
| * | shader: implement LOP3 fast replace for old function | Nguyen Dac Nam | 2020-02-21 | 1 | -36/+58 |
| | | | | | | ref: https://devtalk.nvidia.com/default/topic/1070081/cuda-programming-and-performance/reverse-lut-for-lop3-lut/ | ||||
* | | shader/texture: Fix illegal 3D texture assert | ReinUsesLisp | 2020-02-21 | 1 | -1/+1 |
|/ | | | | | Fix typo in the illegal 3D texture assert logic. We care about catching arrayed 3D textures or 3D shadow textures, not regular 3D textures. | ||||
* | Merge pull request #3415 from ReinUsesLisp/texture-code | bunnei | 2020-02-20 | 1 | -43/+28 |
|\ | | | | | shader/texture: Allow 2D shadow arrays and simplify code | ||||
| * | shader/texture: Allow 2D shadow arrays and simplify code | ReinUsesLisp | 2020-02-15 | 1 | -43/+28 |
| | | | | | | | | | | | | | | Shadow sampler 2D arrays are supported on OpenGL, so there's no reason to forbid these. Enable textureLod usage on these. Minor style changes. | ||||
* | | shader_conversion: I2F : add Assert for case src_size is Short | Nguyen Dac Nam | 2020-02-19 | 1 | -0/+3 |
| | | |||||
* | | fix warning | Nguyen Dac Nam | 2020-02-19 | 1 | -1/+1 |
| | | |||||
* | | clang-format fix | Nguyen Dac Nam | 2020-02-19 | 1 | -1/+1 |
| | | |||||
* | | shader_conversion: add conversion I2F for Short | Nguyen Dac Nam | 2020-02-19 | 1 | -9/+6 |
|/ | |||||
* | Merge pull request #3379 from ReinUsesLisp/cbuf-offset | bunnei | 2020-02-14 | 2 | -3/+3 |
|\ | | | | | shader/decode: Fix constant buffer offsets | ||||
| * | shader/decode: Fix constant buffer offsets | ReinUsesLisp | 2020-02-05 | 2 | -3/+3 |
| | | | | | | | | | | | | Some instances were using cbuf34.offset instead of cbuf34.GetOffset(). This returned the an invalid offset. Address those instances and rename offset to "shifted_offset" to avoid future bugs. | ||||
* | | Merge pull request #3369 from ReinUsesLisp/shf | bunnei | 2020-02-08 | 1 | -11/+102 |
|\ \ | |/ |/| | shader/shift: Implement SHF | ||||
| * | shader/shift: Implement SHIFT_RIGHT_{IMM,R} | ReinUsesLisp | 2020-02-02 | 1 | -26/+58 |
| | | | | | | | | Shifts a pair of registers to the right and returns the low register. | ||||
| * | shader/shift: Implement SHF_LEFT_{IMM,R} | ReinUsesLisp | 2020-02-02 | 1 | -10/+69 |
| | | | | | | | | Shifts a pair of registers to the left and returns the high register. | ||||
* | | Merge pull request #3357 from ReinUsesLisp/bfi-rc | bunnei | 2020-02-04 | 1 | -2/+5 |
|\ \ | | | | | | | shader/bfi: Implement register-constant buffer variant | ||||
| * | | shader/bfi: Implement register-constant buffer variant | ReinUsesLisp | 2020-01-27 | 1 | -2/+5 |
| | | | | | | | | | | | | | | | It's the same as the variant that was implemented, but it takes the operands from another source. | ||||
* | | | Merge pull request #3356 from ReinUsesLisp/fcmp | bunnei | 2020-02-04 | 1 | -1/+10 |
|\ \ \ | | | | | | | | | shader/arithmetic: Implement FCMP | ||||
| * | | | shader/arithmetic: Implement FCMP | ReinUsesLisp | 2020-01-27 | 1 | -1/+10 |
| |/ / | | | | | | | | | | | | | Compares the third operand with zero, then selects between the first and second. | ||||
* | | | Merge pull request #3337 from ReinUsesLisp/vulkan-staged | bunnei | 2020-02-03 | 1 | -3/+6 |
|\ \ \ | | | | | | | | | yuzu: Implement Vulkan frontend | ||||
| * | | | shader/other: Fix skips for SYNC and BRK | ReinUsesLisp | 2020-01-29 | 1 | -2/+2 |
| | | | | |||||
| * | | | shader/other: Stub S2R LaneId | ReinUsesLisp | 2020-01-29 | 1 | -1/+4 |
| |/ / | |||||
* | | | shader: Remove curly braces initializers on shared pointers | ReinUsesLisp | 2020-02-02 | 2 | -3/+3 |
| | | | |||||
* | | | Merge pull request #3282 from FernandoS27/indexed-samplers | bunnei | 2020-02-02 | 1 | -37/+73 |
|\ \ \ | | | | | | | | | Partially implement Indexed samplers in general and specific code in GLSL | ||||
| * | | | Shader_IR: Address feedback. | Fernando Sahmkow | 2020-01-25 | 1 | -1/+2 |
| | | | | |||||
| * | | | Shader_IR: Change name of TrackSampler function so it does not confuse with the type. | Fernando Sahmkow | 2020-01-24 | 1 | -1/+1 |
| | | | | |||||
| * | | | Shader_IR: Propagate bindless index into the GL compiler. | Fernando Sahmkow | 2020-01-24 | 1 | -16/+24 |
| | | | | |||||
| * | | | Shader_IR: deduce size of indexed samplers | Fernando Sahmkow | 2020-01-24 | 1 | -4/+5 |
| | | | | |||||
| * | | | Shader_IR: Setup Indexed Samplers on the IR | Fernando Sahmkow | 2020-01-24 | 1 | -20/+46 |
| |/ / | |||||
* | | | Merge pull request #3347 from ReinUsesLisp/local-mem | bunnei | 2020-01-30 | 1 | -30/+55 |
|\ \ \ | |_|/ |/| | | shader/memory: Implement LDL.S16, LDS.S16, STL.S16 and STS.S16 | ||||
| * | | shader/memory: Implement STL.S16 and STS.S16 | ReinUsesLisp | 2020-01-25 | 1 | -3/+10 |
| | | | |||||
| * | | shader/memory: Implement unaligned LDL.S16 and LDS.S16 | ReinUsesLisp | 2020-01-25 | 1 | -5/+3 |
| | | | |||||
| * | | shader/memory: Move unaligned load/store to functions | ReinUsesLisp | 2020-01-25 | 1 | -18/+27 |
| | | | |||||
| * | | shader/memory: Implement LDL.S16 and LDS.S16 | ReinUsesLisp | 2020-01-25 | 1 | -12/+23 |
| |/ | |||||
* / | shader/memory: Implement ATOM.ADD | ReinUsesLisp | 2020-01-26 | 1 | -1/+21 |
|/ | | | | | | | | | | | | | ATOM operates atomically on global memory. For now only add ATOM.ADD since that's what was found in commercial games. This asserts for ATOM.ADD.S32 (handling the others as unimplemented), although ATOM.ADD.U32 shouldn't be any different. This change forces us to change the default type on SPIR-V storage buffers from float to uint. We could also alias the buffers, but it's simpler for now to just use uint. While we are at it, abstract the code to avoid repetition. | ||||
* | Merge pull request #3273 from FernandoS27/txd-array | bunnei | 2020-01-24 | 1 | -5/+12 |
|\ | | | | | Shader_IR: Implement TXD Array. | ||||
| * | Shader_IR: Implement TXD Array. | Fernando Sahmkow | 2020-01-04 | 1 | -5/+12 |
| | | | | | | | | | | This commit extends the compilation of TXD to support array samplers on TXD. | ||||
* | | shader/memory: Implement ATOMS.ADD.U32 | ReinUsesLisp | 2020-01-16 | 1 | -0/+19 |
| | | |||||
* | | Merge pull request #3287 from ReinUsesLisp/ldg-stg-16 | bunnei | 2020-01-14 | 1 | -33/+51 |
|\ \ | | | | | | | shader_ir/memory: Implement u16 and u8 for STG and LDG | ||||
| * | | shader_ir/memory: Implement u16 and u8 for STG and LDG | ReinUsesLisp | 2020-01-09 | 1 | -33/+51 |
| |/ | | | | | | | | | | | | | Using the same technique we used for u8 on LDG, implement u16. In the case of STG, load memory and insert the value we want to set into it with bitfieldInsert. Then set that value. | ||||
* / | shader_ir/texture: Simplify AOFFI code | ReinUsesLisp | 2020-01-09 | 1 | -10/+6 |
|/ | |||||
* | Merge pull request #3239 from ReinUsesLisp/p2r | bunnei | 2020-01-01 | 1 | -16/+44 |
|\ | | | | | shader/p2r: Implement P2R Pr | ||||
| * | shader/p2r: Implement P2R Pr | ReinUsesLisp | 2019-12-20 | 1 | -1/+15 |
| | | | | | | | | | | P2R dumps predicate or condition codes state to a register. This is useful for unit testing. | ||||
| * | shader/r2p: Refactor P2R to support P2R | ReinUsesLisp | 2019-12-20 | 1 | -16/+30 |
| | | |||||
* | | Merge pull request #3228 from ReinUsesLisp/ptp | bunnei | 2019-12-27 | 1 | -33/+75 |
|\ \ | | | | | | | shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S | ||||
| * | | shader/texture: Implement TLD4.PTP | ReinUsesLisp | 2019-12-16 | 1 | -18/+56 |
| | | | |||||
| * | | shader/texture: Enable arrayed TLD4 | ReinUsesLisp | 2019-12-16 | 1 | -1/+0 |
| | | | |||||
| * | | shader/texture: Implement AOFFI for TLD4S | ReinUsesLisp | 2019-12-16 | 1 | -13/+18 |
| | | | |||||
| * | | shader/texture: Remove unnecesary parenthesis | ReinUsesLisp | 2019-12-16 | 1 | -2/+2 |
| | | | |||||
* | | | Merge pull request #3235 from ReinUsesLisp/ldg-u8 | bunnei | 2019-12-22 | 1 | -6/+32 |
|\ \ \ | |_|/ |/| | | shader/memory: Implement LDG.U8 and unaligned U8 loads | ||||
| * | | shader/memory: Implement LDG.U8 and unaligned U8 loads | ReinUsesLisp | 2019-12-18 | 1 | -6/+32 |
| |/ | | | | | | | | | | | | | | | | | | | | | | | LDG can load single bytes instead of full integers or packs of integers. These have the advantage of loading bytes that are not aligned to 4 bytes. To emulate these this commit gets the byte being referenced (by doing "address & 3" and then using that to extract the byte from the loaded integer: result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8) | ||||
* | | Merge pull request #3234 from ReinUsesLisp/i2f-u8-selector | bunnei | 2019-12-20 | 1 | -2/+13 |
|\ \ | | | | | | | shader/conversion: Implement byte selector in I2F | ||||
| * | | shader/conversion: Implement byte selector in I2F | ReinUsesLisp | 2019-12-18 | 1 | -2/+13 |
| |/ | | | | | | | | | | | | | I2F's byte selector is used to choose what bytes to convert to float. e.g. if the input is 0xaabbccdd and the selector is ".B3" it will convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in that example the default would convert 0xdd to float. | ||||
* / | shader/texture: Properly shrink unused entries in size mismatches | ReinUsesLisp | 2019-12-18 | 1 | -4/+9 |
|/ | | | | | | | When a image format mismatches we were inserting zeroes to the texture itself. This was not handling cases were the mismatch uses less coordinates than the guest shader code. Address that by resizing the vector. | ||||
* | Shader_IR: Correct TLD4S Depth Compare. | Fernando Sahmkow | 2019-12-12 | 1 | -5/+12 |
| | |||||
* | Shader_Ir: Correct TLD4S encoding and implement f16 flag. | Fernando Sahmkow | 2019-12-12 | 1 | -9/+12 |
| | |||||
* | Shader_Ir: default failed tracks on bindless samplers to null values. | Fernando Sahmkow | 2019-12-12 | 1 | -22/+75 |
| | |||||
* | shader: Implement MEMBAR.GL | ReinUsesLisp | 2019-12-10 | 1 | -0/+6 |
| | | | | Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V. | ||||
* | shader_ir/other: Implement S2R InvocationId | ReinUsesLisp | 2019-12-10 | 1 | -0/+2 |
| | |||||
* | shader: Keep track of shaders using warp instructions | ReinUsesLisp | 2019-12-10 | 1 | -0/+3 |
| | |||||
* | shader_ir/memory: Implement patch stores | ReinUsesLisp | 2019-12-10 | 1 | -16/+18 |
| | |||||
* | Merge pull request #3109 from FernandoS27/new-instr | bunnei | 2019-12-07 | 2 | -7/+65 |
|\ | | | | | Implement FLO & TXD Instructions on GPU Shaders | ||||
| * | Shader_IR: Address Feedback | Fernando Sahmkow | 2019-11-18 | 2 | -10/+8 |
| | | |||||
| * | Shader_IR: Implement TXD instruction. | Fernando Sahmkow | 2019-11-14 | 1 | -7/+49 |
| | | |||||
| * | Shader_IR: Implement FLO instruction. | Fernando Sahmkow | 2019-11-14 | 1 | -0/+18 |
| | | |||||
* | | shader/texture: Handle TLDS texture type mismatches | ReinUsesLisp | 2019-11-23 | 1 | -1/+10 |
| | | | | | | | | | | | | | | | | | | | | Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets used by instructions of 1D textures. To handle the discrepancy this commit uses the the texture type from the binding and modifies the emitted code IR to build a valid backend expression. E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples a 2D texture in the coordinate ivec2(X, 0). | ||||
* | | shader/texture: Deduce texture buffers from locker | ReinUsesLisp | 2019-11-23 | 1 | -61/+41 |
| | | | | | | | | | | Instead of specializing shaders to separate texture buffers from 1D textures, use the locker to deduce them while they are being decoded. | ||||
* | | shader/other: Reduce DEPBAR log severity | ReinUsesLisp | 2019-11-20 | 1 | -1/+1 |
|/ | | | | | | While DEPBAR is stubbed it doesn't change anything from our end. Shading languages handle what this instruction does implicitly. We are not getting anything out fo this log except noise. | ||||
* | shader_ir/warp: Implement FSWZADD | ReinUsesLisp | 2019-11-08 | 1 | -0/+9 |
| | |||||
* | gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics | ReinUsesLisp | 2019-11-08 | 1 | -33/+35 |
| | |||||
* | shader/decode: Reduce severity of arithmetic rounding warnings | ReinUsesLisp | 2019-11-07 | 6 | -15/+17 |
| | |||||
* | shader/arithmetic: Reduce RRO stub severity | ReinUsesLisp | 2019-11-07 | 1 | -1/+2 |
| | |||||
* | shader/texture: Remove NODEP warnings | ReinUsesLisp | 2019-11-07 | 1 | -35/+0 |
| | | | | | These warnings don't offer meaningful information while decoding shaders. Remove them. | ||||
* | Merge pull request #3039 from ReinUsesLisp/cleanup-samplers | Rodrigo Locatti | 2019-11-06 | 2 | -54/+55 |
|\ | | | | | shader/node: Unpack bindless texture encoding | ||||
| * | shader/node: Unpack bindless texture encoding | ReinUsesLisp | 2019-10-30 | 2 | -54/+55 |
| | | | | | | | | | | | | | | | | | | Bindless textures were using u64 to pack the buffer and offset from where they come from. Drop this in favor of separated entries in the struct. Remove the usage of std::set in favor of std::list (it's not std::vector to avoid reference invalidations) for samplers and images. | ||||
* | | Shader_IR: Fix regression on TLD4 | Fernando Sahmkow | 2019-10-31 | 1 | -4/+3 |
| | | | | | | | | | | | | Originally on the last commit I thought TLD4 acted the same as TLD4S and didn't have a mask. It actually does have a component mask. This commit corrects that. | ||||
* | | Shader_IR: Fix TLD4 and add Bindless Variant. | Fernando Sahmkow | 2019-10-30 | 1 | -8/+24 |
|/ | | | | | | This commit fixes an issue where not all 4 results of tld4 were being written, the color component was defaulted to red, among other things. It also implements the bindless variant. | ||||
* | Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased | Rodrigo Locatti | 2019-10-26 | 1 | -20/+50 |
|\ | | | | | Implement Fast BRX, fix TXQ and addapt the Shader Cache for it | ||||
| * | Shader_IR: Address Feedback. | Fernando Sahmkow | 2019-10-26 | 1 | -22/+16 |
| | | |||||
| * | Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it | Fernando Sahmkow | 2019-10-25 | 1 | -18/+54 |
| | | |||||
* | | Merge pull request #3013 from FernandoS27/tld4s-fix | Rodrigo Locatti | 2019-10-26 | 1 | -4/+4 |
|\ \ | |/ |/| | Shader_Ir: Fix TLD4S from using a component mask. | ||||
| * | Shader_Ir: Fix TLD4S from using a component mask. | Fernando Sahmkow | 2019-10-22 | 1 | -4/+4 |
| | | | | | | | | | | | | TLD4S always outputs 4 values, the previous code checked a component mask and omitted those values that weren't part of it. This commit corrects that and makes sure all 4 values are set. | ||||
* | | video_core/shader: Resolve instances of variable shadowing | Lioncash | 2019-10-24 | 6 | -11/+12 |
| | | | | | | | | Silences a few -Wshadow warnings. | ||||
* | | shader_ir/memory: Ignore global memory when tracking fails | ReinUsesLisp | 2019-10-22 | 1 | -16/+23 |
|/ | | | | | | | | | | | Ignore global memory operations instead of invoking undefined behaviour when constant buffer tracking fails and we are blasting through asserts, ignore the operation. In the case of LDG this means filling the destination registers with zeroes; for STG this means ignore the instruction as a whole. The default behaviour is still to abort execution on failure. | ||||
* | shader/half_set_predicate: Fix HSETP2 for constant buffers | ReinUsesLisp | 2019-10-07 | 1 | -0/+2 |
| | | | | | HSETP2 when used with a constant buffer parses the second operand type as F32. This is not configurable. | ||||
* | shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG | ReinUsesLisp | 2019-10-07 | 1 | -1/+2 |
| | |||||
* | Merge pull request #2869 from ReinUsesLisp/suld | bunnei | 2019-09-24 | 1 | -60/+77 |
|\ | | | | | shader/image: Implement SULD and fix SUATOM | ||||
| * | gl_shader_decompiler: Use uint for images and fix SUATOM | ReinUsesLisp | 2019-09-21 | 1 | -37/+29 |
| | | | | | | | | | | | | In the process remove implementation of SUATOM.MIN and SUATOM.MAX as these require a distinction between U32 and S32. These have to be implemented with imageCompSwap loop. | ||||
| * | shader/image: Implement SULD and remove irrelevant code | ReinUsesLisp | 2019-09-21 | 1 | -24/+49 |
| | | | | | | | | | | * Implement SULD as float. * Remove conditional declaration of GL_ARB_shader_viewport_layer_array. | ||||
* | | Merge pull request #2878 from FernandoS27/icmp | Rodrigo Locatti | 2019-09-21 | 1 | -0/+29 |
|\ \ | |/ |/| | shader_ir: Implement ICMP | ||||
| * | Shader_IR: ICMP corrections and fixes | Fernando Sahmkow | 2019-09-21 | 1 | -6/+9 |
| | | |||||
| * | Shader_IR: Implement ICMP. | Fernando Sahmkow | 2019-09-20 | 1 | -0/+26 |
| | | |||||
* | | Merge pull request #2855 from ReinUsesLisp/shfl | bunnei | 2019-09-20 | 1 | -0/+47 |
|\ \ | |/ |/| | shader_ir/warp: Implement SHFL for Nvidia devices | ||||
| * | shader_ir/warp: Implement SHFL | ReinUsesLisp | 2019-09-17 | 1 | -0/+47 |
| | | |||||
* | | Merge pull request #2784 from ReinUsesLisp/smem | bunnei | 2019-09-18 | 1 | -19/+29 |
|\ \ | |/ |/| | shader_ir: Implement shared memory | ||||
| * | shader_ir: Implement LD_S | ReinUsesLisp | 2019-09-05 | 1 | -10/+13 |
| | | | | | | | | Loads from shared memory. | ||||
| * | shader_ir: Implement ST_S | ReinUsesLisp | 2019-09-05 | 1 | -9/+16 |
| | | | | | | | | | | This instruction writes to a memory buffer shared with threads within the same work group. It is known as "shared" memory in GLSL. | ||||
* | | shader/image: Implement SUATOM and fix SUST | ReinUsesLisp | 2019-09-11 | 1 | -21/+71 |
| | | |||||
* | | Merge pull request #2823 from ReinUsesLisp/shr-clamp | bunnei | 2019-09-10 | 1 | -6/+13 |
|\ \ | | | | | | | shader/shift: Implement SHR wrapped and clamped variants | ||||
| * | | shader/shift: Implement SHR wrapped and clamped variants | ReinUsesLisp | 2019-09-04 | 1 | -6/+13 |
| | | | | | | | | | | | | | | | | | | Nvidia defaults to wrapped shifts, but this is undefined behaviour on OpenGL's spec. Explicitly mask/clamp according to what the guest shader requires. | ||||
* | | | gl_shader_decompiler: Keep track of written images and mark them as modified | ReinUsesLisp | 2019-09-06 | 1 | -21/+19 |
| |/ |/| | |||||
* | | half_set_predicate: Fix predicate assignments | ReinUsesLisp | 2019-09-04 | 1 | -10/+9 |
|/ | |||||
* | Merge pull request #2812 from ReinUsesLisp/f2i-selector | bunnei | 2019-09-04 | 1 | -6/+16 |
|\ | | | | | shader_ir/conversion: Implement F2I and F2F F16 selector | ||||
| * | shader_ir/conversion: Split int and float selector and implement F2F H1 | ReinUsesLisp | 2019-08-28 | 1 | -18/+16 |
| | | |||||
| * | shader_ir/conversion: Implement F2I F16 Ra.H1 | ReinUsesLisp | 2019-08-28 | 1 | -4/+16 |
| | | |||||
* | | Merge pull request #2811 from ReinUsesLisp/fsetp-fix | bunnei | 2019-09-04 | 1 | -4/+5 |
|\ \ | | | | | | | float_set_predicate: Add missing negation bit for the second operand | ||||
| * | | float_set_predicate: Add missing negation bit for the second operand | ReinUsesLisp | 2019-08-28 | 1 | -4/+5 |
| |/ | |||||
* | | video_core: Silent miscellaneous warnings (#2820) | Rodrigo Locatti | 2019-08-30 | 5 | -5/+0 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * texture_cache/surface_params: Remove unused local variable * rasterizer_interface: Add missing documentation commentary * maxwell_dma: Remove unused rasterizer reference * video_core/gpu: Sort member declaration order to silent -Wreorder warning * fermi_2d: Remove unused MemoryManager reference * video_core: Silent unused variable warnings * buffer_cache: Silent -Wreorder warnings * kepler_memory: Remove unused MemoryManager reference * gl_texture_cache: Add missing override * buffer_cache: Add missing include * shader/decode: Remove unused variables | ||||
* | | Merge pull request #2758 from ReinUsesLisp/packed-tid | bunnei | 2019-08-29 | 1 | -0/+7 |
|\ \ | | | | | | | shader/decode: Implement S2R Tic | ||||
| * | | shader/decode: Implement S2R Tic | ReinUsesLisp | 2019-07-22 | 1 | -0/+7 |
| | | | |||||
* | | | shader_ir: Implement VOTE | ReinUsesLisp | 2019-08-21 | 1 | -0/+55 |
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers. | ||||
* | | Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix | bunnei | 2019-08-21 | 1 | -1/+1 |
|\ \ | | | | | | | half_set_predicate: Fix HSETP2_C constant buffer offset | ||||
| * | | half_set_predicate: Fix HSETP2_C constant buffer offset | ReinUsesLisp | 2019-08-04 | 1 | -1/+1 |
| | | | |||||
* | | | Merge pull request #2753 from FernandoS27/float-convert | bunnei | 2019-08-21 | 1 | -5/+25 |
|\ \ \ | | | | | | | | | Shader_Ir: Implement F16 Variants of F2F, F2I, I2F. | ||||
| * | | | Shader_Ir: Implement F16 Variants of F2F, F2I, I2F. | Fernando Sahmkow | 2019-07-20 | 1 | -5/+25 |
| | |/ | |/| | | | | | | | | | | This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done. | ||||
* | | | Merge pull request #2778 from ReinUsesLisp/nop | bunnei | 2019-08-18 | 1 | -0/+6 |
|\ \ \ | | | | | | | | | shader_ir: Implement NOP | ||||
| * | | | shader_ir: Implement NOP | ReinUsesLisp | 2019-08-04 | 1 | -0/+6 |
| | |/ | |/| | |||||
* / | | decode/half_set_predicate: Fix predicates | ReinUsesLisp | 2019-07-26 | 1 | -3/+3 |
|/ / | |||||
* | | Merge pull request #2743 from FernandoS27/surpress-assert | bunnei | 2019-07-25 | 5 | -13/+20 |
|\ \ | |/ |/| | Downgrade and suppress a series of GPU asserts and debug messages. | ||||
| * | Shader_Ir: Change Debug Asserts for Log Warnings | Fernando Sahmkow | 2019-07-20 | 3 | -10/+17 |
| | | |||||
| * | Shader_Ir: correct clang format | Fernando Sahmkow | 2019-07-18 | 1 | -2/+2 |
| | | |||||
| * | Shader_Ir: Downgrade precision and rounding asserts to debug asserts. | Fernando Sahmkow | 2019-07-18 | 5 | -10/+10 |
| | | | | | | | | | | | | This commit reduces the sevirity of asserts for FP precision and rounding as this are well known and have little to no consequences in gpu's accuracy. | ||||
* | | shader/half_set_predicate: Fix HSETP2 implementation | ReinUsesLisp | 2019-07-20 | 1 | -17/+14 |
| | | |||||
* | | shader/half_set_predicate: Implement missing HSETP2 variants | ReinUsesLisp | 2019-07-20 | 1 | -13/+29 |
| | | |||||
* | | Merge pull request #2738 from lioncash/shader-ir | bunnei | 2019-07-18 | 3 | -30/+30 |
|\ \ | |/ |/| | shader-ir: Minor cleanup-related changes | ||||
| * | shader_ir: Rename Get/SetTemporal to Get/SetTemporary | Lioncash | 2019-07-17 | 3 | -30/+30 |
| | | | | | | | | | | | | This is more accurate in terms of describing what the functions are actually doing. Temporal relates to time, not the setting of a temporary itself. | ||||
* | | Merge pull request #2740 from lioncash/bra | Fernando Sahmkow | 2019-07-17 | 1 | -1/+1 |
|\ \ | |/ |/| | shader/decode/other: Correct branch indirect argument within BRA handling | ||||
| * | shader/decode/other: Correct branch indirect argument within BRA handling | Lioncash | 2019-07-16 | 1 | -1/+1 |
| | | | | | | | | | | This appears to have been a copy/paste error introduced within 8a6fc529a968e007f01464abadd32f9b5eb0a26c | ||||
* | | shader: Allow tracking of indirect buffers without variable offset | ReinUsesLisp | 2019-07-15 | 3 | -23/+10 |
|/ | | | | | | While changing this code, simplify tracking code to allow returning the base address node, this way callers don't have to manually rebuild it on each invocation. | ||||
* | Merge pull request #2692 from ReinUsesLisp/tlds-f16 | Fernando Sahmkow | 2019-07-14 | 1 | -1/+7 |
|\ | | | | | shader/texture: Add F16 support for TLDS | ||||
| * | shader/texture: Add F16 support for TLDS | ReinUsesLisp | 2019-07-07 | 1 | -1/+7 |
| | | |||||
* | | shader_ir: Unify blocks in decompiled shaders. | Fernando Sahmkow | 2019-07-09 | 1 | -7/+23 |
| | | |||||
* | | shader_ir: Implement BRX & BRA.CC | Fernando Sahmkow | 2019-07-09 | 1 | -4/+38 |
| | | |||||
* | | Delete decode_integer_set.cpp | Tobias | 2019-07-07 | 1 | -0/+0 |
|/ | |||||
* | decode/texture: Address feedback | ReinUsesLisp | 2019-06-24 | 1 | -0/+1 |
| | |||||
* | shader_ir: Fix image copy rebase issues | Fernando Sahmkow | 2019-06-21 | 1 | -2/+7 |
| | |||||
* | shader: Implement bindless images | ReinUsesLisp | 2019-06-21 | 1 | -2/+28 |
| | |||||
* | shader: Decode SUST and implement backing image functionality | ReinUsesLisp | 2019-06-21 | 1 | -0/+89 |
| | |||||
* | shader: Implement texture buffers | ReinUsesLisp | 2019-06-21 | 1 | -0/+44 |
| | |||||
* | shader: Split SSY and PBK stack | ReinUsesLisp | 2019-06-07 | 1 | -10/+8 |
| | | | | | | | | | | | Hardware testing revealed that SSY and PBK push to a different stack, allowing code like this: SSY label1; PBK label2; SYNC; label1: PBK; label2: EXIT; | ||||
* | shader: Use shared_ptr to store nodes and move initialization to file | ReinUsesLisp | 2019-06-06 | 26 | -8/+34 |
| | | | | | | | | | Instead of having a vector of unique_ptr stored in a vector and returning star pointers to this, use shared_ptr. While changing initialization code, move it to a separate file when possible. This is a first step to allow code analysis and node generation beyond the ShaderIR class. | ||||
* | Merge pull request #2446 from ReinUsesLisp/tid | bunnei | 2019-05-29 | 1 | -14/+28 |
|\ | | | | | shader: Implement S2R Tid{XYZ} and CtaId{XYZ} | ||||
| * | shader: Implement S2R Tid{XYZ} and CtaId{XYZ} | ReinUsesLisp | 2019-05-20 | 1 | -14/+28 |
| | | |||||
* | | Merge pull request #2485 from ReinUsesLisp/generic-memory | bunnei | 2019-05-25 | 1 | -27/+55 |
|\ \ | | | | | | | shader/memory: Implement generic memory stores and loads (ST and LD) | ||||
| * | | shader/memory: Implement ST (generic memory) | ReinUsesLisp | 2019-05-21 | 1 | -21/+35 |
| | | | |||||
| * | | shader/memory: Implement LD (generic memory) | ReinUsesLisp | 2019-05-21 | 1 | -7/+21 |
| |/ | |||||
* | | shader/decode/*: Add missing newline to files lacking them | Lioncash | 2019-05-23 | 18 | -18/+18 |
| | | | | | | | | Keeps the shader code file endings consistent. | ||||
* | | shader/decode/*: Eliminate indirect inclusions | Lioncash | 2019-05-23 | 6 | -1/+5 |
| | | | | | | | | | | | | | | Amends cases where we were using things that were indirectly being satisfied through other headers. This way, if those headers change and eliminate dependencies on other headers in the future, we don't have cascading compilation errors. | ||||
* | | shader/decode/memory: Remove left in debug pragma | Lioncash | 2019-05-22 | 1 | -2/+0 |
|/ | |||||
* | Merge pull request #2441 from ReinUsesLisp/al2p | bunnei | 2019-05-19 | 2 | -10/+33 |
|\ | | | | | shader: Implement AL2P and ALD.PHYS | ||||
| * | shader_ir/other: Implement IPA.IDX | ReinUsesLisp | 2019-05-03 | 1 | -5/+8 |
| | | |||||
| * | shader_ir/memory: Assert on non-32 bits ALD.PHYS | ReinUsesLisp | 2019-05-03 | 1 | -0/+3 |
| | | |||||
| * | shader: Add physical attributes commentaries | ReinUsesLisp | 2019-05-03 | 1 | -1/+1 |
| | | |||||
| * | shader_ir/memory: Implement physical input attributes | ReinUsesLisp | 2019-05-03 | 1 | -3/+6 |
| | | |||||
| * | shader: Remove unused AbufNode Ipa mode | ReinUsesLisp | 2019-05-03 | 2 | -6/+3 |
| | | |||||
| * | shader_ir/memory: Emit AL2P IR | ReinUsesLisp | 2019-05-03 | 1 | -0/+17 |
| | | |||||
* | | video_core/shader/decode/texture: Remove unused variable from GetTld4Code() | Lioncash | 2019-05-10 | 1 | -1/+0 |
| | | |||||
* | | shader/decode/texture: Remove unused variable | Lioncash | 2019-05-04 | 1 | -1/+0 |
|/ | | | | This isn't used anywhere, so we can get rid of it. | ||||
* | Merge pull request #2435 from ReinUsesLisp/misc-vc | bunnei | 2019-04-29 | 1 | -1/+1 |
|\ | | | | | shader_ir: Miscellaneous fixes | ||||
| * | shader_ir/texture: Fix sampler const buffer key shift | ReinUsesLisp | 2019-04-26 | 1 | -1/+1 |
| | | |||||
* | | Merge pull request #2322 from ReinUsesLisp/wswitch | bunnei | 2019-04-29 | 2 | -5/+7 |
|\ \ | | | | | | | video_core: Silent -Wswitch warnings | ||||
| * | | video_core: Silent -Wswitch warnings | ReinUsesLisp | 2019-04-18 | 2 | -5/+7 |
| | | | |||||
* | | | Merge pull request #2423 from FernandoS27/half-correct | bunnei | 2019-04-29 | 2 | -15/+16 |
|\ \ \ | |_|/ |/| | | Corrections on Half Float operations: HADD2 HMUL2 and HFMA2 | ||||
| * | | Corrections Half Float operations on const buffers and implement saturation. | Fernando Sahmkow | 2019-04-21 | 2 | -15/+16 |
| | | | |||||
* | | | Merge pull request #2407 from FernandoS27/f2f | bunnei | 2019-04-20 | 1 | -16/+53 |
|\ \ \ | |/ / |/| | | Do some corrections in conversion shader instructions. | ||||
| * | | Do some corrections in conversion shader instructions. | Fernando Sahmkow | 2019-04-16 | 1 | -16/+53 |
| |/ | | | | | | | | | | | Corrects encodings for I2F, F2F, I2I and F2I Implements Immediate variants of all four conversion types. Add assertions to unimplemented stuffs. | ||||
* | | Merge pull request #2409 from ReinUsesLisp/half-floats | bunnei | 2019-04-20 | 5 | -36/+32 |
|\ \ | | | | | | | shader_ir/decode: Miscellaneous fixes to half-float decompilation | ||||
| * | | shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic | ReinUsesLisp | 2019-04-16 | 5 | -29/+21 |
| | | | | | | | | | | | | | | | | | | | | | Operations done before the main half float operation (like HAdd) were managing a packed value instead of the unpacked one. Adding an unpacked operation allows us to drop the per-operand MetaHalfArithmetic entry, simplifying the code overall. | ||||
| * | | shader_ir/decode: Implement half float saturation | ReinUsesLisp | 2019-04-16 | 1 | -4/+2 |
| | | | |||||
| * | | shader_ir/decode: Reduce severity of unimplemented half-float FTZ | ReinUsesLisp | 2019-04-16 | 3 | -3/+9 |
| |/ | |||||
* | | Merge pull request #2348 from FernandoS27/guest-bindless | bunnei | 2019-04-18 | 1 | -19/+94 |
|\ \ | | | | | | | Implement Bindless Textures on Shader Decompiler and GL backend | ||||
| * | | Adapt Bindless to work with AOFFI | Fernando Sahmkow | 2019-04-08 | 1 | -7/+18 |
| | | | |||||
| * | | Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. | Fernando Sahmkow | 2019-04-08 | 1 | -1/+2 |
| | | | |||||
| * | | Fix TMML | Fernando Sahmkow | 2019-04-08 | 1 | -5/+7 |
| | | | |||||
| * | | Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters | Fernando Sahmkow | 2019-04-08 | 1 | -23/+24 |
| | | | |||||
| * | | Implement TXQ_B | Fernando Sahmkow | 2019-04-08 | 1 | -2/+8 |
| | | | |||||
| * | | Implement TMML_B | Fernando Sahmkow | 2019-04-08 | 1 | -5/+10 |
| | | | |||||
| * | | Corrections to TEX_B | Fernando Sahmkow | 2019-04-08 | 1 | -4/+5 |
| | | | |||||
| * | | Unify both sampler types. | Fernando Sahmkow | 2019-04-08 | 1 | -10/+12 |
| | | | |||||
| * | | Implement Bindless Samplers and TEX_B in the IR. | Fernando Sahmkow | 2019-04-08 | 1 | -6/+52 |
| | | | |||||
* | | | Merge pull request #2315 from ReinUsesLisp/severity-decompiler | bunnei | 2019-04-17 | 1 | -4/+5 |
|\ \ \ | | | | | | | | | shader_ir/decode: Reduce the severity of common assertions | ||||
| * | | | shader_ir/memory: Reduce severity of LD_L cache management and log it | ReinUsesLisp | 2019-04-03 | 1 | -2/+2 |
| | | | | |||||
| * | | | shader_ir/memory: Reduce severity of ST_L cache management and log it | ReinUsesLisp | 2019-04-03 | 1 | -2/+3 |
| | | | | |||||
* | | | | shader_ir: Implement STG, keep track of global memory usage and flush | ReinUsesLisp | 2019-04-14 | 1 | -35/+74 |
| |_|/ |/| | | |||||
* | | | Correct XMAD mode, psl and high_b on different encodings. | Fernando Sahmkow | 2019-04-08 | 1 | -9/+30 |
| |/ |/| | |||||
* | | shader_ir/decode: Silent implicit sign conversion warning | Mat M | 2019-03-31 | 1 | -2/+2 |
| | | | | | | Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc> | ||||
* | | shader_ir/decode: Implement AOFFI for TEX and TLD4 | ReinUsesLisp | 2019-03-30 | 1 | -24/+88 |
|/ | |||||
* | shader/decode: Remove extras from MetaTexture | ReinUsesLisp | 2019-02-26 | 1 | -14/+23 |
| | |||||
* | shader/decode: Split memory and texture instructions decoding | ReinUsesLisp | 2019-02-26 | 2 | -493/+525 |
| | |||||
* | Merge pull request #2118 from FernandoS27/ipa-improve | bunnei | 2019-02-25 | 2 | -3/+14 |
|\ | | | | | shader_decompiler: Improve Accuracy of Attribute Interpolation. | ||||
| * | shader_decompiler: Improve Accuracy of Attribute Interpolation. | Fernando Sahmkow | 2019-02-14 | 2 | -3/+14 |
| | | |||||
* | | gl_shader_decompiler: Re-implement TLDS lod | ReinUsesLisp | 2019-02-12 | 1 | -1/+1 |
|/ | |||||
* | Merge pull request #2108 from FernandoS27/fix-cc | bunnei | 2019-02-12 | 1 | -2/+2 |
|\ | | | | | Fix incorrect value for CC bit in IADD | ||||
| * | Fix incorrect value for CC bit in IADD | Fernando Sahmkow | 2019-02-11 | 1 | -2/+2 |
| | | |||||
* | | Merge pull request #2109 from FernandoS27/fix-f2i | bunnei | 2019-02-12 | 1 | -3/+3 |
|\ \ | | | | | | | Corrected F2I None mode to RoundEven. | ||||
| * | | Corrected F2I None mode to RoundEven. | Fernando Sahmkow | 2019-02-11 | 1 | -3/+3 |
| |/ | |||||
* | | shader_ir: Remove F4 prefix to texture operations | ReinUsesLisp | 2019-02-07 | 1 | -8/+7 |
| | | | | | | | | | | | | This was originally included because texture operations returned a vec4. These operations now return a single float and the F4 prefix doesn't mean anything. | ||||
* | | shader_ir: Clean texture management code | ReinUsesLisp | 2019-02-07 | 1 | -96/+58 |
|/ | | | | | | | | | Previous code relied on GLSL parameter order (something that's always ill-formed on an IR design). This approach passes spatial coordiantes through operation nodes and array and depth compare values in the the texture metadata. It still contains an "extra" vector containing generic nodes for bias and component index (for example) which is still a bit ill-formed but it should be better than the previous approach. | ||||
* | Merge pull request #2083 from ReinUsesLisp/shader-ir-cbuf-tracking | bunnei | 2019-02-07 | 25 | -34/+34 |
|\ | | | | | shader/track: Add a more permissive global memory tracking | ||||
| * | shader_ir: Rename BasicBlock to NodeBlock | ReinUsesLisp | 2019-02-03 | 25 | -33/+32 |
| | | | | | | | | It's not always used as a basic block. Rename it for consistency. | ||||
| * | shader_ir: Pass decoded nodes as a whole instead of per basic blocks | ReinUsesLisp | 2019-02-03 | 25 | -26/+27 |
| | | | | | | | | | | | | | | | | | | Some games call LDG at the top of a basic block, making the tracking heuristic to fail. This commit lets the heuristic the decoded nodes as a whole instead of per basic blocks. This may lead to some false positives but allows it the heuristic to track cases it previously couldn't. | ||||
* | | Merge pull request #2081 from ReinUsesLisp/lmem-64 | bunnei | 2019-02-05 | 1 | -12/+43 |
|\ \ | | | | | | | shader_ir/memory: Add LD_L 64 bits loads | ||||
| * | | shader_ir/memory: Add ST_L 64 and 128 bits stores | ReinUsesLisp | 2019-02-03 | 1 | -3/+11 |
| | | | |||||
| * | | shader_ir/memory: Add LD_L 128 bits loads | ReinUsesLisp | 2019-02-03 | 1 | -7/+19 |
| | | | |||||
| * | | shader_bytecode: Rename BytesN enums to BitsN | ReinUsesLisp | 2019-02-03 | 1 | -4/+4 |
| | | | |||||
| * | | shader_ir/memory: Add LD_L 64 bits loads | ReinUsesLisp | 2019-02-03 | 1 | -6/+17 |
| |/ | |||||
* | | Merge pull request #2082 from FernandoS27/txq-stl | bunnei | 2019-02-05 | 1 | -6/+9 |
|\ \ | |/ |/| | Fix TXQ not using the component mask. | ||||
| * | Fix TXQ not using the component mask. | Fernando Sahmkow | 2019-02-03 | 1 | -6/+9 |
| | | |||||
* | | shader_ir: Unify constant buffer offset values | ReinUsesLisp | 2019-01-30 | 13 | -21/+23 |
|/ | | | | | | | Constant buffer values on the shader IR were using different offsets if the access direct or indirect. cbuf34 has a non-multiplied offset while cbuf36 does. On shader decoding this commit multiplies it by four on cbuf34 queries. | ||||
* | shader_decode: Implement LDG and basic cbuf tracking | ReinUsesLisp | 2019-01-30 | 1 | -0/+49 |
| | |||||
* | shader_ir: Fixup clang build | ReinUsesLisp | 2019-01-16 | 1 | -4/+6 |
| | |||||
* | shader_decode: Fixup XMAD | ReinUsesLisp | 2019-01-15 | 1 | -1/+1 |
| | |||||
* | shader_ir: Pass to decoder functions basic block's code | ReinUsesLisp | 2019-01-15 | 25 | -25/+25 |
| | |||||
* | shader_decode: Improve zero flag implementation | ReinUsesLisp | 2019-01-15 | 13 | -73/+53 |
| | |||||
* | shader_ir: Remove composite primitives and use temporals instead | ReinUsesLisp | 2019-01-15 | 1 | -145/+149 |
| | |||||
* | shader_decode: Use proper primitive names | ReinUsesLisp | 2019-01-15 | 2 | -8/+8 |
| | |||||
* | shader_decode: Use BitfieldExtract instead of shift + and | ReinUsesLisp | 2019-01-15 | 5 | -46/+18 |
| | |||||
* | shader_ir: Remove Ipa primitive | ReinUsesLisp | 2019-01-15 | 1 | -3/+2 |
| | |||||
* | shader_ir: Remove RZ and use Register::ZeroIndex instead | ReinUsesLisp | 2019-01-15 | 1 | -6/+11 |
| | |||||
* | shader_decode: Implement TEXS.F16 | ReinUsesLisp | 2019-01-15 | 1 | -13/+25 |
| | |||||
* | shader_decode: Fixup R2P | ReinUsesLisp | 2019-01-15 | 1 | -2/+3 |
| | |||||
* | shader_decode: Fixup WriteLogicOperation zero comparison | ReinUsesLisp | 2019-01-15 | 1 | -1/+1 |
| | |||||
* | shader_decode: Fixup PSET | ReinUsesLisp | 2019-01-15 | 1 | -2/+3 |
| | |||||
* | shader_decode: Fixup clang-format | ReinUsesLisp | 2019-01-15 | 2 | -2/+4 |
| | |||||
* | video_core: Implement IR based geometry shaders | ReinUsesLisp | 2019-01-15 | 1 | -0/+25 |
| | |||||
* | shader_decode: Implement VMAD and VSETP | ReinUsesLisp | 2019-01-15 | 1 | -0/+120 |
| | |||||
* | shader_decode: Implement HSET2 | ReinUsesLisp | 2019-01-15 | 1 | -1/+43 |
| | |||||
* | shader_decode: Rework HSETP2 | ReinUsesLisp | 2019-01-15 | 1 | -3/+5 |
| | |||||
* | shader_decode: Implement R2P | ReinUsesLisp | 2019-01-15 | 1 | -1/+28 |
| | |||||
* | shader_decode: Implement CSETP | ReinUsesLisp | 2019-01-15 | 1 | -14/+37 |
| | |||||
* | shader_decode: Implement PSET | ReinUsesLisp | 2019-01-15 | 1 | -1/+16 |
| | |||||
* | shader_decode: Implement HFMA2 | ReinUsesLisp | 2019-01-15 | 1 | -1/+53 |
| | |||||
* | shader_decode: Implement POPC | ReinUsesLisp | 2019-01-15 | 1 | -0/+10 |
| | |||||
* | shader_decode: Implement TLDS (untested) | ReinUsesLisp | 2019-01-15 | 1 | -8/+61 |
| | |||||
* | shader_decode: Update TLD4 reflecting #1862 changes | ReinUsesLisp | 2019-01-15 | 1 | -52/+49 |
| | |||||
* | shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompiling | ReinUsesLisp | 2019-01-15 | 1 | -50/+49 |
| | |||||
* | shader_decode: Fixup FSET | ReinUsesLisp | 2019-01-15 | 1 | -2/+2 |
| | |||||
* | shader_decode: Implement IADD32I | ReinUsesLisp | 2019-01-15 | 1 | -0/+11 |
| | |||||
* | video_core: Return safe values after an assert hits | ReinUsesLisp | 2019-01-15 | 6 | -8/+12 |
| | |||||
* | shader_decode: Implement FFMA | ReinUsesLisp | 2019-01-15 | 1 | -1/+36 |
| | |||||
* | shader_ir: Fixup file inclusions and clang-format | ReinUsesLisp | 2019-01-15 | 1 | -1/+1 |
| | |||||
* | shader_decode: Fixup clang-format | ReinUsesLisp | 2019-01-15 | 2 | -3/+2 |
| | |||||
* | shader_decode: Implement LEA | ReinUsesLisp | 2019-01-15 | 1 | -0/+55 |
| | |||||
* | shader_decode: Implement IADD3 | ReinUsesLisp | 2019-01-15 | 1 | -0/+61 |
| | |||||
* | shader_decode: Implement LOP3 | ReinUsesLisp | 2019-01-15 | 1 | -0/+60 |
| | |||||
* | shader_decode: Implement ST_L | ReinUsesLisp | 2019-01-15 | 1 | -0/+17 |
| | |||||
* | shader_decode: Implement LD_L | ReinUsesLisp | 2019-01-15 | 1 | -0/+18 |
| | |||||
* | shader_decode: Implement HSETP2 | ReinUsesLisp | 2019-01-15 | 1 | -1/+37 |
| | |||||
* | shader_decode: Implement HADD2 and HMUL2 | ReinUsesLisp | 2019-01-15 | 1 | -1/+48 |
| | |||||
* | shader_decode: Implement HADD2_IMM and HMUL2_IMM | ReinUsesLisp | 2019-01-15 | 1 | -1/+28 |
| | |||||
* | shader_decode: Implement MOV_SYS | ReinUsesLisp | 2019-01-15 | 1 | -0/+27 |
| | |||||
* | shader_decode: Implement IMNMX | ReinUsesLisp | 2019-01-15 | 1 | -0/+16 |
| | |||||
* | shader_decode: Implement F2F_C | ReinUsesLisp | 2019-01-15 | 1 | -2/+10 |
| | |||||
* | shader_decode: Implement I2I | ReinUsesLisp | 2019-01-15 | 1 | -0/+26 |
| | |||||
* | shader_decode: Implement BRA internal flag | ReinUsesLisp | 2019-01-15 | 1 | -4/+8 |
| | |||||
* | shader_decode: Implement ISCADD | ReinUsesLisp | 2019-01-15 | 1 | -0/+15 |
| | |||||
* | shader_decode: Implement XMAD | ReinUsesLisp | 2019-01-15 | 1 | -1/+85 |
| | |||||
* | shader_decode: Implement PBK and BRK | ReinUsesLisp | 2019-01-15 | 1 | -1/+22 |
| | |||||
* | shader_decode: Implement LOP | ReinUsesLisp | 2019-01-15 | 1 | -0/+15 |
| | |||||
* | shader_decode: Implement SEL | ReinUsesLisp | 2019-01-15 | 1 | -0/+8 |
| | |||||
* | shader_decode: Implement IADD | ReinUsesLisp | 2019-01-15 | 1 | -1/+28 |
| | |||||
* | shader_decode: Implement ISETP | ReinUsesLisp | 2019-01-15 | 1 | -1/+30 |
| | |||||
* | shader_decode: Implement BFI | ReinUsesLisp | 2019-01-15 | 1 | -1/+22 |
| | |||||
* | shader_decode: Implement ISET | ReinUsesLisp | 2019-01-15 | 1 | -1/+27 |
| | |||||
* | shader_decode: Implement LD_C | ReinUsesLisp | 2019-01-15 | 1 | -0/+31 |
| | |||||
* | shader_decode: Implement SHL | ReinUsesLisp | 2019-01-15 | 1 | -0/+8 |
| | |||||
* | shader_decode: Implement SHR | ReinUsesLisp | 2019-01-15 | 1 | -1/+26 |
| | |||||
* | shader_decode: Implement LOP32I | ReinUsesLisp | 2019-01-15 | 1 | -1/+67 |
| | |||||
* | shader_decode: Implement BFE | ReinUsesLisp | 2019-01-15 | 1 | -1/+25 |
| | |||||
* | shader_decode: Implement FSET | ReinUsesLisp | 2019-01-15 | 1 | -1/+36 |
| | |||||
* | shader_decode: Implement F2I | ReinUsesLisp | 2019-01-15 | 1 | -0/+37 |
| | |||||
* | shader_decode: Implement I2F | ReinUsesLisp | 2019-01-15 | 1 | -0/+23 |
| | |||||
* | shader_decode: Implement F2F | ReinUsesLisp | 2019-01-15 | 1 | -1/+37 |
| | |||||
* | shader_decode: Stub DEPBAR | ReinUsesLisp | 2019-01-15 | 1 | -0/+4 |
| | |||||
* | shader_decode: Implement SSY and SYNC | ReinUsesLisp | 2019-01-15 | 1 | -0/+19 |
| | |||||
* | shader_decode: Implement PSETP | ReinUsesLisp | 2019-01-15 | 1 | -1/+21 |
| | |||||
* | shader_decode: Implement TMML | ReinUsesLisp | 2019-01-15 | 1 | -3/+45 |
| | |||||
* | shader_decode: Implement TEX and TXQ | ReinUsesLisp | 2019-01-15 | 1 | -0/+219 |
| | |||||
* | shader_decode: Implement TEXS (F32) | ReinUsesLisp | 2019-01-15 | 1 | -0/+199 |
| | |||||
* | shader_decode: Implement FSETP | ReinUsesLisp | 2019-01-15 | 1 | -1/+33 |
| | |||||
* | shader_decode: Partially implement BRA | ReinUsesLisp | 2019-01-15 | 1 | -0/+12 |
| | |||||
* | shader_decode: Implement IPA | ReinUsesLisp | 2019-01-15 | 1 | -0/+12 |
| | |||||
* | shader_decode: Implement EXIT | ReinUsesLisp | 2019-01-15 | 1 | -1/+32 |
| | |||||
* | shader_decode: Implement ST_A | ReinUsesLisp | 2019-01-15 | 1 | -0/+30 |
| | |||||
* | shader_decode: Implement LD_A | ReinUsesLisp | 2019-01-15 | 1 | -1/+39 |
| | |||||
* | shader_decode: Implement FADD32I | ReinUsesLisp | 2019-01-15 | 1 | -0/+12 |
| | |||||
* | shader_decode: Implement FMUL32_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+10 |
| | |||||
* | shader_decode: Implement MOV32_IMM | ReinUsesLisp | 2019-01-15 | 1 | -1/+9 |
| | |||||
* | shader_decode: Stub RRO_C, RRO_R and RRO_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+9 |
| | |||||
* | shader_decode: Implement FMNMX_C, FMNMX_R and FMNMX_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+18 |
| | |||||
* | shader_decode: Implement MUFU | ReinUsesLisp | 2019-01-15 | 1 | -0/+29 |
| | |||||
* | shader_decode: Implement FADD_C, FADD_R and FADD_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+15 |
| | |||||
* | shader_decode: Implement FMUL_C, FMUL_R and FMUL_IMM | ReinUsesLisp | 2019-01-15 | 1 | -0/+42 |
| | |||||
* | shader_decode: Implement MOV_C and MOV_R | ReinUsesLisp | 2019-01-15 | 1 | -1/+23 |
| | |||||
* | shader_ir: Initial implementation | ReinUsesLisp | 2019-01-15 | 25 | -0/+576 |