Age | Commit message (Collapse) | Author | |
---|---|---|---|
2021-02-22 | Update README.md (#42)v2 | Andrzej Janik | |
2021-02-22 | Make misc fixes (#41) | Andrzej Janik | |
* Update ze_loader.lib to the newest version * Export _ptsz/_ptds for which we have a legacy stream implementations * Stop producing build logs if we are not looking at them anyway | |||
2021-02-21 | Add zluda_redirect.dll to CI builds (#40) | Andrzej Janik | |
2021-02-21 | Improve CI (#39) | Andrzej Janik | |
* Use official GPU driver packages for building on Linux * Start building on Windows * Start uploading artifacts | |||
2021-02-20 | Improve ZLUDA injection (#37) | Andrzej Janik | |
Improve injector&redirector so it's no longer required to manually mess with files if the application links nvcuda.dll. Additionally inject into child processes | |||
2021-01-26 | Fix signed integer conversion (#36) | Andrzej Janik | |
This fixes the last remaining bug preventing end-to-end GeekBench run, so also update Geekbench results in README | |||
2021-01-23 | Add script for replaying dumped kernel (#34) | Andrzej Janik | |
zluda_dump can already create traces of GPU execution, this script can replay those traces. Additionally, changed added just enough code in core ZLUDA to support simple PyCUDAexecution | |||
2021-01-16 | Add a library for dumping kernels arguments before and after launch (#18) | Andrzej Janik | |
2021-01-15 | Prevent linker from stripping exports on Linux (#33) | Andrzej Janik | |
2021-01-08 | Add empty implementation of cuDeviceGetLuid (#30) | Andrzej Janik | |
This function is required by recent versions of CUDA runtime on Windows | |||
2021-01-08 | Regenerate SPIR-V tests (#29) | Andrzej Janik | |
In one of the previous commits we made a change to mark ld/st as aligned. This change was not propagated to test files | |||
2021-01-08 | Improve build procedure and instructions (#28) | Andrzej Janik | |
Fixes issues pointed out in #27: * spirv_tools-sys was build in non-test profiles * By default ZLUDA dll has a wrong name * We relied on third-party OpenCL installation on Windows * We encouraged building debug configuration * We didn't provide build information for developers (cmake, python, submodules) | |||
2021-01-03 | Fix Windows ZLUDA injector (#26) | Andrzej Janik | |
Fix various bugs in injector and redirector, make them more robust and enable building them by default | |||
2021-01-03 | Merge commit '4b96dbc8f49c5ae00c96935e0b576df88a5d8af9' | Andrzej Janik | |
2021-01-03 | Squashed 'ext/detours/' changes from 39aa864..36b69b9 | Andrzej Janik | |
36b69b9 Make Detours MinGW Clang-compatible git-subtree-dir: ext/detours git-subtree-split: 36b69b971888b2ca0c5913563bae011efaa4a42e | |||
2021-01-03 | Merge commit 'dabc40cb19bf4e297c32284d26c74adbd6775e49' as 'ext/detours' | Andrzej Janik | |
2021-01-03 | Squashed 'ext/detours/' content from commit 39aa864 | Andrzej Janik | |
git-subtree-dir: ext/detours git-subtree-split: 39aa864d2985099c8d847e29a5fb86618039b9c4 | |||
2020-12-29 | Add building only CI (#25) | Takeshi Watanabe | |
Testing isn't working yet because some tests require live Intel GPU and live NVIDIA GPU | |||
2020-12-12 | Fix builtins generation, mark ld/st as aligned (#22) | Andrzej Janik | |
Two changes: * Fixes to builtins generation that I forgot to include in #21 * Marking of ld/st as aligned - this gives a big performance boost in GeekBench SFFT | |||
2020-12-11 | Fix SPIR-V code generation for PTX special registers (#21) | Andrzej Janik | |
We currently directly map PTX special registers: %ntid, %tid, etc. to SPIR-V builtins with type OpTypeVector %uint 4. This is wrong and leads to a silent corruption, which fails e.g. Depth of Field in GeekBench | |||
2020-12-09 | Refactor how vectors are handled (#20) | vosen | |
Current code has a problem with handling vector members: "b.x" in "mov.u32 a, b.x". This functionality has been kinda tacked-on and has annoying issues: * vector members support is only limited to being source of movs (so "add.u32 a.x, b.x, c.y" will not work) * the width of "b" in "b.x" is not known, which led to some "interesting" workarounds * passes can either convert all member accesses to other member accesses or to temporaries. No way to convert some member accesses to temporaries (which we need for an important fix) This commit solves all this | |||
2020-11-29 | Merge pull request #15 from nilsmartel/patch-2 | vosen | |
Fix small typo | |||
2020-11-29 | Merge pull request #14 from ritschwumm/patch-1 | vosen | |
fix typo in readme | |||
2020-11-27 | Fix small typo | Nils Martel | |
2020-11-27 | fix typo in readme | ritschwumm | |
2020-11-24 | Update wording, add license | Andrzej Janik | |
2020-11-23 | Update README with links to GeekBench resultsv1 | Andrzej Janik | |
2020-11-23 | Append short project name to the device if there's not enough space for long ↵ | Andrzej Janik | |
name | |||
2020-11-23 | Change wording slightly | Andrzej Janik | |
2020-11-23 | Add graph with Geekbench results | Andrzej Janik | |
2020-11-23 | Add README and rebuild .spv library | Andrzej Janik | |
2020-11-23 | Remove temporary file | Andrzej Janik | |
2020-11-23 | Rename everything | Andrzej Janik | |
2020-11-23 | Throw away useless stuff | Andrzej Janik | |
2020-11-22 | Fix typo in selp | Andrzej Janik | |
2020-11-22 | Add 8bit memset | Andrzej Janik | |
2020-11-21 | Fix linking with shl/shr, add memset on host and support __assertfail | Andrzej Janik | |
2020-11-21 | Fix problems with linking | Andrzej Janik | |
2020-11-20 | Fix buggy handling of u8 shared memory | Andrzej Janik | |
2020-11-19 | Implement stateless-to-stateful optimization | Andrzej Janik | |
2020-11-14 | Support more property queries | Andrzej Janik | |
2020-11-12 | Add back erroneously removed functionality | Andrzej Janik | |
2020-11-12 | Refactor host code to use one big lock | Andrzej Janik | |
2020-11-07 | Append project URL to device name and add few missing CUDA v1 functions | Andrzej Janik | |
2020-11-07 | Fix ftz behavior slightly | Andrzej Janik | |
2020-11-06 | Implement instructions bfe, rem, xor | Andrzej Janik | |
2020-11-05 | Implement instructions clz, brev, popc | Andrzej Janik | |
2020-11-05 | Fix same width float-to-float conversions | Andrzej Janik | |
2020-11-05 | Fix issues with .param/.local and implement sin, cos, ex2, lg2 | Andrzej Janik | |
2020-11-01 | Implement neg instruction | Andrzej Janik | |