aboutsummaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
2024-04-29Correctly report emulated wave32 CUDA devicewave32_report_fixAndrzej Janik
2024-04-28Build improvements (#206)Andrzej Janik
* Allow to create .zip package on Windows * Allow to create .tar.gz package on Linux * Add configuration for post-build Github CI
2024-04-14Rewrite surface implementation to more accurately support unofficial CUDA ↵Andrzej Janik
semantics (#203) This fixes black screen in some CompuBench tests (TV-L1 Optical Flow) and other apps that use CUDA surfaces incorrectly
2024-04-06Implement sad instruction (#198)Andrzej Janik
2024-04-05Fix buggy carry flags when mixing subc/sub.cc with addc/add.cc (#197)Andrzej Janik
2024-04-05Implement mad.hi.cc (#196)NyanCatTW1
2024-03-29Support old PTX compression scheme (#188)Andrzej Janik
2024-03-28Add Blender 4.2 support (#184)Andrzej Janik
Redo primary context and fix various long-standing bugs around this API
2024-03-17Disable even more optional LLVM components (#179)Andrzej Janik
2024-03-17Fix reported build errors (#178)Andrzej Janik
2024-03-08Update README.md (#166)Ikko Eltociear Ashimine
underying -> underlying
2024-02-26Fix adrenalin software link (#139)Seb Ospina
The link that should be for AMD Adrenalin was pointing to ROCm linux info
2024-02-16Update llama.cpp support (#102)Andrzej Janik
Add sign extension support to prmt, allow set.<op>.f16x2.f16x2, add more BLAS mappings
2024-02-15Update README.md (#100)Ikko Eltociear Ashimine
uderlying -> underlying
2024-02-15Add troubleshooting/debugging instructions (#91)Andrzej Janik
2024-02-15Fixed typo in readme (#89)ManInDark
2024-02-13Fixing typo in README.md (#63)Arna13
2024-02-13Tidy up some English in ARCHITECTURE.md (#61)Sean McLemon
2024-02-11Nobody expects the Red Teamv3Andrzej Janik
Too many changes to list, but broadly: * Remove Intel GPU support from the compiler * Add AMD GPU support to the compiler * Remove Intel GPU host code * Add AMD GPU host code * More device instructions. From 40 to 68 * More host functions. From 48 to 184 * Add proof of concept implementation of OptiX framework * Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML * Improve ZLUDA launcher for Windows
2021-02-28Search for a new developer (#44)Andrzej Janik
2021-02-22Update README.md (#42)v2Andrzej Janik
2021-02-22Make misc fixes (#41)Andrzej Janik
* Update ze_loader.lib to the newest version * Export _ptsz/_ptds for which we have a legacy stream implementations * Stop producing build logs if we are not looking at them anyway
2021-02-21Add zluda_redirect.dll to CI builds (#40)Andrzej Janik
2021-02-21Improve CI (#39)Andrzej Janik
* Use official GPU driver packages for building on Linux * Start building on Windows * Start uploading artifacts
2021-02-20Improve ZLUDA injection (#37)Andrzej Janik
Improve injector&redirector so it's no longer required to manually mess with files if the application links nvcuda.dll. Additionally inject into child processes
2021-01-26Fix signed integer conversion (#36)Andrzej Janik
This fixes the last remaining bug preventing end-to-end GeekBench run, so also update Geekbench results in README
2021-01-23 Add script for replaying dumped kernel (#34)Andrzej Janik
zluda_dump can already create traces of GPU execution, this script can replay those traces. Additionally, changed added just enough code in core ZLUDA to support simple PyCUDAexecution
2021-01-16Add a library for dumping kernels arguments before and after launch (#18)Andrzej Janik
2021-01-15Prevent linker from stripping exports on Linux (#33)Andrzej Janik
2021-01-08Add empty implementation of cuDeviceGetLuid (#30)Andrzej Janik
This function is required by recent versions of CUDA runtime on Windows
2021-01-08Regenerate SPIR-V tests (#29)Andrzej Janik
In one of the previous commits we made a change to mark ld/st as aligned. This change was not propagated to test files
2021-01-08Improve build procedure and instructions (#28)Andrzej Janik
Fixes issues pointed out in #27: * spirv_tools-sys was build in non-test profiles * By default ZLUDA dll has a wrong name * We relied on third-party OpenCL installation on Windows * We encouraged building debug configuration * We didn't provide build information for developers (cmake, python, submodules)
2021-01-03Fix Windows ZLUDA injector (#26)Andrzej Janik
Fix various bugs in injector and redirector, make them more robust and enable building them by default
2021-01-03Merge commit '4b96dbc8f49c5ae00c96935e0b576df88a5d8af9'Andrzej Janik
2021-01-03Squashed 'ext/detours/' changes from 39aa864..36b69b9Andrzej Janik
36b69b9 Make Detours MinGW Clang-compatible git-subtree-dir: ext/detours git-subtree-split: 36b69b971888b2ca0c5913563bae011efaa4a42e
2021-01-03Merge commit 'dabc40cb19bf4e297c32284d26c74adbd6775e49' as 'ext/detours'Andrzej Janik
2021-01-03Squashed 'ext/detours/' content from commit 39aa864Andrzej Janik
git-subtree-dir: ext/detours git-subtree-split: 39aa864d2985099c8d847e29a5fb86618039b9c4
2020-12-29Add building only CI (#25)Takeshi Watanabe
Testing isn't working yet because some tests require live Intel GPU and live NVIDIA GPU
2020-12-12Fix builtins generation, mark ld/st as aligned (#22)Andrzej Janik
Two changes: * Fixes to builtins generation that I forgot to include in #21 * Marking of ld/st as aligned - this gives a big performance boost in GeekBench SFFT
2020-12-11Fix SPIR-V code generation for PTX special registers (#21)Andrzej Janik
We currently directly map PTX special registers: %ntid, %tid, etc. to SPIR-V builtins with type OpTypeVector %uint 4. This is wrong and leads to a silent corruption, which fails e.g. Depth of Field in GeekBench
2020-12-09Refactor how vectors are handled (#20)vosen
Current code has a problem with handling vector members: "b.x" in "mov.u32 a, b.x". This functionality has been kinda tacked-on and has annoying issues: * vector members support is only limited to being source of movs (so "add.u32 a.x, b.x, c.y" will not work) * the width of "b" in "b.x" is not known, which led to some "interesting" workarounds * passes can either convert all member accesses to other member accesses or to temporaries. No way to convert some member accesses to temporaries (which we need for an important fix) This commit solves all this
2020-11-29Merge pull request #15 from nilsmartel/patch-2vosen
Fix small typo
2020-11-29Merge pull request #14 from ritschwumm/patch-1vosen
fix typo in readme
2020-11-27Fix small typoNils Martel
2020-11-27fix typo in readmeritschwumm
2020-11-24Update wording, add licenseAndrzej Janik
2020-11-23Update README with links to GeekBench resultsv1Andrzej Janik
2020-11-23Append short project name to the device if there's not enough space for long ↵Andrzej Janik
name
2020-11-23Change wording slightlyAndrzej Janik
2020-11-23Add graph with Geekbench resultsAndrzej Janik