ZLUDA - CUDA on AMD GPUs

Age	Commit message (Collapse)	Author
2024-04-22	Attempt to fix bpermute on wave64bpermute	Andrzej Janik

2024-04-14	Rewrite surface implementation to more accurately support unofficial CUDA ↵	Andrzej Janik
	semantics (#203) This fixes black screen in some CompuBench tests (TV-L1 Optical Flow) and other apps that use CUDA surfaces incorrectly
2024-04-06	Implement sad instruction (#198)	Andrzej Janik

2024-04-05	Fix buggy carry flags when mixing subc/sub.cc with addc/add.cc (#197)	Andrzej Janik

2024-04-05	Implement mad.hi.cc (#196)	NyanCatTW1

2024-03-29	Support old PTX compression scheme (#188)	Andrzej Janik

2024-03-28	Add Blender 4.2 support (#184)	Andrzej Janik
	Redo primary context and fix various long-standing bugs around this API
2024-03-17	Disable even more optional LLVM components (#179)	Andrzej Janik

2024-03-17	Fix reported build errors (#178)	Andrzej Janik

2024-03-08	Update README.md (#166)	Ikko Eltociear Ashimine
	underying -> underlying
2024-02-26	Fix adrenalin software link (#139)	Seb Ospina
	The link that should be for AMD Adrenalin was pointing to ROCm linux info
2024-02-16	Update llama.cpp support (#102)	Andrzej Janik
	Add sign extension support to prmt, allow set.<op>.f16x2.f16x2, add more BLAS mappings
2024-02-15	Update README.md (#100)	Ikko Eltociear Ashimine
	uderlying -> underlying
2024-02-15	Add troubleshooting/debugging instructions (#91)	Andrzej Janik

2024-02-15	Fixed typo in readme (#89)	ManInDark

2024-02-13	Fixing typo in README.md (#63)	Arna13

2024-02-13	Tidy up some English in ARCHITECTURE.md (#61)	Sean McLemon

2024-02-11	Nobody expects the Red Teamv3	Andrzej Janik
	Too many changes to list, but broadly: * Remove Intel GPU support from the compiler * Add AMD GPU support to the compiler * Remove Intel GPU host code * Add AMD GPU host code * More device instructions. From 40 to 68 * More host functions. From 48 to 184 * Add proof of concept implementation of OptiX framework * Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML * Improve ZLUDA launcher for Windows
2021-02-28	Search for a new developer (#44)	Andrzej Janik

2021-02-22	Update README.md (#42)v2	Andrzej Janik

2021-02-22	Make misc fixes (#41)	Andrzej Janik
	* Update ze_loader.lib to the newest version * Export _ptsz/_ptds for which we have a legacy stream implementations * Stop producing build logs if we are not looking at them anyway
2021-02-21	Add zluda_redirect.dll to CI builds (#40)	Andrzej Janik

2021-02-21	Improve CI (#39)	Andrzej Janik
	* Use official GPU driver packages for building on Linux * Start building on Windows * Start uploading artifacts
2021-02-20	Improve ZLUDA injection (#37)	Andrzej Janik
	Improve injector&redirector so it's no longer required to manually mess with files if the application links nvcuda.dll. Additionally inject into child processes
2021-01-26	Fix signed integer conversion (#36)	Andrzej Janik
	This fixes the last remaining bug preventing end-to-end GeekBench run, so also update Geekbench results in README
2021-01-23	Add script for replaying dumped kernel (#34)	Andrzej Janik
	zluda_dump can already create traces of GPU execution, this script can replay those traces. Additionally, changed added just enough code in core ZLUDA to support simple PyCUDAexecution
2021-01-16	Add a library for dumping kernels arguments before and after launch (#18)	Andrzej Janik

2021-01-15	Prevent linker from stripping exports on Linux (#33)	Andrzej Janik

2021-01-08	Add empty implementation of cuDeviceGetLuid (#30)	Andrzej Janik
	This function is required by recent versions of CUDA runtime on Windows
2021-01-08	Regenerate SPIR-V tests (#29)	Andrzej Janik
	In one of the previous commits we made a change to mark ld/st as aligned. This change was not propagated to test files
2021-01-08	Improve build procedure and instructions (#28)	Andrzej Janik
	Fixes issues pointed out in #27: * spirv_tools-sys was build in non-test profiles * By default ZLUDA dll has a wrong name * We relied on third-party OpenCL installation on Windows * We encouraged building debug configuration * We didn't provide build information for developers (cmake, python, submodules)
2021-01-03	Fix Windows ZLUDA injector (#26)	Andrzej Janik
	Fix various bugs in injector and redirector, make them more robust and enable building them by default
2021-01-03	Merge commit '4b96dbc8f49c5ae00c96935e0b576df88a5d8af9'	Andrzej Janik

2021-01-03	Squashed 'ext/detours/' changes from 39aa864..36b69b9	Andrzej Janik
	36b69b9 Make Detours MinGW Clang-compatible git-subtree-dir: ext/detours git-subtree-split: 36b69b971888b2ca0c5913563bae011efaa4a42e
2021-01-03	Merge commit 'dabc40cb19bf4e297c32284d26c74adbd6775e49' as 'ext/detours'	Andrzej Janik

2021-01-03	Squashed 'ext/detours/' content from commit 39aa864	Andrzej Janik
	git-subtree-dir: ext/detours git-subtree-split: 39aa864d2985099c8d847e29a5fb86618039b9c4
2020-12-29	Add building only CI (#25)	Takeshi Watanabe
	Testing isn't working yet because some tests require live Intel GPU and live NVIDIA GPU
2020-12-12	Fix builtins generation, mark ld/st as aligned (#22)	Andrzej Janik
	Two changes: * Fixes to builtins generation that I forgot to include in #21 * Marking of ld/st as aligned - this gives a big performance boost in GeekBench SFFT
2020-12-11	Fix SPIR-V code generation for PTX special registers (#21)	Andrzej Janik
	We currently directly map PTX special registers: %ntid, %tid, etc. to SPIR-V builtins with type OpTypeVector %uint 4. This is wrong and leads to a silent corruption, which fails e.g. Depth of Field in GeekBench
2020-12-09	Refactor how vectors are handled (#20)	vosen
	Current code has a problem with handling vector members: "b.x" in "mov.u32 a, b.x". This functionality has been kinda tacked-on and has annoying issues: * vector members support is only limited to being source of movs (so "add.u32 a.x, b.x, c.y" will not work) * the width of "b" in "b.x" is not known, which led to some "interesting" workarounds * passes can either convert all member accesses to other member accesses or to temporaries. No way to convert some member accesses to temporaries (which we need for an important fix) This commit solves all this
2020-11-29	Merge pull request #15 from nilsmartel/patch-2	vosen
	Fix small typo
2020-11-29	Merge pull request #14 from ritschwumm/patch-1	vosen
	fix typo in readme
2020-11-27	Fix small typo	Nils Martel

2020-11-27	fix typo in readme	ritschwumm

2020-11-24	Update wording, add license	Andrzej Janik

2020-11-23	Update README with links to GeekBench resultsv1	Andrzej Janik

2020-11-23	Append short project name to the device if there's not enough space for long ↵	Andrzej Janik
	name
2020-11-23	Change wording slightly	Andrzej Janik

2020-11-23	Add graph with Geekbench results	Andrzej Janik

2020-11-23	Add README and rebuild .spv library	Andrzej Janik