Releases: EnzymeAD/Reactant.jl
Releases · EnzymeAD/Reactant.jl
v0.2.31
v0.2.30
Reactant v0.2.30
Merged pull requests:
- feat: use parameter shardings from XLA (#743) (@avik-pal)
- feat: JLL changes to expose HloModule (#749) (@avik-pal)
- [IFRT] add ifrt-proxy server and client bindings (#750) (@mofeing)
- fix: ordering of arguments need to be according to device (#753) (@avik-pal)
- Support tracing of
remwith only one operand being aConcreteRNumber(#754) (@giordano) - Fix for ocean (#756) (@wsmoses)
- Bump to 0.2.30 (#757) (@glwagner)
- Fix implementation of
mod(#758) (@giordano) - [ReactantCUDAExt] Remove extra method (#760) (@giordano)
Closed issues:
v0.2.29
Reactant v0.2.29
Merged pull requests:
- Format code of branch "main" (#729) (@github-actions[bot])
- fix: prevent method ambiguity for CartesianIndex{1} (#730) (@avik-pal)
- [GHA] Some improvement to CI setup (#731) (@giordano)
- fix: improve generated mlir for wrapped arrays (#732) (@avik-pal)
- fix
Type(value)instead oftype(value)(#733) (@jumerckx) - fix: don't expand all ranges by default (#737) (@avik-pal)
- ci: add cpp format check (#739) (@avik-pal)
- feat: sharding via IFRT (#740) (@avik-pal)
- fix: unqualified Sharding access (#741) (@avik-pal)
- Force tracing of type to act as noop (#747) (@wsmoses)
- Support for dicts (#748) (@wsmoses)
Closed issues:
v0.2.28
v0.2.27
Reactant v0.2.27
Merged pull requests:
- Format code of branch "main" (#711) (@github-actions[bot])
- feat: overload ifelse for more types (#712) (@avik-pal)
- fix: multi-device execution and sharding [take III] (#713) (@avik-pal)
- Replace capture maps with
Holdedwrapper (#715) (@mofeing) - refactor: split XLA.jl into multiple files (#716) (@avik-pal)
- feat: enable async on CPU (#717) (@avik-pal)
- [ReactantExtra] IFRT bindings (round 4) (#718) (@mofeing)
- [ReactantExtra] feat: OpSharding bindings for Julia (#721) (@avik-pal)
- [ReactantExtra] fix: build on mac (#722) (@avik-pal)
- Update WORKSPACE (#723) (@avik-pal)
- Fix jll (#724) (@wsmoses)
Closed issues:
- shardy functions not visible on macos (#714)
v0.2.26
Reactant v0.2.26
Merged pull requests:
@tracefunction calls (#366) (@jumerckx)- chore: missing upstream optimization passes (#624) (@avik-pal)
- feat: shardy and multi device execution (#637) (@avik-pal)
- Regenerate MLIR Bindings (#686) (@github-actions[bot])
- Misc fixes (#687) (@wsmoses)
- dict value fix (#688) (@wsmoses)
- [deps] Some improvements to the
build_local.jlscript (#689) (@giordano) - Multiple device error (#690) (@wsmoses)
- feat: API changes for multi-device execution [ReactantExtra JLL changes] (#692) (@avik-pal)
- Wrapping RCReferences (#697) (@hhkit)
- Ref ptr fix (#698) (@wsmoses)
- Add GPUCompiler and LLVM as deps to CUDA extension and run CUDA tests on macOS (#700) (@giordano)
- vendor optimize (#703) (@wsmoses)
- [ReactantExtra] Stop removing references to
hardware_interference_size(#704) (@giordano) - Update Project.toml (#705) (@wsmoses)
- JLL related fixups (#706) (@wsmoses)
- Regenerate MLIR Bindings (#708) (@github-actions[bot])
- Format code of branch "main" (#709) (@github-actions[bot])
- fix: don't trace val (#710) (@avik-pal)
Closed issues:
v0.2.25
Reactant v0.2.25
Merged pull requests:
- make
similarreturn empty tensors. (#632) (@jumerckx) - Use
LLVMOpenMP_jllto call OpenMP functions (#673) (@giordano) - [ReactantCUDAExt] Skip precompile load on Julia v1.11.3 (#675) (@giordano)
- Regenerate MLIR Bindings (#680) (@github-actions[bot])
- [ReactantExtra] Add argument to
ClientCompileto pass CUDA data dir (#683) (@giordano) - CUDA: fix gc issues (#685) (@wsmoses)
Closed issues:
v0.2.24
v0.2.23
Reactant v0.2.23
Merged pull requests:
- Regenerate MLIR Bindings (#627) (@github-actions[bot])
- [CI] Add workflow to clean up docs previews (#628) (@giordano)
- fix: build error with shardy (#629) (@avik-pal)
- [ReactantExtra] Improvements to BUILD file to compile CUDA for aarch64 (#631) (@giordano)
- fix cuda abi setting (#633) (@wsmoses)
- Format code of branch "main" (#634) (@github-actions[bot])
- [tests] Replace random custom type numbers with fixed set of numbers (#636) (@giordano)
- Add IR dumping (#638) (@wsmoses)
- [ReactantExtra] Bump XLA version (#640) (@giordano)
- TPU profiler (#642) (@Pangoraw)
- Applehw (#643) (@wsmoses)
- Regenerate MLIR Bindings (#644) (@github-actions[bot])
- feat: add dispatch for KA get_backend (#645) (@avik-pal)
- Use
xla/stream_executor/cuda:cuda_compute_capability_proto_cc_implonly on non CUDA (#646) (@giordano) - CPU backend (#647) (@wsmoses)
- docs: add shardy to docs (#648) (@avik-pal)
- chore: generate shardy c wrappers (#650) (@avik-pal)
- Regenerate MLIR Bindings (#651) (@github-actions[bot])
- chore: missing header files in API (#652) (@avik-pal)
- feat: the big jll PR (#653) (@avik-pal)
- [CI] Fix path of previews directory in PreviewCleanup workflow (#656) (@giordano)
- Detect TPU using PCI devices (#659) (@Pangoraw)
- Replace
trim->strip(#661) (@giordano) - Silence various warnings in tests (#662) (@giordano)
- Feature: allow colon indexing of traced vectors (#664) (@floffy-f)
- Format code of branch "main" (#665) (@github-actions[bot])
- Regenerate MLIR Bindings (#666) (@github-actions[bot])
- KA ext (#667) (@wsmoses)
- [docs] Add information about configuration on GPU and TPU systems (#668) (@giordano)
- Fix ntuple traced type issue on unionall (#669) (@wsmoses)
Closed issues:
v0.2.22
Reactant v0.2.22
Merged pull requests:
- [CI] Move tests on aarch64 linux to GitHub Actions (#543) (@giordano)
- feat: multi GPU support (#587) (@avik-pal)
- feat: expose gpu memory allocation options (#589) (@avik-pal)
- Fix condition to skip CUDA tests on aarch64 (#592) (@giordano)
- feat: add the new optimization passes (#595) (@avik-pal)
- feat: support lowering custom fp types (#596) (@avik-pal)
- Update ReactantCUDAExt.jl (#597) (@wsmoses)
- Add convert (#598) (@wsmoses)
- feat: support dynamic indexing for reshaped arrays (#601) (@avik-pal)
- Fix dense elements attribute in
Enzyme.autodiff#593 (#604) (@mofeing) - feat: overload LinearAlgebra.kron (#607) (@avik-pal)
- feat: more indexing support (#608) (@avik-pal)
- feat: forward more base ops to chlo (#611) (@avik-pal)
- Add hermetic cuda getter (#612) (@wsmoses)
- [tests] Always skip CUDA tests on non-CUDA machines (#615) (@giordano)
- Typed rounding (#619) (@wsmoses)
- Regenerate MLIR Bindings (#621) (@github-actions[bot])
- feat: build the shardy dialect (#622) (@avik-pal)
- feat: support more set indexing (#625) (@avik-pal)
- Add bound optimizations (#626) (@wsmoses)
Closed issues: