Releases · AI-Hypercomputer/kithara · GitHub

25 Mar 15:59

Kithara v0.0.10 Release Latest

Latest

Highlights

Llama3.1 is now supported -- Test it out by using the "hf://meta-llama/Llama-3.1-8B" model handle :)
MaxText model inference with KV cache -- MaxTextModel.generate("hi") now is much faster!
Serve with vLLM TPUs or GPUs -- Check our docs to see how to serve Kithara tuned models to vLLM

What's Changed

Support llama3.1 models by @chandrasekhard2
Add vllm guide by @richardsliu in #3
Add disk storage documentation by @richardsliu in #4
Patch llama3.1 model support by @wenxindongwork in #5
Fix Llama3.1 MaxText saving by @wenxindongwork in #6
Support uploading models to HuggingFace Hub by @wenxindongwork in #7
Adding multiple benchmarks for LoRA & full parameter fine by @manavgarg in #9
Support running inference with KV cache on Kithara's MaxText models. by @wenxindongwork in #10

Contributors

manavgarg, richardsliu, and 2 other contributors

Assets 2