Use float32 for LoRA weights to avoid the risk of underflow and overflow.#22559
Use float32 for LoRA weights to avoid the risk of underflow and overflow.#22559james77777778 wants to merge 3 commits intokeras-team:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the LoRA implementation across several layers—including Convolutional, Dense, EinsumDense, and Embedding—to ensure that LoRA weights are initialized as float32 to prevent numerical instability. It also introduces explicit casting to the appropriate variable or compute dtypes during kernel composition and forward passes. A critical issue was identified in the EinsumDense layer where a trailing comma incorrectly converts the LoRA update into a tuple, which will cause a TypeError during tensor operations.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #22559 +/- ##
=======================================
Coverage 82.95% 82.96%
=======================================
Files 596 596
Lines 69252 69259 +7
Branches 10814 10814
=======================================
+ Hits 57451 57458 +7
Misses 8973 8973
Partials 2828 2828
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
7722450 to
28a8fe9
Compare
28a8fe9 to
4080af9
Compare
|
PR rebased. The openvino test failure should be unrelated to this PR. |
4080af9 to
b91abca
Compare
b91abca to
9818dc4
Compare
|
Thanks for the PR, it's very well written, please check my minor comment about this. |
| "lora is already enabled. This can only be done once per layer." | ||
| ) | ||
| self._tracker.unlock() | ||
|
|
There was a problem hiding this comment.
The PR itself is fine; just worth adding a note that users should merge LoRA weights before deploying for inference.
There was a problem hiding this comment.
Sure. The comments have been updated:
# LoRA weights should be float32 to avoid the risk of underflow or
# overflow during fine-tuning.
# When deploying the model, these weights should be merged with the
# original kernel while maintaining the original kernel's dtype.
...Add notes for deploying with lora weights.
9818dc4 to
5822cef
Compare
Description
As reported in keras-team/keras-hub#2629
We should use high precision (float32) for LoRA weights to stabilize the finetuning.
References:
autocast_adapter_dtype)huggingface/peftimpl)Contributor Agreement
Please check all boxes below before submitting your PR for review: