Skip to content

[Bug] Mixed Precision (float16) numerical instability in GroupNormalization with small epsilon #22586

@amadhan882

Description

@amadhan882

Description

While using keras.mixed_precision.set_global_policy("mixed_float16"), the GroupNormalization layer produces NaN outputs when the epsilon parameter is set to a value smaller than the precision limit of float16 (e.g., $1e-12$). Since the smallest positive normalized value in FP16 is approximately $6.1e-5$, a very small epsilon effectively acts as zero during variance normalization, leading to numerical collapse.

Poc

import os
os.environ["KERAS_BACKEND"] = "tensorflow"
import keras
import numpy as np

keras.mixed_precision.set_global_policy("mixed_float16")
inputs = keras.Input(shape=(16, 16, 64))
# Epsilon 1e-12 causes NaN in FP16
x = keras.layers.GroupNormalization(groups=8, epsilon=1e-12)(inputs)
model = keras.Model(inputs, x)

extreme_data = np.random.uniform(low=60000, high=70000, size=(1, 16, 16, 64)).astype("float32")
output = model(extreme_data)
print(f"NaN detected: {np.any(np.isnan(output))}")


Observed Result

NaN detected: True

Expected Behavior

The layer should either:

  • Automatically clip the epsilon to a safe minimum (e.g., 1e-7) when the policy is float16.

  • Perform the normalization internal math in float32 regardless of the policy (Upcasting).

Actual Behavior

The layer produces NaN values immediately when exposed to large input values or small epsilons in mixed_float16 mode.

Notes

  • This issue arises from float16 precision limits rather than user misuse.
  • Other normalization layers (e.g., BatchNormalization) internally upcast to float32 to avoid similar issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions