Skip to content

Remove backend-specific strides + dilation_rate restriction from DepthwiseConv and SeparableConv#22598

Merged
hertschuh merged 7 commits intokeras-team:masterfrom
ssam18:fix/depthwise-separable-conv-strides-dilation-validation
Apr 17, 2026
Merged

Remove backend-specific strides + dilation_rate restriction from DepthwiseConv and SeparableConv#22598
hertschuh merged 7 commits intokeras-team:masterfrom
ssam18:fix/depthwise-separable-conv-strides-dilation-validation

Conversation

@ssam18
Copy link
Copy Markdown
Contributor

@ssam18 ssam18 commented Apr 1, 2026

The strides > 1 + dilation_rate > 1 combination is not universally unsupported. It only fails on the TensorFlow backend. The previous validation in BaseDepthwiseConv.init and BaseSeparableConv.init was raising a ValueError for all backends, which was too strict. This PR removes that blanket restriction and its associated docstring notes and tests, allowing JAX, PyTorch, and other backends to use the combination freely. Also adds compute_output_shape validation in BaseSeparableConv.build() so invalid output dimensions are still caught at build time regardless of backend.

Contributor Agreement

  • I am a human, and not a bot.
  • I will be responsible for responding to review comments in a timely manner.
  • I will work with the maintainers to push this PR forward until submission.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces validation in depthwise_conv and separable_conv to prevent the simultaneous use of strides and dilation rates greater than one, which is not supported. The implementation raises a ValueError when these conditions are met, and the tests have been updated to assert this behavior. Feedback focuses on improving maintainability by extracting the duplicated max-value calculation logic into a helper function and ensuring that the type-checking in tests is consistent with the implementation by supporting both lists and tuples.

Comment thread keras/src/ops/nn.py Outdated
Comment thread keras/src/ops/nn_test.py Outdated
Comment thread keras/src/ops/nn_test.py Outdated
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.94%. Comparing base (a44f5be) to head (f0d804e).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #22598      +/-   ##
==========================================
- Coverage   82.95%   82.94%   -0.01%     
==========================================
  Files         596      596              
  Lines       69200    69196       -4     
  Branches    10806    10804       -2     
==========================================
- Hits        57402    57398       -4     
  Misses       8969     8969              
  Partials     2829     2829              
Flag Coverage Δ
keras 82.76% <ø> (-0.01%) ⬇️
keras-jax 58.76% <ø> (-0.01%) ⬇️
keras-numpy 54.59% <ø> (-0.01%) ⬇️
keras-openvino 59.38% <ø> (-0.01%) ⬇️
keras-tensorflow 60.31% <ø> (-0.01%) ⬇️
keras-torch 59.10% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@keerthanakadiri keerthanakadiri added the stat:awaiting keras-eng Awaiting response from Keras engineer label Apr 2, 2026
Copy link
Copy Markdown
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What it looks like to me is that it was working on all backends except Tensorflow. So I would argue that we should do the reverse, i.e. remove the check on BaseDepthwiseConv and BaseSeparableConv and accept the fact that it doesn't work on Tensorflow. I think this check was inherited from Keras 2, which was TensorFlow only.

@hertschuh hertschuh added stat:awaiting response from contributor and removed stat:awaiting keras-eng Awaiting response from Keras engineer labels Apr 3, 2026
@ssam18
Copy link
Copy Markdown
Contributor Author

ssam18 commented Apr 3, 2026

@hertschuh Good point! I have reversed the approach and removed the global validation from depthwise_conv and separable_conv. The TF-specific skip is restored in the correctness tests, and the shape tests are back to asserting the correct output shapes for backends that support the combination.

@hertschuh
Copy link
Copy Markdown
Collaborator

@ssam18

This PR is now empty.

@SamareshSingh
Copy link
Copy Markdown
Contributor

Forgot to push the changes. Will do that soon

@ssam18
Copy link
Copy Markdown
Contributor Author

ssam18 commented Apr 7, 2026

I am seeing failure from OpenVino that is not related to the change in this PR.
@hertschuh can you re-run the flaky test_fft2 test?

Copy link
Copy Markdown
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple very small tweaks:

Also, can you rebase?

Comment thread keras/src/layers/convolutional/base_depthwise_conv.py Outdated
Comment thread keras/src/layers/convolutional/base_separable_conv.py Outdated
ssam18 added 6 commits April 15, 2026 15:11
…le_conv ops

The low-level keras.ops.depthwise_conv and keras.ops.separable_conv functions lacked the same strides > 1 + dilation_rate > 1 validation that already exists in the high-level DepthwiseConv2D and SeparableConv2D layers. This caused the symbolic (static) shape inference to return incorrect shapes when both strides > 1 and dilation_rate > 1 are used together, since the formula-based shape computation
disagrees with what TF's depthwise/separable conv ops actually produce.

Also added the validation to both ops and updated the tests to assert a ValueError instead of silently skipping the unsupported combination on the TF backend.
…s+dilation

The static and dynamic shape tests were asserting on output shapes for strides=2 + dilation_rate=2 combinations, which are now correctly rejected
with a ValueError. Also extract _get_seq_max helper to avoid duplicated logic in depthwise_conv and separable_conv, and use (list, tuple) consistently
in the test isinstance checks.
As per reviewer feedback, strides+dilation works on all backends except
TensorFlow. Remove the ValueError validation added to depthwise_conv()
and separable_conv() in nn.py, restore the original shape assertions in
static/dynamic shape tests, and restore the TF-specific skip in the
correctness tests.
…parableConv as it only applies to TensorFlow
@ssam18 ssam18 force-pushed the fix/depthwise-separable-conv-strides-dilation-validation branch from 83893d1 to e39ebe8 Compare April 15, 2026 20:14
Copy link
Copy Markdown
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the PR description and title? It's no longer what this is.

A couple of things don't conform to the code format:

Comment thread keras/src/layers/convolutional/separable_conv_test.py
Comment thread keras/src/layers/convolutional/separable_conv_test.py Outdated
@ssam18 ssam18 changed the title Fix incorrect static shape in depthwise_conv and separable_conv when strides > 1 and dilation_rate > 1 Remove backend-specific strides + dilation_rate restriction from DepthwiseConv and SeparableConv Apr 15, 2026
@ssam18
Copy link
Copy Markdown
Contributor Author

ssam18 commented Apr 17, 2026

Can you update the PR description and title? It's no longer what this is.

A couple of things don't conform to the code format:

It is done. Please check

Copy link
Copy Markdown
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@google-ml-butler google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Apr 17, 2026
@hertschuh hertschuh merged commit b98faa7 into keras-team:master Apr 17, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kokoro:force-run ready to pull Ready to be merged into the codebase size:M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants