Skip to content

fix: RecordPreparer incorrectly injected __typename into embedded concrete-typed fields#1109

Open
marcdaniels-toast wants to merge 5 commits intoblock:mainfrom
marcdaniels-toast:fix/record-preparer-typename-injection
Open

fix: RecordPreparer incorrectly injected __typename into embedded concrete-typed fields#1109
marcdaniels-toast wants to merge 5 commits intoblock:mainfrom
marcdaniels-toast:fix/record-preparer-typename-injection

Conversation

@marcdaniels-toast
Copy link
Copy Markdown
Contributor

@marcdaniels-toast marcdaniels-toast commented Apr 5, 2026

Summary

RecordPreparer was incorrectly managing __typename for embedded fields because the old approach tracked it as a type-level property — but whether __typename belongs at a given position is fundamentally contextual (the same concrete type may need __typename when indexed in a shared abstract-type index, but must omit it when embedded as a concrete field elsewhere).

Fix: Replace type-level requires_typename tracking with a mapping-based approach. mapping_properties is now threaded recursively through prepare_value_for_indexing, and __typename is included at a given position if and only if the index mapping has it there. When __typename is absent from the input record but the mapping requires it (e.g. a concrete subtype being indexed into a shared union/interface index), it is injected.

The test schema is also updated to give Person and Company their own dedicated indexes (people and companies), making them types that are both directly indexed and embedded in another indexed type (Manufacturer.ceo). This exercises the previously untested case Myron noted in the issue. Once index inheritance is fully implemented, the index can be moved up to the NamedInventor interface, at which point the acceptance test will also directly exercise the bug fix path.

Test Plan

  • Unit test (record_preparer_spec.rb): A new example directly demonstrates the bug and fix. It builds a schema where Person shares a union index (requiring __typename at root) and is also embedded as Manufacturer.ceo (where __typename must be absent). The test verifies that preparing a root Person record injects __typename, while preparing a Manufacturer record with an embedded Person ceo does not.

  • Acceptance test (search_spec.rb): The existing "supports fetching interface fields" test is extended to index a Manufacturer with a ceo: Person and query it end-to-end via named_entities. This confirms the full round-trip works correctly for a type that is both a root indexed type and embedded in another indexed type.

Fixes #1097

…crete-typed fields

When a concrete type (e.g. Person) was indexed in a shared union index, it was added
to @types_requiring_typename. The old prepare_value_for_indexing used that set for both
root and embedded contexts, causing __typename to be injected into embedded fields like
Manufacturer.ceo even though no __typename is needed or expected there.

Fix by splitting the set into @types_with_typename_required_in_json_schema (used for
embedded processing) and @types_requiring_typename (superset, used only at root level
in prepare_for_index).

Also adds Person and Company as directly indexed types in the test schema, exercising
the case where a type is both a root indexed type and embedded in another indexed type.

Generated with Claude Code
…company

Now that Person and Company have their own indexes, the validation in
common_project_specs picks them up and tries to validate them as indexed
records -- which they aren't. They're embedded-only factories used for
union fields (Widget.inventor) and concrete embedded fields (Manufacturer.ceo).

Rename them to :embedded_person/:embedded_company to make the intent clear,
and add them to ignored_factories in common_project_specs_spec.rb.

Generated with Claude Code
Comment thread Rakefile Outdated
Comment thread elasticgraph-indexer/lib/elastic_graph/indexer/record_preparer.rb Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on my comment above about it being better to include __typename based on the index mapping, I suspect there's a scenario which would fail with your implementation and pass with an implementation that consults the index mapping. Can you come up with a test that covers that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been having a hard time trying to build an example to prove a real flaw exists today because things seem to just work out due to the "clever" combination of type and index mapping information. But as you said the approach is unsound because it's inappropriately using global type information for a problem that requires positional awareness. It may even break in the future if current invariants were to change. I get your idea of threading the index mapping during recursion but I think we want something similar to the current lookup mechanism so we aren't doing anything expensive during record processing. Does that seem like a good thing to look into as part of this improvement or do you already know it won't really be possible?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not able to trigger an error with the type-based approach. The example I tried after we met was this:

Manufacturer {
  inventor: Inventor
  ceo: Person
}

Where Person inherits an index from its parent type. So I'm trying to force a situation where there are two sites for the same type: one which requires typename and one which does not. But even at a type level I can't imagine this because an embedded type is either abstract (will always require typename) or concrete (will never require it). And the new handling for the root-level took its typename requirement from the index mapping.

Anyway I just went forward with the fix to use only index mapping information.

Comment thread spec_support/lib/elastic_graph/spec_support/factories/widgets.rb Outdated
…urce of truth

Replace type-level `requires_typename` tracking with a mapping-based approach:
`__typename` is included at a given position if and only if the index mapping
has it there, threading `mapping_properties` recursively through
`prepare_value_for_indexing`. This correctly handles types that are both
embedded concretely (where `__typename` should be omitted) and indexed via a
shared abstract-type index (where it must be present).

Also consolidate duplicate `:embedded_xyz`/`:indexed_xyz` factory definitions
into single `:person` and `:company` factories (both backed by `:indexed_type`),
with a `HashAsEmbedded` refinement providing `.as_embedded` to strip
indexed-type-only keys. Fix non-deterministic `Array#sample` → `Faker::Base.sample`
in Rakefile test-data generation.

Generated with Claude Code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RecordPreparer Bug: Context-Unaware __typename Addition

2 participants