Skip to content

Conversation

NikitaHerndlhofer
Copy link

@NikitaHerndlhofer NikitaHerndlhofer commented Sep 14, 2025

Summary

This PR adds comprehensive support for propertyOrdering in the Google provider, allowing developers to control the order of properties in JSON responses from Gemini models. The implementation supports both a simple array format for basic use cases and a nested object format for complex structures.

Usage Examples

Simple Array Format

For basic root-level property ordering, you can use a simple array:

import { google } from '@ai-sdk/google';
import { generateObject } from 'ai';
import { z } from 'zod';

const result = await generateObject({
  model: google('gemini-2.0-flash'),
  providerOptions: {
    google: {
      propertyOrdering: ['name', 'age', 'email'], // Clean and simple!
    },
  },
  schema: z.object({
    name: z.string(),
    age: z.number(),
    email: z.string(),
  }),
  prompt: 'Generate a person profile',
});

// Output will have properties in the specified order:
// { "name": "...", "age": ..., "email": "..." }

Complex Nested Object Ordering

For nested structures, use the object format with dot notation:

const result = await generateObject({
  model: google('gemini-2.0-flash'),
  providerOptions: {
    google: {
      propertyOrdering: {
        '': ['name', 'profile', 'preferences'], // Empty string for root
        'profile': ['bio', 'settings', 'contacts'],
        'profile.settings': ['theme', 'notifications'], // dot-notation for nested obejcts
        'profile.settings.notifications': ['email', 'push'],
        'profile.contacts': ['address', 'phone'],
        'profile.contacts.address': ['street', 'city'],
        'preferences': ['language', 'timezone'],
      },
    },
  },
  schema: z.object({
    name: z.string(),
    profile: z.object({
      bio: z.string(),
      settings: z.object({
        theme: z.string(),
        notifications: z.object({
          email: z.boolean(),
          push: z.boolean(),
        }),
      }),
      contacts: z.object({
        address: z.object({
          street: z.string(),
          city: z.string(),
        }),
        phone: z.string(),
      }),
    }),
    preferences: z.object({
      language: z.string(),
      timezone: z.string(),
    }),
  }),
  prompt: 'Generate a comprehensive user profile',
});

Technical Implementation

Key Changes

  1. Enhanced Type System:

    // Supports both formats:
    propertyOrdering: z.union([
      z.array(z.string()), // Simple: ['name', 'age', 'email']
      z.record(z.string(), z.array(z.string())), // Nested: { '': [...], 'profile': [...] }
    ]).optional();
  2. Smart Format Normalization:

    • Automatically converts simple array format to object format internally
    • Maintains full backward compatibility
    • Enables clean API for common use cases
  3. Enhanced convertJSONSchemaToOpenAPISchema function:

    • Support for both Record<string, string[]> and string[] formats
      • string[] for simple root-level property ordering and Record<string, string[]> for nested property ordering.
    • Internal normalization logic to convert between formats
    • Recursive property ordering for nested objects
    • Path tracking using dot notation ('parent.child')
  4. OpenAPI Schema Generation:
    The function converts property ordering definitions into OpenAPI-compliant schemas:

    // Input
    { '': ['name', 'profile'], 'profile': ['bio', 'settings'] }
    
    // Generated OpenAPI Schema
    {
      "type": "object",
      "properties": { /* ... */ },
      "propertyOrdering": ["name", "profile"]  // ← Automatically included
    }

Path-based Property Ordering

The implementation supports:

  • Root level: '' (empty string) or simple array format
  • Nested objects: 'parent.child' dot notation

Testing

  • 18/18 tests passing
  • ✅ Test coverage for both array and object formats
  • ✅ Tests for nested object property ordering
  • ✅ Tests for backward compatibility
  • ✅ Edge cases handled (arrays, allOf, anyOf, oneOf)
  • ✅ All existing functionality preserved

Test Cases Added:

  1. should add propertyOrdering when provided - Complex nested object test
  2. should work without propertyOrdering - Backward compatibility test
  3. should support simple array format for root-level property ordering - Simple array format test

Documentation Updates

README.md

  • Comprehensive "Property Ordering for Structured Output" section
  • Examples for both simple array and object formats
  • Real-world usage examples
  • Links to Google's official documentation

Type Documentation

  • Enhanced JSDoc comments with usage examples
  • Clear documentation of both supported formats

API Design

The API provides excellent developer experience with progressive complexity:

// ✅ Simple case - clean and intuitive
propertyOrdering: ['name', 'age', 'email']

// ✅ Complex case - full control over nested structures
propertyOrdering: {
  '': ['name', 'profile'],
  'profile': ['bio', 'settings'],
  'profile.settings': ['theme', 'notifications']
}

Breaking Changes

None. This is a purely additive feature with full backward compatibility.

References


This implementation enables developers to leverage Google's property ordering feature for more consistent, reliable, and high-quality structured outputs from Gemini models, with an API that scales from simple to complex use cases.

commit b79b3e56351537215b6e2a7d1ed895bb3d93d549
Author: Nikita Herndlhofer <[email protected]>
Date:   Sun Sep 14 18:47:56 2025 +0100

    Added support to provide just an array for simple root-level ordering

commit 391159394a6f4aaa67b29e44a1234773dc5afa3b
Author: Nikita Herndlhofer <[email protected]>
Date:   Sun Sep 14 18:40:03 2025 +0100

    Added patch changeset

commit 5fc0a50964868ee73c835561e3263f2aa2dfb53b
Author: Nikita Herndlhofer <[email protected]>
Date:   Sun Sep 14 17:33:52 2025 +0100

    Add property ordering support
@gr2m
Copy link
Collaborator

gr2m commented Sep 15, 2025

can you share why this is needed? I looked through https://ai.google.dev/gemini-api/docs/structured-output, and I understand what it does, but what is the use case for it?

@NikitaHerndlhofer
Copy link
Author

NikitaHerndlhofer commented Sep 15, 2025

can you share why this is needed? I looked through https://ai.google.dev/gemini-api/docs/structured-output, and I understand what it does, but what is the use case for it?

Because LLMs generate things token-by-token the generation of every token is affected by every previous token. That means that in structured outputs it is important to specify the order of the properties. For example: You are creating a CoT pipeline and have defined the following structure:

reasoning: z.array(z.string()),
result: z.string()

Then you always want the llm to generate the 'reasoning' property first, and the 'result' property last.

This is a really important feature for many use cases in more or less complex AI pipelines used in production.

@gr2m gr2m self-assigned this Sep 16, 2025
@gr2m
Copy link
Collaborator

gr2m commented Sep 16, 2025

@NikitaHerndlhofer I have a few more questions here.

When you defined the schema, like in your example

const result = await generateObject({
  model: google('gemini-2.0-flash'),
  schema: z.object({
    name: z.string(),
    age: z.number(),
    email: z.string(),
  }),
  prompt: 'Generate a person profile',
});

console.log(Object.keys(result.object));
  1. do the keys come back in a different order? Because I cannot reproduce that problem.
  2. if they come in a different order, can you check that the schema JSON is sent to the server in the order it was defined? If not we do have a method that lets you defined your own schema without relying on zod and its schema-to-json functionality?

@NikitaHerndlhofer
Copy link
Author

@gr2m
Because the json deserialization does not keep the property order you should really have a look at the raw generated response of the llm, I think it's the result.response.body.
Look at the specific order, then set the propertyOrdering setting and look how the property order changed.

It's easier when you name your properties like

propertyOne
fieldTwo
attributeLast
number5

Then the lllm will be more inconsistent in the result.

So the best way to test this would be:

  1. Set a defined seed
  2. Set temperature to 0
  3. Run the inference
  4. look at the property order
  5. Set a different propertyOrder setting
  6. Run the generation again and compare the results

@NikitaHerndlhofer
Copy link
Author

To answer your specific questions:

  1. They do, but you're looking at result.object which will provide inconsistent results
  2. I wrote a test for that and generally it should send the json schema in the same order, because we use things like object.entries which should work like for...in defined here, but this depends on zod and on the developer

I hope this answers your questions

@gr2m
Copy link
Collaborator

gr2m commented Sep 16, 2025

The raw response body is not relevant if the parsing through zod and your provided schema is what defines the order. And if you don't want to depend on zod, you can use your own custom schema: https://ai-sdk.dev/docs/reference/ai-sdk-core/json-schema#jsonschema

Can you provide an end-to-end example using AI SDK where the resulting object keys have a different order than the schema you provide?

@NikitaHerndlhofer
Copy link
Author

It's relevant, because it directly affects the LLMs response, because LLMs generate responses token by token. And the raw response is how you can have a look at it.

@gr2m
Copy link
Collaborator

gr2m commented Sep 16, 2025

For now we don't see a need for it. You can do the sorting outside the AI SDK as needed. We will watch out for other users running into an issue with the order of keys. Thanks Nikita

@gr2m gr2m closed this Sep 16, 2025
@NikitaHerndlhofer
Copy link
Author

NikitaHerndlhofer commented Sep 16, 2025

https://www.dsdev.in/order-of-fields-in-structured-output-can-hurt-llms-output

https://discuss.ai.google.dev/t/output-order-in-structured-outputs/53765

@gr2m
I'd ask you to please reconsider.

Please have a look at the example here
#6541

This will also directly affect streaming the outputs as well.

If you have comments to the code or the approach itself - please provide them, there is a need for this feature for production use cases.
Or please implement this yourself - it's a small change.

@gr2m
Copy link
Collaborator

gr2m commented Sep 17, 2025

Please have a look at the example here
#6541

I did. When I run this example

const schema = z.object({
  tagFirst: z.string(),
  tag2: z.object({
    childTagFirst: z.string(),
    childTag1: z.object({
      subChildTagFirst: z.string(),
      subchildTagRandomPosition: z.string(),
      subChildTag1: z.string(),
    }),
    childTag2: z.string(),
    childTagSecond: z.string(),
  }),
  tag3: z.string(),
  tagLast: z.string(),
  tag1: z.string(),
});

const prompt = `
You are a helpful assistant that generates tags for a post.
generate tags!
`;


const result = streamObject({
  model: google('gemini-2.0-flash-001'),
  temperature: 0,
  seed: 42,
  schema,
  prompt,
});

result.partialObjectStream.pipeTo(
  new WritableStream({
    write(chunk) {
      console.log(chunk);
    },
  }),
);

The logged chunks are exactly as expected, following the order in the schema

{ tagFirst: 'example_tag_1', tag2: {} }
{ tagFirst: 'example_tag_1', tag2: { childTagFirst: 'child_tag_a' } }
{
  tagFirst: 'example_tag_1',
  tag2: {
    childTagFirst: 'child_tag_a',
    childTag1: {
      subChildTagFirst: 'sub_child_tag_i',
      subchildTagRandomPosition: 'sub'
    }
  }
}
{
  tagFirst: 'example_tag_1',
  tag2: {
    childTagFirst: 'child_tag_a',
    childTag1: {
      subChildTagFirst: 'sub_child_tag_i',
      subchildTagRandomPosition: 'sub_child_tag_ii',
      subChildTag1: 'sub_child_tag_iii'
    }
  }
}
{
  tagFirst: 'example_tag_1',
  tag2: {
    childTagFirst: 'child_tag_a',
    childTag1: {
      subChildTagFirst: 'sub_child_tag_i',
      subchildTagRandomPosition: 'sub_child_tag_ii',
      subChildTag1: 'sub_child_tag_iii'
    },
    childTag2: 'child_tag_b',
    childTagSecond: 'child_tag_c'
  },
  tag3: 'example_tag_3',
  tagLast: 'example_tag_'
}
{
  tagFirst: 'example_tag_1',
  tag2: {
    childTagFirst: 'child_tag_a',
    childTag1: {
      subChildTagFirst: 'sub_child_tag_i',
      subchildTagRandomPosition: 'sub_child_tag_ii',
      subChildTag1: 'sub_child_tag_iii'
    },
    childTag2: 'child_tag_b',
    childTagSecond: 'child_tag_c'
  },
  tag3: 'example_tag_3',
  tagLast: 'example_tag_last',
  tag1: 'example_tag_2'
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants