Skip to content

Conversation

candleindark
Copy link
Member

@candleindark candleindark commented Jul 8, 2025

This PR is a replacement of #2396, which can't be properly tested since it is coming from another repo.

This PR makes the instance config as defined in dandischema.conf, per dandi/dandi-schema#294, available through the api/info/ endpoint. The instance config, presented as serialized JSON object, can be reconstructed by calling model_validate or model_validate_json on dandischema.conf.Config.

TODOs:

  • Remove the "tmp: temporarily set the dandischema dependency to the devendorize branch" commit which is a temporary measure for testing this solution.
  • Remove the "tmp: temporarily pin dandischema to the devendorize-init-config-with-field-name branch" commit which is a temporary measure for testing this solution.
  • Figure out what env vars to set in the "CI for backend" GH workflow to set the most appropriate instance config for testing.
    • Decided to set no env vars to set the instance config, so a schema without any vendor specific information is used for testing. This schema is more relaxed compared to any vendor specific schema.

Release Notes

This PR makes the instance config as defined in dandischema.conf, per dandi/dandi-schema#294, available through the api/info/ endpoint.

yarikoptic and others added 14 commits May 15, 2025 09:52
Metadata lifecycle inspired by the one we created for BIDS:
see bids-standard/bids-website#626
- Add new API endpoints to serialize JSONSchema at runtime
- Update info endpoint to use local schema URL instead of GitHub
- Use TypeAdapter for proper Pydantic schema generation
- Add tests for schema endpoints
- Add implementation documentation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add visual representation of the new schema flow
- Highlight key differences from the current approach
- Show elimination of dependency on external schema repository

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
The default name django uses is "testserver" and then URL becomes
http://testserver, but DJANGO itself does not consider it to be a valid URL
because it lacks TLD.  So whenever we add "schema_url" pointing to our server
(in tests just "testserver") DJANGO starts raising validation errors.

See encode/django-rest-framework#9705
for more information/rationale.
to concetrate the testing logic and also to accent on the differences (parameters of the test)
* origin/master:
  Correct text on Github sign in / sign up button in tests
Extends test to compare local runtime-generated schema against static GitHub
schema content. This test is expected to fail, demonstrating that vendorized
schemas differ from static schemas due to runtime customizations.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Refactors schema generation to match dandischema approach and adds type
safety.
Changes from Published models to regular models to match GitHub static schemas:
- Use Dandiset instead of PublishedDandiset
- Use Asset instead of PublishedAsset
- Update test parameterization accordingly

Test now shows minimal vendorization difference (repository default value)
demonstrating the expected runtime customization vs static schema difference.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Add logic to normalize vendorization differences before schema comparison:
- Adjust repository default value from runtime config to match GitHub schema
- Test now passes, confirming schemas are equivalent after vendorization adjustment
- Demonstrates successful runtime schema generation with expected customizations

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Extends the schema API to support both regular and published models:

New endpoints:
- /api/schema/latest/published-dandiset/
- /api/schema/<version>/published-dandiset/
- /api/schema/latest/published-asset/
- /api/schema/<version>/published-asset/

Refactoring improvements:
- Extract common logic into _schema_view_impl() helper function
- Update URL patterns and view exports

Testing enhancements:
- Extend parametrized tests to cover published schema endpoints

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
In d41f24e I committed a little rushed
tune up by claude and it replaced "current state" diagram with the actually
implemented, and then for implemented it just removed having jsonschema
serializations altogether.

Since we did merges into this branch, rebasing is "tricky", so I decided to
just push a fixup commit.
@candleindark candleindark changed the title Have the /info/ endpoint provide instance config info Have the api/info/ endpoint provide instance config info Jul 8, 2025
@candleindark candleindark changed the title Have the api/info/ endpoint provide instance config info Have the api/info/ endpoint provide instance config info (duplicate) Jul 8, 2025
@candleindark candleindark changed the title Have the api/info/ endpoint provide instance config info (duplicate) Have the api/info/ endpoint provide instance config info Jul 8, 2025
@candleindark
Copy link
Member Author

@yarikoptic This PR is ready to be merged after removing one temporary commit indicated in the TODO list in the top post.

The following is the an example output from the api/info endpoint.

{
  "instance_config": {
    "instance_name": "EMBER-DANDI",
    "doi_prefix": "10.60533"
  },
  "schema_version": "0.6.10",
  "schema_url": "http://127.0.0.1:8000/api/schema/latest/dandiset/",
  "version": "0.10.0.post21+g1a480108.d20250709",
  "services": {
    "api": {
      "url": "http://localhost:8000/api"
    },
    "webui": {
      "url": "http://localhost:8085"
    },
    "jupyterhub": {
      "url": "https://hub.dandiarchive.org/"
    }
  },
  "cli-minimal-version": "0.60.0",
  "cli-bad-versions": []
}

The reasons for the remaining failures in the CI are the following.

  1. "CI for backend / test (test) ": Incorrect assumption made in a test introduced in Local runtime schema serialization endpoints #2386. For more information, visit Local runtime schema serialization endpoints #2386 (comment).
  2. "Test Integration with dandi-cli": Tests in dandi-cli need to be updated to accommodate Vendor-Configurable Metadata Models dandi-schema#294. The failures here are no different than the remaining test failures in Vendor-Configurable Metadata Models dandi-schema#294. These failures will be address in the de/revendorization of dandi-cli.

@candleindark candleindark marked this pull request as ready for review July 9, 2025 06:16
schema_url = f'http://{TEST_SERVER_NAME}{reverse("schema-dandiset-latest")}'

assert resp.json() == {
'instance_config': {'instance_name': 'DANDI', 'doi_prefix': '10.80507'},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvandenburgh could you please look at this PR and this change to the /info/ API endpoint output?
It is to be included in the larger

The point here is to expose configuration relevant to the instance of dandi-archive, so client and others could properly discover instance names etc.

Those two variables are the ones we are to define ATM but there could be more later on. But now I started to wonder if we should just add them flat, instead of nesting into instance_config? wdyt @candleindark

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But now I started to wonder if we should just add them flat, instead of nesting into instance_config?

I prefer have all the instance config variables nested in the instance_config field in the JSON response at the `api/info/ endpoint. In that way,

  1. The value of the instance_config field can be readily used construct a dandischema.conf.Config define in Vendor-Configurable Metadata Models dandi-schema#294.
  2. Validation of the value of the instance_config field is delegated to the dandischema.conf.Config Pydantic model. On the other hand, transferring the definition of the variables in the instance_config field to the upper level of the api/info/ response would requires redefinition of the fields in dandischema.conf.Config using Django's lingo, which is a duplication of logic that can easily get out of hand when add more fields to dandischema.conf.Config in the future.

@yarikoptic
Copy link
Member

please rebase to resolve conflicts and ensure that there is no new test fails

@candleindark candleindark force-pushed the rf-jsonschema-runtime-instance-config branch from 1a48010 to bb14ef5 Compare July 28, 2025 16:03
@candleindark
Copy link
Member Author

New failures are showing up. Possibly due the changes in #2386 and dandi/dandi-schema#294. Putting this PR in draft now but will attempt after dandi/dandi-schema#312 is merged which will also affect this PR.

@candleindark candleindark marked this pull request as draft July 28, 2025 16:25
candleindark and others added 4 commits September 18, 2025 11:04
@candleindark candleindark force-pushed the rf-jsonschema-runtime-instance-config branch from 25c120a to d0a0241 Compare September 24, 2025 05:47
@candleindark candleindark force-pushed the rf-jsonschema-runtime-instance-config branch from d0a0241 to 3867de8 Compare September 24, 2025 05:57
ports:
- 9000:9000
env:
DJANGO_DANDI_INSTANCE_NAME: DANDI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: so this is needed since otherwise some tests would start failing with DANDI-ADHOC name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which tests? I can run the tests just fine locally. We shouldn't set this env var to DANDI anywhere besides production.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjnesbitt - Which tests? I can run the tests just fine locally.

You are right. I have already relaxed ID_PATTERN defined in dandischema.models in case of the instance name defaults, so setting DJANGO_DANDI_INSTANCE_NAME to "DANDI" is no longer needed.

The setting of the DJANGO_DANDI_INSTANCE_NAME env var is now removed.

@yarikoptic
Copy link
Member

@jjnesbitt do you see anything to be changed in this PR or should we absorb into that

@jjnesbitt
Copy link
Member

jjnesbitt commented Sep 26, 2025

@jjnesbitt do you see anything to be changed in this PR or should we absorb into that

I left one comment. Besides that, I'd like to see the multiple tmp commits squashed into one commit that we can drop once we're ready to merge into master. Also, the temp dandi-schema branch being pointed at needs to be fixed (devendorize-init-config-with-field-name has been merged into devendorize).

As mentioned in #2386 (comment), I think this can become the sole PR on the archive side to implement the "devendorize" work.

What is the timeline for dandi/dandi-schema#294 being merged?

@candleindark candleindark force-pushed the rf-jsonschema-runtime-instance-config branch 2 times, most recently from e2df66d to b6104d2 Compare September 26, 2025 20:55
@candleindark
Copy link
Member Author

candleindark commented Sep 26, 2025

@jjnesbitt - I left one comment.

Responded.

@jjnesbitt - Besides that, I'd like to see the multiple tmp commits squashed into one commit that we can drop once we're ready to merge into master.

Done

@jjnesbitt - Also, the temp dandi-schema branch being pointed at needs to be fixed (devendorize-init-config-with-field-name has been merged into devendorize).

Fixed.

What is the timeline for dandi/dandi-schema#294 being merged?

dandi/dandi-schema#294 is complete. If it has any unforeseen changes, they should be small. The timeline for dandi/dandi-schema#294 to be merged depends on dependent changes on dandi-archive. I.e., if all the needed changes in dandi-archive to work with dandi/dandi-schema#294 are ready, we can merge dandi/dandi-schema#294 at the same time when the changes in dandi-archive are merged.

@yarikoptic Is there any other considerations in the merging process?

@yarikoptic
Copy link
Member

I think we should list a clear order of PR merges across all repos in e.g.

@candleindark please adjust original issue description with clear stages, as e.g. we should get this and #2386 get merged/released/tested, infrastructure ones merged, and then do final testing of dandi/dandi-schema#294 and merge/release it too, IIRC the steps correctly.

@candleindark candleindark force-pushed the rf-jsonschema-runtime branch 6 times, most recently from eb2e1c2 to 7134490 Compare September 29, 2025 19:04
@candleindark
Copy link
Member Author

@candleindark please adjust original issue description with clear stages, as e.g. we should get this and #2386 get merged/released/tested, infrastructure ones merged, and then do final testing of dandi/dandi-schema#294 and merge/release it too, IIRC the steps correctly.

Done

@candleindark candleindark force-pushed the rf-jsonschema-runtime-instance-config branch from aee1394 to b6104d2 Compare October 3, 2025 06:57
Base automatically changed from rf-jsonschema-runtime to master October 7, 2025 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants