Add a permissions system #2543

evansd · 2025-09-19T15:56:28Z

This adds a simple, generic permissions system to ehrQL. In this first instance this is only used to restrict access to certain tables, but I anticipate extending it to e.g. gate access to Event Level Data and for the various opt-out controls.

ehrQL table definitions can now be annotated with metadata using an internal _meta class. This can included a required_permission attribute e.g.

@table
class some_table(EventFrame):
    class _meta:
        required_permission = "some_restricted_dataset"

A check at the documentation build stage ensures that every restricted table references the required permissions somewhere in the table documentation.

In Production

The RAP Controller already supplies an EHRQL_PERMISSIONS environment variable with a JSON-encoded list of permissions that the current job has. If a dataset or measure definition uses a restricted table without the relevant permission then this results in an immediate error, before any queries are executed. For example:

EHRQLPermissionError: You do not currently have all the permissions needed for this action.

Missing permissions are:

    * appointments: required for access to the `tpp.appointments` table
    * waiting_list: required for access to the `tpp.wl_clockstops` and `tpp.wl_openpathways` tables

If you think this is a mistake and that you should have these permissions please contact OpenSAFELY support.

ehrQL then exits with a specific status code which the RAP Agent can identify and use to inform the user that their code failed due to a permissions error.

Locally

Running the same code locally currently results in a warning in the logs:

Some of the tables or features you are using require special permission to use with real
patient data. The permissions needed are:

    * appointments: required for access to the `tpp.appointments` table
    * waiting_list: required for access to the `tpp.wl_clockstops` and `tpp.wl_openpathways` tables   

You can continue to work on your code using dummy data by “claiming” the required permisions:

    from ehrql import claim_permissions
    claim_permissions("appointments", "waiting_list")

The intention is that once this change has been circulated and users have had time to made the necessary changes we can make this a hard error rather than a warning so it will be impossible to miss in future.

See the documentation preview here:
https://evansd-table-permissions.databuilder.pages.dev/reference/language/#permissions

Note that the claim_permissions() function acts globally, rather than being a dataset/measure specific method. This seems like a better fit with the project-wide nature of permissions and less fiddly for the user who just has to stick the relevant line somewhere in their Python file.

Note also the deliberate decision to use strings for permissions rather than a dedicated enum type. The autocomplete benefits of enums are largely obviated by the fact that the error message gives the user exactly the code they need to copy. And the typo-resistance is less important given that mistyped permissions will result in immediate permissions failures anyway. Attempting to use a dedicated Permission type turned out to be architecturally awkward and made for a more cumbersome user story as well, so strings felt like the better compromise.

Closes: #2514 #2515

cloudflare-workers-and-pages · 2025-09-19T16:22:01Z

Deploying databuilder-docs with Cloudflare Pages

Latest commit:	`a476462`
Status:	✅ Deploy successful!
Preview URL:	https://9e5fc93d.databuilder.pages.dev
Branch Preview URL:	https://evansd-table-permissions.databuilder.pages.dev

View logs

See: * opensafely-core/ehrql#2543

suzannehamilton

I've just had a look to see the shape of the change, so these comments are fairly bikesheddy rather than a thorough review!

suzannehamilton · 2025-09-24T15:36:52Z

docs/includes/generated_docs/language__permissions.md

+
+    from ehrql import claim_permissions
+
+    claim_permissions("some_permission", "another_permission")


Will researchers know what values to use? Would it be worth linking to the schema docs below and/or including a couple of real examples like "waiting_list" and "isaric"?

My thought was that the first time researchers would encounter this function would be from the warning (later error) message they get locally which will tell them exactly what names to use. And I worried that the real names might be a bit cryptic to anyone not familiar with the particular tables in question. I wanted to have something in the docstring, but I don't think it's ever going to serve as a comprehensive introduction to the topic.

suzannehamilton · 2025-09-24T16:05:58Z

ehrql/query_model/nodes.py

 class SelectTable(ManyRowsPerPatientFrame):
    name: str
    schema: TableSchema
+    required_permission: str | None = None


Is it safer for the default permission to be some placeholder value rather than None, so that all tables have to override it? Or is this something we'll definitely consider when we add a new table?

I think the default here, in the query model, has to be None otherwise every single table we ever construct (which we do many times in tests) needs to be updated.

If we're worried about forgetting we could have a separate check that looks at every table in our official schemas and make sure they all explicitly specify a required_permission attribute, which may be None for many tables.

suzannehamilton · 2025-09-24T16:08:30Z

ehrql/permissions.py

+            f"permissions assigned by the OpenSAFELY team. For more information see:\n"
+            f"https://docs.opensafely.org/ehrql/reference/language/#permissions"
+        )
+        # For the inital rollout of this feature we issue a warning locally but continue


Typo: inital

suzannehamilton · 2025-09-24T16:10:34Z

ehrql/permissions.py

+            f"https://docs.opensafely.org/ehrql/reference/language/#permissions"
+        )
+        # For the inital rollout of this feature we issue a warning locally but continue
+        # running. Eventually we want to make this a hard error so that it can't be


👍

Is doing this part of #2515 or should we make a new issue to make sure we don't forget it?

I was going to wait for feedback from the co-pilots before creating an issue for this because I'm not quite sure what steps we need to do first before we can reasonably make this a hard error. But yeah, I think it's a separate step which warrants a separate ticket.

See: * opensafely-core/ehrql#2543

bloodearnest

This broadly makes sense to me. Interesting to see how it is plumbed in to the table definitiones and qm.

Left a couple of questions, more for my own understanding of future intent.

bloodearnest · 2025-09-30T07:52:35Z

tests/functional/test_generate_measures.py

+        required_permission = "special_perm"
+
+
+@function_body_as_string


lol, I had not see this decorator before! Evil/Genius hack! I love it.

Thanks, I was quite pleased with that. Makes defining test fixtures much more pleasant I think.

bloodearnest · 2025-09-30T07:54:53Z

ehrql/query_language.py

        raise Error("Schema class must subclass either `PatientFrame` or `EventFrame`")

+    try:
+        table_name = cls._meta.table_name


I think that using a inner class is fine, we at least have the precedent of django to follow, folks should be somewhat familiar.

But for my own education, could you give an example of a speculative meta inner classes method that would be used for future dummy data?

Good question. I haven't got as far as thinking what the API might look like. But there are a couple of tables which are currently special-cased in the dummy data generator:

ehrql/ehrql/dummy_data/generator.py

Lines 203 to 223 in 29f56df

def rows_for_patients(self, table_info):

row = {

"date_of_birth": self.date_of_birth,

"date_of_death": self.date_of_death,

}

# Apply any FirstOfMonth constraints

for key, value in row.items():

if key in table_info.columns and value is not None:

if table_info.columns[key].get_constraint(Constraint.FirstOfMonth):

row[key] = value.replace(day=1)

return [row]

def rows_for_practice_registrations(self, table_info):

# TODO: Generate more interesting registration histories; for now, we just

# assume that every patient is permanently registered with a single practice

# from birth

row = {

"start_date": self.events_start,

"end_date": None,

}

return [row]

And my thought was that we can move some of this logic to the table definitions themselves, and then apply it more broadly to different kinds of table.

bloodearnest · 2025-09-30T07:57:05Z

ehrql/permissions.py

+
+
+def parse_permissions(environ):
+    return set(environ.get("EHRQL_PERMISSIONS", "").split(","))


AIUI, this supports a single purpose list, e.g. dataset permissions.

If we want to do T100 or old-patient-id flags, presumably we'd need to either have multiple env var for each, or have a more structured value for EHRQL_PERMISSIONS. This is obviously out of scope for this PR, but I was wondering how you had been thinking to handle that when we get to it?

Well, strictly speaking I think we can still handle those things with just a flat list of permissions e.g. t1oo,untagged_patient_ids,event_level_data. There's nothing specifically about datasets in the permissions, that just happens to be the only thing which is currently checked.

But I do agree that we may want more structured data here and I wondered about switching to a JSON list rather than comma separated strings. I figured it would be easy enough to do that later and I didn't want to over-complicate things. but now you've raised it, maybe we would be better off tackling this now. It would be a trivial change in the controller.

evansd · 2025-10-01T08:17:35Z

Prompted by Simon's question, I've switch the format of the EHRQL_PERMISSIONS variable to be JSON rather than comma-separated strings. This gives us a bit more future-proofing and generally seems more robust. This means we need to merge and deploy this RAP Controller change first:

Use JSON to encode ehrQL permissions job-runner#1190

See: * opensafely-core/ehrql#2543

We need to be able to do this when enforcing permissions. This is a slight improvement on the status quo I think in that we now at least have a proper type to represent a collection of measures and some consistency between `Dataset()` and `Measure()` in that both now have a `_compile()` method. Measures are still in a strange half-state in being a lot like query model objects, but not actually being part of the query model. But sorting that all out is beyond the scope of this PR.

We do this via a Django-ish approach of defining an inner class. There are other approaches we could take, but the big advantage of using a class is that it makes it easy to include methods among the metadata which I think we'll want to do for dummy data generation purposes. In the first instance the only metadata property we support is supplying a table name which differs from the class name.

This is slightly cleaner than the previous approach.

This offsets what is, to my mind, the biggest downside of the inner class approach which is silent failure on typos.

When referencing this in the user docs I've tried to use some suitable generic text which doesn't suggest that you can just ask for permission and expect to be granted it. We may well want to tweak this on a per-table basis though, depending on the exact reason for the restriction.

I had previously tried to auto-generate the documentation here, but I think we're better off having the flexibility to explain different permissions in different ways. However I don't want us to be able to forget to document them, or to typo the permission name, so we now enforce this at documentation build time.

It's helpful if we can surface the error type to the user directly in Job Server, as we do for certain specific kinds of database error.

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543 Note that this PR should not be merged until the related `research-action` PR has been merged, which removes the one usage of the `opensafely check` command: * opensafely-core/research-action#115

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543 Note that this PR should not be merged until the related `research-action` PR has been merged, which removes the one usage of the `opensafely check` command: * opensafely-core/research-action#115 Note that we deliberately leave the `repository_permissions.yaml` file in place to avoid triggering errors on clients which haven't yet updated and are still trying to fetch this file.

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543 Note that this PR should not be merged until the related `research-action` PR has been merged, which removes the one usage of the `opensafely check` command: * opensafely-core/research-action#115 Note that we deliberately leave the `repository_permissions.yaml` file in place to avoid triggering errors on clients which haven't yet updated and are still trying to fetch this file. I've set a reminder to remove this file in a couple of weeks.

evansd force-pushed the evansd/table-permissions branch from a2b83f1 to 5643572 Compare September 19, 2025 16:20

github-actions bot deployed to databuilder-docs (Preview) September 19, 2025 16:21 View deployment

evansd force-pushed the evansd/table-permissions branch from 5643572 to 28a9cce Compare September 19, 2025 16:30

github-actions bot deployed to databuilder-docs (Preview) September 19, 2025 16:31 View deployment

evansd force-pushed the evansd/table-permissions branch from 28a9cce to 2e033e3 Compare September 19, 2025 16:56

github-actions bot deployed to databuilder-docs (Preview) September 19, 2025 16:57 View deployment

evansd force-pushed the evansd/table-permissions branch from 2e033e3 to 733b48b Compare September 22, 2025 11:50

github-actions bot deployed to databuilder-docs (Preview) September 22, 2025 11:51 View deployment

evansd force-pushed the evansd/table-permissions branch from 733b48b to 5faa3ad Compare September 22, 2025 12:27

github-actions bot deployed to databuilder-docs (Preview) September 22, 2025 12:27 View deployment

evansd added a commit to opensafely-core/job-runner that referenced this pull request Sep 22, 2025

Handle some more specific ehrQL exit codes

9cc0aa8

See: * opensafely-core/ehrql#2543

evansd mentioned this pull request Sep 22, 2025

Handle some more specific ehrQL exit codes opensafely-core/job-runner#1188

Merged

evansd marked this pull request as ready for review September 22, 2025 12:46

evansd force-pushed the evansd/table-permissions branch from 5faa3ad to 39465cd Compare September 22, 2025 16:38

github-actions bot deployed to databuilder-docs (Preview) September 22, 2025 16:39 View deployment

suzannehamilton reviewed Sep 24, 2025

View reviewed changes

evansd force-pushed the evansd/table-permissions branch from 39465cd to 68083ea Compare September 25, 2025 09:57

github-actions bot deployed to databuilder-docs (Preview) September 25, 2025 09:58 View deployment

evansd added a commit to opensafely-core/job-runner that referenced this pull request Sep 26, 2025

Handle some more specific ehrQL exit codes

695aafb

See: * opensafely-core/ehrql#2543

bloodearnest approved these changes Sep 30, 2025

View reviewed changes

evansd force-pushed the evansd/table-permissions branch from 68083ea to c1a57c0 Compare October 1, 2025 08:15

github-actions bot deployed to databuilder-docs (Preview) October 1, 2025 08:16 View deployment

evansd force-pushed the evansd/table-permissions branch from c1a57c0 to d8caaf8 Compare October 1, 2025 09:05

github-actions bot deployed to databuilder-docs (Preview) October 1, 2025 09:06 View deployment

evansd added a commit to opensafely-core/job-runner that referenced this pull request Oct 1, 2025

Handle some more specific ehrQL exit codes

dde0a3e

See: * opensafely-core/ehrql#2543

evansd mentioned this pull request Oct 1, 2025

Make local missing permissions a hard error rather than a warning #2553

Open

evansd added a commit to opensafely-core/job-runner that referenced this pull request Oct 1, 2025

Handle some more specific ehrQL exit codes

0ac19a3

See: * opensafely-core/ehrql#2543

Fix incorrect docstring

d360c78

evansd added 16 commits October 1, 2025 12:39

Small refactor to make future testing easier

b97fd13

Move indent() helper to generic log utils module

6a7a74e

Use table metadata to specify all non-standard table names

2be46ee

This is slightly cleaner than the previous approach.

Add validation checks for inner metadata class

974f564

This offsets what is, to my mind, the biggest downside of the inner class approach which is silent failure on typos.

Support specifying permissions in table metadata

d30e134

feat: Enforce permissions when running against real data

51b4c4f

Add functional tests for permissions enforcement

9932797

feat: Allow users to "claim" permissions locally

5c22276

Add functional tests for permissions claims

098f65b

Add documentation

444c17a

Use exit codes to indicate different kinds of error

148118c

It's helpful if we can surface the error type to the user directly in Job Server, as we do for certain specific kinds of database error.

Run just generate-docs

a476462

evansd force-pushed the evansd/table-permissions branch from d8caaf8 to a476462 Compare October 1, 2025 11:39

github-actions bot deployed to databuilder-docs (Preview) October 1, 2025 11:40 View deployment

evansd enabled auto-merge October 1, 2025 11:41

evansd merged commit 7e5a1af into main Oct 1, 2025
9 checks passed

evansd deleted the evansd/table-permissions branch October 1, 2025 11:56

evansd added a commit to opensafely-core/research-action that referenced this pull request Oct 2, 2025

Remove "Check Datasets" step

134b694

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543

evansd mentioned this pull request Oct 2, 2025

Remove "Check Datasets" step opensafely-core/research-action#115

Merged

evansd mentioned this pull request Oct 2, 2025

Remove check command opensafely-core/opensafely-cli#364

Merged

evansd added a commit to opensafely-core/research-action that referenced this pull request Oct 3, 2025

Remove "Check Datasets" step

184b542

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543

evansd added a commit to opensafely-core/research-action that referenced this pull request Oct 3, 2025

Remove "Check Datasets" step

2ebff32

This is now superseded by the ehrQL permissions system: * opensafely-core/ehrql#2543


		from ehrql import claim_permissions

		claim_permissions("some_permission", "another_permission")

		required_permission = "special_perm"


		@function_body_as_string

	def rows_for_patients(self, table_info):
	row = {
	"date_of_birth": self.date_of_birth,
	"date_of_death": self.date_of_death,
	}
	# Apply any FirstOfMonth constraints
	for key, value in row.items():
	if key in table_info.columns and value is not None:
	if table_info.columns[key].get_constraint(Constraint.FirstOfMonth):
	row[key] = value.replace(day=1)
	return [row]

	def rows_for_practice_registrations(self, table_info):
	# TODO: Generate more interesting registration histories; for now, we just
	# assume that every patient is permanently registered with a single practice
	# from birth
	row = {
	"start_date": self.events_start,
	"end_date": None,
	}
	return [row]



		def parse_permissions(environ):
		return set(environ.get("EHRQL_PERMISSIONS", "").split(","))

Add a permissions system #2543

Add a permissions system #2543

Uh oh!

Conversation

evansd commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

In Production

Locally

Uh oh!

cloudflare-workers-and-pages bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying databuilder-docs with Cloudflare Pages

Uh oh!

suzannehamilton left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bloodearnest left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

evansd commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

evansd commented Sep 19, 2025 •

edited

Loading

cloudflare-workers-and-pages bot commented Sep 19, 2025 •

edited

Loading