Skip to content

Conversation

marslanabdulrauf
Copy link
Contributor

@marslanabdulrauf marslanabdulrauf commented Sep 18, 2025

What are the relevant tickets?

https://github.com/mitodl/hq/issues/6962

Description (What does it do?)

This PR:

  1. adds receivers to create Github repo(if enabled) on course creation signal
  2. creates Github repo(if enabled) on course re-run creation

Screenshot

image

How can this be tested?

  1. Install and configure ol_openedx_git_auto_export
  2. Run Migration in CMS: tutor dev exec cms ./manange.py cms migrate

Testing management command

  1. You must have a course with giturl set in its advance settings -- It shouldn't have gitrepo entry in admin
  2. Run Management command tutor dev exec cms ./manage.py cms migrate_giturl
  3. Verify that the giturl is migrate to CourseGitRepo model
  4. Make sure that giturl is migrated for re-run courses as well

Auto repo creation testing

  1. Install and configure ol_openedx_git_auto_export
  2. Make sure to:
    • Enable ENABLE_AUTO_GITHUB_REPO_CREATION
    • Set GITHUB_ORG_API_URL
    • Set GITHUB_ACCESS_TOKEN -- This token should have repo read/write and pull/push access.
  3. Create new course and verify from your organization that a new repository is created.
  4. Check admin and see if the entry is added for this newly created course
  5. Create re-run of any course and repeat steps 3 and 4 to verify
  6. Make any change and publish to make sure you are getting latest course changes/commits

@marslanabdulrauf marslanabdulrauf marked this pull request as ready for review September 19, 2025 12:05
"Course published with auto-export enabled. Starting export... (course id: %s)", # noqa: E501
course_key,
)
# HACK: To create auto git repo for Re-runs as it does not emit COURSE_CREATED signal # noqa: E501 FIX004
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a bug that reruns don't emit COURSE_CREATED signal?

Is there a different signal emitted when a rerun is created? @asadali145

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a different signal emitted when a rerun is created? @asadali145

I am not sure about a specific signal, but we listen for the post_save signal on changes in CourseRerunState model. It actually tracks the status of reruns, like in_progress, failed, and completed. When a rerun state is completed, then we update our course sync mappings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with @marslanabdulrauf and @arslanashraf7; reruns do not trigger the COURSE_CREATED signal, which looks like a bug to me. The reason is that course reruns follow a completely different code path, and that never sends a COURSE_CREATED signal.

We can listen for CourseRerunState post save here like we do for the course sync plugin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @asadali145 I will update it to catch course re-run creation through post_save signal 👍

@pdpinch
Copy link
Member

pdpinch commented Sep 19, 2025

When we create a repo, I want to make sure that it is set to private in Github. Course exports are considered sensitive since they can contain exam solutions. This is critical, so should be in this PR.

If possible, I'd also like to set the repo description at creation time, with a string like f"{display_name}, exported from {studio_url}" so repo users can easily see which repo goes with which course and can link to it in Studio (this could be a follow-up PR).

@cachob anything else we learned recently about creating repos?

@marslanabdulrauf
Copy link
Contributor Author

The repository is set to Private.

I can update the repository description, current set to very minimal description, can be seen from attached SS as well:

image

}
payload = {
"name": course_id_slugified,
"description": f"Git repository for {course_id!s}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do something like f"{display_name}, exported from {studio_url}" so repo users can easily see which repo goes with which course and can link to it in Studio?


def listen_for_course_created(**kwargs):
course_id = kwargs.get("course").course_key
course_id_slugified = slugify(str(course_id))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of the output of slugify for a typical course key?

I'm concerned that we might be dropping all the +s, which act as delimiters and make it easier to read and parse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github converts all +, / or : to - and there could be other characters, I haven't tested it for all of them. So I slugify it. Slugify returns course-v1xprogh101101_1 for course-v1:xpro+gh101+101_1

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it with all special characters that I could type from my laptop and other than ., - and _ it converts all of them into -
Let me add a formatter to convert course_id into Github supported format, it will increase readability

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also drop the course-v1- from the start of the repo name? It will be the same for all repos. (unless maybe someday there is a course-v2).

@marslanabdulrauf
Copy link
Contributor Author

@pdpinch I updated the repository name and added a screenshot of repo page in PR description

@asadali145 asadali145 self-assigned this Sep 22, 2025
Copy link
Contributor

@asadali145 asadali145 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a high-level code review. I will proceed with the testing in the next review.

}
# Set when ENABLE_AUTO_GITHUB_REPO_CREATION is true
GITHUB_ORG_API_URL = "https://api.github.com/orgs/<GITHUB_ORG_NAME>" # For GitHub Enterprise, change the URL accordingly
GITHUB_ACCESS_TOKEN = "<GITHUB_PERSONAL_ACCESS_TOKEN>" # Token must have repo creation permissions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we list down the steps and exact permissions required?

----------------------
- Open studio then course and go to advanced settings.
- Choose field GIT URL and add you OLX git repo. For example ``[email protected]:<GITHUB_USERNAME>/edx4edxlite.git``.
- Open studio `admin <http://studio.local.openedx.io:8001/admin/ol_openedx_git_auto_export/coursegitrepo/>`_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Open studio `admin <http://studio.local.openedx.io:8001/admin/ol_openedx_git_auto_export/coursegitrepo/>`_
- Open studio admin at `/ol_openedx_git_auto_export/coursegitrepo/`

- Open studio then course and go to advanced settings.
- Choose field GIT URL and add you OLX git repo. For example ``[email protected]:<GITHUB_USERNAME>/edx4edxlite.git``.
- Open studio `admin <http://studio.local.openedx.io:8001/admin/ol_openedx_git_auto_export/coursegitrepo/>`_
- Add your course_id and in GIT URL add you OLX git repo. For example ``[email protected]:<GITHUB_USERNAME>/edx4edxlite.git``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Add your course_id and in GIT URL add you OLX git repo. For example ``[email protected]:<GITHUB_USERNAME>/edx4edxlite.git``.
- Add your course_id and in the GIT URL, add your OLX git repo. For example ``[email protected]:<GITHUB_USERNAME>/edx4edxlite.git``.

list_per_page = 20


admin.site.register(CourseGitRepo, CourseGitRepoAdmin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the @admin.register decorator here.

Comment on lines 84 to 89
"""Create a GitHub repository for the given course key.
Args:
course_key (CourseKey): The course key for which to create the repository.
Returns:
str or None: The SSH URL of the created repository, or None if creation failed.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Create a GitHub repository for the given course key.
Args:
course_key (CourseKey): The course key for which to create the repository.
Returns:
str or None: The SSH URL of the created repository, or None if creation failed.
"""
"""
Create a GitHub repository for the given course key.
Args:
course_key (CourseKey): The course key for which to create the repository.
Returns:
str or None: The SSH URL of the created repository, or None if creation failed.
"""

return repo_name.replace("course-v1-", "")


def create_github_repo(course_key):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move the repo creation to a Celery task.

)
return None

# SignalHandler.course_published is called before COURSE_CREATED signal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a valid comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I faced issues but its not valid now as we have moved repo creation to post_save signal

"description": repo_desc,
"private": True,
"has_issues": False,
"has_project": False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct key is has_projects

"auto_init": True,
}
response = requests.post(url, headers=headers, json=payload, timeout=30)
if response.status_code != 201: # noqa: PLR2004
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use restframework status.201

@asadali145
Copy link
Contributor

We discussed this in the morning, and I am posting here on behalf of @marslanabdulrauf as he is out sick and could not post this comment.

@pdpinch Do we want to auto-create GitHub repos for all the new courses, including UAI and non-UAI courses?

If the goal is to create repositories only for UAI courses, would it make sense to use UAI_ as the standard prefix for repo creation logic, or did you have a different approach in mind?

CC: @marslanabdulrauf @arslanashraf7

@pdpinch
Copy link
Member

pdpinch commented Sep 24, 2025

Do we want to auto-create GitHub repos for all the new courses, including UAI and non-UAI courses?

All courses should get a repo, UAI and non-UAI.

In particular, we need to have version control of all UAI_SOURCE courseruns.

@marslanabdulrauf
Copy link
Contributor Author

Thanks for clarifying @pdpinch

I have also updated migration command to:

  1. Migrate giturl from advance settings to our custom managed table CourseGithubRepository
  2. Create new repo for courses with duplicate giturl
  3. Create new repo if the course does not have any giturl

Copy link
Contributor

@asadali145 asadali145 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I have noted that we mostly use git instead of GitHub, because most of the code is specific to GitHub. Therefore, we should use GitHub in our naming conventions and documentation.

I have tested it, and it looks good.


class Command(BaseCommand):
help = """
Migrate git URLs from courses advanced settings to CourseGithubRepository model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Migrate git URLs from courses advanced settings to CourseGithubRepository model
Migrate git URLs from the course advanced settings to the CourseGithubRepository model

from opaque_keys.edx.django.models import CourseKeyField


class CourseGithubRepository(TimeStampedModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
class CourseGithubRepository(TimeStampedModel):
class CourseGitHubRepository(TimeStampedModel):

Model to store Git repository information for courses.
"""

course_id = CourseKeyField(max_length=255, unique=True, primary_key=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
course_id = CourseKeyField(max_length=255, unique=True, primary_key=True)
course_key = CourseKeyField(max_length=255, unique=True, primary_key=True)

Comment on lines 5 to 6
creation, and rerun events to automatically export course content to Git repositories
and create GitHub repositories as needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
creation, and rerun events to automatically export course content to Git repositories
and create GitHub repositories as needed.
creation, and rerun events to automatically create GitHub repositories and export course content to them.

Comment on lines 133 to 137
else:
LOGGER.error(
"Failed to retrieve SSH URL for GitHub repository for course %s",
course_key,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will still go to sentry and I think we should add more information here to why we could not get the URL.

return git_repo_export_dir


def github_repo_name_format(course_id_str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def github_repo_name_format(course_id_str):
def github_repo_name_format(course_key_str):

- Maximum length is 100 characters
Args:
course_id_str (str): The course ID string to format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
course_id_str (str): The course ID string to format
course_id_str (str): The course key string to format

Comment on lines 70 to 72
repo_name = "".join(
char if char.isalnum() or char in "-_." else "-" for char in course_id_str
).strip("-")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be done with RE,

Suggested change
repo_name = "".join(
char if char.isalnum() or char in "-_." else "-" for char in course_id_str
).strip("-")
repo_name = re.sub(r"[^A-Za-z0-9_.-]", "-", course_id_str).strip("-")

----------------------
- Open studio then course and go to advanced settings.
- Choose field GIT URL and add you OLX git repo. For example ``[email protected]:<GITHUB_USERNAME>/edx4edxlite.git``.
- Open studio admin at `/ol_openedx_git_auto_export/coursegitrepo/`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Open studio admin at `/ol_openedx_git_auto_export/coursegitrepo/`
- Open studio admin at `/ol_openedx_git_auto_export/coursegitrepo/`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you please fix a typo at line 43 in work authentication

----------------------
- Open studio then course and go to advanced settings.
- Choose field GIT URL and add you OLX git repo. For example ``[email protected]:<GITHUB_USERNAME>/edx4edxlite.git``.
- Open studio admin at `/ol_openedx_git_auto_export/coursegitrepo/`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update this URL path.

2. Extract git URLs from the course modulestore data
3. Create CourseGitHubRepository records for courses with git URLs
4. Handle duplicate git URLs by creating new GitHub repositories
5. Report on courses without git URLs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is invalid as we are creating the repos for the courses with no git URL as well.


def has_delete_permission(self, request, obj=None): # noqa: ARG002
"""
Disable delete permission for CourseGitHubRepository objects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also mention the reason for disabling the deletion, i.e. to avoid stale repos

Comment on lines 4 to 5
This module defines the CourseGitHubRepository model which stores the mapping between
OpenedX courses and their GitHub repositories for automated export functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove this to be able to add more models in the future. These should go the model docs.

from opaque_keys.edx.django.models import CourseKeyField


class CourseGitHubRepository(TimeStampedModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to admit it here that this was better named CourseGitRepository, I mixed it with our repo creation but now I think it was better named before. If repo creation is not enabled then this would still work for any git repo or without the integration. My bad!

Copy link
Contributor

@asadali145 asadali145 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thanks for making all the changes.

@arslanashraf7 You can have another quick look over the changes now.

export_to_git(course_module.id, course_module.giturl, user=user)
course_repo = CourseGitRepository.objects.get(course_key=course_key)
if course_repo.is_export_enabled:
LOGGER.debug(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOGGER.debug(
LOGGER.info(

Copy link
Contributor

@arslanashraf7 arslanashraf7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just skimmed through the code and left some comments. I did no functionality test it assuming @asadali145 has already did the testing so I did not want to put double effort on testing if not needed.

GITHUB_ACCESS_TOKEN = "GITHUB_ACCESS_TOKEN" # noqa: S105

COURSE_RERUN_STATE_SUCCEEDED = "succeeded"
REPOSITORY_NAME_MAX_LENGTH = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was some decision-making on this? We should probably mention this in the README as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is max limit by GitHub itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's add a code comment above and mention that this is the max limit from GitHub. That would be helpful for future readers.

Examples:
# Migrate all courses
python manage.py migrate_giturl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use --all instead of setting the default command run to run for all courses.

courses = CourseOverview.objects.all().order_by("created")
if course_keys:
courses = courses.filter(id__in=course_keys)
seen_giturls = set()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
seen_giturls = set()
processed_giturls = set()

)
)

async_create_github_repo.delay(str(course.id), export_course=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should wait on tasks to complete to represent the actual command result. Currently, the command will post status of the start of the jobs it would not wait if the task failed or succeeded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repo creation can take time and we might have 500+ repos to create. @asadali145, I think, added estimation time somewhere that it could take 30+ minutes for this management command to complete.

def plugin_settings(settings):
"""Settings for the git auto export plugin.""" # noqa: D401
settings.GIT_REPO_EXPORT_DIR = "/openedx/export_course_repos"
settings.GITHUB_ORG_API_URL = "https://api.github.com/orgs/edx"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should default it to mitodl org.

github_repo_name_format,
)

LOGGER = get_task_logger(__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We are using log or logger naming convention mostly.

course_module.id,
)
except CourseGitRepository.DoesNotExist:
LOGGER.exception(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: This will be logged on Sentry. I don't know if we want to log it there since we are handling it by creating a repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after migration command, we don't expect any course without Github url, this exception is added if we get here we should verify that why this course is missing repository


course_key = CourseKey.from_string(course_key_str)
course_id_slugified = github_repo_name_format(str(course_key))
if not settings.FEATURES.get(ENABLE_AUTO_GITHUB_REPO_CREATION):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, This check is best placed before starting a task. e.g. in case of command we might end up starting quite a lot tasks but they would all get added in queue even knowing that they will not succeed.

GITHUB_ACCESS_TOKEN = "GITHUB_ACCESS_TOKEN" # noqa: S105

COURSE_RERUN_STATE_SUCCEEDED = "succeeded"
REPOSITORY_NAME_MAX_LENGTH = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's add a code comment above and mention that this is the max limit from GitHub. That would be helpful for future readers.


def handle(self, *args, **options): # noqa: ARG002
if not is_auto_repo_creation_enabled():
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a simple return here, we should print an error saying that the repo creation is disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have that logging in the function itself

return

course_keys = options.get("course_keys", [])
all_courses = options.get("all", False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We used store_true. The options would have a default of False if this arg was not provided.

Suggested change
all_courses = options.get("all", False)
all_courses = options.get("all")

self.style.ERROR(
"Error: You must specify either course keys or use the --all flag.\n" # noqa: E501
"Examples:\n"
" python manage.py migrate_giturl --all\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have seen issues with hardcoded file names in the past. If we are going to write examples, we should mention them in the command help or use the dynamic file naming in our print statements. i.e. command_name = os.path.basename(__file__).replace(".py", "")

Suggested change
" python manage.py migrate_giturl --all\n"
f" python manage.py {command_name} --all\n"

Comment on lines 116 to 132
if giturl and giturl not in processed_giturls:
processed_giturls.add(giturl)
migration_logs["existing_repos"].append(
{"course_id": str(course.id), "giturl": giturl}
)
CourseGitRepository.objects.get_or_create(
course_key=course.id,
defaults={"git_url": giturl},
)
continue

if giturl and giturl in processed_giturls:
migration_logs["duplicate_repos"].append(
{"course_id": str(course.id), "duplicate_giturl": giturl}
)
else:
migration_logs["no_giturl"].append({"course_id": str(course.id)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be simpler, less if conditionals.

Suggested change
if giturl and giturl not in processed_giturls:
processed_giturls.add(giturl)
migration_logs["existing_repos"].append(
{"course_id": str(course.id), "giturl": giturl}
)
CourseGitRepository.objects.get_or_create(
course_key=course.id,
defaults={"git_url": giturl},
)
continue
if giturl and giturl in processed_giturls:
migration_logs["duplicate_repos"].append(
{"course_id": str(course.id), "duplicate_giturl": giturl}
)
else:
migration_logs["no_giturl"].append({"course_id": str(course.id)})
if not giturl:
migration_logs["no_giturl"].append({"course_id": str(course.id)})
elif giturl not in processed_giturls:
processed_giturls.add(giturl)
migration_logs["existing_repos"].append(
{"course_id": str(course.id), "giturl": giturl}
)
CourseGitRepository.objects.get_or_create(
course_key=course.id,
defaults={"git_url": giturl},
)
continue
migration_logs["duplicate_repos"].append(
{"course_id": str(course.id), "duplicate_giturl": giturl}
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good but it has minor logical issue when we don't have giturl. We will add it in no_giturl as well as in duplicate_repo which is not correct.

I will fix and update the code.

Comment on lines +113 to +120
course_module = modulestore().get_course(course.id, depth=1)
giturl = course_module.giturl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is prone to error. We should use try: except approach so that one bad course doesn't break the whole command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This giturl is string field, we should always get this in any case.

Comment on lines 183 to 187
self.stdout.write(f"\n✅ EXISTING REPOSITORIES ({total_existing}):")
for repo in logs["existing_repos"]:
self.stdout.write(
self.style.SUCCESS(f" • {repo['course_id']}: {repo['giturl']}")
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a utility method, that takes a list, a flag representing it the log style, and the message. We can use this method in every other place below.

from opaque_keys.edx.django.models import CourseKeyField


class CourseGitRepository(TimeStampedModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are disabling the deletion in Django Admin above. Should we disable the delete directly in the model so that it is protected against a django shell delete as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add full delete protection. As admins can remove repos and then can delete the entry from shell.

headers = {
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {settings.GITHUB_ACCESS_TOKEN}",
"X-GitHub-Api-Version": "2022-11-28",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is any specific reason for this header and it's value since it says 2022, we should add a code comment for future readers about what & why it is like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its from the Github official docs. This is the latest version

image


# Create new repository for courses with duplicate or no giturl
self.stdout.write(f"Processing repository creation for {course.id}...")
task = async_create_github_repo.delay(str(course.id), export_course=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an anti pattern. Calling each task in the for loop and waiting for its response still defeats the purpose of using celery for running parallel background tasks. We should batch our requests to celery let it run the tasks in parallel, and return the results back to the commands. In both cases, the command will wait. But in the latter, it will run the tasks in parallel and wait instead of running them in series and waiting.

We can use Celery Groups to group the tasks, wait on their response and print the results. You can see some examples of these in this search

@marslanabdulrauf marslanabdulrauf force-pushed the marslan/6962-git-auto-export-edx-integration branch from 7d832f2 to e730e76 Compare October 13, 2025 22:06
@marslanabdulrauf marslanabdulrauf force-pushed the marslan/6962-git-auto-export-edx-integration branch from e730e76 to 4b43be4 Compare October 13, 2025 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants