Updated CourseInfo uses multithreading instead of multiprocessing #147

AssassinMagic · 2025-11-03T18:53:46Z

Using asyncio with python 3.14 lets python run multi threaded instead of single threaded.

CourseInfo in enhanced was updated to take advantage of this

vercel · 2025-11-03T18:53:56Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Updated (UTC)
gophergrades	Ready	Preview	Nov 3, 2025 6:53pm

Kanishk-K · 2025-11-28T15:23:26Z

data-app/src/enhance/courseInfo.py

 from .abstract import EnhanceBase
 from db.Models import DepartmentDistribution, ClassDistribution, Libed, Session, and_
-import requests
+import httpx


Not imported in pipenv

Also we use aiohttp for RMP already. I would encourage using that so we keep our surface area low with libraries. Check the RMP folder for how that looks like.

Kanishk-K · 2025-11-28T15:27:07Z

data-app/src/enhance/courseInfo.py

+        finally:
+            session.close()
+
+    async def enhance_helper(self, dept_dist: DepartmentDistribution, semaphore: asyncio.Semaphore) -> None:


does not fit abstract class structure.

Kanishk-K · 2025-11-28T15:28:21Z

data-app/src/enhance/abstract.py

-        with Pool() as pool:
-            pool.map(self.enhance_helper, dept_dists)
+
+        semaphore = asyncio.Semaphore(9)  # Limit concurrent tasks to something under 5 due to rate limiting


Comment conflicts with implementation

Kanishk-K · 2025-11-28T15:30:06Z

data-app/src/enhance/courseInfo.py

-        campus_str = str(campus)
+    def _process_course_data(self, courses: list[dict], dept: str, campus: str) -> None:
+        session = Session()
+        try:


If even one course fails the entire batch will be considered faulty.

Kanishk-K · 2025-11-28T15:32:30Z

data-app/src/enhance/courseInfo.py

+                    response.raise_for_status()
+                    req = response.json()
+                    courses = req.get("courses", [])
+                except httpx.HTTPStatusError as e:


I'd like to know if this is a prohibited error 4xx or a server error 5xx in our logs.

Kanishk-K · 2025-11-28T15:37:16Z

data-app/src/enhance/courseInfo.py

-        campus = dept_dist.campus
-        campus_str = str(campus)
+    def _process_course_data(self, courses: list[dict], dept: str, campus: str) -> None:
+        session = Session()


Sessions should be made for every granular change. You're batching a massive amount of data in one commit.

Kanishk-K · 2025-11-28T15:43:49Z

data-app/src/enhance/courseInfo.py

-        # Only process UMNTC and UMNRO campuses
-        if campus_str not in ["UMNTC", "UMNRO"]:
-            return
+            courses = []


Do you think this is necessary? Courses will either be assigned a value if the try succeeds. If it fails it should return with no-ops./

Kanishk-K · 2025-11-28T15:45:07Z

data-app/src/enhance/courseInfo.py

-                courses = req.get("courses", [])
-            except ValueError:
-                print("Json malformed, icky!")
+                await asyncio.to_thread(self._process_course_data, courses, dept, campus)


This spawns a seperate thread to run a synchronous function. Can you make _process_course_data async and just await it instead? Would reduce thread overhead.

Kanishk-K · 2025-11-28T15:50:16Z

data-app/src/enhance/abstract.py


-    def enhance(self, dept_dists: list[DepartmentDistribution]) -> None:
+    async def enhance(self, dept_dists: list[DepartmentDistribution]) -> None:
        """Enhance the data for a list of department distributions in a multiprocessing pool."""


Update comment.

Kanishk-K · 2025-11-28T15:59:33Z

In addition, you want to change the signature and logic of CourseDogEnhance. Even though we no longer use it, I would like to have backwards compatibility.

Kanishk-K · 2025-11-28T17:34:04Z

Once you merge in changes from #148 can you also add the following to the pipenv so we force Python 3.14t usage?

[requires]
python_full_version = "3.14t"

AssassinMagic added 2 commits October 23, 2025 17:37

updated enhance to use asyncio for multithreading

62ab652

fixed semaphore issue in enhance

54e16a1

AssassinMagic requested a review from Kanishk-K November 3, 2025 18:53

AssassinMagic self-assigned this Nov 3, 2025

AssassinMagic added the backend Backend label Nov 3, 2025

Kanishk-K requested changes Nov 28, 2025

View reviewed changes

Updated CourseInfo uses multithreading instead of multiprocessing #147

Are you sure you want to change the base?

Updated CourseInfo uses multithreading instead of multiprocessing #147

Uh oh!

Conversation

AssassinMagic commented Nov 3, 2025

Uh oh!

vercel bot commented Nov 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kanishk-K commented Nov 28, 2025

Uh oh!

Kanishk-K commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Kanishk-K commented Nov 28, 2025 •

edited

Loading