-
Notifications
You must be signed in to change notification settings - Fork 4
feat(websoc): implement related courses from section code range #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(websoc): implement related courses from section code range #104
Conversation
Just confirmed; this PR as is does not fulfill the written feature request from the issue. Remember the flag was meant to set the following behavior: query Websoc {
websoc(
query: {
includeRelatedCourses: "true"
year: "2025"
quarter: Winter
sectionCodes: "35610"
}
) {
schools {
departments {
courses {
courseNumber
sections {
sectionType
sectionCode
}
}
}
}
}
} We would expect it to return every ICS 33 section (including the other lecture and the labs) this term, since ICS 33 is a "related course" to a section within the set of codes we specified. However it only returns the section we originally requested. |
Also, please make sure to update the title of your PR to comply with Conventional Commits. |
…string to a boolean and Preprocess `courseIds` to split a comma-separated string into an array.
…a subQuery passed into buildQuery
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why the new courseIds
parameter is necessary. It duplicates existing parameters and isn't documented so a user would have no idea what it does differently from the existing options or that it's only associated with related courses (and not meaningful otherwise) and mutually exclusive with sectionCodes
. Remember that the spec says that you should be able to accomplish related courses without any additional parameters (or forced lack thereof).
You also accumulate SQL data into JS and then use it to request SQL again, which usually incurs excess computing, memory, and network cost. It can and should be done purely in SQL. Off the top of my head a self join or subquery can be effective.
Hey Dante, Thanks for the feedback. I introduced the courseIds parameter as a bonus of my implementation based on Andrew’s guidance.
I realize that in the current implementation the courseIds value gets overwritten when both sectionCodes and Regarding the accumulation of SQL data in JavaScript before making another query: I understand the performance concern and the desire for a more SQL-centric solution. My current approach was designed to clearly separate query construction from execution. However, I'm open to refactoring the implementation to incorporate more logic directly in SQL—perhaps using a self-join or a subquery like you mentioned. Could you please share any specific suggestions or examples of how you'd like to see this implemented? Let me know if this approach works or if further adjustments are needed. |
Thanks for the clarification. I'd like to hear more of your approach and broader context, if you don't mind. Doing so would allow me to deeply understand and comment directly on the code.
What about existing
From what I see so far, the first DB query which is de-duplicated via
I agree that this is a good priority to have. I also think there may be a way to both have the cake and eat it by building conditions which also include a SQL sub-query or CTE when necessary. I argue that this would still be an acceptable separation of concerns since that expression isn't executed by the query building code.
The latter seems reasonable; to be clear I understand you want to make the inclusion of the But "impelementation to merge the courseIds" confuses me; can you clarify?
Of course. At a high level, you can reframe the First, when related courses functionality is not required; this case is the existing code from before this PR. Second, when related courses functionality is needed; this is the new case. You have the choice of when this case should trigger (see above). In this second case, you should wrap the conditions from case 1 in a sub-query, CTE, etc. This inner expression should fetch according to the conditions from case 1, but The queries in these two cases might look like this: -- case 1
SELECT something FROM websoc_section -- INNER JOIN something ON something... more joins, etc.
WHERE -- conditions
;
-- case 2
SELECT something FROM websoc_section -- INNER JOIN...
WHERE course_id IN (
SELECT course_id FROM websoc_section -- INNER JOIN something...
WHERE -- conditions
); Let me know if you have questions on any of the above. |
Thanks for the response Dante. Here some details answering your questions.
For sure, the basic approach for the step-through was: Just wanted to make sure I fully understand your feedback. The issue seems to be that the "subquery" I described isn’t actually a true SQL subquery but rather a separate database query executed in JavaScript. If I’m understanding correctly, the expectation is to handle this logic within the main SQL query itself, rather than fetching courseIds separately in JavaScript and passing them into buildQuery. Andrew initially mentioned that while the initial approach avoided a single SQL query due to the complexity of architecting it, he agrees that moving everything into SQL would likely bring performance benefits. Given that, I’ll be looking into refactoring this into a fully SQL-based solution..
I hadn’t considered department and courseNumber when implementing courseIds. Andrew and I didn’t plan to change those parameters, so at first, it seemed like courseIds might not have a distinct purpose—especially since department + courseNumber already allow course lookups. However, after discussing this with Andrew, he pointed out that the existing combination of department and courseNumber only allows looking up one department at a time, whereas courseIds could provide additional flexibility. Since this is new functionality that doesn’t exist yet, we can choose to keep it. I think the key question now is how to handle cases where users provide conflicting parameters (courseIds vs. department + courseNumber). I’ll need to determine how to prioritize those conditions in buildQuery. Let me know if you have any preferences on that!
Yeah, so the problem was when I had the three flags— I am also assuming this problem would also change due to the change in subquery implementation that I need to refactor. I had some quick follow up questions as well regarding this refactor:
I wanted to give you a heads-up that I should have the new implementation ready within the next week or so. I have my ICS 51 midterm this Wednesday, so I’m pretty packed with studying until then. Thanks for your time and detailed feedback! I’ll work on the refactors and reach out if I have any questions. |
That agrees with my understanding.
It should be in SQL, not necessarily the last query you do or any other query you might call the "main" query. But it should not cross the SQL-JS barrier and go back in again without a very good reason (e.g. you've proven it's actually faster).
Excellent.
They are not mutually exclusive. Of course we are working under the SemVer constraint that we will have no breaking changes. If they are present together, I argue disjunction (logical OR) is most appropriate.
Your decision to disallow the presence of all three parameters is mostly a concern of your validation step. Your further code, including SQL queries, can assume that invariant is upheld where applicable. As particularly applied to SQL, it would probably *reduce* the amount of cases you have to consider when building queries.
Prefer the ORM, since we have it. As I understand the problem statement this should be feasible unless you discover yet another uncovered corner case in Drizzle. If you have to use raw SQL, minimize its scope (e.g. if you can't express a single function, use raw SQL for only that function).
You should never attempt a join on the human-readable course IDs unless you have a very very good reason. You should also never expose the UUIDs. Your understanding is correct.
A CTE is used here to assemble an array of specializations for each major. Here, we Two sub-queries are used here. The first is
No worries. I'm told this isn't urgent. |
<!--- Provide a general summary of your changes in the Title above --> ## Description Introduces a new "websocSectionEnrollmentLive" table to the schema, purposed to keep track of updated enrollment data based on continuous scrapes. Updated the websoc scraper to populate new table every 5 minutes. The table holds short term data, clearing out all entries older than an hour every scrape. Added REST endpoint functionality for the new table, which retrieves enrollment snapshots based on a `since` timestamp. - The API returns: - A "from" snapshot representing the latest known state before `since` (if available). - A "to" snapshot representing the latest known state. - If no significant changes are detected, only "to" is returned. ## Related Issue Closes [#59] ## Motivation and Context Allows developers to create accurate applications of UCI's enrollment information, helping student's quarterly enrollment processes go smoother and be more communicative. ## How Has This Been Tested? the schema and scraper were tested across numerous scapes in 5 minute intervals, and the data was validated for consistency. ## Screenshots (if appropriate): ## Types of changes <!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Checklist: - [X] My code involves a change to the database schema. - [ ] My code requires a change to the documentation. --------- Co-authored-by: Sanskar Mishra <[email protected]> Co-authored-by: Jordan <[email protected]> Co-authored-by: Jordan Yee <[email protected]> Co-authored-by: Dante Dam <[email protected]> Co-authored-by: Andrew Wang <[email protected]>
…ing a helper function
…ithub.com:icssc/anteater-api into 18_related_courses_from _section_course_code_range
Description
includeRelatedCourses
flag in thewebsoc.ts
schema. The flag now accepts"true"
or"false"
as strings, which are transformed into a boolean.courseIds
flag in the same schema.courseIds
is now accepted as a comma-separated string and preprocessed into an array of strings.buildQuery
function inservices/websoc.ts
to build conditions for related courses based on section code ranges. When includeRelatedCourses is enabled along withsectionCodes
, a subquery fetches the related course IDs, and the query is constructed to filter based on these IDs.sectionCodes
andcourseIds
are provided withincludeRelatedCourses
, since the flag is only intended to work with sectionCodes.Related Issue
Completes #18
Motivation and Context
This change enhances the WebSoc API by providing a more flexible and user-friendly way to fetch related courses. The
includeRelatedCourses
flag now supports multiple boolean representations and ensures that related courses are dynamically fetched based on section code ranges. Additionally, the bonus handling forcourseIds
allows users to input a comma-separated string that is automatically transformed into an array, enforcing consistency and preventing ambiguous queries.How Has This Been Tested?
Manually tested the REST and GraphQL endpoints using various scenarios:
sectionCodes
withincludeRelatedCourses
:"true"
.courseIds
.sectionCodes
andcourseIds
withincludeRelatedCourses
, which correctly trigger the error response.Screenshots (if appropriate):
Types of changes
Checklist: