-
Notifications
You must be signed in to change notification settings - Fork 728
Update activities in postgres periodically #2668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
7025b4d
Format file
437d3ab
Ignore empty options in orderBy of conversations/query
1c2262b
Remove unused file
cd8f349
Do not write/use activities from postgres from data-sink-worker
d18faf7
Sync activities from questdb to postgres periodically
218a934
Create index on activities(updatedAt)
68d2f8a
Make sure we always update `updatedAt` when changing activities
671b310
Clean up modifying code of activities in postgres
657c99b
Use timestamp too when finding already existing activity
bef53ef
Clean up unused modifying sql code of activities
3a42be8
Update activities in merging-entity-worker via questdb
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,141 @@ | ||
| import cronGenerator from 'cron-time-generator' | ||
|
|
||
| import { DbStore, getDbConnection } from '@crowd/data-access-layer/src/database' | ||
| import { IDbActivityCreateData } from '@crowd/data-access-layer/src/old/apps/data_sink_worker/repo/activity.data' | ||
| import ActivityRepository from '@crowd/data-access-layer/src/old/apps/data_sink_worker/repo/activity.repo' | ||
| import { QueryExecutor, formatQuery, pgpQx } from '@crowd/data-access-layer/src/queryExecutor' | ||
| import { Logger, logExecutionTimeV2, timer } from '@crowd/logging' | ||
| import { getClientSQL } from '@crowd/questdb' | ||
| import { PlatformType } from '@crowd/types' | ||
|
|
||
| import { DB_CONFIG } from '@/conf' | ||
|
|
||
| import { CrowdJob } from '../../types/jobTypes' | ||
|
|
||
| async function decideUpdatedAt(pgQx: QueryExecutor, maxUpdatedAt?: string): Promise<string> { | ||
| if (!maxUpdatedAt) { | ||
| const result = await pgQx.selectOne('SELECT MAX("updatedAt") AS "maxUpdatedAt" FROM activities') | ||
| return result?.maxUpdatedAt | ||
sausage-todd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
|
|
||
| return maxUpdatedAt | ||
| } | ||
|
|
||
| async function getTotalActivities(qdbQx: QueryExecutor, whereClause: string): Promise<number> { | ||
| const { totalActivities } = await qdbQx.selectOne( | ||
| `SELECT COUNT(1) AS "totalActivities" FROM activities WHERE ${whereClause}`, | ||
| ) | ||
| return totalActivities | ||
| } | ||
|
|
||
| function createWhereClause(updatedAt: string): string { | ||
| return formatQuery('"updatedAt" > $(updatedAt)', { updatedAt }) | ||
| } | ||
|
|
||
| async function syncActivitiesBatch( | ||
| activityRepo: ActivityRepository, | ||
| activities: IDbActivityCreateData[], | ||
| ) { | ||
| const result = { | ||
| inserted: 0, | ||
| updated: 0, | ||
| } | ||
|
|
||
| for (const activity of activities) { | ||
| const existingActivity = await activityRepo.existsWithId(activity.id) | ||
|
|
||
| if (existingActivity) { | ||
| await activityRepo.rawUpdate(activity.id, { | ||
| ...activity, | ||
| platform: activity.platform as PlatformType, | ||
| }) | ||
| result.updated++ | ||
| } else { | ||
| await activityRepo.rawInsert(activity) | ||
| result.inserted++ | ||
| } | ||
| } | ||
|
|
||
| return result | ||
| } | ||
sausage-todd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| export async function syncActivities(logger: Logger, maxUpdatedAt?: string) { | ||
| logger.info(`Syncing activities from ${maxUpdatedAt}`) | ||
|
|
||
| const qdb = await getClientSQL() | ||
| const db = await getDbConnection({ | ||
| host: DB_CONFIG.writeHost, | ||
| port: DB_CONFIG.port, | ||
| database: DB_CONFIG.database, | ||
| user: DB_CONFIG.username, | ||
| password: DB_CONFIG.password, | ||
| }) | ||
sausage-todd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| const pgQx = pgpQx(db) | ||
| const qdbQx = pgpQx(qdb) | ||
| const activityRepo = new ActivityRepository(new DbStore(logger, db, undefined, true), logger) | ||
|
|
||
| let updatedAt = await logExecutionTimeV2( | ||
| () => decideUpdatedAt(pgQx, maxUpdatedAt), | ||
| logger, | ||
| 'decide updatedAt', | ||
| ) | ||
|
|
||
| const whereClause = createWhereClause(updatedAt) | ||
|
|
||
| const totalActivities = await logExecutionTimeV2( | ||
| () => getTotalActivities(qdbQx, whereClause), | ||
| logger, | ||
| 'get total activities', | ||
| ) | ||
|
|
||
| let counter = 0 | ||
|
|
||
| const t = timer(logger, `sync ${totalActivities} activities`) | ||
| // eslint-disable-next-line no-constant-condition | ||
| while (true) { | ||
| const result = await logExecutionTimeV2( | ||
| // eslint-disable-next-line @typescript-eslint/no-loop-func | ||
| () => | ||
| qdbQx.select( | ||
| ` | ||
| SELECT * | ||
| FROM activities | ||
| WHERE "updatedAt" > $(updatedAt) | ||
| ORDER BY "updatedAt" | ||
| LIMIT 1000; | ||
| `, | ||
| { updatedAt }, | ||
| ), | ||
| logger, | ||
| `getting activities with updatedAt > ${updatedAt}`, | ||
| ) | ||
|
|
||
| if (result.length === 0) { | ||
| break | ||
| } | ||
|
|
||
| const t = timer(logger) | ||
| const { inserted, updated } = await syncActivitiesBatch(activityRepo, result) | ||
| t.end(`Inserting ${inserted} and updating ${updated} activities`) | ||
|
|
||
| counter += inserted + updated | ||
| const pct = Math.round((counter / totalActivities) * 100) | ||
| logger.info(`synced ${counter} activities out of ${totalActivities}. That's ${pct}%`) | ||
|
|
||
| updatedAt = result[result.length - 1].updatedAt | ||
| } | ||
sausage-todd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| t.end() | ||
| } | ||
|
|
||
| const job: CrowdJob = { | ||
| name: 'Sync Activities', | ||
| // every day | ||
| cronTime: cronGenerator.every(1).days(), | ||
| onTrigger: async (logger: Logger) => { | ||
| await syncActivities(logger) | ||
| }, | ||
| } | ||
|
|
||
| export default job | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| import { getServiceChildLogger } from '@crowd/logging' | ||
|
|
||
| import { syncActivities } from '../jobs/syncActivities' | ||
|
|
||
| const logger = getServiceChildLogger('syncActivities') | ||
|
|
||
| setImmediate(async () => { | ||
| const updatedAt = process.argv[2] | ||
|
|
||
| if (!updatedAt) { | ||
| logger.error('No updatedAt provided') | ||
| process.exit(1) | ||
| } | ||
sausage-todd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| await syncActivities(logger, updatedAt) | ||
|
|
||
| process.exit(0) | ||
| }) | ||
sausage-todd marked this conversation as resolved.
Show resolved
Hide resolved
sausage-todd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Empty file.
1 change: 1 addition & 0 deletions
1
backend/src/database/migrations/V1730386050__activities-updated-at-index.sql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| CREATE INDEX CONCURRENTLY IF NOT EXISTS activities_updated_at ON activities ("updatedAt"); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.