Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions .github/workflows/check_db_migration_confict.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: Check DB migration conflict
on:
push:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this run only on pushes to master or is the intention to use this for release branches also?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more thinking of feature branches, but this could be useful to release branches, too. We could potentially do some name matching here but since this workflow is relatively simple and fast, I think running on all branches is safer.

paths:
- "superset/migrations/**"

jobs:
check_db_migration_conflict:
name: Check DB migration conflict
runs-on: ubuntu-20.04
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@v2
- name: Check and notify
uses: actions/github-script@v3
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure secrets will be available on push events from forked repositories.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think push events only triggers for the base repo. Forked repos will run the push event with their own secret, which does not have access to the base repo---in this case, the forked repo is considered its own base repo.

Either way, it shouldn't matter because we only want to run the workflow in the base branch anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GITHUB_TOKEN provided to forks is read-only. I believe a write token is required to publish a comment on a PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is, when this workflow is run in a fork's push event, it runs in the forked repo itself. It will check PRs within the fork. So it will be write token, but it's only writable to the PRs in the fork repo---which is why it worked in my test PR.

When this is merged into the base repo, then the push event will run on both the base repo and forks. When it's running on the base repo's push event, which is triggered only on PR merge and direct push, it will use a GITHUB_TOKEN that has write access to the base repo, because the code is considered safe since they are in the repo already.

Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the confusion was related to push events running on PRs from forked repos, but looking at recent action runs, this doesn't appear to be the case. The GitHub Actions docs leave something to be desired.

script: |
// API reference: https://octokit.github.io/rest.js
const currentBranch = context.ref.replace('refs/heads/', '');

// Find all pull requests to current branch
const opts = github.pulls.list.endpoint.merge({
owner: context.repo.owner,
repo: context.repo.repo,
base: context.ref,
state: 'open',
sort: 'updated',
per_page: 100,
});
const pulls = await github.paginate(opts);
if (pulls.length > 0) {
console.log(`Found ${pulls.length} open PRs for base branch "${currentBranch}"`)
}

for (const pull of pulls) {
const listFilesOpts = await github.pulls.listFiles.endpoint.merge({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: pull.number,
});
const files = await github.paginate(listFilesOpts);
if (
files.some(x => x.contents_url.includes('/contents/superset/migrations'))
) {
console.log(`PR #${pull.number} "${pull.title}" also added db migration`)
await github.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: pull.number,
body:
`⚠️ @${pull.user.login} Your base branch \`${currentBranch}\` has just ` +
'also updated `superset/migrations`.\n' +
'\n' +
'❗ **Please consider rebasing your branch to avoid db migration conflicts.**',
});
}
}
39 changes: 35 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -1001,7 +1001,7 @@ Submissions will be considered for submission (or removal) on a case-by-case bas

When two DB migrations collide, you'll get an error message like this one:

```
```text
alembic.util.exc.CommandError: Multiple head revisions are present for
given argument 'head'; please specify a specific target
revision, '<branchname>@head' to narrow to a specific head,
Expand All @@ -1016,15 +1016,46 @@ To fix it:
superset db heads
```

This should list two or more migration hashes.
This should list two or more migration hashes. E.g.

```bash
1412ec1e5a7b (head)
67da9ef1ef9c (head)
```

2. Pick one of them as the parent revision, open the script for the other revision
and update `Revises` and `down_revision` to the new parent revision. E.g.:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also updating the contribution guide on how to avoid adding a new empty db migration merge file.


```diff
--- a/67da9ef1ef9c_add_hide_left_bar_to_tabstate.py
+++ b/67da9ef1ef9c_add_hide_left_bar_to_tabstate.py
@@ -17,14 +17,14 @@
"""add hide_left_bar to tabstate

Revision ID: 67da9ef1ef9c
-Revises: c501b7c653a3
+Revises: 1412ec1e5a7b
Create Date: 2021-02-22 11:22:10.156942

"""

# revision identifiers, used by Alembic.
revision = "67da9ef1ef9c"
-down_revision = "c501b7c653a3"
+down_revision = "1412ec1e5a7b"

import sqlalchemy as sa
from alembic import op
```

1. Create a new merge migration
Alternatively you may also run `superset db merge` to create a migration script
just for merging the heads.

```bash
superset db merge {HASH1} {HASH2}
```

1. Upgrade the DB to the new checkpoint
3. Upgrade the DB to the new checkpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you should be able to always use number 1. and it will automatically increment it on the next bullet (avoids the need for updating all bullets when inserting/deleting bullets)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am aware of this markdown feature, just thought it reads nicer in the source code. The numbers doesn't have to be always 1 either, so one can also just choose not to update them while inserting/deleting bullets. I happen to think it's not a big deal to update it here.

E.g. this also works:

1.
2.
2.
5.


```bash
superset db upgrade
Expand Down