Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1020 spider chi design #1023

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

hamma95
Copy link

@hamma95 hamma95 commented Nov 13, 2021

Summary

Issue: #1020

Replace "ISSUE_NUMBER" with the number of your issue so that GitHub will link this pull request with the issue and make review easier.

Checklist

All checks are run in GitHub Actions. You'll be able to see the results of the checks at the bottom of the pull request page after it's been opened, and you can click on any of the specific checks listed to see the output of each step and debug failures.

  • Tests are implemented
  • All tests are passing
  • Style checks run (see documentation for more details)
  • Style checks are passing
  • Code comments from template removed

Questions

  1. I used a third party library (w3lib) to conveniently remove html tags. I don't know if that's okay which is why i didn't include it in the requirement files yet. I could implement the functionality without the library but the remove_tags method is much more convenient, and could be useful for other spiders too.

  2. The test for the links is set to xfail because there was a unicode character in the result, and in actual result it represented with ascii characters, so should I change the ascii to unicode, or keep the ascii ?

  3. In the links field, there might be some links with the same href but with different titles, like in this example. some titles are more descriptive than others, or contain more info like the zoom password. should I keep the duplicate hrefs or remove them ?

Include any questions you have about what you're working on.

…ultiple meetings, but it was just multiple descriptions for multiple agendas, and just one meeting. Also fixed location to be always virtually, because the addresses in the page are not the meeting locations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant