Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DealabsBridge - HotUKDealsBridge - MydealsBridge] Update Groups #2083

Merged
merged 2 commits into from
Jul 27, 2021

Conversation

sysadminstory
Copy link
Contributor

The bridges has been updated with the newest "groups" available on every
website !

The bridges has been updated with the newest "groups" available on every
website !
'Huile moteur' => 'huile-moteur',
'Hygiène corporelle' => 'hygiene-corporelle',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links with hygiene-corporelle become invalid.

'Hygiène de la maison' => 'hygiene-de-la-maison',
'Hygiène des bébés' => 'hygiene-des-bebes',
'Image, son & vidéo' => 'image-son-video',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links with image-son-video become invalid.

'Magasins d'usine' => 'magasins-usine',
'Magazines' => 'magazines',
'Maillots de bain' => 'maillots-de-bain',
'Maillots de football' => 'maillots-de-football',
'Maison & Jardin' => 'maison-et-jardin',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

'Microsoft Office' => 'microsoft-office',
'Microsoft Surface' => 'microsoft-surface',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Didn't check other values that were removed

Copy link
Contributor

@em92 em92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started checking, if old feed links are still valid on those values, that were removed. Found some, that became invalid.

@sysadminstory
Copy link
Contributor Author

Some of the existing links (like 'image-son-video') are redirected to another URL on the website, and some (like 'microsoft-surface') don't exists anymore.
But some (like 'maison-et-jardin') are still existing, even I think I crawled the whole website using a script...

I'll check if some of them are still visible in the "Similar Groups" box.
I must check my crawling script again !

…uble check

Groups were extracted from :
- Website menu and the links to groups categories
- Group categories Popular Groups and "More similar groups" links
- Group page breadcrumb
- Group page similar grous
@sysadminstory
Copy link
Contributor Author

Hello !

I double checked : I extended my crawling script, to get more "sources" : some of the existing groups (like 'maison-et-jardin') does still work, but there are not on the website anymore. I think those are abandoned groups : the last deal in this group is from 2019.

The group order changed a bit, because I fixed a duplicate entries issue in my crawling script, and did not need to use the sort command to fix them.

IMHO, the feed this pull request will break are already broken, or does not return anything new since a while.

@sysadminstory sysadminstory requested a review from em92 May 3, 2021 22:26
@sysadminstory
Copy link
Contributor Author

Did I miss something ? :)

@sysadminstory
Copy link
Contributor Author

Hello !

Am I missing some changes ?

@em92 em92 merged commit 2689f5f into RSS-Bridge:master Jul 27, 2021
@em92
Copy link
Contributor

em92 commented Jul 27, 2021

gj!

@em92
Copy link
Contributor

em92 commented Jul 27, 2021

Btw, just pushed 877707f, so you can add your script crawling scripts to contrib/ directory

@sysadminstory sysadminstory deleted the dealabs-update-groups branch July 27, 2021 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants