-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sitemaps for multi languages #1
Comments
Thanks a lot for the tips, we will definitely look into it shortly. |
For a better fix, there maybe hints of another solution i.e. for some one that knows how to code may find helpful http://trac.transposh.org/browser/trunk/WordPress/plugin/transposh/wp/transposh_3rdparty.php from line 223 there seems like another fix (+ line 43-44) looks like it needs manual patching too. |
Thank you! We will take a look. |
To add to the already listed links I found several others that may help listed below Transposh |
Did you try using https://wordpress.org/plugins/google-xml-sitemaps-v3-for-qtranslate/, is there anything wrong with it? Is there a complete solution which works for Yoast SEO and qTranslate-X, which we could put in the code? |
I looked now over Yoast SEO code, there is no filter, which can be easily enough used to generate entries for each language. Making it work will be almost as much work as writing a new plugin. Why don't we simply use already existent plugin Google XML Sitemaps v3 for qTranslate, which works fine for many people, including myself? |
Hi John, Thanks for looking the code over, Not sure if you got my last link https://wordpress.org/plugins/wp-seo-yoast-integration-mq-translate/developers/ Google XML Sitemaps v3 for qTranslate I have used it in the past then had issues at one stage can't recall what it was but the plugin isn't maintained and there are several issues people have with it including:
I understand that for over 80% of users it should suffice, but on the other hand if your plugin could add yoasts sitemaps it would be an added benefit. Maybe if its hard to hook directly into yoast itself, perhaps commented out code could be inserted into your plugin with a manually applied fix, like I said I'm not good at coding but I got it working for my site manually. . Something like:http://trac.transposh.org/browser/trunk/WordPress/plugin/transposh/wp/transposh_3rdparty.php from line 223, but they left out one bracket apparently https://wordpress.org/support/topic/tranposh-make-wordpress-seo-by-yoast-sitemap-xml-pages-blank?replies=3 Hope that helps. Another approach which may be better, not sure is try incorporate https://wordpress.org/plugins/wp-seo-yoast-integration-mq-translate/ (https://github.com/rufein/wp-seo-yoast-integration-mq-translate) into your plugin. |
Why is it not working with qtx? I would think qtx is no different from mq- at database level. Have you tried to turn on option "Compatibility Functions"? I am sure if it needs adjustments, it would be very little to make it work with qtx.
Yes, I understand how to change Yost code, and it is not much, but the problem is that it cannot be done in an encapsulated and updatable way. The way https://wordpress.org/plugins/wp-seo-yoast-integration-mq-translate/developers/ did it, it is a copy of Yoast code, modified then. This way we lose ability to take advantage of Yoast future improvements and put ourselves in trouble to always update our code after Yoast update theirs. This does not go along with WP main policy and design. People do this out of grief just to fix their own site, but to make it publicly available and supported would be a lot of trouble, high maintenance. I would rather try to convince Yoast to put a couple of filters in, that we could hook our little code on. I know that he does not like qTranslate and will refuse to do it for the sake of qtx, but if we design it in a way that any multilingual plugin can hook, and submit a pull request at his place, he might be more cooperative. |
I have just tried https://wordpress.org/plugins/wp-seo-yoast-integration-mq-translate/developers/, it appears to be a copy of mq- as well. This is just unmaintainable at all ... |
I was looking into Yoast code again, all we need to ask them is to put code:
right at the beginning of function Then we can make filter 'wpseo_sitemap_entry' to return an array of urls for each language, instead of a single url, which is normally expected. That is all. This would make Yoast SEO compatible-ready with any multilingual plugin. What would you think? |
I will submit code needed for filter hopefully within next 48 hours and then you could test this idea on your site. Is that ok? |
Image part will be the same for all language urls, is that correct? We will simply copy it? |
Can you write me directly through https://qtranslatexteam.wordpress.com/contact-us/, I need to discuss something off line. |
Saw this already in place Yoast/wordpress-seo#2579 may help |
I committed the changes. Please, use the latest qtx, https://github.com/qTranslate-Team/qtranslate-x, and the latest integration plugin, https://github.com/qTranslate-Team/wp-seo-qtranslate-x. After installing them, all should function as before, no changes should be observed, except a few improvements unrelated to sitemaps. Then insert the code listed above into yoast file /wp-content/plugins/wordpress-seo/inc/class-sitemaps.php right after the line
Do not duplicate the line itself, of course. Here is a copy of the code above to be inserted into
Please, let me know if it does what you need. Thanks! P.S. is you use QTranslate Slug, please try it too. |
Hi John, It works nearly perfectly, ran on my test env. Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 15728640 bytes) in E:\Websites\Backup\2015\InstantWP_4.4.2\iwpserver\htdocs\wordpress\wp-includes\wp-db.php on line 1285 I commented out images as there were x10 for each language prob what caused memory errors and it worked fine after that. ' ' '
' ' ' I tested tags and they worked fine, but none of my tags are translated, so they work fine if your tag was e.g. TAG: en:word de:word da:word , I guess they would be replicated if you had en:word de:wort da:ord you will prob get the number of languages active times the number of words. Another recommendation for users would be to set Max entries per sitemap inside of yoast sitemaps: I dont have qslug setup will try adding it now, can't imagine it would work but will test it now. |
Another potential issue may be other users site url structure like: |
I am not sure how images need to be treated. Don't they need to be listed for each language? They come with "title" and "alt", which I translated for each language. Would that be correct, if no db error? Looks like your sitemap with images gets so big, that it cannot even fit in a database ently? The code for filter 'wpseo_sitemap_entry' is in file Maybe we should use different approach and try to generate a separate sitemap for each language, like "page-sitemap-en.xml", "page-sitemap-xx.xml", etc? Each sitemap will then have approximately the same size of single language map? I am not sure how hard that would be to implement. Would that work in general for search engines? Or pages with different languages should be listed next to each other? I am not sure about the sitemap requirements. BTW, page /wp-admin/admin.php?page=wpseo_xml has option "Max entries per sitemap". Have you tried to make it smaller? |
I think the way it's setup now is correct, except images are not filtered by language, if you made separate language pages maintenance may be harder if they changed their code. Yes, that's what I meant there are already settings in yoast for Max entries per sitemap etc. was recommended for other users. Was just trying out qtrans slug, doesn't play nicely with my site, anyway once others use this they will provide better feedback as to issues they are having. |
images are listed, but in my case 10 images per url per language I set the Max entries per sitemap right down (5) to assist the generation of the sitemap with images, may help other users with memory issues. |
Basically that image loop needs to run (url images) divided by (number of languages) Hmnn that will not work neither if a post had a different image in a lang or more images than another lang. It maybe easier just to disable images by commenting them out, unless you could add a break point/unique identifier for $img['src'] per language, or like you said earlier make separate pages per lang which would get tricky and be harder to maintain. Images are not really needed, and I get no errors from google sitemaps when submitting without them, the alternative is Google XML Sitemaps v3 for qTranslate which also doesn't do images, having said that though it would be nice to have them ;) Overall as it stands now it would help many users, its a pretty straight forward patch even if wp-seo don't add your updated function, if they did, users would still need to manually comment out images as it stands now. |
I now understood the problem with images, We can use filter 'wpseo_sitemap_urlimages' to filter duplicated entries, no additional code at Yoast. I added image filtering in the latest https://github.com/qTranslate-Team/wp-seo-qtranslate-x. But I discovered a problem, Yoast does not pass home_url through the filter 'wpseo_sitemap_entry', and no other filter, so it stays in default language only in sitemaps. In fact, after looking a bit more, I now suspect that making separate sitemaps per language might be possible to do without modification Yoast code. I'll research a bit more. The question is, would that be ok for search engines? I do not see a reason why it would not be, but it would be good if you could research this issue to be sure? |
This http://wplang.org/sitemaps-multiple-languages-wordpress/ suggests separate sitemap per Lang |
How about this way? I committed new changes. Please, download again the latest qtx, https://github.com/qTranslate-Team/qtranslate-x, and the latest integration plugin, https://github.com/qTranslate-Team/wp-seo-qtranslate-x. No need to change Yoast code. |
Yes, I got it, I will put some answers here, because other people may have input too:
This is a real problem, unrelated to Yoast. Those duplicates of images used to come from different languages on a page, but I do not have information which image from which language to pick correct "title", "caption", and "alt". I currently simply take the first out of all with the same "src". If title, caption and alt are entered on "Edit Media" screen per image, then I can extract a proper language. This can be a big additional work for the admin if those entries are not filled on "Edit Media" page. It can probably be automated with a script. Or I have to completely reimplement image extraction code. Let us think a bit more about the best way.
I am not sure what is meant here.
It does not fit in the way it is done now, but it maybe a good idea. I did not mean to use /xx/sitemap_index.xml other than for testing. The search robots will always go to /sitemap.xml, which gets redirected to /xx/sitemap_index.xml of the default language. I can probably do "/xx-sitemap_index.xml" to show one language sitemap, if it is important. |
On image attributes: how about now? Update to the latest from https://github.com/qTranslate-Team/wp-seo-qtranslate-x. I translated $p-post_content in filter 'wpseo_xml_sitemap_post_url', and search of images later goes on already translated text which includes only one language. Do you think it would be a good idea to fetch title, caption and "alt" from image properties entered on "Edit Media", if those attributes come empty from Yoast code? |
I can confirm that:
That sounds like a great idea. Seems to be working pretty nicely now, only enhancement is to make xx/sitemap_index.xml or "/xx-sitemap_index.xml" filtered so only relevant language urls are shown, if that's even possible.
As it stands now, it beats my sitemap :) and Google XML Sitemaps v3 for qTranslate, so good job. |
On sitemap structure. Here is my understanding: I think robots try name "sitemap.xml", it gets redirected to "sitemap_index.xml" with the default language active. That sitemap has all languages. Robots will never hit any "i18n-index-sitemap.xml", unless you submit those names to google manually, and if you do, then whatever you submitted is the only what will get hit. If you submit all "i18n-index-sitemap.xml" and do not submit "sitemap_index.xml", then you are fine. I am not sure if this theory is correct. I thought that robots do not browse the site if there is "sitemap.htm". If it is there, then they will only browse what is listed in sitemap.xml. I never saw two-levels sitemap indices, and I am not sure if it is allowed. Probably yes. I can replace the main sitemap_index with an index pointing to all language index maps. Could you research to make sure that this would be a good idea? |
I've updated to hierarchical sitemaps, just to test the idea, but I think we need to revert to the previous schema. It looks like sitemap indices are not designed to list other indices. I could not find an example of that on internet. Try the latest version just out of curiosity. In fact, with the current setup you have now, you may create file sitemap.xml manually and list "i18n-index" urls for each language. Then sitemap_index will be hidden from robots, but will still work if typed in manually. |
OK, Back to the sitemap. I think the way its setup now with: sitemap_index.xml showing a list of all i18n-index langs is correct, the structure is like "Best Sitemap Structure" image listed above, only thing is displayed like this only sitemap_index.xml should be visible not /(lang)/ sitemap_index.xml, but I could be wrong. One issue I did see is that in my case: Maybe if case (wp-admin/options-general.php?page=qtranslate-x#general) Hide URL language information for default language = TRUE I will ask for advice regarding the hierarchy of sitemaps: |
I would not worry about redirections, robots do not use cookies, and for them if there is no /en/ it assumes the default language with no redirection, as you want it to be. It is only during human testing, it picks up the active language and does redirections sometimes to switch the language properly. I do not see an issue here. What is wrong with the last setup, the link to "Sitemap Index" to parent index does not appear on index maps and is wrong on sitemaps. This link comes from style file "main-sitemap.xsl", as far as I understood. The style file is referenced in .xml file, you can find it in the source. Replacing with correct link on sitemap pages is probably doable with a hack-like code. However, I do not understand why this link does not appear on the language index pages. Following the logic, it should appear, but it does not, which makes me think that index pages cannot point to index pages again. I feel more comfortable with the old sitemap_index with all languages listed in one index, unless we figure out the proper theory on this. I could not find an example of two-level sitemap indices so far. I am afraid that there is a reason for that. As I mentioned before, you may still submit i18n-index* urls to google manually and that would be fine as well. Also if we put flat sitemap_index.xml with all languages by default, you may still create sitemap.xml actual file in place with index of indices manually and that will hide sitemap_index.xml from the robots. Let us see what you can find out from people. If indices of indices are allowed, then we need to figure out the proper syntax for that. Please, try to prepare then files manually, which work correctly with "Sitemap Index" link to parent index, and I will try to do the same in PHP code. |
The problem with title & meta is really strange. I have no idea how that can happen. Could you try deactivation plugin (then line "plugins/wp-seo-qtranslate-x/i18n-config.json" will disappear from option "Configuration Files"), removing all the files from the file system, then putting all files freshly downloaded from the GitHub again, make sure folder name is "wp-seo-qtranslate-x", activate again (line "plugins/wp-seo-qtranslate-x/i18n-config.json" should be back) and see if it makes difference. Please, remove additional folders and copies you created, this does not help to figure out the problem and can be very confusing. After that fresh install, send me please the configuration shown in "Configuration Inspector": /wp-admin/options-general.php?page=qtranslate-x&config_inspector=show. |
I did see a solid answer that any structure is acceptable on manual submission of maps, but I did not see an answer to two-level indices files read by robots. |
Could you please figure out the syntax of two-level indices files and send me how you would wish them to be for your site, for example. There is no problem with manual submission of maps to google either way, exactly as Irena says, it is already working in both setups. But I do not understand how to list them properly in the top level sitemap.xml. |
I'll read more too ... |
This is what I will likely do with my sitemap setup: In .htaccess (only because all search engines already use sitemap_index.xml URL for my site) Link in footer.php Submit individual language sitemap urls to google webmaster tools, apply localization where possible: Optional add to end of robot.txt file |
Yes, that is ok, but we also need a default solution for everybody with doing no additional setup. We need to figure out the syntax for two-level indices for sitemap. If you look in the source of all sitemaps you will see that "index.xml" has xml entity "", which lists "" items. This type of files I call "index map". Other files, called "sitemap", have xml entity "" with items "". In my theory item can only point to a "sitemap" and cannot point to "index map", or there is another syntax to point an item to index map from other parent index map. I've never seen an example of "" entry pointed to an index file within "". This is what I would like to clarify. May be it is ok to point "" entry to another index map instead of sitemap, but then why style file "main-sitemap.xsl" works differently in sitemap and in index map. In sitemap file it produces link to parent index, in red, "Sitemap Index", but in index map, this link does not appear, but we should have it. Maybe, all we need to do is to figure out different syntax for style file referred within an index map file. Your solution with .htaccess file will work for you in any case, but we also need to worry about people who did not dig it that deeply and will not know that they need to do additional configuration. Manual submitting of maps to google is fine, but there are many other search engines, which might not be as important, but still one cannot manually submit all sitemaps to all search engines. We need to make sure that any search engine robot will figure out all appropriate sitemaps automatically, starting from the standard "/sitemap.xml". I hope this clarifies what I wrote previously too briefly. |
Yes, I've read http://www.sitemaps.org/protocol.html, there is no mentioning that index map can be listed under tag inside other index. They either assume that it is self-obviously allowed or it means that it is disallowed. Not sure how to find out for sure. |
Yeah, Leslie, you are so good with finding the stuff 👍 That is probably the answer. Have you tried to validate your top sitemap_index.xml, which lists other index maps on google sitemap validator? |
Thanks. No I was not joking, I did not hit that one when I tried to search, although I was not trying hard enough. You got lucky to put better string to search then, so it is indeed good. |
I think it seems like commit 1545425 (hierarchical sitemaps) should be removed to prevent Nested Sitemap indexes errors. Then users will be able to setup sitemaps one of 2 ways:1) Simple - Using the default sitemap_index.xmlAdvantages:
Disadvantages:
OR 2) Advanced - Submiting each language's i18n-index-sitemap.xml ( i.e. not using sitemap_index.xml)Advantages:
Disadvantages:
For both types optional:
|
I agree. The only thing is that we do not need to add a redirect in .htaccess /sitemap_index.xml to /i18n-index-sitemap.xml, if we manually submitted i18n-index-sitemap.xml to a search engine. Flat sitemap_index.xml with all languages does not hurt to have in any case (it is virtual anyway, not an actual file on file system, as you probably notice). In your case, since it is already submitted, search engines will continue to work correctly picking up all languages through this file, until you change it in search engine console to make it even better with separate maps. Search engines, which you did not configure manually, will still find all languages through default sitemap.xml, which is also virtual and gets redirected to flat sitemap_index.xml with all languages. People should always be advised to modify robots.txt with all i18n-index-sitemap.xml. Then robots, which pay attention to this configuration, will also work correctly. So, I do not see a reason to bother with additional redirection. I will revert to flat sitemap_index shortly. |
I checked in hopefully the final version in terms of sitemaps. Please, review the notes on page "/wp-admin/admin.php?page=wpseo_xml". I hope all human links in sitemaps also tuned correctly, depending on referrer page - try to break it by clicking all the links. Please, also test all other features that you use as well. Thanks a lot for all your help. |
If you kept your browser open all this time, please use Ctrl+F5 to refresh cache. |
Discovered some browser caching problems with xls files, just updated again, please download it again if you have already done it. |
Seems like its all working fine, good job, and thanks for adding it, guess we can only be sure once it's live, I assume the next biggie everyone will want is working Page Analysis :) FYI:
https://yoast.com/dev-blog/yoast-seo-breaking-api-changes/ |
Yes, page analysis, I am afraid we will need Yoast cooperation for that ... Did you have a chance to test all other stuff?
|
Just had a major issue when putting the latest github versions to my live site, reverted back to wp current version in the mean time, until I figure out whats wrong My test server has the same issue. Index.php
Generates this:
Seems like the_content isn't working and adding class="qtranxs-available-languages-message qtranxs-available-languages-message-no" In default language, it generates correctly eg:
qTranslate-Team/qtranslate-x@fa98566 is the reason, if I delete
in qtranslate_frontend.php then the site works normally again, maybe related issue qTranslate-Team/qtranslate-x#271 |
For me latest version sitemap works just fine for all languages, i still did not add anything in robots.txt file i am testing on a live site. |
@Lesrad: yeah, this is what I meant by "test all other stuff". Yes that line appears to be a big trouble, it breaks "more" tags severely and unrecoverably. I think the filter "qtranxf_postsFilter" was created to fix "more" tag problem and I thought that it would be called with filter='display' from the_content, but for some reason WP calls it in 'raw'. That line was added for one of the first version of sitemaps, but then it was done in a different way as far as I can now tell. When this line is commented out, do all sitemap continue to work correctly including proper image attributes translation? If so, I will comment it out permanently. It also broke "get_the_excerpt", and it may show more problems later. Could you please re-test all without that line, and if all is ok, I will undo that line. |
lesrad that person you reffering to is me :) i don't know the rules for sharing the site here so i just sent message to qtranslatexteam email (i guess that's John and Gunu :) ) it works and i think it is nearly perfect or even perfect hehe all post_sitemaps page_sitemaps category_sitemaps tag_sitemaps are shown the proper way each for it's own language my favorite tags are shown in all 3 languages :) But i did not notice any error so i guess i will need to check if i have one now. |
I guess, we are ready to close this issue, we can still write into a closed issue, or we can re-open it, if needed. If new problems are discovered, I would suggest to open a new issue, since this one is already exceptionally long. We can cite it from new issue if needed. Thank you very much to all of you for invaluable help. |
On some pages i noticed q-translate slugs on other languages are not read by yoast give the message : Only first language version says yes but i don't think it is a problem, just letting you know. |
Sorry for the delayed response, got busy with some personal stuff. Here are some screenshots that may help others, John feel free to reuse them if you wish. Submit Sitemap to Googlehttps://www.google.com/webmasters/tools/home?hl=en
Submit Sitemap to Binghttps://www.bing.com/webmaster/configure/sitemaps/home
Robots (Add/Edit to sites root directory: robots.txt)
Footer (wp-theme: footer.php)
|
A sitemap is only made for one language when enabling sitemaps with this plugin addon, it would be great if the plugin could generate a sitemap for all languages.
I used to manually hack wp-seo to make it work for my site including sitemap generation, see this post maybe it will give you some tips: https://wordpress.org/support/topic/mqtranslate-and-multilingual-seo-with-yoast-seo/page/2?replies=64
but I'm not a coder so I'm sure you guys could make a better version.
The text was updated successfully, but these errors were encountered: