Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate entries for tags with spÉcial characters #623

Closed
benbonnet opened this issue Jan 12, 2015 · 27 comments
Closed

Duplicate entries for tags with spÉcial characters #623

benbonnet opened this issue Jan 12, 2015 · 27 comments
Labels

Comments

@benbonnet
Copy link

Updated and all, but the problems remains.

If for example the word "Numérique" is saved, and if it already exists, acts_as_taggable breaks.
Same will go for "numerique".

Is there a way to handle it ?

@seuros
Copy link
Collaborator

seuros commented Jan 12, 2015

You have to either provide a test case or backtrace.
If you are using Mysql, check your encoding.

@benbonnet
Copy link
Author

Sorry for that

Here is a gist of what occurs : https://gist.github.com/bbnnt/7aadc59bea0edb24630d
"Numérique" exists and saves, but "Numerique" does not exists nor saves, breaks instead. It is actually because of the key, as far as I can understand, but I really don"t know how to solve it

Encoding of the table is UTF-8 unicode and collation is utf-8_general_ci

@benbonnet benbonnet changed the title Duplicate entries for tags with spécial characters Duplicate entries for tags with spÉcial characters Jan 12, 2015
@seuros
Copy link
Collaborator

seuros commented Jan 12, 2015

I can't reproduce this bug.

@seuros
Copy link
Collaborator

seuros commented Jan 12, 2015

Can you reproduce it with a fresh database/app ?

@benbonnet
Copy link
Author

Yes :/
I've done a rake db:drop db:create db:migrate

Then tried again, starting with only the command that you'l see in the gist / capture
https://gist.github.com/bbnnt/d5a5489182ac7bfe9be8
( I've highlighted the commands to make it clearer here http://cl.ly/image/1F391b0n3O1x?_ga=1.72731347.1360774124.1418737841 )

I checked again, the table has the encoding specified above, by default

@inket
Copy link

inket commented Feb 16, 2015

Could reproduce this bug. It also happens with the Japanese space -> " "

@benbonnet
Copy link
Author

Too bad it is not such an issue for the maintainer ad the logs provided pretty much shows that it does occurs

@seuros
Copy link
Collaborator

seuros commented Feb 16, 2015

@bbnnt : Too bad it is not such an issue for the maintainer ad the logs provided pretty much shows that it does occurs
That not nice from you, this is an OpenSource project you can step up, fix it and send a PR.
Your argument will be valid if i refused/ignored to merge a fix.

@seuros seuros added the bug!! label Feb 16, 2015
@benbonnet
Copy link
Author

@seuros that was not an argument; but more or less to get a reply from you (:
I'll get into this

@seuros
Copy link
Collaborator

seuros commented Feb 16, 2015

👍

@bf4
Copy link
Collaborator

bf4 commented Feb 16, 2015

@bbnnt This project is really hard to maintain. It's a really old code base with lots of bugs, frequently reported issues, and none of the maintainers, as far as I know, use it anymore.

@benbonnet
Copy link
Author

@bf4 damn i'm getting old-schooled ! would you recommend another gem that has similar functionalities ?

@seuros
Copy link
Collaborator

seuros commented Feb 17, 2015

@bbnnt The fact that we don't use this gem don't mean we using a better alternative. It simply mean we don't need it.

@bf4
Copy link
Collaborator

bf4 commented Feb 17, 2015 via email

@rikettsie
Copy link
Contributor

@bbnnt the problem is related to the COLLATION actually applied to the 'name' column when a new tag name is about to be stored.
A tag 'name' is stored by the gem as 'binary encoded string', but if collation is not specified as 'utf8_bin' for that column, all the comparisons are not made properly and so the unicity constraint expressed by the index 'index_tags_on_name' generate the error you experienced.

For a quick circumvention, you could alter the 'tags' table column 'name', e.g. in MySql:

ALTER TABLE tags MODIFY name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_bin;

@benbonnet
Copy link
Author

@rikettsie wow it seems to have solved the problem (:
thx a lot

@rikettsie
Copy link
Contributor

You are welcome ;-)

@seuros
Copy link
Collaborator

seuros commented Feb 26, 2015

@rikettsie : Could you update the readme and send a pr ?

rikettsie added a commit to rikettsie/acts-as-taggable-on that referenced this issue Feb 26, 2015
seuros added a commit that referenced this issue Feb 26, 2015
#623: updated README (added suggestion for manual column alteration).
rikettsie added a commit to rikettsie/acts-as-taggable-on that referenced this issue Feb 26, 2015
seuros added a commit that referenced this issue Mar 1, 2015
#623: solve matching of binary-encoded strings with MySql (via rake rule or initializer setting)
@Morred
Copy link

Morred commented Apr 27, 2015

@seuros @rikettsie @bbnnt Since the change_collation_for_tag_names now gets copied over together with the other migrations when calling 'rake acts_as_taggable_on_engine:install:migrations', is it still necessary to set the initializer 'ActsAsTaggableOn.force_binary_collation = true' as described in the readme?
Because if I do so, it will throw a MySQL error when running 'rake db:migrate', which doesn't happen when I do not set the initializer.

@carlosescri
Copy link

@Morred it seems that it's already included in the migrations. After running ./bin/rake db:migrate part of the output was:

== 20150427105348 ChangeCollationForTagNames: migrating =======================
-- execute("ALTER TABLE tags MODIFY name varchar(255) CHARACTER SET utf8 COLLATE utf8_bin;")
   -> 0.0512s
== 20150427105348 ChangeCollationForTagNames: migrated (0.0522s) ==============

@rikettsie
Copy link
Contributor

@Morred can you provide the error you obtain while migrating?
@carlosescri, yes the behaviour is the same as setting force_binary_collation = true. Having the parameter exposed is useful because one can switch it to false/true as preferred independently of migration.

@Morred
Copy link

Morred commented Apr 30, 2015

@carlosescri Yes, it's in the migrations that get copied over when you run "rake acts_as_taggable_on_engine:install:migrations".

@rikettsie If I set ActsAsTaggableOn.force_binary_collation = true in the initializer, I get this error when running "rake db:migrate":

-- execute("ALTER TABLE tags MODIFY name varchar(255) CHARACTER SET utf8 COLLATE utf8_bin;")
rake aborted!
ActiveRecord::StatementInvalid: Mysql2::Error: Table 'foo_development.tags' doesn't exist: ALTER TABLE tags MODIFY name varchar(255) CHARACTER SET utf8 COLLATE utf8_bin;
/Users/laura/foo/bar/config/initializers/acts_as_taggable_on.rb:1:in `<top (required)>'
/Users/laura/foo/bar/config/environment.rb:5:in `<top (required)>'
Mysql2::Error: Table 'foo_development.tags' doesn't exist
/Users/laura/foo/bar/config/initializers/acts_as_taggable_on.rb:1:in `<top (required)>'
/Users/laura/foo/bar/config/environment.rb:5:in `<top (required)>'
Tasks: TOP => db:migrate => environment

This happens only if the migrations that come from "rake acts_as_taggable_on_engine:install:migrations" haven't run yet, and therefore the tags table doesn't exist yet. If I remove the line in the initializer, the migrations run without a problem, and if I re-add the line after that, I can also run it without issues once the table exists. So I guess it tries to run the initializer before the actual migrations for some reason?

@rikettsie
Copy link
Contributor

Yes @Morred, you should install and run migrations before adding the force_binary_collation parameter to the initializer file (otherwise the parameter gets executed when the application environment loads the first time and the table does not exist yet).
I can fix this side case.

rikettsie added a commit to rikettsie/acts-as-taggable-on that referenced this issue Apr 30, 2015
seuros added a commit that referenced this issue Apr 30, 2015
#623 collation parameter is ignored if it generates an exception.
@seuros seuros closed this as completed Apr 30, 2015
@Morred
Copy link

Morred commented May 1, 2015

@rikettsie That would be great. If it's only myself, I can run the migrations and then add that line, but if there are more people working on a project and somebody git clones the thing and then tries to run the migrations, they would have to remove the line, run the migrations and then add the line back into the initializer, which is somewhat inconvenient. A fix would be very much appreciated!

@rikettsie
Copy link
Contributor

@Morred, I fixed it yesterday and the diff merged into master. It is ready for next version.

@Morred
Copy link

Morred commented May 2, 2015

@rikettsie Awesome, thanks!

markedmondson pushed a commit to markedmondson/acts-as-taggable-on that referenced this issue Jun 18, 2015
* master: (26 commits)
  mbleigh#623 collation parameter is ignored if it generates an exception.
  Update release date for 3.5.0
  Update README.md
  Changing ActsAsTaggable to ActsAsTaggableOn
  version 3.5.0
  Add context constraint to find_related_* methods. Fixes mbleigh#628
  added rake rule and a config parameter to force binary collation (MySql)
  added migration and rake task
  mbleigh#623: added manual column alter suggestion to fix special characters in tags (MySql only)
  version bump
  Add context constraint to find_related_* methods. Fixes mbleigh#628
  Fixing typo in docs with strong typing
  Namespaced TagList usage in a customer parser
  version bump, test on ruby 2.2, remove rails edge from matrix.
  clears column cache on reset_column_information resolves mbleigh#621
  Use the new build env on Travis
  Update README.md
  sha_prefix should not be random
  Fix milestones link
  Meet interface expectation for active record.
  ...
@neverhoodboy
Copy link

neverhoodboy commented Feb 20, 2020

@bbnnt the problem is related to the COLLATION actually applied to the 'name' column when a new tag name is about to be stored.
A tag 'name' is stored by the gem as 'binary encoded string', but if collation is not specified as 'utf8_bin' for that column, all the comparisons are not made properly and so the unicity constraint expressed by the index 'index_tags_on_name' generate the error you experienced.

For a quick circumvention, you could alter the 'tags' table column 'name', e.g. in MySql:

ALTER TABLE tags MODIFY name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_bin;

For anyone using the character set of utf8mb4, the migration should be something like:

ALTER TABLE tags MODIFY name VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;

And in the case of MySQL 5.6:

ALTER TABLE tags MODIFY name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;

tekniklr pushed a commit to tekniklr/acts-as-taggable-on that referenced this issue Mar 19, 2021
tekniklr pushed a commit to tekniklr/acts-as-taggable-on that referenced this issue Mar 19, 2021
mbleigh#623: updated README (added suggestion for manual column alteration).
tekniklr pushed a commit to tekniklr/acts-as-taggable-on that referenced this issue Mar 19, 2021
mbleigh#623: solve matching of binary-encoded strings with MySql (via rake rule or initializer setting)
tekniklr pushed a commit to tekniklr/acts-as-taggable-on that referenced this issue Mar 19, 2021
tekniklr pushed a commit to tekniklr/acts-as-taggable-on that referenced this issue Mar 19, 2021
mbleigh#623 collation parameter is ignored if it generates an exception.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants