Comments
-
Greg Hurrell
Created
,
edited
Been thinking about this and I think the right way to implement this is the following:
We have a model, final name to be decided, called
TagCorrection
,TagTransform
,TagTransformation
,TagAlias
or similar.It basically has two attributes: original tag names and what it should be corrected/transformed to.
original | corrected -------------------------- upgrade | updates upgrades | updates update | updates unittesting | unit.testing
This table is consulted at two times:
- When tags are added to a record ie. each tag is checked to see if it should be transformed before being added
- When new additions are added to the table (or an existing entry is modified), we run through all tags in the database and check for names which should be transformed
To illustrate those two scenarios:
First, we create a new wiki article "Updating to Foo 2.0", and mistakenly tag it with "foo update". "foo" is checked in the correction table and not found, so it gets added without modification. "updates" is found in the table, so gets transformed to "updates" before being added. The place where this check would be performed would be in the
parse_tag_list
method, most likely. It would add a single database query to the process, because we could query for all tags in one go.Second, with a lot of articles in the database tagged with "foo", the "Foo" project changes its name to "Bar". So we add
foo -> bar
to our corrections table. At that point we check the "tags" table for a tag called "foo", and rename it to "bar". If it is a duplicate, validation will fail and the database-level constraint will kick in anyway, so we have to be ready to handle that case (most likely by iterating through all "foo" tags and retagging the associated objects with the existing "bar" tag. -
Greg Hurrell
I've made some more notes about the finer details, but no time to enter them here now so will have to do that after the weekend.
One thing I will note now though is that I did think about a "non-destructive" version of this operation (the operation is destructive because it's irreversible; if you change 1,000 records with tag "foo" to "bar" then there's no way to go back to tagging them as "foo" without doing it for each one, unless you can change all "bar"-tagged items to "foo" again, which might not be an option).
The solution would be to add a new column to the
tags
table which would be like an "alias" column. Given tag "foo" with alias "3", you'd look up the tag with id 3 to find out what the authoritative name for the tag was.While this is nice and non-destructive, it could potential add a bunch of queries, especially on pages with a lot of tags.
In the end, I don't think the trade-off is worth it, and the optimal solution is to just go with the "destructive" updates and trust the admin to know what he's doing when managing tags.
-
Greg Hurrell
Ok, so some of those "finer details" I mentioned earlier.
First up, all this logic can be neatly packed into a
TagTransform
observer which fires onafter_save
and/orafter_create
.Now, let's look at the "renaming" a tag case again.
Case 1: tag "foo" renamed to tag "bar" (new tag)
As noted above, if "bar" doesn't already exist, the operation succeeds and all is easy.
Case 2: tag "foo" renamed to tag "baz" (existing tag)
When we try to save the renamed tag we'll get a
ActiveRecord::RecordInvalid
exception (ie. if we dosave!
rather thansave
).Even if it weren't for the uniqueness validation, we'd fail at the level of database constraint (
ActiveRecord::RecordNotUnique
).So in this case, what do we do?
If the save fails due to a validation error and the error is on the "name" attribute and it is about uniqueness, then we can proceed to look at the
taggings
table.We can't just do an
UPDATE ALL
to change all taggings that point at "old_tag" to point at "new_tag" instead, because this table too has a uniqueness constraint, and if it gets triggered by any of the changed taggings then all of the changes will be rolled back.This can happen, say, when an item is already tagged with "baz", and we try to change the "foo" tag on it to "baz" as well.
So I guess the approach here is to first try
UPDATE ALL
(for speed), and if it fails with anActiveRecord::RecordNotUnique
exception then iterate painstakingly through each tagging that needs to be changed. For each which fails (with the same exception), we just destroy the tagging, because we know a duplicate already exists.The alternative is to find out the affected taggables and loop through them, use the high level
tag
method to just retag them all over again, and triggered the standard logic which gets applied whenever creating new tags. -
Greg Hurrell
Another idea to play with:
-
Instead of having a separate table, add a new column to
the
tags
table containing the desired transformation.
Or:
- Instead of having a table with two columns (for tag name and correction), use some kind of Active Record association (ie. the correction "belongs to" the tag)
Or:
- A combination of these two ideas, in which the extra column contains a tag ID rather than a string
Will need to think through the implications of each of these ideas.
-
Instead of having a separate table, add a new column to
the
-
Greg Hurrell
Would have liked to knock this one off today but unfortunately was busy with other stuff. Did at least get some refactoring and clean up of the tagging code done though, which will lay some groundwork for this feature:
b4a94e4 Refactor 'save_pending_tags' callback for efficiency 63d5e18 Make counter cache decrement properly after untagging 2545e27 Reformat ActsAsTaggable specs efacb1f Be more specific about ignored exception in add_tag method d93802f Tidy up parse_tag_list implementation
-
Greg Hurrell
A new use case: Apple has officially rebranded Mac OS X as OS X.
Want all those items tagged with "mac.os.x" to be redirected to "os.x".
In this case, simply renaming the tag might be easiest. That might break some existing links.
-
Greg Hurrell
Status changed:
- From: open
- To: closed
Add a comment
Comments are now closed for this issue.