Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward slash is valid url character but removed #98

Closed
kmturley opened this issue Dec 6, 2020 · 14 comments
Closed

Forward slash is valid url character but removed #98

kmturley opened this issue Dec 6, 2020 · 14 comments

Comments

@kmturley
Copy link

kmturley commented Dec 6, 2020

Slugify is designed to support valid url characters. However if I pass in a forward slash:

slugify(`Something/Another thing`, { lower: true })

The result is:

somethinganother-thing

Where I would expect:

something/another-thing

This can of course be mitigated by changing the remove RegEx adding \/:

console.log(slugify(`Something/Another thing`, { lower: true, remove: /[^\w\s$*_+~.()'"!\-:@\/]+/g }));

But this is not a very elegant solution. Ideally there would be an ignore/exclude character feature?

console.log(slugify(`Something/Another thing`, { lower: true, exclude: '/' }));
@JuanFML
Copy link

JuanFML commented Dec 10, 2020

Hi! I am new here, I was wondering if I could try to fix this issue?

@simov
Copy link
Owner

simov commented Dec 10, 2020

Generally speaking I don't think adding additional exclude option is the way to go. It simply doesn't make sense because you have a remove regex by default, that excludes characters from being available in the result slug, and then you have an exclude option that excludes from the excluded characters. You see what I'm saying? So there might be an elegant way to solve this but I'm not sure what it is.

@Trott
Copy link
Collaborator

Trott commented Dec 10, 2020

This is probably an over-engineered and inelegant solution, but I can't tell so here it is anyway:

It would be a breaking change, but perhaps the strict option can be deprecated in favor of a mode setting that affects the defaults of things like the remove regex. slugify can ship with a strict mode and loose (or sloppy or whatever you want to call it) mode, but people could add their own and perhaps even ship them as separate packages, effectively being a plugin:

const slugify = require('slugify')
const slashesMode = require('slugify-plugin-slashes-mode')

slugify.addMode(slashesMode) 
slugify.mode = 'slashes'

// Or maybe `.use()` or `.useMode()` to do both of the above two lines at once?

@simov
Copy link
Owner

simov commented Dec 10, 2020

Thanks @Trott, so now we are building a wrapper for regular expressions, although I really do see your point. I like the idea of having the strict mode by default in a next major. I think the main confusion stems from the fact that you have to deal with the existing remove regex by default. I imagine people feeling much more comfortable about crafting their own remove regex from scratch but I might be wrong.

@kmturley
Copy link
Author

kmturley commented Dec 10, 2020

I was thinking of two potential solutions:

  1. Input of multiple characters (include/exclude lists), converted to a single RegEx
    slugify(Something/Another thing, { lower: true, exclude: ['/'] })

  2. Input of multiple RegExes, combined to a single RegEx
    slugify(Something/Another thing, { lower: true, exclude: ['\/'] })

There are some attempts to combine or simplify RegExs:
https://stackoverflow.com/questions/869809/combine-regexp
https://github.com/spadgos/regex-combiner
https://github.com/gogeorge/EasyRegex
https://github.com/aaditmshah/regex

Not sure if that is a reliable solution

@Trott
Copy link
Collaborator

Trott commented Dec 10, 2020

Thanks @Trott, so now we are building a wrapper for regular expressions, although I really do see your point.

It would also allow one to alter other settings so the possibilities become pretty substantial. A mode plugin could alter the charmap, so there could be an easy way to add Pinyin for users who need that, but it can be omitted by default for users who don't need thousands of characters increasing the size of their library.

@simov
Copy link
Owner

simov commented Dec 11, 2020

@Trott that makes sense, but it will also require complete overhaul of how the module works.

@kmturley thanks for your suggestions.

@Trott
Copy link
Collaborator

Trott commented Aug 27, 2021

This is perhaps more maintainable and understandable than passing in a remove regular expression.

function slugifyWithSlashes(myString) {
 return myString.split('/').map((val) => slugify(val)).join('/')
}

@Trott
Copy link
Collaborator

Trott commented Aug 28, 2021

Repeating what I wrote in #131 (comment) here:

I always considered a slug to be a URL element, not a path, so keeping slashes is very much unexpected to me.

To me, the canonical use case is "I have a title of an article and I want to convert that article title to a component in a URL."

So I might have a blog at www.example.com/my-awesome-blog. And if I post something today, it might end up in www.example.com/my-awesome-blog/2021/08/28/some-slug-value-here.

So if I have an article titled "Better A/B Testing", I want a slug like better-ab-testing or Better-AB-Testing or better-a-b-testing but most certainly not better-a/b-testing.

The use case where keeping slashes is expected seems unusual to me. Can you explain how this came up for you?

@kmturley
Copy link
Author

The use case is when constructing a url with multiple slug parts:

county: Los Angeles
city: Culver City

With the current functionality you could do:

`${slugify(location.county)}/${slugify(location.city)}`

But with the suggested feature it would be:

slugify(`${location.county)}/${location.city}`)

This is an example for just two url parts, but there could be n number

@Trott
Copy link
Collaborator

Trott commented Aug 28, 2021

This is an example for just two url parts, but there could be n number

Interesting. For what it's worth, I'd be inclined to put all the URL parts into an array and use map() and join():

urlParts = ['Los Angeles',  'Culver City', 'California', 'United States of America', '99122', 'Earth']

urlParts.map(slugify).join('/')

Being able to run slugify() once may be more performant, but I don't think there's too situations out there where the bottleneck is slugify().

@kmturley
Copy link
Author

Sure, the performance benefit is negligible. But one RegEx is more performant than a loop. For me it's the convenience and to make code more DRY.

Forward slashes are very closely related to slugs so either way, some folks will need the workaround! I will understand if you don't feel it's a feature slugify should support.

@Trott
Copy link
Collaborator

Trott commented Aug 29, 2021

By the way, the use case suggested here would result in a bug in user code if a county or city (or any other string supplied) contained a slash in its name, whereas using an array with map() and join() or any of the other alternatives avoids that footgun entirely. I understand the motivation, but the more I think about this, the more I think this is a feature we probably don't want.

@kmturley
Copy link
Author

Fair enough, closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants