-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement titlecase function. Fix #19465. #19469
Conversation
@@ -403,6 +403,25 @@ julia> lowercase("STRINGS AND THINGS") | |||
lowercase(s::AbstractString) = map(lowercase, s) | |||
|
|||
""" | |||
titlecase(s::AbstractString) | |||
|
|||
Returns `s` with all initial characters converted to titlecase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you define "initial character"? Maybe also explain what's exactly titlecase.
``` | ||
""" | ||
function titlecase(s::AbstractString) | ||
words = split(lowercase(s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why call lowercase
? That's not mentioned in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the behaviour I know from e.g. Python. The question is whether we want this.
This might be more useful
julia> str = "ISS - international space station"
julia> titlecase(str)
"ISS - International Space Station"
than this
julia> str = "ISS - international space station"
julia> titlecase(str)
"Iss - International Space Station"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the first one sounds more useful, and you can always get the second one by calling lowercase
first (while the contrary isn't possible).
Thanks for the feedback @nalimilan and @stevengj. |
@test titlecase('lj') == 'Lj' | ||
@test titlecase("ljubljana") == "Ljubljana" | ||
@test titlecase("aBc ABC") == "ABc ABC" | ||
@test titlecase("abcD EFG\n\thij") == "AbcD EFG\n\tHij" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a couple of spaces after D to make sure multiple spaces are preserved.
@@ -403,6 +403,31 @@ julia> lowercase("STRINGS AND THINGS") | |||
lowercase(s::AbstractString) = map(lowercase, s) | |||
|
|||
""" | |||
titlecase(s::AbstractString) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll also need to add a placeholder to doc/stdlib/strings.rst
for titlecase
to be included in the manual, and re-run julia doc/genstdlib.jl
Needs a NEWS.md item, too |
fdbfac9
to
30b4b7f
Compare
Done. |
LGTM. |
@@ -62,6 +62,8 @@ Library improvements | |||
|
|||
* New `accumulate` and `accumulate!` functions, which generalize `cumsum` and `cumprod`. Also known as a [scan](https://en.wikipedia.org/wiki/Prefix_sum) operation ([#18931]). | |||
|
|||
* New `titlecase` function, which capitalizes the first character of each word within a string ([#19469]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should doc/NEWS-update.jl
be run in order to create the link?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we are holding off on running that... we will update a bunch of links at once in a separate PR. Updating the NEWS links in individual PRs creates too many merge conflicts.
Clarify docstring. Single-pass version without call to lowercase. Test subsequent spaces. NEWS.md entry and RST docs. Add NEWS.md links.
(@helgee, no need to rebase and squash the commits in the future. Github does that for you when we hit the "squash and merge" button.) |
@stevengj Good to know. IIRC Tony asked me to squash last time. Shall I revert the NEWS.md link changes? |
GitHub added the feature relatively recently, so last time it probably was necessary. No need to revert anything though. |
Travis failure seems to be an unrelated problem with cmake on OSX that has been appearing in several PRs. |
Hmm, maybe we should be following the algorithm in UTR#21. |
That algorithm does not handle the desired exceptions (e.g. acronyms) though. |
Indeed, I'm torn. Anyway better follow other rules from that algorithm unless we have reasons not to. |
Yes, I agree that we shouldn't convert to lowercase, since you could always do |
AFAICT does the currently implemented julia> lowercase("OΔΥΣΣΕΥΣ")
"oδυσσευσ" # wrong
"oδυσσευς" # would be correct is wrong, i.e. the lowercase sigma at the end is the non-final form σ but should be the final form ς instead. EDIT: Python handles it correctly. |
Good point; we should consider a fix to all of them, but for now the simple |
I will open a new issue for the special case conventions. Is there anything left to do here? |
No description provided.