Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename occursin to something more reasonable #30368

Closed
musm opened this issue Dec 12, 2018 · 7 comments
Closed

Rename occursin to something more reasonable #30368

musm opened this issue Dec 12, 2018 · 7 comments
Labels
search & find The find* family of functions strings "Strings!"

Comments

@musm
Copy link
Contributor

musm commented Dec 12, 2018

Everytime I encounter this function I visually parse it as occur_sin, and personally find it ugly.
From slack there seems to be a lot of people that share this sentiment. It might be good to rename this function.

I tried searching for other programming languages that use the name occursin. I found some usages where it was spelled as occursIn in Java, which would more closely translate to occurs_in in Julia (however I'm not a fan of the underscore and this is something that Base Julia tries to avoid in general).

Stefan threw out the following in the thread:

matches would still be reasonable, or even search

and later on threw out: ismatch

@JeffBezanson JeffBezanson added the strings "Strings!" label Dec 12, 2018
@nalimilan nalimilan added the search & find The find* family of functions label Dec 14, 2018
@mbauman
Copy link
Member

mbauman commented Dec 14, 2018

History:

In short: English doesn't have a good short word for subsequence matching. We've tried many different ways of naming this operation.

@StefanKarpinski
Copy link
Member

Thought: Perl and Ruby use ~= for matching. Not sure this is worth that bit of ASCII real estate, but I thought I'd mention it.

@ararslan
Copy link
Member

That looks like it should be an updating ~, i.e. x ~= y would be x = x ~ y.

@c42f
Copy link
Member

c42f commented Feb 5, 2019

I kind of just want to write in for this, even though I know it's inconsistent. What about a wrapper for sequences which reinterprets them as a matchable set of subsequences, allowing us to consistently use in?

Simplistic example:

struct SubSeqs # All sub sequences of a string
    str::String
end
Base.in(pat, ss::SubSeqs) = occursin(pat, ss.str)

Thence

julia> "foo" in SubSeqs("asdf foo x")
true

Unfortunately it's not a completely clean decomposition of the problem when you come to matching regex or other pattern types, as these patterns have their own notion of whether the pattern must match the whole string, or merely a subsequence. I'm not sure whether that's a fatal problem yet.

@cormullion
Copy link
Contributor

at least within is a real word... 😂

@KristofferC
Copy link
Member

We put back contains.

@o314
Copy link
Contributor

o314 commented Jul 15, 2020

ismatch may has been a better choice to reach / provide some unary tester for filter, find api, eg.

ismatch(r::Regex) = (s) -> match(r, string(s)) !== nothing # could go in Base

using Test
@test filter(ismatch(r"[c-e]"), 'a':'z') == ['c','d','e']

Works not so well with contains

EDIT:

julia> VERSION
v"1.3.1"

julia> methods(occursin)
# 6 methods for generic function "occursin":
[1] occursin(delim::UInt8, buf::Base.GenericIOBuffer{Array{UInt8,1}}) in Base at iobuffer.jl:469
[2] occursin(delim::UInt8, buf::Base.GenericIOBuffer) in Base at iobuffer.jl:475
[3] occursin(needle::Union{AbstractChar, AbstractString}, haystack::AbstractString) in Base at strings/search.jl:530
[4] occursin(r::Regex, s::SubString; offset) in Base at regex.jl:177
[5] occursin(r::Regex, s::AbstractString; offset) in Base at regex.jl:172
[6] occursin(pattern::Tuple, r::Test.LogRecord) in Test at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib
\v1.3\Test\src\logging.jl:211

BTW occursin seems to only target string / regex. ismatch may be a better generic and coherent fit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
search & find The find* family of functions strings "Strings!"
Projects
None yet
Development

No branches or pull requests

10 participants