Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add common helper methods to the String class (Contains(StringComparison), Replace(StringComparison), Left, Truncate, TrimPrefix) #14831

Open
GSPP opened this issue Jul 12, 2015 · 11 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Milestone

Comments

@GSPP
Copy link

GSPP commented Jul 12, 2015

The String class is missing some helper methods for string manipulation that are commonly needed. Of course it is possible for developers to add extension methods but that should not be required. here, I'll propose a list of methods that I think can be easily understood and are commonly required. The ordering of this list starts with the least controversial in my mind.

  1. bool Contains(this string str, string value, StringComparison comparisonType). We have no Contains version that takes a StringComparison. The workaround with IndexOf is awkward and feels like magic.
  2. Replace(..., StringComparison). Right now, Replace always uses Ordinal.
  3. string Left(this string str, int count) as well as Right. Left is equivalent to str.Substring(0, count) but right is str.Substring(str.Length - count, count).
  4. Truncate(int maxLength). This is not equivalent to Substring(maxLength) because Substring throws if the string is short.
  5. bool IsValidIndex(this string str, int index) tests whether the argument can be used to obtain a character from the indexer. That sometimes comes in handy. Also bool IsValidRange(this string str, int index, int count). Both can be useful for Debug.Assert assertions.
  6. string TrimPrefix(this string str, string prefix) as well as TrimPostfix. It turns out that when working with externally generated strings (ETL processes) it is very common to need to remove a prefix. For example to get a JIRA issue number as an int you do int.Parse("PROJ-2317".TrimPrefix("PROJ-")). Replace(..., "PROJ-", "") is not equivalent and semantically wrong. The TrimStart method cannot be used here. Further helper methods would be EnsurePrefix/Postfix.
  7. Some helpers to work on strings that might be null or empty. This comes up all the time.
        public static string EmptyToNull(this string str)
        {
            return string.IsNullOrEmpty(str) ? null : str;
        }
        public static string NullToEmpty(this string str)
        {
            return str ?? string.Empty;
        }
        public static bool IsNullOrEmpty(this string str)
        {
            return string.IsNullOrEmpty(str);
        }
        public static bool IsNullOrWhitespace(this string str)
        {
            return string.IsNullOrWhiteSpace(str);
        }
  1. SplitLines(). Returns the string split into lines. Need to define how to deal with OS-specific line endings. Probably, this should just be str.Replace("\r", "").Split('\n') which seems to work for everything.

I can see some issues with some of these proposals. They are not perfect but I want to get the discussion going. I feel the BCL has a blind spot here. The list certainly starts out with a few no-brainers.

@JonHanna
Copy link
Contributor

Most of these have very well-known idiomatic solutions, that people learn very early on in programming generally. This isn't to say they wouldn't be useful, but does reduce how useful.

1 Is definitely an example of this.

2 I think would certainly be useful if done well (catching the trickier edge-cases around e.g. "A pint of WEISSBEIR.".Replace("weißbier", "ale") == "A pint of ale."). It would be worse than useless otherwise.

3, 4, 5 are again common idioms.

6 is a common idiom with regular expressions.

7 has some semantic nicety IMO but little else. I'd note that many people don't like accepting null as the first parameter of extension methods. (I personally don't like not being able to accept null as the this of a non-virtual method and certainly don't see anything wrong with null on an extension method, but the opinion definitely exists).

8 Text files that use U+000D as a line-ending aren't unheard of, though much rarer since Mac OS 9 stopped being current. Text files using U+2028 and U+2029 to distinguish lines and paragraphs will likely become more common. If you were to do something like this, you should probably break on all mandatory-break sequences in UAX 14, which would be U+000A, U+000B, U+000C, U+000D when followed by anything other than U+000A, U+0085, U+2028 and U+2029.

@GSPP
Copy link
Author

GSPP commented Jul 12, 2015

Regarding idioms: The idiom should not be str.Length <= maxLength ? str : str.Substring(0, maxLength). The idiom should be str.Truncate(maxLength). I see your point, though.

Note, that in this expression I have referenced str and maxLength twice. This does not work if one of them is side-effecting, expensive or syntactically long. For example str could be CallWebService(). Now you need to introduce a local variable which you might not have wanted. Changing this computation into a method call captures those expressions by value once. This provides a lot of convenience.

(2) This is true, I'm impressed :) The problem is that the match length is not necessarily stringToFind.Length! If I remember correctly I have failed to find a way to do this with the existing .NET Framework APIs.

(6) should not require regex, for reasons of performance and simplicity. The string to search must be escaped. I'm also not sure whether this will behave correctly in all cases (for example what about CurrentcultureIgnoreCase semantics or Unicode surrogate pairs?). This should just work.

(7) Agreed. I only propose taking this as null for this extremely high frequency use case.

(8) My best idea would be to define that this method behaves exactly identical to StreamReader (whatever that is).

Good feedback, thanks.

@svick
Copy link
Contributor

svick commented Jul 12, 2015

FYI, 6 has already been proposed: #14386.

@juliusfriedman
Copy link
Contributor

@karelz
Copy link
Member

karelz commented Nov 18, 2016

We need formal API proposal. Please include also usage info with justification why it is useful to add (should be widely needed/used).

BTW: There is overlap with #14504.

@GSPP
Copy link
Author

GSPP commented Nov 18, 2016

What more details are needed?

@karelz
Copy link
Member

karelz commented Nov 18, 2016

Formal API proposal: see example, or general API review process

@GrabYourPitchforks
Copy link
Member

This issue hasn't been touched in 3+ years. Some of the shipping APIs (String.Contains, String.Replace) have now been updated to take StringComparison parameters.

Is there still interest in addressing the remainder of the proposal?

@am11
Copy link
Member

am11 commented Jan 21, 2020

I have personally written Truncate extension method in multiple .NET projects by this signature:

// inspired by Ruby https://apidock.com/rails/String/truncate
public static string Truncate(this string subject, int length, string omission = "...")

@GrabYourPitchforks
Copy link
Member

If there's interest in a Truncate method I'd recommend opening a separate issue for that in the runtime repo. There's probably enough nuance in that one method alone that it's deserving of its own issue.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@GSPP
Copy link
Author

GSPP commented Feb 3, 2020

I have created an issue for Truncate as requested: #31655

I would suggest that the team goes over the list in this issue and decides which ones are worth pursuing further. Without this decision, this issue might keep lingering with no action taken.

I'm willing to open issues for those that are deemed worthy in the style of the one that I just opened. I'm unlikely to make the code contribution but I can cleanly write up the APIs.

@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@joperezr joperezr removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2020
@ericstj ericstj modified the milestones: 5.0.0, Future Aug 5, 2020
@GSPP GSPP changed the title Add common helper methods to the String class Add common helper methods to the String class (Contains(StringComparison), Replace(StringComparison), Left, Truncate, TrimPrefix) Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Projects
None yet
Development

No branches or pull requests