Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

substring does not support UTF-16 #35798

Closed
maxlapides opened this issue Jan 29, 2019 · 12 comments
Closed

substring does not support UTF-16 #35798

maxlapides opened this issue Jan 29, 2019 · 12 comments
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-core

Comments

@maxlapides
Copy link

"🍕".substring(0, 1);
> "�"

This issue came up because a user inputted an emoji into a data field. Attempting to render Text("🍕".substring(0, 1)) in Flutter results in:

flutter: ══╡ EXCEPTION CAUGHT BY RENDERING LIBRARY ╞═════════════════════════════════════════════════════════
flutter: The following ArgumentError was thrown during performLayout():
flutter: Invalid argument(s): string is not well-formed UTF-16
flutter:
flutter: When the exception was thrown, this was the stack:
flutter: #0      ParagraphBuilder.addText (dart:ui/text.dart:1157:7)
flutter: #1      TextSpan.build 
package:flutter/…/painting/text_span.dart:172
flutter: #2      TextPainter.layout 
package:flutter/…/painting/text_painter.dart:352
flutter: #3      RenderParagraph._layoutText 
...
@maxlapides
Copy link
Author

❯ flutter doctor -v
[✓] Flutter (Channel beta, v1.0.0, on Mac OS X 10.14.2 18C54, locale en-US)
    • Flutter version 1.0.0 at /Users/maxlapides/flutter
    • Framework revision 5391447fae (9 weeks ago), 2018-11-29 19:41:26 -0800
    • Engine revision 7375a0f414
    • Dart version 2.1.0 (build 2.1.0-dev.9.4 f9ebf21297)

[✓] Android toolchain - develop for Android devices (Android SDK 28.0.3)
    • Android SDK at /Users/maxlapides/Library/Android/sdk
    • Android NDK location not configured (optional; useful for native profiling support)
    • Platform android-28, build-tools 28.0.3
    • ANDROID_HOME = /Users/maxlapides/Library/Android/sdk
    • Java binary at: /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/java
    • Java version OpenJDK Runtime Environment (build 1.8.0_152-release-1248-b01)
    • All Android licenses accepted.

[✓] iOS toolchain - develop for iOS devices (Xcode 10.1)
    • Xcode at /Applications/Xcode.app/Contents/Developer
    • Xcode 10.1, Build version 10B61
    • ios-deploy 1.9.4
    • CocoaPods version 1.6.0.beta.1

[✓] Android Studio (version 3.3)
    • Android Studio at /Applications/Android Studio.app/Contents
    • Flutter plugin version 31.3.3
    • Dart plugin version 182.5124
    • Java version OpenJDK Runtime Environment (build 1.8.0_152-release-1248-b01)

[✓] VS Code (version 1.30.2)
    • VS Code at /Applications/Visual Studio Code.app/Contents
    • Flutter extension version 2.22.1

[✓] Connected device (2 available)
    • SAMSUNG SM G920A • 03157df3c5558209                     • android-arm64 • Android 5.1.1 (API 22)
    • iPhone XS        • 9CFC4B23-BE03-4DC9-B1CD-5E1226F5A183 • ios           • iOS 12.1 (simulator)

• No issues found!

@kevmoo kevmoo added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-core labels Jan 29, 2019
@kevmoo
Copy link
Member

kevmoo commented Jan 29, 2019

CC @lrhn

@maxlapides
Copy link
Author

Here's my workaround for now:

String.fromCharCode(str.runes.first)

@liamappelbe
Copy link
Contributor

A substring method that works on code points rather than UTF-16 code units might be inefficient. Eg, taking a substring near the end of a very long string would be slow, because it would have to iterate through most of the string counting code points.

The substring documentation doesn't make it clear that the method operates on code units. Can we update it to explain this, similarly to the good explanation given on the [] operator?

@sgon00
Copy link

sgon00 commented May 14, 2019

I am having the exact same problem. My case is I want to programmatically delete (backspace) a text input which may have emojis. So far, my workaround is as follows:

var s = "abc😀";
var sRunes = s.runes;
print(String.fromCharCodes(sRunes, 0, sRunes.length-1));

And make sure users do not input those emojis which have length 4.

@sgon00
Copy link

sgon00 commented May 14, 2019

FYI, My another workaround https://stackoverflow.com/a/56135774/348719 which is currently broken.

@elMuso
Copy link

elMuso commented Jun 29, 2019

I have found another workaround , is more respource expensive but so far it works with all emojis.

In my particular case i wanted to do a subtring [EX from index 0 to 16], and count an emoji as an individual character , however it was just getting half of the text due to the emojis in it

My workaround is this one

Create a function called runeSubtring() like this one

String runeSubstring({String input , int start , int end}){
  String finalString = ''; //initialize the string
  List individualRunes = input.runes.toList(); //convert the string to a list of runes
  individualRunes.sublist(start,end).forEach((rune) { //"substring" the list
          String character = String.fromCharCode(rune); //convert the list back to the string one by one
          finalString = finalString + character;
});
  return finalString; //return the substring
}

and just use it like this when you like

String example = r'Example \ud83d\ude13  Example \ud83d\ude13';
String result = runeSubstring(input: example,start: 0,end:10);
String resultEnd = runeSubstring(input: example,start: 11); //from 11 to the end

If the text is really big, you should do input.runes.toList(); outside the function since it will leverage the charge of converting text to runes and to a list everythime the function is called.

@rakudrama
Copy link
Member

See #28404 and dart-lang/language#34, long discussions about making correct String manipulation easier. We will probably close this issue as being one example of the bigger issue.

@LiteCatDev String.fromCharCodes takes an Iterable of rune values, so you can simplify your code to:

String runeSubstring({String input, int start, int end}) {
  return String.fromCharCodes(input.runes.toList().sublist(start, end));
}

@truongsinh
Copy link

@mit-mit
Copy link
Member

mit-mit commented Nov 14, 2019

dart-lang/language#685 is our current attempt at supporting this via a package; closing the present issue in favor of that

@kristijorgji
Copy link

Similar issue I need to solve have posted problem here

https://stackoverflow.com/questions/68518125/flutter-dart-how-to-trim-if-special-characters-are-present

please give your inputs thanks

@mit-mit
Copy link
Member

mit-mit commented Jul 26, 2021

Use package:characters: https://pub.dev/documentation/characters/latest/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-core
Projects
None yet
Development

No branches or pull requests

9 participants