Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with windows-1251 codepage in .bas (cyrrilic variables names) #246

Closed
tst32 opened this issue Jul 1, 2021 · 4 comments
Closed
Milestone

Comments

@tst32
Copy link

tst32 commented Jul 1, 2021

It seems that when i exported sources some weird happened with module.bas code
image
I open the code in notepad++ as well in vba debugger to show problem. This problem doesnot happened with forms or anything else but modules.bas

@tst32 tst32 changed the title problems with windows-1251 codepage in .bas problems with windows-1251 codepage in .bas (cyrrilic variables names) Jul 1, 2021
@joyfullservice
Copy link
Owner

@tst32 - Thank you for including the screenshot. It helps to see the additional details included there.

In the source code of the add-in, you can take a look at the modEncoding module, and especially the last function GetSystemEncoding(). This is where we attempt to map the local code page number to the corresponding string used when converting between the local encoding and UTF-8.

'---------------------------------------------------------------------------------------
' Procedure : GetSystemEncoding
' Author    : Adam Waller
' Date      : 3/8/2021
' Purpose   : Return the current encoding type used for non-UTF-8 text files.
'           : (Such as VBA code modules.)
'           : https://docs.microsoft.com/en-us/windows/win32/intl/code-page-identifiers
'           : https://documentation.help/MS-Office-VB/ofhowConstants.htm
'           : * Note that using utf-8 as a default system encoding may not work
'           : correctly with some extended characters in VBA code modules. The VBA IDE
'           : does not support Unicode characters, and requires code pages to display
'           : extended/non-English characters. See Issues #60, #186, #180
'---------------------------------------------------------------------------------------
'
Public Function GetSystemEncoding() As String
    
    Static lngEncoding As Long
    
    ' Call API to determine active code page, caching return value.
    If lngEncoding = 0 Then lngEncoding = GetACP
    Select Case lngEncoding
    
        ' Language encoding mappings can be defined here
        Case msoEncodingISO88591Latin1:     GetSystemEncoding = "iso-8859-1"
        Case msoEncodingWestern:            GetSystemEncoding = "windows-1252"
        
        ' *In Windows 10, this is a checkbox in Region settings for
        ' "Beta: Use Unicode UTF-8 for worldwide language support"
        Case msoEncodingUTF8:               GetSystemEncoding = "utf-8"
        
        ' Any other language encoding not defined above
        Case Else
            ' Attempt to autodetect the language based on the content.
            ' (Note that this does not work as well on code as it does
            '  with normal written language. See issue #186)
            GetSystemEncoding = "_autodetect_all"
    End Select
    
End Function

I have not seen a source online that maps the constants with the string representations, but after looking at this more today, I was able to combine a couple of sources to come up with what I believe to be a comprehensive list.

joyfullservice added a commit that referenced this issue Jul 1, 2021
This should resolve issues like what is described in #246 where the autodetect fails to accurately detect the language.
@joyfullservice
Copy link
Owner

I have pushed an update to the dev branch which should resolve this issue, but please be aware that there are a couple other outstanding issues that need to be resolved before we can roll out version 3.4.x for general release. In the mean time, you can patch your version using the following steps:

  1. Open the add-in file
  2. Replace the GetSystemEncoding() function in modEncoding with the new version from e30ef71
  3. Save and close the add-in
  4. Open the (modified) add-in again, and click to install

This will install this update on your system. Let me know if this resolves the problem for you!

@joyfullservice joyfullservice added this to the Release 3.4.0 milestone Jul 1, 2021
@tst32
Copy link
Author

tst32 commented Jul 5, 2021

Thanks! i followed your recomendations (edited modEncoding.bas), and it works like a charm!
image

@tst32 tst32 closed this as completed Jul 5, 2021
@joyfullservice
Copy link
Owner

That's awesome! Thanks for posting to confirm that the update solved the problem. This should be helpful for anyone else that is using a non-English codepage with VBA.

joyfullservice added a commit that referenced this issue Jan 26, 2023
This change ensures that VBA code modules are imported using an 8-bit code page, even if the system encoding is using UTF-8 by default. We are currently hard coding Windows-1252 which supports English, French, German, and most other Western European languages, and is the most widely used codepage. This should allow people to use the Windows 10 option for Beta Unicode support. See #180, #246, #377
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants