Skip to content

theory/encode-zapcp1252

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Encode/ZapCP1252 version 0.40

CPAN version Build Status

Have you ever been processing a Web form submit, assuming that the incoming text was encoded in ISO-8859-1 (Latin-1), only to end up with a bunch of junk because someone pasted in content from Microsoft Word? Well, this is because Microsoft uses a superset of the Latin-1 encoding called "Windows Western" or "CP1252". So mostly things will come out right, but a few things--like curly quotes, m-dashes, ellipses, and the like--will not. The differences are well-known; you see a nice chart at documenting the differences on Wikipedia.

Of course, that won't really help you. So this library's module, Encode::ZapCP1252, provides subroutines for removing Windows Western Gremlins from strings, turning them into their appropriate UTF-8 or ASCII approximations:

my $clean_latin1 = zap_cp1252 $latin1_text;
my $fixed_utf8   = fix_cp1252 $utf8_text;

Installation

To install this module, type the following:

perl Build.PL
./Build
./Build test
./Build install

Or, if you don't have Module::Build installed, type the following:

perl Makefile.PL
make
make test
make install

Copyright and Licence

Copyright (c) 2005-2020 David E. Wheeler. Some Rights Reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

About

Zap Windows Western Gremlins

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages