unicode - Check valid string encoding with PHP's intl (ICU) features -
unicode - Check valid string encoding with PHP's intl (ICU) features -
using features available in php's intl wrapper icu, how go checking validity of string's encoding? (e.g. check valid utf-8)
i know can done mbstring, iconv() , pcre i'm interested in intl question.
uconverter can used since php 5.5. manual doesn't exist. see https://wiki.php.net/rfc/uconverter api.
function replace_invalid_byte_sequence($str) { homecoming uconverter::transcode($str, 'utf-8', 'utf-8'); } function replace_invalid_byte_sequence2($str) { homecoming (new uconverter('utf-8', 'utf-8'))->convert($str); } function utf8_check_encoding($str) { homecoming $str === uconverter::transcode($str, 'utf-8', 'utf-8'); } function utf8_check_encoding2($str) { homecoming $str === (new uconverter('utf-8', 'utf-8'))->convert($str); } // table 3-8. utilize of u+fffd in utf-8 conversion // http://www.unicode.org/versions/unicode6.1.0/ch03.pdf) $str = "\x61"."\xf1\x80\x80"."\xe1\x80"."\xc2"."\x62"."\x80"."\x63" ."\x80"."\xbf"."\x64"; $expected = 'a���b�c��d'; var_dump([ $expected === replace_invalid_byte_sequence($str), $expected === replace_invalid_byte_sequence2($str) ],[ false === utf8_check_encoding($str), false === utf8_check_encoding2($str) ]); php unicode icu
Comments
Post a Comment