php下检测字符串是否是utf8编码的代码

  作者:bea

functionis_utf8($string){ returnpreg_match('%^(?: [x09x0Ax0Dx20-x7E]#ASCII |[xC2-xDF][x80-xBF]#non-overlong2-byte |xE0[xA0-xBF][x80-xBF]#excludingoverlongs |[xE1-xECxEExEF][x80-xBF]{2}#straight3-byte |xED[x80-x9F][x80-xBF]#excludingsurrogat
 function is_utf8($string) {
     return preg_match('%^(?:
             [x09x0Ax0Dx20-x7E]                 # ASCII
         | [xC2-xDF][x80-xBF]                 # non-overlong 2-byte
         |     xE0[xA0-xBF][x80-xBF]             # excluding overlongs
         | [xE1-xECxEExEF][x80-xBF]{2}     # straight 3-byte
         |     xED[x80-x9F][x80-xBF]             # excluding surrogates
         |     xF0[x90-xBF][x80-xBF]{2}     # planes 1-3
         | [xF1-xF3][x80-xBF]{3}             # planes 4-15
         |     xF4[x80-x8F][x80-xBF]{2}     # plane 16
     )*$%xs', $string);     
}
准确率基本和mb_detect_encoding一样,要对一起对,要错一起错。
编码检测不可能100%准确,这个东西已经可以基本满足要求了。 


有用  |  无用

猜你喜欢