PHP截取带有汉字的字符串,将汉字按两个字节计算

<?php
header("Content-type:text/html;charset=utf-8");

/**
*截取字符串,汉字占两个字节,字母占一个字节
*页面编码必须为utf-8
*/
function esub($str, $length = 0,$ext = "..."){

    if($length < 1){
        return $str;
    }

    //计算字符串长度
    $strlen = (strlen($str) + mb_strlen($str,"UTF-8")) / 2;
    if($strlen < $length){
        return $str;
    }

    if(mb_check_encoding($str,"UTF-8")){
        $str = mb_strcut(mb_convert_encoding($str, "GBK","UTF-8"), 0, $length, "GBK");
        $str = mb_convert_encoding($str, "UTF-8", "GBK");

    }else{

        return "不支持的文档编码";
    }
    
    $str = rtrim($str," ,.。,-——(【、;‘“??《<@");
    return $str.$ext;
}

$str = "L对每个人都说还好";

var_dump(esub($str,9));

程序运行结果: string 'L对每个人...' (length=16)

本函数未考虑在gb2312编码下的支持,因为某些函数在不同编码文件下的输出不一样,原因详见此链接

原文地址:https://www.cnblogs.com/praglody/p/6706475.html