1、先说重点:
不同的编码格式占字节数是不同的,UTF-8编码下一个中文所占字节也是不确定的,可能是2个、3个、4个字节;
2、以下是源码:
@Test
public void test1() throws UnsupportedEncodingException {
String a = "名";
System.out.println("UTF-8编码长度:"+a.getBytes("UTF-8").length);
System.out.println("GBK编码长度:"+a.getBytes("GBK").length);
System.out.println("GB2312编码长度:"+a.getBytes("GB2312").length);
System.out.println("==========================================");
String c = "0x20001";
System.out.println("UTF-8编码长度:"+c.getBytes("UTF-8").length);
System.out.println("GBK编码长度:"+c.getBy