BytesWritable 长度问题(多出空格)

在使用 BytesWritable 进行小文件合并时,发现长度与原类容不一致,会多出一些空格

测试代码

@Test
public void test() {
    String str = "aaa";

    BytesWritable v = new BytesWritable();
    v.set(str.getBytes(), 0, str.getBytes().length);

    System.out.println("*" + new String(v.getBytes()) + "*");
}

结果,看到多出了一个空格

查看 BytesWritable 源码,发现复制后数组大小会被处理,真正存储类容长度的为 size 属性

public void set(byte[] newData, int offset, int length) {
    setSize(0);
    setSize(length);
    System.arraycopy(newData, offset, bytes, 0, size);
}
public void setSize(int size) { if (size > getCapacity()) { // Avoid overflowing the int too early by casting to a long. long newSize = Math.min(Integer.MAX_VALUE, (3L * size) / 2L); setCapacity((int) newSize); } this.size = size; }

既然知道长度,在转换时设置上就好了

@Test
public void test() {
    String str = "aaa";

    BytesWritable v = new BytesWritable();
    v.set(str.getBytes(), 0, str.getBytes().length);

    // getSize()为过期方法,使用 getLength()
    System.out.println("*" + new String(v.getBytes(),0,v.getLength()) + "*");
}


http://hadoop.apache.org/docs/r2.9.2/api/org/apache/hadoop/io/BytesWritable.html

原文地址:https://www.cnblogs.com/jhxxb/p/10795875.html