java乱炖

---------------------------------------------------------

ArrayList<String> arrayList = new ArrayList<String>(10);
System.out.println(arrayList.size());

大家猜，打印结果会是什么？会是10吗？

答案：0

因为，new ArrayList<String>(10)，只是告诉jvm分配10个Stirng的空间出来，arrayList 不是null而已。此时的arrayList 无任何内容！

---------------------------------------------------------

为什么说JDK中处理String的indexOf()方法效率低，建议不要使用？今天探了个究竟。

上源码！

static int indexOf(char[] source, int sourceOffset, int sourceCount,
            char[] target, int targetOffset, int targetCount,
            int fromIndex) {
        if (fromIndex >= sourceCount) {
            return (targetCount == 0 ? sourceCount : -1);
        }
        if (fromIndex < 0) {
            fromIndex = 0;
        }
        if (targetCount == 0) {
            return fromIndex;
        }

        char first = target[targetOffset];
        int max = sourceOffset + (sourceCount - targetCount);

        for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                while (++i <= max && source[i] != first);
            }

            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j]
                        == target[k]; j++, k++);

                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;
    }

真整齐啊！

上来就看到了两个for循环。问题必然出在这里！和KMP算法不同，这明显是简单粗暴的loop index。为什么没采用复杂度为O(m+n)的KMP算法？估计时间问题吧。

以后注意尽量避免indexOf()方法。

顺便窥探了下subString()方法的实现：

上源码！

public String substring(int beginIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        int subLen = value.length - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
    }


public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count < 0) {
            throw new StringIndexOutOfBoundsException(count);
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

怪不得效率较高，原来用到了Arrays.copyOfRange()方法。

---------------------------------------------------------

一道非常老的面试题：

从字符串str1中找出子字符串str2出现的次数。比如：

String str1 = "sgiccomcocmcomomameadjcomfjecomcomcomcijr";
String str2 = "com";

自己想到的方法如下：

//暴力匹配
//KMP算法
//subString
//split
//RegEx(may be the best)

这其中，RegEx（正则匹配）的效率是最高的（个人理解）。而且实现起来也很简单，如下：

String str1 = "sgiccomcocmcomomameadjcomfjecomcomcomcijr";
String str2 = "com";
int times = 0;
Pattern pattern = Pattern.compile(str2);
Matcher matcher = pattern.matcher(str1);
while (matcher.find()) {
        times++;
        System.out.println(matcher.group());            
    }
System.out.println(times);

---------------------------------------------------------

7/30

刚刚正在探索KMP算法，然后有经验的同事过来看看究竟。不过扯出了另外一个话题，正则的效率不一定高！用他的话说，java代码需要JVM进行解释，那正则表达式也是需要解释器的（可以这么理解），所以效率很可能会低的。

另外，String类中也存在matches()方法，遂对该方法和Pattern中的matches()进行了对比。

代码如下：

public static void main(String[] args){
        String string = "ddenfj#@fe_dw.comw";
        int count = 100000;
        long before = System.currentTimeMillis();
        Runtime runtime = Runtime.getRuntime();
        long r1 = runtime.totalMemory();
        for (int i = 0; i < count; i++) {
            string.matches("^([\.a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+((\.[a-zA-Z0-9_-]{2,3}){1,2})$");
        }
        long r2 = runtime.totalMemory();
        long end = System.currentTimeMillis();
        System.out.println("the memory of match() in String is = " + (r2-r1)/1024);
        System.out.println("the time of match() in String is = " + (end-before));
        
        //Pattern
        Pattern pattern = Pattern.compile("^([\.a-zA-Z0-9_-])+@([a-zA-Z0-9_-])+((\.[a-zA-Z0-9_-]{2,3}){1,2})$");
        before = System.currentTimeMillis();
        r1 = runtime.totalMemory();
        for (int i = 0; i < count; i++) {
            pattern.matcher(string).matches();
        }
        r2 = runtime.totalMemory();
        end = System.currentTimeMillis();
        System.out.println("the memory of match() in Pattern is = " + (r2-r1)/1024);
        System.out.println("the time of match() in Pattern is = " + (end-before));
    }

在count=100000的情况下，发现结果如下：

the memory of match() in String is = 92416
the time of match() in String is = 327
the memory of match() in Pattern is = 0
the time of match() in Pattern is = 26

结果让我吃惊！String类中的matches()方法相对于Pattern类中的matches()方法，简直不能用！

究其原因：String类中的matches()方法调用了Pattern.matches(regex, this)，而该方法的实现如下：

public static boolean matches(String regex, CharSequence input) {
        Pattern p = Pattern.compile(regex);
        Matcher m = p.matcher(input);
        return m.matches();
    }

强调部分的重复执行会占用大量内存。

而且该方法的解释中：If a pattern is to be used multiple times, compiling it once and reusing it will be more efficient than invoking this method each time.

总之，String类中的matches()方法不要使用！

---------------------------------------------------------

今天同事问了个问题：不同版本的jar包，能同时存在吗？如果能的话，应该加载的哪个？

做了个实验，把poi-3.10-beta2.jar和poi-3.10-FINAL.jar（我把名称改为chris.jar）放在lib文件夹下，编译没错，证明可以共存。

但是使用哪个jar包呢？最新的。怎么区分最新的？根据版本号？

最后证明编译器仍然使用chris.jar下的class文件。

结论：当不同版本的jar包共存时，编译器将根据class的时间以及META-INF下的信息区分哪个jar包是最新的，并加载之。

---------------------------------------------------------

清醒时做事，糊涂时读书，大怒时睡觉，独处时思考; 做一个幸福的人，读书，旅行，努力工作，关心身体和心情，成为最好的自己 -- 共勉