GC是如何回收SoftReference对象的

看Fresco的代码中,有这样的一个类:

/**
 * To eliminate the possibility of some of our objects causing an OutOfMemoryError when they are
 * not used, we reference them via SoftReferences.
 * What is a SoftReference?
 * <a href="http://developer.android.com/reference/java/lang/ref/SoftReference.html"></a>
 * <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/ref/SoftReference.html"></a>
 * A Soft Reference is a reference that is cleared when its referent is not strongly reachable and
 * there is memory pressure. SoftReferences as implemented by Dalvik blindly treat every second
 * SoftReference as a WeakReference every time a garbage collection happens, - i.e. clear it unless
 * there is something else referring to it:
 * <a href="https://goo.gl/Pe6aS7">dalvik</a>
 * <a href="https://goo.gl/BYaUZE">art</a>
 * It will however clear every SoftReference if we don't have enough memory to satisfy an
 * allocation after a garbage collection.
 * <p>
 * This means that as long as one of the soft references stays alive, they all stay alive. If we
 * have two SoftReferences next to each other on the heap, both pointing to the same object, then
 * we are guaranteed that neither will be cleared until we otherwise would have thrown an
 * OutOfMemoryError. Since we can't strictly guarantee the location of objects on the heap, we use
 * 3 just to be on the safe side.
 * TLDR: It's a reference that's cleared if and only if we otherwise would have encountered an OOM.
 */
public class OOMSoftReference<T> {

    SoftReference<T> softRef1;
    SoftReference<T> softRef2;
    SoftReference<T> softRef3;

    public OOMSoftReference() {
        softRef1 = null;
        softRef2 = null;
        softRef3 = null;
    }

    public void set(@Nonnull T hardReference) {
        softRef1 = new SoftReference<T>(hardReference);
        softRef2 = new SoftReference<T>(hardReference);
        softRef3 = new SoftReference<T>(hardReference);
    }

    @Nullable
    public T get() {
        return (softRef1 == null ? null : softRef1.get());
    }

    public void clear() {
        if (softRef1 != null) {
            softRef1.clear();
            softRef1 = null;
        }
        if (softRef2 != null) {
            softRef2.clear();
            softRef2 = null;
        }
        if (softRef3 != null) {
            softRef3.clear();
            softRef3 = null;
        }
    }
}


当看到类的名字的时候就感觉奇怪,因为大家听到OOM都感觉害怕,为什么还起这样的一个名字?
带着疑问,看了下代码,三个SoftReference引用同一个对象,甚是出奇。WHY?
然后看注释,注释一大段,其实就只有一个关键点:什么是SoftReference?有何特点?

SoftReference



在android developer官方文档上对SoftReference是这样描述的:

Soft reference objects, which are cleared at the discretion of the garbage collector in response to memory demand.

Suppose that the garbage collector determines at a certain point in time that an object is softly reachable. At that time it may choose to clear atomically all soft references to that object and all soft references to any other softly-reachable objects from which that object is reachable through a chain of strong references. At the same time or at some later time it will enqueue those newly-cleared soft references that are registered with reference queues.

All soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError. Otherwise no constraints are placed upon the time at which a soft reference will be cleared or the order in which a set of such references to different objects will be cleared. Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references.

注意红色段落的描述:所有的SoftReference必定会在OOM发生前被回收,但是,究竟虚拟机要回收哪个SoftReference或者回收的顺序是怎么样是没有限制的。虚拟机的实现一般倾向于清掉最新创建或者最新被使用的SoftReference。



看到这里,有一个想法就是,三个SoftReference的作用应该和GC对SoftReference的回收策略有关,至于有何相关依然很难确定。恰巧,上面的代码注释中,有两个链接:,链接到的页面就是dalvik和art虚拟机的源代码,这就印证了我们的想法。那么我们就看看虚拟机对SoftReference的回收策略是如何的?

Virtual Machine GC

循着注释的链接,我们去看一下虚拟机GC对SoftReference的回收策略。选择dalvik虚拟部分去看:

/*
 * Walks the reference list marking any references subject to the
 * reference clearing policy.  References with a black referent are
 * removed from the list.  References with white referents biased
 * toward saving are blackened and also removed from the list.
 */
static void preserveSomeSoftReferences(Object **list)
{
    assert(list != NULL);
    GcMarkContext *ctx = &gDvm.gcHeap->markContext;
    size_t referentOffset = gDvm.offJavaLangRefReference_referent;
    Object *clear = NULL;
    size_t counter = 0;
    while (*list != NULL) {
        //从list中取出head
        Object *ref = dequeuePendingReference(list);
        //取出reference指向的对象
        Object *referent = dvmGetFieldObject(ref, referentOffset);
        if (referent == NULL) {
            /* Referent was cleared by the user during marking. */
            continue;
        }
        //判断对象是否被marked
        bool marked = isMarked(referent, ctx);
        //如果没有被marked,而且该对象是处于list的偶数位置,则为ture
        if (!marked && ((++counter) & 1)) {
            /* Referent is white and biased toward saving, mark it. */
            //mark对象
            markObject(referent, ctx);
            marked = true;
        }
        //依然没有mark的对象插入到clear list头中
        if (!marked) {
            /* Referent is white, queue it for clearing. */
            enqueuePendingReference(ref, &clear);
        }
    }
    //clear list复制给list
    *list = clear;
    /*
     * Restart the mark with the newly black references added to the
     * root set.
     */
    processMarkStack(ctx);
}

MarkSweep.cpp

总所周知,dalvik虚拟机的GC算法是Mark-Sweep,即回收过程主要分两部分,Mark阶段是对对象进行标记,Sweep阶段是对没有被标记的对象进行回收。
大概的分析可以知道上面的函数是主要作用是:遍历所有的SoftReference,位置是偶数位的进行Mark,奇数位的对象进入清除队列

究竟是不是这样,我们看调用该函数的地方:

/*
 * Process reference class instances and schedule finalizations.
 */
void dvmHeapProcessReferences(Object **softReferences, bool clearSoftRefs,
                              Object **weakReferences,
                              Object **finalizerReferences,
                              Object **phantomReferences)
{
    assert(softReferences != NULL);
    assert(weakReferences != NULL);
    assert(finalizerReferences != NULL);
    assert(phantomReferences != NULL);
    /*
     * Unless we are in the zygote or required to clear soft
     * references with white references, preserve some white
     * referents.
     */
    /*
     * 如果当前不是zygote进程,而且没有设置clearSoftRefs为true,则调用preserveSomeSoftReferences
     * 去mark 偶数位的SoftReference引用的对象
     */
if (!gDvm.zygote && !clearSoftRefs) {
        preserveSomeSoftReferences(softReferences);
    }
    /*
     * Clear all remaining soft and weak references with white
     * referents.
     */
clearWhiteReferences(softReferences);
    clearWhiteReferences(weakReferences);
    /*
     * Preserve all white objects with finalize methods and schedule
     * them for finalization.
     */
    enqueueFinalizerReferences(finalizerReferences);
    /*
     * Clear all f-reachable soft and weak references with white
     * referents.
     */
    clearWhiteReferences(softReferences);
    clearWhiteReferences(weakReferences);
    /*
     * Clear all phantom references with white referents.
     */
    clearWhiteReferences(phantomReferences);
    /*
     * At this point all reference lists should be empty.
     */
    assert(*softReferences == NULL);
    assert(*weakReferences == NULL);
    assert(*finalizerReferences == NULL);
    assert(*phantomReferences == NULL);
}


再看clearWhiteReferences方法:

/*
 * Unlink the reference list clearing references objects with white
 * referents.  Cleared references registered to a reference queue are
 * scheduled for appending by the heap worker thread.
 */
static void clearWhiteReferences(Object **list)
{
    assert(list != NULL);
    GcMarkContext *ctx = &gDvm.gcHeap->markContext;
    size_t referentOffset = gDvm.offJavaLangRefReference_referent;
    while (*list != NULL) {
        Object *ref = dequeuePendingReference(list);
        Object *referent = dvmGetFieldObject(ref, referentOffset);
        //没有被mark的对象,会被回收掉
        if (referent != NULL && !isMarked(referent, ctx)) {
            /* Referent is white, clear it. */
            clearReference(ref);
            if (isEnqueuable(ref)) {
                enqueueReference(ref);
            }
        }
    }
    assert(*list == NULL);
}


从上面的代码可以知道,preserveSomeSoftReferences函数的作用其实就是保留一部分的SoftReference引用的对象,另外一部分就会被垃圾回收掉,而这个策略就是位置的奇偶性。

然后我们回到,那段注释

 This means that as long as one of the soft references stays alive, they all stay alive. If we
 have two SoftReferences next to each other on the heap, both pointing to the same object, then
 we are guaranteed that neither will be cleared until we otherwise would have thrown an
 OutOfMemoryError. Since we can't strictly guarantee the location of objects on the heap, we use
 3 just to be on the safe side.


感觉有点恍然大悟,注释的意思就是说:如果两个SoftReference相邻(一奇一偶),那么这两个SoftReference引用的对象就不会被GC回收掉,但是,SoftReference的位置是不能够确定的,所以,为了“安全起见”,使用三个SoftReference(为什么不是10个)去引用对象,尽可能地防止被GC回收。

到此,我们基本明白这个OOMSoftReference为什么改这个名字了,目的就是防止对象被GC回收掉,那么,如果这样做不就真的容易引起OOM的发生吗?其实不然。原因就是前面提到的,“所有的SoftReference必定会在OOM发生前被回收”。

GC_BEFORE_OOM

继续追根究底,所有的真相都在代码中,下面这段代码是当需要分配任何一个对象内存时,都会调用的:

/* Try as hard as possible to allocate some memory.
 */
static void *tryMalloc(size_t size)
{
    void *ptr;

//TODO: figure out better heuristics
//    There will be a lot of churn if someone allocates a bunch of
//    big objects in a row, and we hit the frag case each time.
//    A full GC for each.
//    Maybe we grow the heap in bigger leaps
//    Maybe we skip the GC if the size is large and we did one recently
//      (number of allocations ago) (watch for thread effects)
//    DeflateTest allocs a bunch of ~128k buffers w/in 0-5 allocs of each other
//      (or, at least, there are only 0-5 objects swept each time)
    //尝试分配内存,分配成功则返回 
    ptr = dvmHeapSourceAlloc(size);
    if (ptr != NULL) {
        return ptr;
    }

    /*
     * The allocation failed.  If the GC is running, block until it
     * completes and retry.
     */
    //GC进行中,等待
    if (gDvm.gcHeap->gcRunning) {
        /*
         * The GC is concurrently tracing the heap.  Release the heap
         * lock, wait for the GC to complete, and retrying allocating.
         */
        dvmWaitForConcurrentGcToComplete();
    } else {
      /*
       * Try a foreground GC since a concurrent GC is not currently running.
       */
      //注意这里的参数是false
      gcForMalloc(false);
    }

    //尝试分配内存,分配成功则返回
    ptr = dvmHeapSourceAlloc(size);
    if (ptr != NULL) {
        return ptr;
    }

    /* Even that didn't work;  this is an exceptional state.
     * Try harder, growing the heap if necessary.
     */
    //尝试分配内存,不够分配Heap就自增,尝试分配,分配成功则返回
    ptr = dvmHeapSourceAllocAndGrow(size);
    if (ptr != NULL) {
        size_t newHeapSize;

        newHeapSize = dvmHeapSourceGetIdealFootprint();
//TODO: may want to grow a little bit more so that the amount of free
//      space is equal to the old free space + the utilization slop for
//      the new allocation.
        LOGI_HEAP("Grow heap (frag case) to "
                "%zu.%03zuMB for %zu-byte allocation",
                FRACTIONAL_MB(newHeapSize), size);
        return ptr;
    }

    /* Most allocations should have succeeded by now, so the heap
     * is really full, really fragmented, or the requested size is
     * really big.  Do another GC, collecting SoftReferences this
     * time.  The VM spec requires that all SoftReferences have
     * been collected and cleared before throwing an OOME.
     */
//TODO: wait for the finalizers from the previous GC to finish
    LOGI_HEAP("Forcing collection of SoftReferences for %zu-byte allocation",
            size);
    //注意这里的参数是true
    gcForMalloc(true);
    //尝试分配内存,不够分配Heap就自增,尝试分配,分配成功则返回
    ptr = dvmHeapSourceAllocAndGrow(size);
    if (ptr != NULL) {
        return ptr;
    }
//TODO: maybe wait for finalizers and try one last time

    LOGE_HEAP("Out of memory on a %zd-byte allocation.", size);
//TODO: tell the HeapSource to dump its state
    dvmDumpThread(dvmThreadSelf(), false);

    return NULL;
}


看看gcForMalloc函数:

/* Do a full garbage collection, which may grow the
 * heap as a side-effect if the live set is large.
 */
static void gcForMalloc(bool clearSoftReferences)
{
    if (gDvm.allocProf.enabled) {
        Thread* self = dvmThreadSelf();
        gDvm.allocProf.gcCount++;
        if (self != NULL) {
            self->allocProf.gcCount++;
        }
    }
    /* This may adjust the soft limit as a side-effect.
     */
    //clearSoftReferences为true,则GC类似为GC_BEFORE_OOM,否则为GC_FOR_MALLOC
    const GcSpec *spec = clearSoftReferences ? GC_BEFORE_OOM : GC_FOR_MALLOC;
    dvmCollectGarbageInternal(spec);
}
/*
 * Initiate garbage collection.
 *
 * NOTES:
 * - If we don't hold gDvm.threadListLock, it's possible for a thread to
 *   be added to the thread list while we work.  The thread should NOT
 *   start executing, so this is only interesting when we start chasing
 *   thread stacks.  (Before we do so, grab the lock.)
 *
 * We are not allowed to GC when the debugger has suspended the VM, which
 * is awkward because debugger requests can cause allocations.  The easiest
 * way to enforce this is to refuse to GC on an allocation made by the
 * JDWP thread -- we have to expand the heap or fail.
 */
void dvmCollectGarbageInternal(const GcSpec* spec)
{
    GcHeap *gcHeap = gDvm.gcHeap;
    u4 gcEnd = 0;
    u4 rootStart = 0 , rootEnd = 0;
    u4 dirtyStart = 0, dirtyEnd = 0;
    size_t numObjectsFreed, numBytesFreed;
    size_t currAllocated, currFootprint;
    size_t percentFree;
    int oldThreadPriority = INT_MAX;

    /* The heap lock must be held.
     */

    if (gcHeap->gcRunning) {
        LOGW_HEAP("Attempted recursive GC");
        return;
    }

    // Trace the beginning of the top-level GC.
    if (spec == GC_FOR_MALLOC) {
        ATRACE_BEGIN("GC (alloc)");
    } else if (spec == GC_CONCURRENT) {
        ATRACE_BEGIN("GC (concurrent)");
    } else if (spec == GC_EXPLICIT) {
        ATRACE_BEGIN("GC (explicit)");
    } else if (spec == GC_BEFORE_OOM) {
        ATRACE_BEGIN("GC (before OOM)");
    } else {
        ATRACE_BEGIN("GC (unknown)");
    }

    .............................
...................................

    /*
     * All strongly-reachable objects have now been marked.  Process
     * weakly-reachable objects discovered while tracing.
     */
dvmHeapProcessReferences(&gcHeap->softReferences,
spec->doPreserve == false,
                             &gcHeap->weakReferences,
                             &gcHeap->finalizerReferences,
                             &gcHeap->phantomReferences);

#if defined(WITH_JIT)
    /*
     * Patching a chaining cell is very cheap as it only updates 4 words. It's
     * the overhead of stopping all threads and synchronizing the I/D cache
     * that makes it expensive.
     *
     * Therefore we batch those work orders in a queue and go through them
     * when threads are suspended for GC.
     */
    dvmCompilerPerformSafePointChecks();
#endif

    LOGD_HEAP("Sweeping...");

    dvmHeapSweepSystemWeaks();

    /*
     * Live objects have a bit set in the mark bitmap, swap the mark
     * and live bitmaps.  The sweep can proceed concurrently viewing
     * the new live bitmap as the old mark bitmap, and vice versa.
     */
    dvmHeapSourceSwapBitmaps();

    .............................
..................................
}


再看看GC_BEFORE_OOM:


static const GcSpec kGcBeforeOomSpec = {
    false,  /* isPartial */
    false,  /* isConcurrent */
    false,  /* doPreserve 为false*/
    "GC_BEFORE_OOM"
};

const GcSpec *GC_BEFORE_OOM = &kGcBeforeOomSpec;



看完上面代码,基本了解了为什么说“所有的SoftReference必定会在OOM发生前被回收”。

原因是:当进程不断申请内存,如果一直申请不到(尝试了多次,Heap大小已经不能再增长),那么dalvik虚拟机会触发GC_BEFORE_OOM类似的回收方式,触发这种类型GC,会保证所有SoftReference引用的对象,都会被回收掉。

Conclusions



至此,三个SoftReference的谜团终于解开,至于为什么Fresco这样做,个人猜想是,Fresco希望尽量自己管理内存的分配和释放,所以要防止对象被回收掉,避免重新分配内存,起到缓存池的作用。那为什么不使用strong reference,因为在自己管理的同时可以保证在系统内存资源紧张时,能够依赖GC,释放掉SoftReference引用对象的内存,避免真的发生OOM。

对于Art部分的回收机制,这里就不在深入,基本差不多,有兴趣的自行深究。

https://zhuanlan.zhihu.com/p/24720906

原文地址:https://www.cnblogs.com/softidea/p/9600542.html