（原）tensorflow中函数执行完毕，显存不自动释放

转载请注明出处：

http://www.cnblogs.com/darkknightzh/p/7608916.html

参考网址：

https://stackoverflow.com/questions/39758094/clearing-tensorflow-gpu-memory-after-model-execution

https://github.com/tensorflow/tensorflow/issues/1727#issuecomment-285815312s

tensorflow中，在一个函数内配置完GPU，tf分配了显存，等函数执行完，显存不会释放（貌似torch7中也一样。。。）。第二个参考网址指出：

As for the original problem, currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down. Even if a second session chooses a different GPUOptions, it would not take effect.

第一个session对GPU初始化后，即便释放了显存，第二个sess使用不同的GPU选项来初始化GPU，也不会起效。

第一个网址Oli Blum指出，use processes and shut them down after the computation才能释放显存。具体代码如下（可以参考第一个网址）：

 1 import tensorflow as tf
 2 import multiprocessing
 3 import numpy as np
 4 
 5 def run_tensorflow():
 6 
 7     n_input = 10000
 8     n_classes = 1000
 9 
10     # Create model
11     def multilayer_perceptron(x, weight):
12         # Hidden layer with RELU activation
13         layer_1 = tf.matmul(x, weight)
14         return layer_1
15 
16     # Store layers weight & bias
17     weights = tf.Variable(tf.random_normal([n_input, n_classes]))
18 
19 
20     x = tf.placeholder("float", [None, n_input])
21     y = tf.placeholder("float", [None, n_classes])
22     pred = multilayer_perceptron(x, weights)
23 
24     cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
25     optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
26 
27     init = tf.global_variables_initializer()
28 
29     with tf.Session() as sess:
30         sess.run(init)
31 
32         for i in range(100):
33             batch_x = np.random.rand(10, 10000)
34             batch_y = np.random.rand(10, 1000)
35             sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})
36 
37     print "finished doing stuff with tensorflow!"
38 
39 
40 if __name__ == "__main__":
41 
42     # option 1: execute code with extra process
43     p = multiprocessing.Process(target=run_tensorflow)
44     p.start()
45     p.join()
46 
47     # wait until user presses enter key
48     raw_input()
49 
50     # option 2: just execute the function
51     run_tensorflow()
52 
53     # wait until user presses enter key
54     raw_input()

使用multiprocessing.Process运行run_tensorflow后，显存会自动释放，但是如果直接执行run_tensorflow，显存不会自动释放。当然，该函数计算量较小，如果显卡太好，可能看不到运行multiprocessing.Process后，显存分配、计算并释放的过程，感觉就像没有运行一样。。。