Tensorrt之tf-trt

使用python接口，另外一种方式就是使用tf-trt，优化后的模型还是pb。优化的过程主要是一些层的合并啊之类的，加速结果不是特别明显，测了两个网络，

加速了10%的样子。优化后仍是pb，因此可以继续用tfserving。

keras/tf model -> pb model ->(trt优化model)

或者已经是savedmodel，可直接通 saved_model_cli来转换，用于后续的tfserving

参考：

https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#usage-example

https://github.com/srihari-humbarwadi/TensorRT-for-keras

https://github.com/jeng1220/KerasToTensorRT

https://github.com/NVIDIA-AI-IOT/tf_trt_models

https://github.com/WeJay/TensorRTkeras

https://github.com/tensorflow/tensorrt/tree/master/tftrt/examples/image-classification

https://github.com/srihari-humbarwadi/TensorRT-for-keras

https://github.com/NVIDIA-AI-IOT/tf_trt_models/blob/master/examples/classification/classification.ipynb

https://developer.ibm.com/linuxonpower/2019/08/05/using-tensorrt-models-with-tensorflow-serving-on-wml-ce/

讨论区

https://devtalk.nvidia.com/default/board/304/tensorrt/

其他还有C++端的接口，暂是没用到

https://zhuanlan.zhihu.com/p/85365075

https://zhuanlan.zhihu.com/p/86827710

http://manaai.cn/aicodes_detail3.html?id=48