DeepLearning 写代码常用

日志

import logging
from logzero import logger
logzero.loglevel(logging.DEBUG)
logdir = os.path.join(args.output_dir, "logs")
os.makedirs(logdir, exist_ok=True)
logzero.logfile(os.path.join(logdir, f"bert_{int(time.time())}.log"))            

优化器与梯度裁剪

param_optimizer = list(self.model.named_parameters())
no_decay = ['bias', 'LayerNorm.bias', 'LayerNorm.weight']
optimizer_grouped_parameters = [
    {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01},
    {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)], 'weight_decay': 0.0}
    ]

        
optimizer.zero_grad()        
loss, hidden = model(data, hidden, targets)
loss.backward()
 
torch.nn.utils.clip_grad_norm(model.parameters(), args.clip)
optimizer.step()
原文地址:https://www.cnblogs.com/JohnRain/p/11615395.html