1.先说结论

nn.BCEWithLogitsLoss等于nn.BCELoss+nn.Sigmoid。

2.公式分解

*
BCEWithLogitsLoss

L o s s = { l 1 , . . . , l N } ,   l n = − [ y n ⋅ log ⁡ ( σ ( x n ) ) + ( 1
− y n ) ⋅ log ⁡ ( 1 − σ ( x n ) ) ] Loss = \{ l_1 , ... , l_N \} , \ l_n = - [
y_n \cdot \log ( \sigma { ( x_n ) }) + ( 1 - y_n ) \cdot \log ( 1 - \sigma { (
x_n ) } ) ]Loss={l1​,...,lN​}, ln​=−[yn​⋅log(σ(xn​))+(1−yn​)⋅log(1−σ(xn​))]

σ ( x ) = 1 1 + exp ⁡ ( − x ) \sigma ( x ) = \frac { 1 } { 1 + \exp ( -x ) } σ
(x)=1+exp(−x)1​

*
BCELoss

L o s s = { l 1 , . . . , l N } ,   l n = − [ y n ⋅ log ⁡ ( x n ) + ( 1 − y n
) ⋅ log ⁡ ( 1 − x n ) ] Loss = \{ l_1 , ... , l_N \} , \ l_n = - [ y_n \cdot
\log ( x_n ) + ( 1 - y_n ) \cdot \log ( 1 - x_n ) ]Loss={l1​,...,lN​}, ln​=−[yn​
⋅log(xn​)+(1−yn​)⋅log(1−xn​)]

3.实验代码
# 随机初始化label值，两个Batch，每个含3个标签 label = torch.empty((2, 3)).random_(2) #

[0., 1., 1.]]) # 随机初始化x值，代表模型的预测值 x = torch.randn((2, 3)) # tensor([[-0.6117,
0.1446, 0.0415], # [-1.5376, -0.2599, -0.9680]]) sigmoid = nn.Sigmoid() x1 =
sigmoid(x) # 归一化至 (0, 1)区间 # tensor([[0.3517, 0.5361, 0.5104], # [0.1769,
0.4354, 0.2753]]) bceloss = nn.BCELoss() bceloss(x1, label) # tensor(0.6812) #

bce_with_logits_loss(x, label) # tensor(0.6812)
4.log-sum-exp数值稳定

x = torch.tensor(1e+10) x1 = sigmoid(x) # tensor(1.) label = torch.tensor(1.)
bceloss(x1, label) # tensor(0.) bce_with_logits_loss(x, label) # tensor(0.)

GitHub

Gitee