Loss函数总结


  • 核心层

    Loss函数总结

    花了快一周的时间看了很多的Loss函数,分析的还不是很仔细,后面使用过程中有心得会持续更新,也欢迎大家补充和讨论。

    Log Loss

    L(Y,P(YX))=logP(YX)L(Y,P(Y|X)) = -logP(Y|X)

    • 其基本思想就是让事情发生的概率最大,使用极大似然估计来更新参数

    Logistic Loss

    • logistic loss可以看作Log loss函数针对二分类问题的一个特例

    KL Divergence Loss

    • 与Logistic loss之间相差一个常数

    Exponential Loss

    L(y,f(x))=exp[yf(x)]L(y,f(x)) = exp[-yf(x)]

    • 主要用于Adaboost当中

    Hinge Loss

    L(y,f(x))=max(0,1yf(x))L(y,f(x)) = max(0,1-yf(x))

    • 主要用于svm当中

    Focal Loss

    L(pt)=αt(1pt)γlog(pt)L(p_t) = -\alpha_t(1-p_t)^\gamma log(p_t)

    • 论文链接https://arxiv.org/pdf/1708.02002.pdf
    • 缓解类别不均衡的问题:由αt\alpha_t解决,给不同的类别配上不同的权重
    • 让Loss更加关注分错的类别:由(1pt)γ(1-p_t)^\gamma完成,使得错的更加离谱的类别更加被关注
    • 可以在训练过程当中动态的调整γ\gamma

    Large-Margin Softmax Loss

    Li=log(ewyixiψ(θyi)ewyixiψ(θyi)+jyiewjxicos(θj))L_i = -log(\frac{e^{||w_{y_i}|| ||x_i||\psi(\theta_{y_i})}}{e^{||w_{y_i}|| ||x_i||\psi(\theta_{y_i})}+\sum_{j\neq y_i}e^{||w_j|| ||x_i||cos(\theta_j)}})

    • 其中ψ(θyi)\psi(\theta_{y_i})是一个单调递减的函数,由参数m控制,m越大,模型越需要发现不同类别中的细微差别特征
    • 可参考的函数
      ψ(θ)=(1)kcos(mθ)2k\psi(\theta)=(-1)^kcos(m\theta)-2k
    • 论文链接https://arxiv.org/pdf/1612.02295.pdf

    Center Loss

    Lc=12i=1mxicyi2L_c = \frac{1}{2}\sum_{i=1}^{m}||x_i-c_{y_i}||^2
    Δcj=i=1mδ(yi=j)<em>(cjxi)1+i=1mδ(yi=j)\Delta c_j=\frac{\sum_{i=1}^m\delta(y_i=j)<em>(c_j-x_i)}{1+\sum_{i=1}^m\delta(y_i=j)}
    cjt+1=cjtα</em>Δcjtc_j^{t+1}=c_j^t-\alpha</em>\Delta c_j^t

    Triplet Loss

    L=iN[f(xia)f(xip)2f(xiaf(xin))2+a]+L=\sum_i^N[||f(x_i^a)-f(x_i^p)||^2-||f(x_i^a-f(x_i^n))||^2+a]_+

    • +表示L小于0的时候L为0
    • 模型输入为一个三元组,[本样本,正样本,负样本]d s
    • 目标:增大正负样本之间的距离
    • 论文链接:(https://arxiv.org/pdf/1503.03832.pdf)[https://arxiv.org/pdf/1503.03832.pdf]

    Soft Distillation Softmax Loss

    L=log(ezi/Tjezj/T)L=-log(\frac{e^{z_i/T}}{\sum_j e^{z_j/T}})

    Soft-Margin Softmax Loss

    L=log(ewyiTmewyiTm+jyiewjTm)L=-log(\frac{e^{w^T_{y_i}-m}}{e^{w^T_{y_i}-m}+\sum_{j\neq y_i}e^{w_j^T-m}})

    Angular Softmax Loss

    Li=log(exiψ(θyi)exiψ(θyi)+jyiexicos(θj))L_i = -log(\frac{e^{||x_i||\psi(\theta_{y_i})}}{e^{||x_i||\psi(\theta_{y_i})}+\sum_{j\neq y_i}e^{||x_i||cos(\theta_j)}})

    L2-constrained Softmax Loss

    L=log(ewyiTf(xi)+byijewjTf(xi)+byi)L=-log(\frac{e^{w_{y_i}^T}f(x_i)+b_{y_i}}{\sum_je^{w_{j}^T}f(x_i)+b_{y_i}})
    f(xi)2=α||f(x_i)||_2=\alpha

    Large Margin Cosine Margin

    Li=log(es(cos(θyi,i)m)es(cos(θyi,i)m)+jyies(cos(θj,i)))L_i = -log(\frac{e^{s(cos(\theta_{y_i},i)-m)}}{e^{s(cos(\theta_{y_i},i)-m)}+\sum_{j\neq y_i}e^{s(cos(\theta_{j},i))}})

    Additive Margin Softmax Loss

    Li=log(es(wyiTf(xi)m)es(wyiTf(xi)m)+jyies(wjTf(xi)))L_i = -log(\frac{e^{s(w_{y_i}^Tf(x_i)-m)}}{e^{s(w_{y_i}^Tf(x_i)-m)}+\sum_{j\neq y_i}e^{s(w_{j}^Tf(x_i))}})

    Angular Triple Loss

    L=[xaxp24tan2αxnxc2]+L=[||x_a-x_p||^2-4tan^2\alpha||x_n-x_c||^2]_+

    Coco Loss

    Lrevise=ec(fi,cli)ec(fi,cm)L^{revise}=\sum \frac{e^{c(f^i,c_{l_i})}}{\sum e^{c(f^i,c_m)}}

    Large-Margin Gaussian Mixture Loss

    Lcls=logN(xi;uzi,zi)p(zi)]k=1kN(xi;uk,k)p(k)L_{cls}=-log\frac{N(x_i;u_{z_i},\sum_{z_i})p(z_i)]}{\sum_{k=1}^{k}N(x_i;u_{k},\sum_{k})p(k)}
    Llkd=logN(xi;uzi,zi)L_{lkd}=-logN(x_i;u_{z_i},\sum_{z_i})
    LGM=Lcls+LlkdL_{GM}=L_{cls}+L_{lkd}

    Contextual Loss

    dij=1(xiuy)(yiuy)xiuy<em>2yjuy<em>2d_{ij}=1-\frac{(x_i-u_y)(y_i-u_y)}{||x_i-u_y||<em>2||y_j-u_y||<em>2}
    uy=1Njyju_y=\frac{1}{N}\sum_jy_j
    w</em>ij=exp(1d</em>ijminkdik+αh)w</em>{ij}=exp(\frac{1-\frac{d</em>{ij}}{min_kd_{ik}+\alpha}}{h})
    CXij=wij/kwikCX_{ij}=w_{ij}/\sum_kw_{ik}
    Lcx=logCXf(x),f(y)L_{cx}=-logCX_{f(x),f(y)}


 

Copyright © 2018 bbs.dian.org.cn All rights reserved.

与 Dian 的连接断开,我们正在尝试重连,请耐心等待