hunto 发布的最佳帖子

hunto

前言

移动端人脸关键点识别项目目前已经取得了不错的进展，大体的系统框架也已确定，从现在起我会陆续发表一系列的文章详细介绍移动端人脸关键点实现的技术，内容涵盖Object Detection、Face Landmark、移动端CNN模型、模型量化加速、神经网络边缘计算框架的对比与选择等，为目前网络上没有的对实时人脸关键点任务的综述，干货满满。

移动端实时人脸关键点识别需要解决的几大任务有：人脸定位 (Face Detection)、人脸关键点检测 (Face Landmark)、移动端网络计算，因此我们的文章也从这几个方面开始展开。

文章列表

Object Detection

移动端实时人脸关键点识别综述1.0 - Object Detection综述
移动端实时人脸关键点识别综述1.1 - Object Detection综述之Faster RCNN
移动端实时人脸关键点识别综述1.2 - Object Detection综述之YOLO
移动端实时人脸关键点识别综述1.3 - Object Detection综述之SSD
移动端实时人脸关键点识别综述1.4 - Object Detection综述之MTCNN

Face Landmark

移动端实时人脸关键点识别综述2.0 - Landmark综述
移动端实时人脸关键点识别综述2.1 - Landmark综述之数据处理
移动端实时人脸关键点识别综述2.2 - 移动端基础图像模型
移动端实时人脸关键点识别综述2.4 - Landmark综述之Loss
移动端实时人脸关键点识别综述2.3 - Landmark综述之网络的改进

Network Computation & Optimization

移动端实时人脸关键点识别综述3.0 - 神经网络边缘计算综述
移动端实时人脸关键点识别综述3.1 - 神经网络边缘计算框架的选择
移动端实时人脸关键点识别综述3.2 - 模型量化
移动端实时人脸关键点识别综述4 - Tips & Tricks

DianFace demo 1	DianFace demo 2

hunto

代码又不会打，模型又写不好，卷积又不会卷积，只有整理基础资源才能维持得了生活这样子

相关资源帖：留一个可靠点的停词表-miracle

网上的各种中文汉字表各不相同，各有千秋，这里分享一份我认为不错的汉字表，这种表见仁见智了，有的可能你觉得不太合适的只管自己剃了去用就好~ 如果没有找到其他合适的表，不妨试试这个：

一 乙 二 十 丁 厂 七 卜 人 入 八 九 几 儿 了 力 乃 刀 又 三 于 干 亏 士 工 土 才 寸 下 大 丈 与 万 上 小 口 巾 山 千 乞 川 亿 个 勺 久 凡 及 夕 丸 么 广 亡 门 义 之 尸 弓 己 已 子 卫 也 女 飞 刃 习 叉 马 乡 丰 王 井 开 夫 天 无 元 专 云 扎 艺 木 五 支 厅 不 太 犬 区 历 尤 友 匹 车 巨 牙 屯 比 互 切 瓦 止 少 日 中 冈 贝 内 水 见 午 牛 手 毛 气 升 长 仁 什 片 仆 化 仇 币 仍 仅 斤 爪 反 介 父 从 今 凶 分 乏 公 仓 月 氏 勿 欠 风 丹 匀 乌 凤 勾 文 六 方 火 为 斗 忆 订 计 户 认 心 尺 引 丑 巴 孔 队 办 以 允 予 劝 双 书 幻 玉 刊 示 末 未 击 打 巧 正 扑 扒 功 扔 去 甘 世 古 节 本 术 可 丙 左 厉 右 石 布 龙 平 灭 轧 东 卡 北 占 业 旧 帅 归 且 旦 目 叶 甲 申 叮 电 号 田 由 史 只 央 兄 叼 叫 另 叨 叹 四 生 失 禾 丘 付 仗 代 仙 们 仪 白 仔 他 斥 瓜 乎 丛 令 用 甩 印 乐 句 匆 册 犯 外 处 冬 鸟 务 包 饥 主 市 立 闪 兰 半 汁 汇 头 汉 宁 穴 它 讨 写 让 礼 训 必 议 讯 记 永 司 尼 民 出 辽 奶 奴 加 召 皮 边 发 孕 圣 对 台 矛 纠 母 幼 丝 式 刑 动 扛 寺 吉 扣 考 托 老 执 巩 圾 扩 扫 地 扬 场 耳 共 芒 亚 芝 朽 朴 机 权 过 臣 再 协 西 压 厌 在 有 百 存 而 页 匠 夸 夺 灰 达 列 死 成 夹 轨 邪 划 迈 毕 至 此 贞 师 尘 尖 劣 光 当 早 吐 吓 虫 曲 团 同 吊 吃 因 吸 吗 屿 帆 岁 回 岂 刚 则 肉 网 年 朱 先 丢 舌 竹 迁 乔 伟 传 乒 乓 休 伍 伏 优 伐 延 件 任 伤 价 份 华 仰 仿 伙 伪 自 血 向 似 后 行 舟 全 会 杀 合 兆 企 众 爷 伞 创 肌 朵 杂 危 旬 旨 负 各 名 多 争 色 壮 冲 冰 庄 庆 亦 刘 齐 交 次 衣 产 决 充 妄 闭 问 闯 羊 并 关 米 灯 州 汗 污 江 池 汤 忙 兴 宇 守 宅 字 安 讲 军 许 论 农 讽 设 访 寻 那 迅 尽 导 异 孙 阵 阳 收 阶 阴 防 奸 如 妇 好 她 妈 戏 羽 观 欢 买 红 纤 级 约 纪 驰 巡 寿 弄 麦 形 进 戒 吞 远 违 运 扶 抚 坛 技 坏 扰 拒 找 批 扯 址 走 抄 坝 贡 攻 赤 折 抓 扮 抢 孝 均 抛 投 坟 抗 坑 坊 抖 护 壳 志 扭 块 声 把 报 却 劫 芽 花 芹 芬 苍 芳 严 芦 劳 克 苏 杆 杠 杜 材 村 杏 极 李 杨 求 更 束 豆 两 丽 医 辰 励 否 还 歼 来 连 步 坚 旱 盯 呈 时 吴 助 县 里 呆 园 旷 围 呀 吨 足 邮 男 困 吵 串 员 听 吩 吹 呜 吧 吼 别 岗 帐 财 针 钉 告 我 乱 利 秃 秀 私 每 兵 估 体 何 但 伸 作 伯 伶 佣 低 你 住 位 伴 身 皂 佛 近 彻 役 返 余 希 坐 谷 妥 含 邻 岔 肝 肚 肠 龟 免 狂 犹 角 删 条 卵 岛 迎 饭 饮 系 言 冻 状 亩 况 床 库 疗 应 冷 这 序 辛 弃 冶 忘 闲 间 闷 判 灶 灿 弟 汪 沙 汽 沃 泛 沟 没 沈 沉 怀 忧 快 完 宋 宏 牢 究 穷 灾 良 证 启 评 补 初 社 识 诉 诊 词 译 君 灵 即 层 尿 尾 迟 局 改 张 忌 际 陆 阿 陈 阻 附 妙 妖 妨 努 忍 劲 鸡 驱 纯 纱 纳 纲 驳 纵 纷 纸 纹 纺 驴 纽 奉 玩 环 武 青 责 现 表 规 抹 拢 拔 拣 担 坦 押 抽 拐 拖 拍 者 顶 拆 拥 抵 拘 势 抱 垃 拉 拦 拌 幸 招 坡 披 拨 择 抬 其 取 苦 若 茂 苹 苗 英 范 直 茄 茎 茅 林 枝 杯 柜 析 板 松 枪 构 杰 述 枕 丧 或 画 卧 事 刺 枣 雨 卖 矿 码 厕 奔 奇 奋 态 欧 垄 妻 轰 顷 转 斩 轮 软 到 非 叔 肯 齿 些 虎 虏 肾 贤 尚 旺 具 果 味 昆 国 昌 畅 明 易 昂 典 固 忠 咐 呼 鸣 咏 呢 岸 岩 帖 罗 帜 岭 凯 败 贩 购 图 钓 制 知 垂 牧 物 乖 刮 秆 和 季 委 佳 侍 供 使 例 版 侄 侦 侧 凭 侨 佩 货 依 的 迫 质 欣 征 往 爬 彼 径 所 舍 金 命 斧 爸 采 受 乳 贪 念 贫 肤 肺 肢 肿 胀 朋 股 肥 服 胁 周 昏 鱼 兔 狐 忽 狗 备 饰 饱 饲 变 京 享 店 夜 庙 府 底 剂 郊 废 净 盲 放 刻 育 闸 闹 郑 券 卷 单 炒 炊 炕 炎 炉 沫 浅 法 泄 河 沾 泪 油 泊 沿 泡 注 泻 泳 泥 沸 波 泼 泽 治 怖 性 怕 怜 怪 学 宝 宗 定 宜 审 宙 官 空 帘 实 试 郎 诗 肩 房 诚 衬 衫 视 话 诞 询 该 详 建 肃 录 隶 居 届 刷 屈 弦 承 孟 孤 陕 降 限 妹 姑 姐 姓 始 驾 参 艰 线 练 组 细 驶 织 终 驻 驼 绍 经 贯 奏 春 帮 珍 玻 毒 型 挂 封 持 项 垮 挎 城 挠 政 赴 赵 挡 挺 括 拴 拾 挑 指 垫 挣 挤 拼 挖 按 挥 挪 某 甚 革 荐 巷 带 草 茧 茶 荒 茫 荡 荣 故 胡 南 药 标 枯 柄 栋 相 查 柏 柳 柱 柿 栏 树 要 咸 威 歪 研 砖 厘 厚 砌 砍 面 耐 耍 牵 残 殃 轻 鸦 皆 背 战 点 临 览 竖 省 削 尝 是 盼 眨 哄 显 哑 冒 映 星 昨 畏 趴 胃 贵 界 虹 虾 蚁 思 蚂 虽 品 咽 骂 哗 咱 响 哈 咬 咳 哪 炭 峡 罚 贱 贴 骨 钞 钟 钢 钥 钩 卸 缸 拜 看 矩 怎 牲 选 适 秒 香 种 秋 科 重 复 竿 段 便 俩 贷 顺 修 保 促 侮 俭 俗 俘 信 皇 泉 鬼 侵 追 俊 盾 待 律 很 须 叙 剑 逃 食 盆 胆 胜 胞 胖 脉 勉 狭 狮 独 狡 狱 狠 贸 怨 急 饶 蚀 饺 饼 弯 将 奖 哀 亭 亮 度 迹 庭 疮 疯 疫 疤 姿 亲 音 帝 施 闻 阀 阁 差 养 美 姜 叛 送 类 迷 前 首 逆 总 炼 炸 炮 烂 剃 洁 洪 洒 浇 浊 洞 测 洗 活 派 洽 染 济 洋 洲 浑 浓 津 恒 恢 恰 恼 恨 举 觉 宣 室 宫 宪 突 穿 窃 客 冠 语 扁 袄 祖 神 祝 误 诱 说 诵 垦 退 既 屋 昼 费 陡 眉 孩 除 险 院 娃 姥 姨 姻 娇 怒 架 贺 盈 勇 怠 柔 垒 绑 绒 结 绕 骄 绘 给 络 骆 绝 绞 统 耕 耗 艳 泰 珠 班 素 蚕 顽 盏 匪 捞 栽 捕 振 载 赶 起 盐 捎 捏 埋 捉 捆 捐 损 都 哲 逝 捡 换 挽 热 恐 壶 挨 耻 耽 恭 莲 莫 荷 获 晋 恶 真 框 桂 档 桐 株 桥 桃 格 校 核 样 根 索 哥 速 逗 栗 配 翅 辱 唇 夏 础 破 原 套 逐 烈 殊 顾 轿 较 顿 毙 致 柴 桌 虑 监 紧 党 晒 眠 晓 鸭 晃 晌 晕 蚊 哨 哭 恩 唤 啊 唉 罢 峰 圆 贼 贿 钱 钳 钻 铁 铃 铅 缺 氧 特 牺 造 乘 敌 秤 租 积 秧 秩 称 秘 透 笔 笑 笋 债 借 值 倚 倾 倒 倘 俱 倡 候 俯 倍 倦 健 臭 射 躬 息 徒 徐 舰 舱 般 航 途 拿 爹 爱 颂 翁 脆 脂 胸 胳 脏 胶 脑 狸 狼 逢 留 皱 饿 恋 桨 浆 衰 高 席 准 座 脊 症 病 疾 疼 疲 效 离 唐 资 凉 站 剖 竞 部 旁 旅 畜 阅 羞 瓶 拳 粉 料 益 兼 烤 烘 烦 烧 烛 烟 递 涛 浙 涝 酒 涉 消 浩 海 涂 浴 浮 流 润 浪 浸 涨 烫 涌 悟 悄 悔 悦 害 宽 家 宵 宴 宾 窄 容 宰 案 请 朗 诸 读 扇 袜 袖 袍 被 祥 课 谁 调 冤 谅 谈 谊 剥 恳 展 剧 屑 弱 陵 陶 陷 陪 娱 娘 通 能 难 预 桑 绢 绣 验 继 球 理 捧 堵 描 域 掩 捷 排 掉 堆 推 掀 授 教 掏 掠 培 接 控 探 据 掘 职 基 著 勒 黄 萌 萝 菌 菜 萄 菊 萍 菠 营 械 梦 梢 梅 检 梳 梯 桶 救 副 票 戚 爽 聋 袭 盛 雪 辅 辆 虚 雀 堂 常 匙 晨 睁 眯 眼 悬 野 啦 晚 啄 距 跃 略 蛇 累 唱 患 唯 崖 崭 崇 圈 铜 铲 银 甜 梨 犁 移 笨 笼 笛 符 第 敏 做 袋 悠 偿 偶 偷 您 售 停 偏 假 得 衔 盘 船 斜 盒 鸽 悉 欲 彩 领 脚 脖 脸 脱 象 够 猜 猪 猎 猫 猛 馅 馆 凑 减 毫 麻 痒 痕 廊 康 庸 鹿 盗 章 竟 商 族 旋 望 率 着 盖 粘 粗 粒 断 剪 兽 清 添 淋 淹 渠 渐 混 渔 淘 液 淡 深 婆 梁 渗 情 惜 惭 悼 惧 惕 惊 惨 惯 寇 寄 宿 窑 密 谋 谎 祸 谜 逮 敢 屠 弹 随 蛋 隆 隐 婚 婶 颈 绩 绪 续 骑 绳 维 绵 绸 绿 琴 斑 替 款 堪 搭 塔 越 趁 趋 超 提 堤 博 揭 喜 插 揪 搜 煮 援 裁 搁 搂 搅 握 揉 斯 期 欺 联 散 惹 葬 葛 董 葡 敬 葱 落 朝 辜 葵 棒 棋 植 森 椅 椒 棵 棍 棉 棚 棕 惠 惑 逼 厨 厦 硬 确 雁 殖 裂 雄 暂 雅 辈 悲 紫 辉 敞 赏 掌 晴 暑 最 量 喷 晶 喇 遇 喊 景 践 跌 跑 遗 蛙 蛛 蜓 喝 喂 喘 喉 幅 帽 赌 赔 黑 铸 铺 链 销 锁 锄 锅 锈 锋 锐 短 智 毯 鹅 剩 稍 程 稀 税 筐 等 筑 策 筛 筒 答 筋 筝 傲 傅 牌 堡 集 焦 傍 储 奥 街 惩 御 循 艇 舒 番 释 禽 腊 脾 腔 鲁 猾 猴 然 馋 装 蛮 就 痛 童 阔 善 羡 普 粪 尊 道 曾 焰 港 湖 渣 湿 温 渴 滑 湾 渡 游 滋 溉 愤 慌 惰 愧 愉 慨 割 寒 富 窜 窝 窗 遍 裕 裤 裙 谢 谣 谦 属 屡 强 粥 疏 隔 隙 絮 嫂 登 缎 缓 编 骗 缘 瑞 魂 肆 摄 摸 填 搏 塌 鼓 摆 携 搬 摇 搞 塘 摊 蒜 勤 鹊 蓝 墓 幕 蓬 蓄 蒙 蒸 献 禁 楚 想 槐 榆 楼 概 赖 酬 感 碍 碑 碎 碰 碗 碌 雷 零 雾 雹 输 督 龄 鉴 睛 睡 睬 鄙 愚 暖 盟 歇 暗 照 跨 跳 跪 路 跟 遣 蛾 蜂 嗓 置 罪 罩 错 锡 锣 锤 锦 键 锯 矮 辞 稠 愁 筹 签 简 毁 舅 鼠 催 傻 像 躲 微 愈 遥 腰 腥 腹 腾 腿 触 解 酱 痰 廉 新 韵 意 粮 数 煎 塑 慈 煤 煌 满 漠 源 滤 滥 滔 溪 溜 滚 滨 粱 滩 慎 誉 塞 谨 福 群 殿 辟 障 嫌 嫁 叠 缝 缠 静 碧 璃 墙 撇 嘉 摧 截 誓 境 摘 摔 聚 蔽 慕 暮 蔑 模 榴 榜 榨 歌 遭 酷 酿 酸 磁 愿 需 弊 裳 颗 嗽 蜻 蜡 蝇 蜘 赚 锹 锻 舞 稳 算 箩 管 僚 鼻 魄 貌 膜 膊 膀 鲜 疑 馒 裹 敲 豪 膏 遮 腐 瘦 辣 竭 端 旗 精 歉 熄 熔 漆 漂 漫 滴 演 漏 慢 寨 赛 察 蜜 谱 嫩 翠 熊 凳 骡 缩 慧 撕 撒 趣 趟 撑 播 撞 撤 增 聪 鞋 蕉 蔬 横 槽 樱 橡 飘 醋 醉 震 霉 瞒 题 暴 瞎 影 踢 踏 踩 踪 蝶 蝴 嘱 墨 镇 靠 稻 黎 稿 稼 箱 箭 篇 僵 躺 僻 德 艘 膝 膛 熟 摩 颜 毅 糊 遵 潜 潮 懂 额 慰 劈 操 燕 薯 薪 薄 颠 橘 整 融 醒 餐 嘴 蹄 器 赠 默 镜 赞 篮 邀 衡 膨 雕 磨 凝 辨 辩 糖 糕 燃 澡 激 懒 壁 避 缴 戴 擦 鞠 藏 霜 霞 瞧 蹈 螺 穗 繁 辫 赢 糟 糠 燥 臂 翼 骤 鞭 覆 蹦 镰 翻 鹰 警 攀 蹲 颤 瓣 爆 疆 壤 耀 躁 嚼 嚷 籍 魔 灌 蠢 霸 露 囊 罐 匕 刁 丐 歹 戈 夭 仑 讥 冗 邓 艾 夯 凸 卢 叭 叽 皿 凹 囚 矢 乍 尔 冯 玄 邦 迂 邢 芋 芍 吏 夷 吁 吕 吆 屹 廷 迄 臼 仲 伦 伊 肋 旭 匈 凫 妆 亥 汛 讳 讶 讹 讼 诀 弛 阱 驮 驯 纫 玖 玛 韧 抠 扼 汞 扳 抡 坎 坞 抑 拟 抒 芙 芜 苇 芥 芯 芭 杖 杉 巫 杈 甫 匣 轩 卤 肖 吱 吠 呕 呐 吟 呛 吻 吭 邑 囤 吮 岖 牡 佑 佃 伺 囱 肛 肘 甸 狈 鸠 彤 灸 刨 庇 吝 庐 闰 兑 灼 沐 沛 汰 沥 沦 汹 沧 沪 忱 诅 诈 罕 屁 坠 妓 姊 妒 纬 玫 卦 坷 坯 拓 坪 坤 拄 拧 拂 拙 拇 拗 茉 昔 苛 苫 苟 苞 茁 苔 枉 枢 枚 枫 杭 郁 矾 奈 奄 殴 歧 卓 昙 哎 咕 呵 咙 呻 咒 咆 咖 帕 账 贬 贮 氛 秉 岳 侠 侥 侣 侈 卑 刽 刹 肴 觅 忿 瓮 肮 肪 狞 庞 疟 疙 疚 卒 氓 炬 沽 沮 泣 泞 泌 沼 怔 怯 宠 宛 衩 祈 诡 帚 屉 弧 弥 陋 陌 函 姆 虱 叁 绅 驹 绊 绎 契 贰 玷 玲 珊 拭 拷 拱 挟 垢 垛 拯 荆 茸 茬 荚 茵 茴 荞 荠 荤 荧 荔 栈 柑 栅 柠 枷 勃 柬 砂 泵 砚 鸥 轴 韭 虐 昧 盹 咧 昵 昭 盅 勋 哆 咪 哟 幽 钙 钝 钠 钦 钧 钮 毡 氢 秕 俏 俄 俐 侯 徊 衍 胚 胧 胎 狰 饵 峦 奕 咨 飒 闺 闽 籽 娄 烁 炫 洼 柒 涎 洛 恃 恍 恬 恤 宦 诫 诬 祠 诲 屏 屎 逊 陨 姚 娜 蚤 骇 耘 耙 秦 匿 埂 捂 捍 袁 捌 挫 挚 捣 捅 埃 耿 聂 荸 莽 莱 莉 莹 莺 梆 栖 桦 栓 桅 桩 贾 酌 砸 砰 砾 殉 逞 哮 唠 哺 剔 蚌 蚜 畔 蚣 蚪 蚓 哩 圃 鸯 唁 哼 唆 峭 唧 峻 赂 赃 钾 铆 氨 秫 笆 俺 赁 倔 殷 耸 舀 豺 豹 颁 胯 胰 脐 脓 逛 卿 鸵 鸳 馁 凌 凄 衷 郭 斋 疹 紊 瓷 羔 烙 浦 涡 涣 涤 涧 涕 涩 悍 悯 窍 诺 诽 袒 谆 祟 恕 娩 骏 琐 麸 琉 琅 措 捺 捶 赦 埠 捻 掐 掂 掖 掷 掸 掺 勘 聊 娶 菱 菲 萎 菩 萤 乾 萧 萨 菇 彬 梗 梧 梭 曹 酝 酗 厢 硅 硕 奢 盔 匾 颅 彪 眶 晤 曼 晦 冕 啡 畦 趾 啃 蛆 蚯 蛉 蛀 唬 唾 啤 啥 啸 崎 逻 崔 崩 婴 赊 铐 铛 铝 铡 铣 铭 矫 秸 秽 笙 笤 偎 傀 躯 兜 衅 徘 徙 舶 舷 舵 敛 翎 脯 逸 凰 猖 祭 烹 庶 庵 痊 阎 阐 眷 焊 焕 鸿 涯 淑 淌 淮 淆 渊 淫 淳 淤 淀 涮 涵 惦 悴 惋 寂 窒 谍 谐 裆 袱 祷 谒 谓 谚 尉 堕 隅 婉 颇 绰 绷 综 绽 缀 巢 琳 琢 琼 揍 堰 揩 揽 揖 彭 揣 搀 搓 壹 搔 葫 募 蒋 蒂 韩 棱 椰 焚 椎 棺 榔 椭 粟 棘 酣 酥 硝 硫 颊 雳 翘 凿 棠 晰 鼎 喳 遏 晾 畴 跋 跛 蛔 蜒 蛤 鹃 喻 啼 喧 嵌 赋 赎 赐 锉 锌 甥 掰 氮 氯 黍 筏 牍 粤 逾 腌 腋 腕 猩 猬 惫 敦 痘 痢 痪 竣 翔 奠 遂 焙 滞 湘 渤 渺 溃 溅 湃 愕 惶 寓 窖 窘 雇 谤 犀 隘 媒 媚 婿 缅 缆 缔 缕 骚 瑟 鹉 瑰 搪 聘 斟 靴 靶 蓖 蒿 蒲 蓉 楔 椿 楷 榄 楞 楣 酪 碘 硼 碉 辐 辑 频 睹 睦 瞄 嗜 嗦 暇 畸 跷 跺 蜈 蜗 蜕 蛹 嗅 嗡 嗤 署 蜀 幌 锚 锥 锨 锭 锰 稚 颓 筷 魁 衙 腻 腮 腺 鹏 肄 猿 颖 煞 雏 馍 馏 禀 痹 廓 痴 靖 誊 漓 溢 溯 溶 滓 溺 寞 窥 窟 寝 褂 裸 谬 媳 嫉 缚 缤 剿 赘 熬 赫 蔫 摹 蔓 蔗 蔼 熙 蔚 兢 榛 榕 酵 碟 碴 碱 碳 辕 辖 雌 墅 嘁 踊 蝉 嘀 幔 镀 舔 熏 箍 箕 箫 舆 僧 孵 瘩 瘟 彰 粹 漱 漩 漾 慷 寡 寥 谭 褐 褪 隧 嫡 缨 撵 撩 撮 撬 擒 墩 撰 鞍 蕊 蕴 樊 樟 橄 敷 豌 醇 磕 磅 碾 憋 嘶 嘲 嘹 蝠 蝎 蝌 蝗 蝙 嘿 幢 镊 镐 稽 篓 膘 鲤 鲫 褒 瘪 瘤 瘫 凛 澎 潭 潦 澳 潘 澈 澜 澄 憔 懊 憎 翩 褥 谴 鹤 憨 履 嬉 豫 缭 撼 擂 擅 蕾 薛 薇 擎 翰 噩 橱 橙 瓢 蟥 霍 霎 辙 冀 踱 蹂 蟆 螃 螟 噪 鹦 黔 穆 篡 篷 篙 篱 儒 膳 鲸 瘾 瘸 糙 燎 濒 憾 懈 窿 缰 壕 藐 檬 檐 檩 檀 礁 磷 了 瞬 瞳 瞪 曙 蹋 蟋 蟀 嚎 赡 镣 魏 簇 儡 徽 爵 朦 臊 鳄 糜 癌 懦 豁 臀 藕 藤 瞻 嚣 鳍 癞 瀑 襟 璧 戳 攒 孽 蘑 藻 鳖 蹭 蹬 簸 簿 蟹 靡 癣 羹 鬓 攘 蠕 巍 鳞 糯 譬 霹 躏 髓 蘸 镶 瓤 矗

hunto

嵌入式又不会写，板子又焊不好，只有搬运整理基础工具才能维持得了生活这样子

这里存放了去年自己适配的STM32F1和F7系列的FreeRTOS系统纯净Keil工程。可以直接clone下来使用。

Github:Hunto/FreeRTOS

目前已退坑嵌入式，只有要用的时候调一下以前做的东西了

hunto

代码又不会打，模型又写不好，卷积又不会卷积，只有搬运整理基础教程才能维持得了生活这样子

Numpy简介

Numpy是Python中用于科学计算的核心库。它提供了高性能的多维数组对象，以及相关工具。

1. 数组Arrays

一个numpy数组是一个由不同数值组成的网格。网格中的数据都是同一种数据类型，可以通过非负整型数的元组来访问。维度的数量被称为数组的阶，数组的大小是一个由整型数构成的元组，可以描述数组不同维度上的大小。

我们可以从列表创建数组，然后利用方括号访问其中的元素：

import numpy as np

a = np.array([1, 2, 3])  # Create a rank 1 array
print type(a)            # Prints "<type 'numpy.ndarray'>"
print a.shape            # Prints "(3,)"
print a[0], a[1], a[2]   # Prints "1 2 3"
a[0] = 5                 # Change an element of the array
print a                  # Prints "[5, 2, 3]"

b = np.array([[1,2,3],[4,5,6]])   # Create a rank 2 array
print b                           # 显示一下矩阵b
print b.shape                     # Prints "(2, 3)"
print b[0, 0], b[0, 1], b[1, 0]   # Prints "1 2 4"

Numpy还提供了很多其他创建数组的方法：

       p { color: red }
import numpy as np

a = np.zeros((2,2))  # Create an array of all zeros
print a              # Prints "[[ 0.  0.]
                     #          [ 0.  0.]]"

b = np.ones((1,2))   # Create an array of all ones
print b              # Prints "[[ 1.  1.]]"

c = np.full((2,2), 7) # Create a constant array
print c               # Prints "[[ 7.  7.]
                      #          [ 7.  7.]]"

d = np.eye(2)        # Create a 2x2 identity matrix
print d              # Prints "[[ 1.  0.]
                     #          [ 0.  1.]]"

e = np.random.random((2,2)) # Create an array filled with random values
print e                     # Might print "[[ 0.91940167  0.08143941]
                            #               [ 0.68744134  0.87236687]]"

2. 访问数组

Numpy提供了多种访问数组的方法。

2.1 切片

和Python列表类似，numpy数组可以使用切片语法。因为数组可以是多维的，所以你必须为每个维度指定好切片。

       p { color: red }
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]

# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print a[0, 1]   # Prints "2"
b[0, 0] = 77    # b[0, 0] is the same piece of data as a[0, 1]
print a[0, 1]   # Prints "77"

你可以同时使用整型和切片语法来访问数组。但是，这样做会产生一个比原数组低阶的新数组。需要注意的是，这里和MATLAB中的情况是不同的：

       p { color: red }
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Two ways of accessing the data in the middle row of the array.
# Mixing integer indexing with slices yields an array of lower rank,
# while using only slices yields an array of the same rank as the
# original array:
row_r1 = a[1, :]    # Rank 1 view of the second row of a  
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
print row_r1, row_r1.shape  # Prints "[5 6 7 8] (4,)"
print row_r2, row_r2.shape  # Prints "[[5 6 7 8]] (1, 4)"

# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print col_r1, col_r1.shape  # Prints "[ 2  6 10] (3,)"
print col_r2, col_r2.shape  # Prints "[[ 2]
                            #          [ 6]
                            #          [10]] (3, 1)"

2.2 整型数组访问

当我们使用切片语法访问数组时，得到的总是原数组的一个子集。整型数组访问允许我们利用其它数组的数据构建一个新的数组：

       p { color: red }
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

# An example of integer array indexing.
# The returned array will have shape (3,) and
print a[[0, 1, 2], [0, 1, 0]]  # Prints "[1 4 5]"

# The above example of integer array indexing is equivalent to this:
print np.array([a[0, 0], a[1, 1], a[2, 0]])  # Prints "[1 4 5]"

# When using integer array indexing, you can reuse the same
# element from the source array:
print a[[0, 0], [1, 1]]  # Prints "[2 2]"

# Equivalent to the previous integer array indexing example
print np.array([a[0, 1], a[0, 1]])  # Prints "[2 2]"

整型数组访问语法还有个有用的技巧，可以用来选择或者更改矩阵中每行中的一个元素：

       p { color: red }
import numpy as np

# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])

print a  # prints "array([[ 1,  2,  3],
         #                [ 4,  5,  6],
         #                [ 7,  8,  9],
         #                [10, 11, 12]])"

# Create an array of indices
b = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
print a[np.arange(4), b]  # Prints "[ 1  6  7 11]"

# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10

print a  # prints "array([[11,  2,  3],
         #                [ 4,  5, 16],
         #                [17,  8,  9],
         #                [10, 21, 12]])

2.3 布尔型数组访问

布尔型数组访问可以让你选择数组中任意元素。通常，这种访问方式用于选取数组中满足某些条件的元素，举例如下：

       p { color: red }
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)  # Find the elements of a that are bigger than 2;
                    # this returns a numpy array of Booleans of the same
                    # shape as a, where each slot of bool_idx tells
                    # whether that element of a is > 2.

print bool_idx      # Prints "[[False False]
                    #          [ True  True]
                    #          [ True  True]]"

# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print a[bool_idx]  # Prints "[3 4 5 6]"

# We can do all of the above in a single concise statement:
print a[a > 2]     # Prints "[3 4 5 6]"

为了教程的简洁，有很多数组访问的细节我们没有详细说明，可以查看文档。

3. 数据类型

每个Numpy数组都是数据类型相同的元素组成的网格。Numpy提供了很多的数据类型用于创建数组。当你创建数组的时候，Numpy会尝试猜测数组的数据类型，你也可以通过参数直接指定数据类型，例子如下：

       p { color: red }
import numpy as np

x = np.array([1, 2])  # Let numpy choose the datatype
print x.dtype         # Prints "int64"

x = np.array([1.0, 2.0])  # Let numpy choose the datatype
print x.dtype             # Prints "float64"

x = np.array([1, 2], dtype=np.int64)  # Force a particular datatype
print x.dtype                         # Prints "int64"

更多细节查看文档。

4. 数组计算

基本数学计算函数会对数组中元素逐个进行计算，既可以利用操作符重载，也可以使用函数方式：

       p { color: red }
import numpy as np

x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print x + y
print np.add(x, y)

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print x - y
print np.subtract(x, y)

# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print x * y
print np.multiply(x, y)

# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print x / y
print np.divide(x, y)

# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print np.sqrt(x)

和MATLAB不同，*是元素逐个相乘，而不是矩阵乘法。在Numpy中使用dot来进行矩阵乘法：

       p { color: red }
import numpy as np

x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print v.dot(w)
print np.dot(v, w)

# Matrix / vector product; both produce the rank 1 array [29 67]
print x.dot(v)
print np.dot(x, v)

# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print x.dot(y)
print np.dot(x, y)

Numpy提供了很多计算数组的函数，其中最常用的一个是sum：

       p { color: red }
import numpy as np

x = np.array([[1,2],[3,4]])

print np.sum(x)  # Compute sum of all elements; prints "10"
print np.sum(x, axis=0)  # Compute sum of each column; prints "[4 6]"
print np.sum(x, axis=1)  # Compute sum of each row; prints "[3 7]"

想要了解更多函数，可以查看文档。

除了计算，我们还常常改变数组或者操作其中的元素。其中将矩阵转置是常用的一个，在Numpy中，使用T来转置矩阵：

       p { color: red }
import numpy as np

x = np.array([[1,2], [3,4]])
print x    # Prints "[[1 2]
           #          [3 4]]"
print x.T  # Prints "[[1 3]
           #          [2 4]]"

# Note that taking the transpose of a rank 1 array does nothing:
v = np.array([1,2,3])
print v    # Prints "[1 2 3]"
print v.T  # Prints "[1 2 3]"

Numpy还提供了更多操作数组的方法，请查看文档。

hunto

前排顶fy大大

hunto

可以在运行py的时候指定显卡id
如:

CUDA_VISIBLE_DEVICES=0 python train.py

也可以指定多个GPU:

CUDA_VISIBLE_DEVICES=0,1,2 python train.py

当然，直接在shell里export也是可以的

export CUDA_VISIBLE_DEVICES=0,1,2

hunto

前言

近期在项目组工作中，使用TextCNN对文本分类取得了不错的准确率，为了更清晰地了解TextCNN的结构，特翻译TensorFlow实现的TextCNN一文。

一、什么是TextCNN

TextCNN 是利用卷积神经网络对文本进行分类的算法，由 Yoon Kim 在《Convolutional Neural Networks for Sentence Classification》中提出.

1. Model

0_1522310664141_1dee9b0b-8773-4ce4-ab86-8d9462ff9949-image.png
<center>图1 TextCNN结构图</center>

第一层将单词嵌入到低维矢量中。下一层使用多个过滤器大小对嵌入的单词向量执行卷积。例如，一次滑动3，4或5个单词。接下来，将卷积层的结果最大池化为一个长特征向量，添加dropout正则，并使用softmax对结果进行分类。

二、一步一步带你实现简单的TextCNN

代码使用dennybritz实现的[Github]cnn-text-classification-tf，下面将对其代码内容进行详解

1. TextCNN类

TextCNN类的初始化方法如下：

class TextCNN(object):
    def __init__(self,sequence_length, num_classes, vocab_size,
      embedding_size, filter_sizes, num_filters,l2_reg_lambda=0.0):

各参数的介绍：

参数名	意义	例
sequence_length	句子长度--输入句子的最大长度	`20`
num_classes	分类数	`2400`
vocab_size	单词数	`3000`
embedding_size	单词嵌入维度	`128`
filter_sizes	卷积过滤器的大小	`[3, 4, 5]` -- 使用大小分别为3,4,5的过滤器，每个过滤器有一个与之对应的`num_filters`，本例共有3*[`num_filters`]个过滤器
num_filters	每个过滤器大小的过滤器数量
l2_reg_lambda	l2正则权值	0.0

1.1 Input Placeholders

# Placeholders for input, output and dropout
self.input_x = tf.placeholder(tf.int32, [None, sequence_length], name="input_x")
self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y")
self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")

这里，我们创建了placeholder变量作为训练的输入和测试的输入，placeholder的第二项变量中第一维为batch_size，None意味着该维度可为任意值，使用None将该维度交给网络自由决定。
将神经元保存在dropout层中的概率也作为网络的输入，因为我们在测试时不使用dropout。

1.2 Embedding层

Embedding层将单词向量使用更低维向量表示。

with tf.device('/cpu:0'), tf.name_scope("embedding"):
    W = tf.Variable(
        tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
        name="W")
    self.embedded_chars = tf.nn.embedding_lookup(W, self.input_x)
    self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)

这里我们使用了一些特殊的特性：

tf.device('/cpu:0')将Embedding操作交给cpu执行。默认情况下TensorFlow会将该操作交给gpu执行（前提是有gpu），但是当前embedding在gpu中执行会报错。
tf.name_scope("embedding"):本操作将embedding加入到命名空间(name scope)中。命名空间将所有操作加入到名为embedding的顶层节点中，因此在使用TensorBoard进行网络可视化时能有一个良好的层次结构。

W是我们在训练中学习的嵌入矩阵。我们使用随机均匀分布来初始化它。 tf.nn.embedding_lookup创建实际的嵌入操作。嵌入操作的结果是形状为[None，sequence_length，embedding_size]的三维张量。
TensorFlow的卷积转换操作具有对应于批次，宽度，高度和通道的尺寸的4维张量。我们嵌入的结果不包含通道尺寸，所以我们手动添加，留下一层shape为[None，sequence_length，embedding_size，1]。

1.3 Convolution and Max-Pooling Layers

下面开始构建卷积层，再进行max-pooling。因为每个卷积产生不同形状的张量，因此为他们中的每一个创建一个层，然后合并结果为一个大的特征向量。

pooled_outputs = []  # 池化输出结果
for i, filter_size in enumerate(filter_sizes):
    # 遍历多个filter_size
    with tf.name_scope("conv-maxpool-%s" % filter_size):
        # Convolution Layer
        filter_shape = [filter_size, embedding_size, 1, num_filters]
        W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
        b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
        conv = tf.nn.conv2d(
            self.embedded_chars_expanded,
            W,
            strides=[1, 1, 1, 1],
            padding="VALID",
            name="conv")
        # Apply nonlinearity
        h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")
        # Max-pooling over the outputs
        pooled = tf.nn.max_pool(
            h,
            ksize=[1, sequence_length - filter_size + 1, 1, 1],
            strides=[1, 1, 1, 1],
            padding='VALID',
            name="pool")
        pooled_outputs.append(pooled)
 
# Combine all the pooled features
num_filters_total = num_filters * len(filter_sizes)
self.h_pool = tf.concat(3, pooled_outputs)
self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])

这里，W是过滤器矩阵，h是将非线性应用于卷积输出的结果。每个过滤器在整个嵌入中滑动，但是它涵盖的字数有所不同。 VALID填充意味着我们在没有填充边缘的情况下将过滤器滑过我们的句子，执行给我们输出形状[1，sequence_length - filter_size + 1,1,1]的窄卷积。在特定过滤器大小的输出上执行最大值池将留下一张张量的形状[batch_size，1，num_filters]。这本质上是一个特征向量，其中最后一个维度对应于我们的特征。一旦我们从每个过滤器大小得到所有的汇总输出张量，我们将它们组合成一个长形特征向量[batch_size，num_filters_total]。在tf.reshape中使用-1可以告诉TensorFlow在可能的情况下平坦化维度。

0_1522310619733_20fffba7-e40f-4304-b1ed-c7bc37bff401-image.png
0_1522310635786_73355d77-a508-4b86-858d-a4a9c5a55214-image.png

1.4 Dropout层

Dropout是使卷积神经网络正则化的最受欢迎的方法，Dropout的想法很简单：Dropout层随机“禁用”神经元的一部分，这可以防止神经元共同适应并迫使他们独立学习有用的特征。神经元中启用的比例是由初始化参数中的dropout_keep_prob决定的，训练时我们将它定义为0.5，而在测试时定义为1（禁用Dropout）。

# Add dropout
with tf.name_scope("dropout"):
    self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)

1.5 评估和预测

使用从max-pooling中得到的特征向量（带Dropout），我们可以通过矩阵乘法生成预测并选择得分最高的分类，我们使用softmax将原分数转换为归一化概率，但它并不会改变预测结果。

with tf.name_scope("output"):
    W = tf.Variable(tf.truncated_normal([num_filters_total, num_classes], stddev=0.1), name="W")
    b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
    self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
    self.predictions = tf.argmax(self.scores, 1, name="predictions")

其中，tf.nn.xw_plus是一个实现 $Wx+b$ 矩阵乘法的一个封装方法。

1.6 loss和准确率计算

我们可以使用1.5得到的score来定义loss function。分类问题的标准损失方程为交叉熵损失方程。

# Calculate mean cross-entropy loss
with tf.name_scope("loss"):
    losses = tf.nn.softmax_cross_entropy_with_logits(self.scores, self.input_y)
    self.loss = tf.reduce_mean(losses)

其中，tf.nn.softmax_cross_entropy_with_logits是一个对每个分类计算交叉熵损失的封装方法，通过score和正确分类作为参数，我们可以得到每一类的loss，对其求平均值，可以得到平均损失。

我们也定义了准确率函数。

# Calculate Accuracy
with tf.name_scope("accuracy"):
    correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
    self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")

1.7 网络的可视化

到这里，我们已经完成了网络的构建，为了得到网络的可视化图，我们可以使用TensorBoard对网络进行可视化。
0_1522310547663_b067f939-22de-417b-b25f-69db46d39497-image.png
<center>图4 网络可视化图</center>

1.8 训练

FLAGS = tf.flags.FLAGS
with tf.Graph().as_default():
    session_conf = tf.ConfigProto(
        allow_soft_placement=FLAGS.allow_soft_placement,
        log_device_placement=FLAGS.log_device_placement
    )
    sess = tf.Session(config=session_conf)
    with sess.as_default():

显式创建graph便于训练结束后释放资源，

1.9 实例化CNN和最小化损失

当我们实例化我们的TextCNN模型时，所有定义的变量和操作将被放置在上面创建的默认图和会话中。

cnn = TextCNN(
    sequence_length=x_train.shape[1],
    num_classes=y_train.shape[1],
    vocab_size=len(vocab_processor.vocabulary)
    embedding_size=FLAGS.num_filters,
    filter_sizes = map(int, FLAGS.filter_sizes.split(",")),
    num_filters = FLAGS.num_filters)

接下来，我们定义如何优化网络的损失函数。 TensorFlow有几个内置优化器。我们正在使用Adam优化器。

# Define Training procedure
global_step = tf.Variable(0,name="global_step",trainable=False)
optimizer = tf.train.AdamOptimizer(1e-4)
grads_and_vars = optimizer.compute_gradients(cnn.loss)
train_op = optimizer.apply_gradients(grads_and_vars,global_step=global_step)

1.10 概览（SUMMARIES）

TensorFlow有一个概述（summaries），可以在训练和评估过程中跟踪和查看各种数值。例如，您可能希望跟踪您的损失和准确性随时间的变化。您还可以跟踪更复杂的数值，例如图层激活的直方图。 summaries是序列化对象，并使用SummaryWriter写入磁盘。

# Output directory for models and summaries
timestamp = str(int(time.time()))
out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
print("Writing to {}\n".format(out_dir))

# Summaries for loss and accuracy
loss_summary = tf.scalar_summary("loss", cnn.loss)
acc_summary = tf.scalar_summary("accuracy", cnn.accuracy)

# Train Summaries
train_summary_op = tf.merge_summary([loss_summary, acc_summary])
train_summary_dir = os.path.join(out_dir, "summaries", "train")
train_summary_writer = tf.train.SummaryWriter(train_summary_dir, sess.graph_def)

# Dev summaries
dev_summary_op = tf.merge_summary([loss_summary, acc_summary])
dev_summary_dir = os.path.join(out_dir, "summaries", "dev")
dev_summary_writer = tf.train.SummaryWriter(dev_summary_dir, sess.graph_def)

在这里，我们分别跟踪培训和评估的总结。在我们的情况下，这些数值是相同的，但是您可能只有在训练过程中跟踪的数值（如参数更新值）。 tf.merge_summary是将多个摘要操作合并到可以执行的单个操作中的便利函数。

1.11 CHECKPOINTING

通常使用TensorFlow的另一个功能是checkpointing- 保存模型的参数以便稍后恢复。Checkpoints 可用于在以后的时间继续训练，或使用 early stopping选择最佳参数设置。使用Saver对象创建 Checkpoints。

# Checkpointing
checkpoint_dir = os.path.abspath(os.path.join(out_dir, "checkpoints"))
checkpoint_prefix = os.path.join(checkpoint_dir, "model")
# Tensorflow assumes this directory already exists so we need to create it
if not os.path.exists(checkpoint_dir):
    os.makedirs(checkpoint_dir)
saver = tf.train.Saver(tf.all_variables())

1.12 初始化变量

在训练模型前，我们需要初始化变量。

# Initialize all variables
sess.run(tf.global_variables_initializer())

1.13 定义单步训练函数

现在我们来定义一个训练步骤的函数，评估一批数据上的模型并更新模型参数。

def train_step(x_batch,y_batch):
    """
        A single training step
        """
    feed_dict = {
        cnn.input_x:x_batch,
        cnn.input_y:y_batch,
        cnn.dropout_keep_prob:FLAGS.dropout_keep_prob
    }
    _,step,summaries,loss,accuracy = sess.run(
        [train_op,global_step,train_summary_op,cnn.loss,cnn.accuracy],feed_dict
    )
    time_str = datetime.datetime.now().isoformat()
    print("{}:step{},loss{:g},acc{:g}".format(time_str,step,loss,accuracy))
    train_summary_writer.add_summary(summaries,step)

def dev_step(x_batch, y_batch, writer=None):
    """
    Evaluates model on a dev set
    """
    feed_dict = {
        cnn.input_x: x_batch,
        cnn.input_y: y_batch,
        cnn.dropout_keep_prob: 1.0
    }
    step, summaries, loss, accuracy = sess.run(
        [global_step, dev_summary_op, cnn.loss, cnn.accuracy],
        feed_dict)
    time_str = datetime.datetime.now().isoformat()
    print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
    if writer:
        writer.add_summary(summaries, step)

1.14 TRAINING LOOP

下面写训练循环。

# Generate batches
batches = data_helpers.batch_iter(
    zip(x_train, y_train), FLAGS.batch_size, FLAGS.num_epochs)
# Training loop. For each batch...
for batch in batches:
    x_batch, y_batch = zip(*batch)
    train_step(x_batch, y_batch)
    current_step = tf.train.global_step(sess, global_step)
    if current_step % FLAGS.evaluate_every == 0:
        print("\nEvaluation:")
        dev_step(x_dev, y_dev, writer=dev_summary_writer)
        print("")
    if current_step % FLAGS.checkpoint_every == 0:
        path = saver.save(sess, checkpoint_prefix, global_step=current_step)
        print("Saved model checkpoint to {}\n".format(path))

2. 模型可优化内容

使用word2vec词向量初始化embedding层，为了达到提升，你需要使用300维以上的词向量。
限制最后一层权重向量的L2范数，就像原始文献一样。你可以通过定义一个新的操作，在每次训练步骤之后更新权重值。
将L2正则化添加到网络以防止过拟合，同时也提高dropout比率。（Github上的代码已经包括L2正则化，但默认情况下禁用）
添加权重更新和图层操作的直方图summaries，并在TensorBoard中进行可视化。

后续内容待更

参考文献

hunto

2018.5.5更新：使用word2vec作为输入进行训练

使用Word2Vec

使用word2vec训练其实早已写完代码，一直没有整理上来。先将修改后的代码放到这里，以后有时间再介绍修改过程。

text_cnn.py

import tensorflow as tf
import numpy as np
class TextCNN(object):

   def __init__(
     self, sequence_length, num_classes, vocab_size,
     embedding_size, filter_sizes, num_filters, l2_reg_lambda=0.0):

       # Placeholders for input, output and dropout
       self.input_x = tf.placeholder(tf.int32, [None, sequence_length], name="input_x")
       self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y")
       self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")
       self.learning_rate = tf.placeholder(tf.float32, name='learning_rate')

       # Keeping track of l2 regularization loss (optional)
       l2_loss = tf.constant(0.0)

       # Embedding layer
       with tf.device('/cpu:0'), tf.name_scope("embedding"):
           self.W = tf.Variable(
               tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
               name="W")
           self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x)
           self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)

       # Create a convolution + maxpool layer for each filter size
       pooled_outputs = []
       for i, filter_size in enumerate(filter_sizes):
           with tf.name_scope("conv-maxpool-%s" % filter_size):
               # Convolution Layer
               filter_shape = [filter_size, embedding_size, 1, num_filters]
               W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
               b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
               conv = tf.nn.conv2d(
                   self.embedded_chars_expanded,
                   W,
                   strides=[1, 1, 1, 1],
                   padding="VALID",
                   name="conv")
               # Apply nonlinearity
               h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")
               # Maxpooling over the outputs
               pooled = tf.nn.max_pool(
                   h,
                   ksize=[1, sequence_length - filter_size + 1, 1, 1],
                   strides=[1, 1, 1, 1],
                   padding='VALID',
                   name="pool")
               pooled_outputs.append(pooled)

       # Combine all the pooled features
       num_filters_total = num_filters * len(filter_sizes)
       self.h_pool = tf.concat(pooled_outputs, 3)
       self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])

       # Add dropout
       with tf.name_scope("dropout"):
           self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)

       # Final (unnormalized) scores and predictions
       with tf.name_scope("output"):
           W = tf.get_variable(
               "W",
               shape=[num_filters_total, num_classes],
               initializer=tf.contrib.layers.xavier_initializer())
           b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
           l2_loss += tf.nn.l2_loss(W)
           l2_loss += tf.nn.l2_loss(b)
           self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
           self.predictions = tf.argmax(self.scores, 1, name="predictions")

       # CalculateMean cross-entropy loss
       with tf.name_scope("loss"):
           losses = tf.nn.softmax_cross_entropy_with_logits(logits=self.scores, labels=self.input_y)
           self.loss = tf.reduce_mean(losses) + l2_reg_lambda * l2_loss

       # Accuracy
       with tf.name_scope("accuracy"):
           correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))
           self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")

data_helpers.py

import numpy as np
import re
import jieba
import itertools
from collections import Counter
from tensorflow.contrib import learn
from gensim.models import Word2Vec


def clean_str(string):
    """
    Tokenization/string cleaning for all datasets except for SST.
    Original taken from https://github.com/yoonkim/CNN_sentence/blob/master/process_data.py
    """
    string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string)
    string = re.sub(r"\'s", " \'s", string)
    string = re.sub(r"\'ve", " \'ve", string)
    string = re.sub(r"n\'t", " n\'t", string)
    string = re.sub(r"\'re", " \'re", string)
    string = re.sub(r"\'d", " \'d", string)
    string = re.sub(r"\'ll", " \'ll", string)
    string = re.sub(r",", " , ", string)
    string = re.sub(r"!", " ! ", string)
    string = re.sub(r"\(", " \( ", string)
    string = re.sub(r"\)", " \) ", string)
    string = re.sub(r"\?", " \? ", string)
    string = re.sub(r"\s{2,}", " ", string)
    return string.strip().lower()


def load_data_and_labels(train_file_path, test_file_path):
    """
    获取训练与测试数据
    :param train_file_path:
    :param test_file_path:
    :return: 分词后的训练句子列表, 训练分类列表, 分词后的测试句子列表, 测试分类列表
    """

    # Load data from files
    x_train = list()  # 训练数据
    y_train_data = list()  # y训练分类数据
    x_test = list()  # 测试数据
    y_test_data = list()  # y测试分类数据
    y_labels = list()  # 分类集

    # 读取训练数据
    with open(train_file_path, 'r', encoding='utf-8') as train_file:
        for line in train_file.read().split('\n'):
            sp = line.split('||')
            if len(sp) != 2:
                continue
            x_train.append(' '.join(jieba.cut(sp[0])))
            y_train_data.append(sp[1])

    # 读取测试数据
    with open(test_file_path, 'r', encoding='utf-8') as test_file:
        for line in test_file.read().split('\n'):
            sp = line.split('||')
            if len(sp) != 2:
                continue
            x_test.append(' '.join(jieba.cut(sp[0])))
            y_test_data.append(sp[1])

    # 构建分类列表
    for item in y_train_data:
        if item not in y_labels:
            y_labels.append(item)

    labels_len = len(y_labels)
    print('分类数为： ', labels_len)

    # 构建训练y
    y_train = np.zeros((len(y_train_data), labels_len), dtype=np.int)
    for index in range(len(y_train_data)):
        y_train[index][y_labels.index(y_train_data[index])] = 1

    # 构建测试y
    y_test = np.zeros((len(y_test_data), labels_len), dtype=np.int)
    for index in range(len(y_test_data)):
        y_test[index][y_labels.index(y_test_data[index])] = 1

    return [x_train, y_train, x_test, y_test, y_labels]


def load_train_dev_data(train_file_path, test_file_path):
    x_train_text, y_train, x_test_text, y_test, _ = load_data_and_labels(train_file_path, test_file_path)
    # Load data
    print("Loading data...")

    # Build vocabulary
    max_train_document_length = max([len(x.split(" ")) for x in x_train_text])
    max_test_document_length = max([len(x.split(" ")) for x in x_test_text])
    max_document_length = max_test_document_length \
        if max_test_document_length > max_train_document_length \
        else max_train_document_length

    # 使用VocabularyProcessor处理输入
    vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length)
    x_train = np.array(list(vocab_processor.fit_transform(x_train_text)))
    x_test = np.array(list(vocab_processor.fit_transform(x_test_text)))

    # Randomly shuffle data -- 随机搅乱数据
    np.random.seed(10)
    shuffle_indices = np.random.permutation(np.arange(len(y_train)))
    x_train = x_train[shuffle_indices]
    y_train = y_train[shuffle_indices]

    print("Vocabulary Size: {:d}".format(len(vocab_processor.vocabulary_)))
    print("Train/Dev split: {:d}/{:d}".format(len(y_train), len(y_test)))
    return x_train, y_train, x_test, y_test, vocab_processor


def load_embedding_vectors_word2vec(vocabulary, filename, binary):
    word2vec_model = Word2Vec.load(filename)
    embedding_vectors = np.random.uniform(-0.25, 0.25, (len(vocabulary), 200))

    for word in word2vec_model.wv.vocab:
        idx = vocabulary.get(word)
        if idx != 0:
            embedding_vectors[idx] = word2vec_model[word]

    return embedding_vectors


def batch_iter(data, batch_size, num_epochs, shuffle=True):
    """
    Generates a batch iterator for a dataset.
    """
    data = np.array(data)
    data_size = len(data)
    num_batches_per_epoch = int((len(data)-1)/batch_size) + 1
    for epoch in range(num_epochs):
        # Shuffle the data at each epoch
        if shuffle:
            shuffle_indices = np.random.permutation(np.arange(data_size))
            shuffled_data = data[shuffle_indices]
        else:
            shuffled_data = data
        for batch_num in range(num_batches_per_epoch):
            start_index = batch_num * batch_size
            end_index = min((batch_num + 1) * batch_size, data_size)
            yield shuffled_data[start_index:end_index]

train.py

#! /usr/bin/env python

import tensorflow as tf
import numpy as np
import os
import time
import datetime
import data_helpers
from text_cnn import TextCNN
from tensorflow.contrib import learn
import yaml

# Parameters
# ==================================================

# Data loading params
tf.app.flags.DEFINE_float("dev_sample_percentage", .1, "Percentage of the training data to use for validation")
tf.app.flags.DEFINE_string("train_file", "../data/train_data.txt", "Train file source.")
tf.app.flags.DEFINE_string("test_file", "../data/test_data.txt", "Test file source.")


# Model Hyperparameters
tf.app.flags.DEFINE_integer("embedding_dim", 128, "Dimensionality of character embedding (default: 128)")
tf.app.flags.DEFINE_string("filter_sizes", "3,4,5", "Comma-separated filter sizes (default: '3,4,5')")
tf.app.flags.DEFINE_integer("num_filters", 128, "Number of filters per filter size (default: 128)")
tf.app.flags.DEFINE_float("dropout_keep_prob", 0.5, "Dropout keep probability (default: 0.5)")
tf.app.flags.DEFINE_float("l2_reg_lambda", 0.0, "L2 regularization lambda (default: 0.0)")

# Training parameters
tf.app.flags.DEFINE_integer("batch_size", 128, "Batch Size (default: 64)")
tf.app.flags.DEFINE_integer("num_epochs", 100, "Number of training epochs (default: 200)")
tf.app.flags.DEFINE_integer("evaluate_every", 100, "Evaluate model on dev set after this many steps (default: 100)")
tf.app.flags.DEFINE_integer("checkpoint_every", 1000, "Save model after this many steps (default: 100)")
tf.app.flags.DEFINE_integer("num_checkpoints", 5, "Number of checkpoints to store (default: 5)")
# Misc Parameters
tf.app.flags.DEFINE_boolean("allow_soft_placement", True, "Allow device soft device placement")
tf.app.flags.DEFINE_boolean("log_device_placement", False, "Log placement of ops on devices")

FLAGS = tf.app.flags.FLAGS
print("\nParameters:")
for attr, value in sorted(FLAGS.__flags.items()):
    print("{}={}".format(attr.upper(), value))
print("")


# Data Preparation
# ==================================================

# Load data
with open("config.yml", 'r') as ymlfile:
    cfg = yaml.load(ymlfile)

print("Loading data...")
x_train, y_train, x_test, y_test, vocab_processor = data_helpers.load_train_dev_data(FLAGS.train_file, FLAGS.test_file)

embedding_name = cfg['word_embeddings']['default']
embedding_dimension = cfg['word_embeddings'][embedding_name]['dimension']


# Training
# ==================================================

with tf.Graph().as_default():
    session_conf = tf.ConfigProto(
      allow_soft_placement=FLAGS.allow_soft_placement,
      log_device_placement=FLAGS.log_device_placement)
    sess = tf.Session(config=session_conf)

    with sess.as_default():
        cnn = TextCNN(
            sequence_length=x_train.shape[1],
            num_classes=y_train.shape[1],
            vocab_size=len(vocab_processor.vocabulary_),
            embedding_size=embedding_dimension,
            filter_sizes=list(map(int, FLAGS.filter_sizes.split(","))),
            num_filters=FLAGS.num_filters,
            l2_reg_lambda=FLAGS.l2_reg_lambda)

        cnn.learning_rate = 0.01
        # Define Training procedure
        global_step = tf.Variable(0, name="global_step", trainable=False)
        optimizer = tf.train.AdamOptimizer(cnn.learning_rate)
        grads_and_vars = optimizer.compute_gradients(cnn.loss)
        train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)

        # Keep track of gradient values and sparsity (optional)
        grad_summaries = []
        for g, v in grads_and_vars:
            if g is not None:
                grad_hist_summary = tf.summary.histogram("{}/grad/hist".format(v.name), g)
                sparsity_summary = tf.summary.scalar("{}/grad/sparsity".format(v.name), tf.nn.zero_fraction(g))
                grad_summaries.append(grad_hist_summary)
                grad_summaries.append(sparsity_summary)
        grad_summaries_merged = tf.summary.merge(grad_summaries)

        # Output directory for models and summaries
        timestamp = str(int(time.time()))
        out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
        print("Writing to {}\n".format(out_dir))

        # Summaries for loss and accuracy
        loss_summary = tf.summary.scalar("loss", cnn.loss)
        acc_summary = tf.summary.scalar("accuracy", cnn.accuracy)

        # Train Summaries
        train_summary_op = tf.summary.merge([loss_summary, acc_summary, grad_summaries_merged])
        train_summary_dir = os.path.join(out_dir, "summaries", "train")
        train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)

        # Dev summaries
        dev_summary_op = tf.summary.merge([loss_summary, acc_summary])
        dev_summary_dir = os.path.join(out_dir, "summaries", "dev")
        dev_summary_writer = tf.summary.FileWriter(dev_summary_dir, sess.graph)

        # Checkpoint directory. Tensorflow assumes this directory already exists so we need to create it
        checkpoint_dir = os.path.abspath(os.path.join(out_dir, "checkpoints"))
        checkpoint_prefix = os.path.join(checkpoint_dir, "model")
        if not os.path.exists(checkpoint_dir):
            os.makedirs(checkpoint_dir)
        saver = tf.train.Saver(tf.global_variables(), max_to_keep=FLAGS.num_checkpoints)

        # Write vocabulary
        vocab_processor.save(os.path.join(out_dir, "vocab"))

        # Initialize all variables
        sess.run(tf.global_variables_initializer())

        vocabulary = vocab_processor.vocabulary_
        initW = data_helpers.load_embedding_vectors_word2vec(vocabulary,
                                                             cfg['word_embeddings']['word2vec']['path'],
                                                             cfg['word_embeddings']['word2vec']['binary'])
        print(initW.shape)
        sess.run(cnn.W.assign(initW))


        def train_step(x_batch, y_batch):
            """
            A single training step
            """
            feed_dict = {
              cnn.input_x: x_batch,
              cnn.input_y: y_batch,
              cnn.dropout_keep_prob: FLAGS.dropout_keep_prob
            }
            _, step, summaries, loss, accuracy = sess.run(
                [train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy],
                feed_dict)
            time_str = datetime.datetime.now().isoformat()
            print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
            train_summary_writer.add_summary(summaries, step)

        def dev_step(x_batch, y_batch, writer=None):
            """
            Evaluates model on a dev set
            """
            feed_dict = {
              cnn.input_x: x_batch,
              cnn.input_y: y_batch,
              cnn.dropout_keep_prob: 1.0
            }
            step, summaries, loss, accuracy = sess.run(
                [global_step, dev_summary_op, cnn.loss, cnn.accuracy],
                feed_dict)
            time_str = datetime.datetime.now().isoformat()
            print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
            if writer:
                writer.add_summary(summaries, step)
            if step % FLAGS.batch_size == 0:
                print('epoch ', step % FLAGS.batch_size)

        # Generate batches
        batches = data_helpers.batch_iter(
            list(zip(x_train, y_train)), FLAGS.batch_size, FLAGS.num_epochs)
        # Training loop. For each batch...
        for batch in batches:
            x_batch, y_batch = zip(*batch)
            train_step(x_batch, y_batch)
            current_step = tf.train.global_step(sess, global_step)
            if current_step % FLAGS.evaluate_every == 0:
                print("\nEvaluation:")
                dev_step(x_test, y_test, writer=dev_summary_writer)
                print("")
            if current_step % FLAGS.checkpoint_every == 0:
                path = saver.save(sess, checkpoint_prefix, global_step=current_step)
                print("Saved model checkpoint to {}\n".format(path))

hunto

Object Detection

0_1557480608311_d6e66d3d-0afa-4ebc-a85d-1e8d7b0be4c9-image.png

Object Detection领域目前主要有几种主流方法，一类是Faster RCNN为代表的two-stage模型，一类是以YOLO、SSD为代表的one-stage模型。在速度方面，one-stage网络具有绝对的优势，而在精确度上，Faster RCNN效果会更好。
同时，对于单目标检测，也有一些多级网络做到了非常快的速度，例如MTCNN(Multi-task Cascaded Convolutional Networks)，目前我们的项目也是使用的MTCNN进行人脸定位。
在之后的内容中，我会选取几个Object Detection模型进行讲解。
Object Detection更基础的介绍可见 DianAI培训3 - CNN for Object Detection。

Multi-stage 级联网络

Faster RCNN
MTCNN

One-stage 单级网络

YOLO
SSD

模型效果比较

0_1557480631929_36c41b68-97ab-41bf-89c8-ff00b8be6607-image.png

hunto

References

RCNN的发展

在说Faster RCNN之前，我们先简单介绍一下RCNN、FastRCNN。RCNN算法是由RCNN->Fast RCNN->Faster RCNN逐步发展来的，且效果和速度为递增关系，因此可以直接学习Faster RCNN，它的前辈已经淘汰了。
CNN作为特征提取器，对图片特征有很好提取效果，因此有了使用CNN进行物体检测的想法，我们可以采用滑动窗口的方法，将图片resize成不同大小，使用同一卷积网络对图片进行卷积，得到很多个窗口的分类概率及坐标回归，但这样的网络计算量显然太大了，如果我要识别的物体非常小，就需要把图片resize成很大，让卷积网络滑过无数个窗口。

0_1557220221685_d8f8dd9c-ccf8-4689-a27a-b1b5b7683087-image.png

上面的滑动窗口的方法很粗暴，绝大多数窗口都是没有物体的，这样对计算造成了很大的浪费，我们是否能在将图片输入网络之前先确定一些有可能是物体的框？当然可以。

Selective Search
对于一张图片，我们可以按照颜色、纹理等特征划分成多个部分，将每一个部分作为一个备选框输入网络，可以极大地减少计算量。本部分我们不再详细介绍。
0_1557221033321_c09a617a-a24b-4db6-9658-bd7a9b96a67a-image.png

RCNN
RCNN就是用了这样的Region Proposal思想构建的。先使用selective search得到备选框，再按框将图片裁剪出来输入网络中，得到物体的图片特征，再通过SVM和线性回归得到分类以及框的偏移量，这样的好处是训练非常简单，我们只需要训练好一个模型作为判别器，判断框中的物体分类即可。
0_1557219833094_1ba352fc-f959-40fe-b7ab-24a4b5a127cf-image.png

Fast RCNN
既然可以用CNN提取图片特征，为什么不直接写一个网络预测物体分类和坐标回归？于是有了Fast RCNN。
0_1557221488429_f668c7ad-1ea1-44de-9bcc-599ac8db857c-image.png

当然Fast RCNN也有一个问题，selective search算法耗时很长。可不可以用CNN来做region proposal？于是就有了Faster RCNN。

RCNN、Fast RCNN、Faster RCNN的区别
0_1557223064315_834f88e3-3144-4033-ad12-58b41a776cc0-image.png

	使用方法	缺点	改进
R-CNN(Region-based Convolutional Neural Networks)	1、SS提取RP；2、CNN提取特征；3、SVM分类；4、BB盒回归。	1、训练步骤繁琐（微调网络+训练SVM+训练bbox）；2、训练、测试均速度慢；3、训练占空间	1、从DPM HSC的34.3%直接提升到了66%（mAP）；2、引入RP+CNN
Fast R-CNN(Fast Region-based Convolutional Neural Networks)	1、SS提取RP；2、CNN提取特征；3、softmax分类；4、多任务损失函数边框回归。	1、依旧用SS提取RP(耗时2-3s，特征提取耗时0.32s)；2、无法满足实时应用，没有真正实现端到端训练测试；3、利用了GPU，但是区域建议方法是在CPU上实现的。	1、由66.9%提升到70%；2、每张图像耗时约为3s。
Faster R-CNN(Fast Region-based Convolutional Neural Networks)	1、RPN提取RP；2、CNN提取特征；3、softmax分类；4、多任务损失函数边框回归。	1、还是无法达到实时检测目标；2、获取region proposal，再对每个proposal分类计算量还是比较大。	1、提高了检测精度和速度；2、真正实现端到端的目标检测框架；3、生成建议框仅需约10ms。

Faster RCNN

0_1557219043179_5f8ed854-54b5-43b9-b541-99135acfc378-image.png

Faster RCNN总体流程

Conv Layers: 使用基础CNN网络提取图像的feature map
Region Proposal Network: 使用RPN判断feature map的anchors是否为物体，同时修正box坐标
Roi Pooling: 该层收集输入的feature maps和proposals，综合这些信息后提取proposal feature maps，送入后续全连接层判定目标类别
Classification: 利用proposal feature maps计算proposal的类别，同时再次bounding box regression获得检测框最终的精确位置

Region Proposal Networks

Faster RCNN将feature map的每一个点作为中心点，为其成k（默认k=9）个anchors，anchors按大小分为3组，每组框的长宽比为[2, 1, 0.5]。个anchors，RPN会对每个anchors都输出对应的是否物体概率及框坐标，因此每个点都会有k * 2个scores输出、k * 4个坐标回归输出。

0_1557224218021_dd5f7820-79b9-445b-adb5-28d45904c496-image.png

其实RPN最终就是在原图尺度上，设置了密密麻麻的候选Anchor。然后用cnn去判断哪些Anchor是里面有目标的foreground anchor，哪些是没目标的backgroud。所以，仅仅是个二分类而已！

那么anchor一共有多少个呢？
若原图为 $800 \times 600$ , 使用VGG网络下采样 $16$ 倍，每个点的anchor数为9，那么总anchor数为：

$N_{anchors} = ceil(800\ /\ 16)\times ceil(600\ /\ 16)\times 9 = 50 \times 38 \times 9 = 17100$

0_1557224354880_2bfddfb2-e00a-4863-a77e-75a1867c38c1-image.png

在得到所有anchors的概率及坐标后，可使用nms(Non-Maximum Suppression)操作将重合度较高的框合并为同一个、将概率较低的框舍弃。

NMS
介绍NMS算法前首先要了解衡量框重合度的IOU公式：
IOU(Intersection over Union)的全称为交并比

0_1557224388102_5d43478d-2b36-4867-83ba-46f9a0b24052-image.png

若框A的面积为SA，框B的面积为SB，两框重合面积为SI，则
$IOU = SI\ /\ (SA + SB - SI)$

非极大值抑制的主要操作步骤为：

对候选框按照分类概率进行筛选，概率低于某个阈值的视为非该物体直接剔除
对每一个分类的box进行排序，得到从大到小的box列表。如：A, B, C, D, E
从最大的框开始，分别与其后的框计算IOU，当两框IOU大于某个阈值时，将概率小的框舍弃。例如：A与C的IOU=0.9 > 阈值0.7，将C舍弃，剩余框为A, B, D, E
再按照顺序继续遍历，从B开始计算B与D、E的IOU
对每个分类均要执行一次上述过程

当然，使用nms方法会存在将重合的两个同类物体合并为一个的问题，也有soft nms等算法可以缓解此问题。

通过上述操作后，我们就可以得到可能为物体的备选框了

ROI Pooling

为什么需要ROI Pooling？
通常图像分类网络的输入大小都是固定的（例如224x224），那么要将一个长宽比为1:2的图片输入网络中，我们的两种做法是：裁剪、拉伸。

0_1557224404580_ea56b27f-20c3-469f-8db2-cdb482fd8f88-image.png

但这样会破坏图像原有的结构信息，因此Faster RCNN使用了ROI Pooling来处理不同长宽比的图片。其原理是，对输入矩阵的宽高维度使用不同大小的池化，例如160wx320h的图片，可以对w使用大小为10的池化，对h使用大小为20的池化，最终得到16x16的矩阵。

0_1557224412334_29ccd88b-d6f0-44f3-aa18-0fde32f2d13e-image.png

Classification

最后再将ROI Pooling得到的feature输入CNN网络中，得到其分类与最终坐标回归。

hunto

先挖坑，不知道想不想填

hunto

导读
这是我在商汤实习期间以共同一作身份发表的一篇CVPR 2020 poster论文，本论文提出了基于贪心超网络的One-Shot NAS方法，显著提升了超网络直接在大规模数据集上的搜索训练效率，并在标准ImageNet数据集上取得了300M FLOPs量级的SOTA。GreedyNAS论文通过提出一种贪心的超网络结构采样训练方法，改善了训练得到的超网络对结构的评估能力，进而帮助搜索算法得到精度更高的结构。

论⽂地址：
GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet
https://arxiv.org/abs/2003.11236

0_1588074736999_64221c54-237e-488c-ba3e-3c998f53edeb-image.png

动机与背景

在目前的神经结构搜索领域中，One-Shot NAS方法由于其搜索开销小被广泛应用，这些方法使用一个权重共享的超网络（supernet）作为不同网络结构的性能评估器，因此，supernet的训练对搜索结果的好坏至关重要。然而，目前的方法一般采用了一个基本的假设，即supernet中每一个结构是同等重要的，supernet应该对每个结构进行准确评估或相对排序。然而，supernet中所包含的结构量级（搜索空间的size）是非常巨大的（如 $7^{21}$ ），因此准确的评估对于supernet来说是非常困难的，导致supernet中结构的表现与其真实表现相关性很差 [1]。
在本篇论文中，我们提出一种贪心超网络来减轻supernet的评估压力，使得supernet更加贪心地注重于有潜力的好结构，而不是全体。具体而言，在supernet训练过程中，我们提出了一种多路径拒绝式采样方法(multi-path sampling with rejection)来进行路径滤波 (path filtering)，使得有潜力的好结构得到训练。通过这种方法，supernet的训练从整个搜索空间贪心地缩小到了有潜力的结构组成的空间中，因此训练的效率得到了提升。同时，为了进一步增大有潜力结构的采样概率与提高训练效率，我们基于exploration and exploitation准则，使用一个经验池存储评估过的“好”结构，用来加强贪心度并为后续的搜索提供好的初始点。本论文搜索出的结构在ImageNet (mobile setting) 下取得了 SOTA(state-of-the-art) 的结果。

主要思路

巨大的搜索空间带来的评估压力使supernet难以准确地区分结构的好坏，由于所有结构的权重都是在supernet中高度共享的，如果一个差的结构被训练到了，好的结构的权重也会受到干扰。这样的干扰会削弱好网络的最终评估精度，影响网络的搜索结果。同时，对差的结构进行训练相当于对权重进行了没有必要的更新，降低supernet的训练效率。

多路径拒绝式采样

针对上述问题，一个直接的想法就是基于贪心策略，在训练过程中只训练好的结构。但很显然，我们并不知道一个随机初始化的搜索空间中哪些结构是好的。假设对于一个supernet，我们考虑其搜索空间A的一个完备划分，即：
0_1588075268506_图片 1.png
搜索空间可以如上划分为好的空间与差的空间，且好空间中每一个结构的ACC均大于差空间，即

于是，一个理想的采样策略是直接在好空间中进行采样即可。然而根据上面的不等式，确定所有结构中哪些是来自好空间需要遍历整个搜索空间，计算开销是无法接受的。为了解决这个问题，我们首先考虑从全空间中进行的一个均匀采样，那么每个path来自好空间的概率为：
0_1588075429483_b262a41a-b77e-4500-857c-52cfa88ee6aa-image.png
为了得到需要的来自于好空间的结构，我们进一步考虑一个多维的Bernoulli实验，那么有如下的结论：
0_1588075398469_1624a551-a3be-4578-bdc8-0ba89cecbc75-image.png
我们取m = 10和m = 20，对定理1的概率进行绘制，如Figure 2所示，可见这种采样下得到来自好空间中的path的概率是很高的。
0_1588075472078_40204171-b28a-44f3-874a-0a39919b5f9a-image.png
于是，我们可以对采样到的多个结构进行评估筛选的方法提升采样到“好”结构的概率，即每次采样m个结构，从中选取评估指标最高的k个结构进行训练。但是，对结构进行评估需要在验证集上计算其ACC，这样会增加非常多计算量（我们的ImageNet验证集大小为50k）。为了减少评估网络的消耗，我们从验证集中随机选取了一小部分（如1000张图）组成小验证集，并使用在小验证集上的loss作为结构的排序指标。使用小验证集进行评估，在保证评估准确性的前提下，相较uniform sampling方法只增加了很少的计算代价，详见论文实验部分。

基于exploration and exploitation策略的路径候选池

0_1588075543485_d5afa174-64b1-4034-9027-e058d95d6e01-image.png

在前面提到的路径滤波中，我们通过评估可以区分出较好的结构，为了进一步提升训练效率，受蒙特卡洛树搜索(Monte Carlo tree search)[4] 和 deep Q-learning[5] 中常用的exploration and exploitation策略启发，我们提出使用一个路径候选池用于存放训练过程中评估过的“好”结构，并进行重复利用。具体而言，候选池可以看作是一个固定大小的有序队列，其只会存储所有评估过结构中得分前n(候选池大小) 的结构。

有了候选池的帮助，我们可以选择从搜索空间中或候选池中采样结构。从候选池中采样的结构是好结构的概率更高，但可能会牺牲结构的多样性。为了平衡exploration与exploitation，我们采用ϵ-采样策略，即以一定的概率从整个搜索空间A或候选池P中采样结构α：
0_1588075607276_cd709e1a-82f8-4702-b11b-9850243f4d21-image.png
在网络刚开始训练时评估过的结构较少，候选池中存储的结构是好结构的可信度不高，因此从候选池中采样的概率 ϵ 在开始时设为0，并线性增加至一个较高的值（在实验中，我们发现0.8是一个较优的值）。若候选池中的结构都来自好空间，通过使用候选池，定理1中好网络的采样概率q提升为：
0_1588075644777_05e3a48b-5162-497b-8503-18f3e921ac7d-image.png
因此，采样10个结构，至少有5个好结构的概率由88.38%提升至99.36% 。
0_1588075677144_d109ad36-1dc4-467a-aa89-9fdf2daa23e1-image.png

基于候选池的early stopping策略
候选池的更新为supernet的训练情况提供了一个很好的参考。若候选池发生的更新（顺序变换、进出）较少，可以认为超网络中较好的网络维持着一个相对稳定的排序，这也说明此时的supernet已经是一个较好的性能评估器，因此训练进程可以提前结束，而不需要训练至网络完全收敛。
在实际使用中，我们会比较当前候选池P与t轮迭代前的候选池P_t的差异度，若差异度低于某个数值（我们的实验使用0.08），训练停止。差异度的定义如下：
0_1588075721391_2cbb8f42-9504-441b-8ac4-7546bf017020-image.png

0_1588075732982_e8db8583-5d39-4d4b-a577-d1cc3266bab3-image.png

基于候选池的搜索策略
Supernet训练结束后，我们可以使用验证集的ACC评估结构的好坏。本文使用 NSGA-II 进化算法[3] 进行结构搜索。我们在进化算法中使用候选池中的结构进行population的初始化，相较于随机初始化，借助于候选池能够使进化算法有一个更好的初始，提升搜索效率及最终的精度。
如 Figure 3 所示，我们在同一个训练好的supernet上使用了随机初始化与候选池初始化两种方式进行搜索，使用候选池初始化搜索到的结构的准确率平均会比随机初始化要高。
0_1588075774826_7dff6fbb-a87c-4cb9-9c2c-c82534d1f978-image.png

实验结果

为了与目前的one-shot方法进行对比，我们首先在与 ProxylessNAS[6] 一样的 MobileNetV2 搜索空间上进行结构搜索，结果见 Table 1 。
0_1588075812295_65e888e8-8db5-490d-99a0-03abacfcc42f-image.png
同时为了进一步提升网络性能，我们在加入了SE的更大搜索空间上进行搜索，结果见 Table 2 。
0_1588075825554_4a814333-1d3a-4b86-8174-99b7febe6111-image.png

Ablation Study

多路径拒绝式采样中的路径评估相关性

我们对随机初始化、uniform sampling、greedy方法训练得到的supernet下的小验证集指标与完整验证集ACC相关性进行了评估，如 Table 3 所示。可见在小验证集上使用loss相比ACC会得到更高的相关性，我们的贪心方法训练出的supernet有着更好的相关性。

0_1588075873676_61a001f4-17b5-4541-8d03-fa4027f1382d-image.png

我们对不同大小小验证集与完整验证集的相关性进行了评估（Figure 4 左图），同时对uniform-sampling算法在不同迭代轮数下的相关性作了评估（Figure 4 右图）。可以看出我们的算法在较小的验证集大小上仍能保持较高的相关性，在精度与效率的权衡下，我们最终选取 1000 作为小验证集大小。
0_1588075915796_ec7bef92-daf3-46b8-845b-bfc2fbca683a-image.png

对多路径贪心采样及候选池效果的评估

我们在MobileNetV2的search space下评估了多路径贪心采样及候选池的效果，如 Table 4 所示。
0_1588075887013_cad70d33-b276-483d-9db8-c472d9cbd807-image.png

总结

超网络训练是单分支One-Shot NAS 方法的关键。与目前方法的对所有分支一视同仁不同，我们的方法贪心地注重于有潜力的好分支的训练。这种贪心地分支滤波可以通过我们提出的多分支采样策略被高效地实现。我们提出的 GreedyNAS 在准确率和训练效率上均展现出了显著的优势。

Reference

[1] Christian Sciuto, Kaicheng Yu, Martin Jaggi, Claudiu Musat, and Mathieu Salzmann. Evaluating the search phase of neural architecture search. arXiv preprint arXiv:1902.08142, 2019.
[2] Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. Single path oneshot neural architecture search with uniform sampling. arXiv preprint arXiv:1904.00420, 2019.
[3] Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE transactions on evolutionary computation, 6(2):182–197, 2002.
[4] Levente Kocsis and Csaba Szepesv´ari. Bandit based montecarlo planning. In European conference on machine learning, pages 282–293. Springer, 2006.
[5] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
[6] Han Cai, Ligeng Zhu, and Song Han. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332, 2018.