据初步统计,CVPR 2022共录用发表与OCR直接相关的论文约26篇,相比于去年(CVPR 2021,22篇)增加了4篇,表明此领域的研究热度在持续增加。26篇论文内容覆盖文档图像处理、场景文字检测、场景文字识别、端到端文字识别、表格分析与结构识别、文档图像理理解、字体生成、手写公式识别、自监督文本识别、OCR+CV应用等多个方向。具体情况如下(标*的论文开源了代码,标#的论文开源了数据集):
附:CVPR 2022论文下载官网:https://openaccess.thecvf.com/CVPR2022?day=all
文字图像处理(场景文字分割、文字超分辨率、文档图像矫正):4篇
-
Xixi Xu, Zhongang Qi, Jianqi Ma, Honglun Zhang, Ying Shan, Xiaohu Qie, BTS: A Bi-Lingual Benchmark for Text Segmentation in the Wild(#) -
Jianqi Ma, Zhetong Liang, Lei Zhang,A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-Resolution(*) -
Chuhui Xue, Zichen Tian, Fangneng Zhan, Shijian Lu, Song Bai, Fourier Document Restoration for Robust Document Dewarping and Recognition (#) -
Xiangwei Jiang, Rujiao Long, Nan Xue, Zhibo Yang, Cong Yao, Gui-Song Xia, Revisiting Document Image Dewarping by Grid Regularization(*)
场景文字检测:3篇
-
Jingqun Tang, Wenqing Zhang, Hongye Liu, MingKun Yang, Bo Jiang, Guanglong Hu, Xiang Bai, Few Could Be Better Than All- Feature Sampling and Grouping for Scene Text Detection -
Sibo Song, Jianqiang Wan, Zhibo Yang, Jun Tang, Wenqing Cheng, Xiang Bai, Cong Yao, Vision-Language Pre-Training for Boosting Scene Text Detectors(*) -
Shangbang Long, Siyang Qin, Dmitry Panteleev, Alessandro Bissacco, Yasuhisa Fujii, Michalis Raptis, Towards End-to-End Unified Scene Text Detection and Layout Analysis(*#)
场景文字识别:2篇
-
Chang Liu, Chun Yang, Xu-Cheng Yin, Open-Set Text Recognition via Character-Context Decoupling(*#) -
Caiyuan Zheng, Hui Li, Seon-Min Rhee, Seungju Han, Jae-Joon Han, Peng Wang, Pushing the Performance Limit of Scene Text Recognizer Without Human Annotation
端到端文字识别:3篇
-
Mingxin Huang, Yuliang Liu, Zhenghao Peng, Chongyu Liu, Dahua Lin, Shenggao Zhu, Nicholas Yuan, Kai Ding, Lianwen Jin, SwinTextSpotter: Scene Text Spotting Via Better Synergy Between Text Detection and Text Recognition(*) -
Xiang Zhang, Yongwen Su, Subarna Tripathi, Zhuowen Tu, Text Spotting Transformer (*) -
Yair Kittenplon, Inbal Lavi ,Sharon Fogel, Yarin Bar, R. Manmatha, Pietro Perona, Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
文档图像理解:3篇
-
Zhangxuan Gu, Changhua Meng, Ke Wang, Jun Lan, Weiqiang Wang, Ming Gu, Liqing Zhang, XYLayoutLM: Towards Layout-Aware Multimodal Networks for Visually-Rich Document Understanding -
Ali Furkan Biten, Ron Litman, Yusheng Xie, Srikar Appalaraju, R. Manmatha, LaTr: Layout-Aware Transformer for Scene-Text VQA -
Yihao Ding, Zhe Huang, Runlin Wang, YanHang Zhang, Xianru Chen, Yuzhong Ma, Hyunsuk Chung, Soyeon Caren Han, V-Doc: Visual Questions Answers With Documents (#)
表格分析与结构识别:3篇
-
Brandon Smock, Rohith Pesala, Robin Abraham, PubTables-1M: Towards Comprehensive Table Extraction From Unstructured Documents(*#) -
Hao Liu, Xin Li, Bing Liu, Deqiang Jiang, Yinsong Liu, Bo Ren, Neural Collaborative Graph Machines for Table Structure Recognition
-
Ahmed Nassar, Nikolaos Livathinos, Maksym Lysak, Peter Staar, TableFormer: Table Structure Understanding with Transformers
数据合成:4篇
-
Wei Liu, Fangyue Liu, Fei Ding, Qian He, Zili Yi, XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation -
Yuxin Kong, Canjie Luo, Weihong Ma, Qiyuan Zhu, Shenggao Zhu, Nicholas Yuan, Lianwen Jin, Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator -
Licheng Tang, Yiyang Cai, Jiaming Liu, Zhibin Hong, Mingming Gong, Minhu Fan, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang, Few-Shot Font Generation by Learning Fine-Grained Local Styles -
Yizhi Wang, Guo Pu, Wenhan Luo, Yexin Wang, Pengfei Xiong, Hongwen Kang, Zhouhui Lian, Aesthetic Text Logo Synthesis via Content-Aware Layout Inferring
其它(手写数学公式识别,自监督文本表征学习,OCR+CV新应用):4篇
-
Ye Yuan, Xiao Liu, Wondimu Dikubab, Hui Liu, Zhilong Ji, Zhongqin Wu, Xiang Bai, Syntax-Aware Network for Handwritten Mathematical Expression Recognition (*#) -
Canjie Luo, Lianwen Jin, Jingdong Chen, SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization -
Mengjun Cheng, Yipeng Sun, Longchao Wang, Xiongwei Zhu, Kun Yao, Jie Chen, Guoli Song, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang, ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval -
Hao Wang, Junchao Liao, Tianheng Cheng, Zewen Gao, Hao Liu, Bo Ren, Xiang Bai, Wenyu Liu, Knowledge Mining with Scene Text for Fine-Grained Recognition (*#)
内容中包含的图片若涉及版权问题,请及时与我们联系删除
评论
沙发等你来抢