[go: up one dir, main page]

CN117671714A - Intelligent archive management method and system for building full life cycle - Google Patents

Intelligent archive management method and system for building full life cycle Download PDF

Info

Publication number
CN117671714A
CN117671714A CN202311712970.XA CN202311712970A CN117671714A CN 117671714 A CN117671714 A CN 117671714A CN 202311712970 A CN202311712970 A CN 202311712970A CN 117671714 A CN117671714 A CN 117671714A
Authority
CN
China
Prior art keywords
file
archive
image
text
archives
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202311712970.XA
Other languages
Chinese (zh)
Inventor
邵俊豪
陈丰
丁洁
刘玲
施秋红
邹积珉
万源
张峰
宋扬
戴岽丞
庙丹明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Zhongbo Communication Co ltd
Original Assignee
Jiangsu Zhongbo Communication Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Zhongbo Communication Co ltd filed Critical Jiangsu Zhongbo Communication Co ltd
Priority to CN202311712970.XA priority Critical patent/CN117671714A/en
Publication of CN117671714A publication Critical patent/CN117671714A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an intelligent archive management method and system for constructing a full life cycle, and relates to the technical field of archive management. The method comprises the following steps: collecting archival data and creating a primary archive; performing file arrangement based on the primary archives, generating and storing an ultimate archives; file retrieval and sharing are carried out by establishing a retrieval rule; recording the operation history and the access history of the file, and marking and tracing; establishing archiving requirements and conducting archiving and backup; setting archive destruction rules and periodically destroying and updating. According to the invention, through collecting, identifying, sorting, storing, searching, counting and destroying the file information resources, file management in a full period from file collection to file destruction is established, so that the integrated utilization and management of the file resources are realized; thereby improving the efficiency and quality of file management and promoting information sharing and collaboration.

Description

一种构建全生命周期的智慧档案管理方法及系统A method and system for building a full life cycle of smart archives management

技术领域Technical field

本发明属于档案管理技术领域,尤其涉及一种构建全生命周期的智慧档案管理方法及系统。The invention belongs to the technical field of archives management, and in particular relates to a method and system for constructing a full life cycle of intelligent archives management.

背景技术Background technique

智慧档案的应用是指将档案的管理智能化和数字化,通过智能化和数字化技术使得档案的管理和交互更加便捷,促进了信息的共享和利用。然而现有的档案管理只针对档案的归纳整理和调阅,没有针对档案资料从产生到销毁这一过程建立全生命周期的档案管理流程,因此会出现档案管理效率低、档案安全无法保障以及信息共享和交互受阻碍等问题。The application of smart archives refers to the intelligentization and digitization of archives management. Intelligent and digital technologies make the management and interaction of archives more convenient and promote the sharing and utilization of information. However, the existing archives management only focuses on the collection, sorting and retrieval of archives, and does not establish a full life cycle archives management process from the generation to destruction of archives. Therefore, there will be low efficiency of archives management, inability to guarantee archives security, and information Issues such as barriers to sharing and interaction.

发明内容Contents of the invention

本发明的目的在于提供一种构建全生命周期的智慧档案管理方法及系统,具体通过以下技术方案实现:The purpose of the present invention is to provide a method and system for constructing a full life cycle of smart archive management, which is specifically achieved through the following technical solutions:

第一方面,本申请实施例提供了一种构建全生命周期的智慧档案管理方法,包括如下步骤:In the first aspect, embodiments of this application provide a method for constructing a full life cycle smart archive management method, which includes the following steps:

采集档案资料并创建初级档案库;Collect archival data and create a primary archive;

基于所述初级档案库进行档案整理,生成终极档案库并进行存储;Organize files based on the primary archives, generate and store the final archives;

通过建立检索规则进行档案检索和共享;Archive retrieval and sharing by establishing search rules;

记录档案的操作历史和访问历史并进行标记和追溯;Record the operation history and access history of archives and mark and trace them;

建立归档要求并进行归档和备份;Establish archiving requirements and perform archiving and backup;

设置档案销毁规则并定期进行销毁和更新;Set file destruction rules and destroy and update them regularly;

采集档案资料包括在线采集、离线采集和自动采集;Collecting archival data includes online collection, offline collection and automatic collection;

创建所述初级档案库具体为:通过信息扫描将纸质档案转化为电子档案;The specific steps of creating the primary archive include: converting paper archives into electronic archives through information scanning;

所述档案整理包括资料分类、文件图像处理和文件文字处理;The file sorting includes data classification, document image processing and document text processing;

建立所述检索规则包括关键词检索、模糊检索、全文检索、关联检索和同义词检索;Establishing the search rules includes keyword search, fuzzy search, full-text search, related search and synonym search;

其中,所述全文检索采用OCR识别技术;所述同义词检索采用自然语义分析技术;Among them, the full-text search uses OCR recognition technology; the synonym search uses natural semantic analysis technology;

所述归档要求的建立基于所述操作历史和所述访问历史,用于将非频繁访问档案进行归档和保存备份。The archiving requirement is established based on the operation history and the access history, and is used to archive and save backups of infrequently accessed files.

优选地,将所述终极档案库与区块链建立连接,包括如下步骤:Preferably, establishing a connection between the ultimate archive and the blockchain includes the following steps:

在所述区块链中建立区块链节点并创建智能合约;Establish blockchain nodes and create smart contracts in said blockchain;

将所述终极档案中的档案信息记录在所述区块链节点中;Record the archive information in the ultimate archive in the blockchain node;

基于所述智能合约建立访问控制机制。Establish an access control mechanism based on the smart contract.

优选地,所述访问控制机制具体为:Preferably, the access control mechanism is specifically:

将所述终极档案库中的档案分为若干包含档案信息的档案块;Divide the archives in the ultimate archive into a number of archive blocks containing archive information;

对所述档案块进行加密并设置验证方式;Encrypt the file block and set a verification method;

基于所述智能合约为访问者分配访问权限;Assign access rights to visitors based on the smart contract;

若访问者访问所述档案块,则对所述访问权限进行匹配和验证。If a visitor accesses the archive block, the access rights are matched and verified.

优选地,对档案资料进行文件图像处理,包括如下步骤:Preferably, document image processing is performed on archive materials, including the following steps:

获取纸质档案材料中的档案图像;所述档案图像包括参考图像和待检测图像;Obtain archival images in paper archival materials; the archival images include reference images and images to be detected;

提取所述参考图像中的多个参考统计特征;extracting a plurality of reference statistical features in the reference image;

对所述参考统计特征进行特征编码,生成参考统计特征向量;Perform feature encoding on the reference statistical features to generate a reference statistical feature vector;

对所述参考图像进行图像编码,生成参考图像特征向量;Perform image coding on the reference image to generate a reference image feature vector;

基于所述参考统计特征向量对所述参考图像特征向量的特征编码进行优化,得到优化参考图像特征矩阵;Optimize the feature encoding of the reference image feature vector based on the reference statistical feature vector to obtain an optimized reference image feature matrix;

将所述待检测图像进行图像编码优化,得到优化检测图像特征矩阵;Perform image coding optimization on the image to be detected to obtain an optimized detection image feature matrix;

将所述优化参考图像特征矩阵和所述优化检测图像特征矩阵进行度量,得到度量特征向量;Measure the optimized reference image feature matrix and the optimized detection image feature matrix to obtain a measured feature vector;

将所述度量特征向量通过分类器得到分类结果;Pass the metric feature vector through a classifier to obtain a classification result;

其中,所述分类结果表示待检测图像的图像质量是否满足预定要求。Wherein, the classification result indicates whether the image quality of the image to be detected meets predetermined requirements.

优选地,所述文件图像处理还包括图像纠偏、图像去污点、图像补边、图像剪裁和图像旋转;Preferably, the document image processing also includes image correction, image decontamination, image edge filling, image cropping and image rotation;

关于所述图像纠偏,具体包括如下步骤:Regarding the image correction, it specifically includes the following steps:

读取待校正档案图像并对所述待校正档案图像进行预处理;Read the archive image to be corrected and preprocess the archive image to be corrected;

利用傅里叶变换获取预处理后的待校正档案图像的倾斜角α;Use Fourier transform to obtain the tilt angle α of the preprocessed archival image to be corrected;

通过边界拟合确定待校正档案图像的外边界线和倾斜角β,并利用空间仿射变换对待校正档案图像的倾斜进行修正;Determine the outer boundary line and tilt angle β of the archive image to be corrected through boundary fitting, and use spatial affine transformation to correct the tilt of the archive image to be corrected;

提取待校正档案图像中的文字图片列表,并利用预设的偏旁组件和/或文字组件对待校正档案图像中文字图片进行匹配,确定待校正档案图像上下倒置情况,并对倒置的待校正档案图像进行翻转来获取文字内容正向且倾斜角度符合规范要求的档案图像;Extract the list of text pictures in the file image to be corrected, and use the preset radical components and/or text components to match the text pictures in the file image to be corrected, determine the inversion of the file image to be corrected, and compare the inverted file image to be corrected Flip to obtain a file image with the text content facing forward and the tilt angle meeting the specification requirements;

其中,所述预处理包括白边裁剪和灰度化处理。Wherein, the preprocessing includes white edge cropping and grayscale processing.

优选地,所述文件文字处理包括文字清理、文字标准化、文字分词、文字转换和敏感信息处理。Preferably, the document text processing includes text cleaning, text standardization, text segmentation, text conversion and sensitive information processing.

优选地,关于所述文件图像处理和所述文件文字处理,还包括图文对比,具体为:Preferably, the document image processing and the document text processing also include image and text comparison, specifically:

通过所述文件图像处理得到档案图像;Obtain archive images through the document image processing;

通过所述文件文字处理得到档案文字;Obtain file text through the file text processing;

将所述档案图像与所述档案文字进行图文对比,获取对比结果;Compare the file image and the file text with pictures and texts to obtain comparison results;

所述对比结果表示档案的图像与文字的匹配程度。The comparison result indicates the degree of matching between the image and the text of the file.

优选地,对档案定期进行销毁和更新,具体为:设定档案销毁时间期限,基于所述档案销毁规则对待销毁档案进行判断,若所述待销毁档案符合所述档案销毁规则,则将所述待销毁档案销毁,并对所述终极档案库进行更新。Preferably, the files are destroyed and updated regularly, specifically: setting a time limit for file destruction, judging the files to be destroyed based on the file destruction rules, and if the files to be destroyed comply with the file destruction rules, then the files to be destroyed are Archives to be destroyed are destroyed and the ultimate archive is updated.

优选地,在档案存储过程中,还包括对档案信息进行查验,具体包括如下步骤:Preferably, the archive storage process also includes checking the archive information, specifically including the following steps:

获取待查验档案储存区域所对应的第一档案查验关键词在第一档案检索路径所对应的关联档案存储区域集;Obtain the set of associated file storage areas corresponding to the first file inspection keyword corresponding to the file storage area to be checked and the first file retrieval path;

从关联档案存储区域集中获取第二档案查验关键词,并辨析在预设时间内与第一档案查验关键词在第一档案检索路径相关联的第一档案解析序列以及与第二档案查验关键词对应的第二档案检索路径相关联的第二档案解析序列;Centrally obtain the second file check keyword from the associated file storage area, and analyze the first file parsing sequence associated with the first file check keyword in the first file retrieval path within the preset time and the second file check keyword The second file parsing sequence associated with the corresponding second file retrieval path;

基于预先配置的档案解析构架分别对第一档案解析序列中的档案信息和第二档案解析序列中的档案信息进行解析,分别获得第一档案解析序列中的第一档案信息集和第二档案解析序列中的第二档案信息集;Based on the pre-configured file parsing architecture, the file information in the first file parsing sequence and the file information in the second file parsing sequence are respectively parsed to obtain the first file information set and the second file parsing in the first file parsing sequence respectively. a second set of archival information in the sequence;

分别从第一档案信息集和第二档案信息集中确定与待查验档案储存区域的档案写入问题相关的关联档案信息,并将关联档案信息与待查验档案储存区域建立绑定关系;Determine associated file information related to the file writing problem in the file storage area to be checked from the first file information set and the second file information set respectively, and establish a binding relationship between the associated file information and the file storage area to be checked;

将关联档案信息与待查验档案储存区域的查验状态进行发送和记录。Send and record the associated file information and the inspection status of the file storage area to be inspected.

第二方面,本申请实施例提供了一种构建全生命周期的智慧档案管理系统,包括依次通信连接的档案采集模块、档案整理模块、档案检索模块、档案记录模块、档案存储模块和档案销毁模块;In the second aspect, embodiments of the present application provide a smart archive management system that builds a full life cycle, including an archive collection module, an archive arrangement module, an archive retrieval module, an archive recording module, an archive storage module and an archive destruction module that are sequentially communicated and connected. ;

所述档案采集模块,用于采集档案资料并创建初级档案库;The archive collection module is used to collect archive materials and create a primary archive;

所述档案整理模块,基于所述初级档案库进行档案整理,生成终极档案库;The file sorting module organizes files based on the primary archive and generates a final archive;

所述档案检索模块,用于建立检索规则并进行档案检索和共享;The archive retrieval module is used to establish retrieval rules and perform archive retrieval and sharing;

所述档案记录模块,用于记录档案的操作历史和访问历史并进行标记和追溯;The archive recording module is used to record the operation history and access history of archives and to mark and trace them;

所述档案存储模块,用于建立归档要求并进行归档和备份,以及存储终极档案库;The archive storage module is used to establish archiving requirements, perform archiving and backup, and store the ultimate archive;

所述档案销毁模块,用于设置档案销毁规则并定期进行销毁和更新。The file destruction module is used to set file destruction rules and destroy and update them regularly.

本发明的有益效果为:The beneficial effects of the present invention are:

(1)本发明通过对档案信息资源进行收集、鉴定、整理、存储、检索统计和销毁,建立从档案收集到档案销毁的全周期的档案管理,从而实现对档案资源的一体化利用和管理;进而提高档案管理效率和质量,促进信息共享和协同。(1) The present invention establishes a full-cycle archive management from archive collection to archive destruction by collecting, identifying, sorting, storing, retrieving statistics and destroying archive information resources, thereby realizing integrated utilization and management of archive resources; This will further improve the efficiency and quality of archives management and promote information sharing and collaboration.

(2)本发明将区块链与档案管理结合在一起,能够确保档案的安全性、隐私保护和透明度,提高档案信息的可靠性和可追溯性。(2) The present invention combines blockchain with archive management, which can ensure the security, privacy protection and transparency of archives, and improve the reliability and traceability of archive information.

(3)本发明通过对档案信息进行查验,极大提高档案查验效率,还可以避免由人工进行一一查验误判,提高档案分析查验的准确性。(3) The present invention greatly improves the efficiency of file inspection by checking file information. It can also avoid misjudgments caused by manual inspection one by one and improve the accuracy of file analysis and inspection.

(4)本发明通过采用基于深度学习的人工智能检测技术来提取出待检测纸质档案图像和参考纸质档案图像中的高维隐含特征分布信息,进一步再通过距离度量工具来度量所述待检测纸质档案图像隐含特征和所述参考纸质档案图像隐含特征之间的特征差异性,并以此来进行所述待检测纸质档案图像的质量评估。这样,能够智能且准确地对于扫描后的纸质档案图像进行质量检测,以此来判断扫描后的档案图像清晰度是否满足后续的应用需求。(4) The present invention uses artificial intelligence detection technology based on deep learning to extract high-dimensional implicit feature distribution information in the paper archive image to be detected and the reference paper archive image, and further uses a distance measurement tool to measure the to-be-detected The characteristic difference between the implicit features of the paper archive image and the implicit features of the reference paper archive image is used to evaluate the quality of the paper archive image to be detected. In this way, the quality of the scanned paper archive image can be intelligently and accurately inspected to determine whether the clarity of the scanned archive image meets subsequent application requirements.

(5)本发明通过拟合扫描档案图像边框并对其进行傅里叶变换,从而实现对档案图像的倾斜矫正,能够解决由于档案存放时间过久造成纸张破损而出现的直线无法准确拟合的情况,能够屏蔽版面内容无关信息,进而提升倾斜矫正的准确度;对于存在倒置情况,通过倒置检测来保证档案文本正向,提高档案加工效率,便于在档案数字化管理和归档中的应用。(5) The present invention realizes the tilt correction of the archive image by fitting the frame of the scanned archive image and performing Fourier transform on it, and can solve the problem that the straight line cannot be accurately fitted due to paper damage caused by the archive being stored for too long. situation, it can shield information irrelevant to the layout content, thereby improving the accuracy of tilt correction; for inversion situations, inversion detection is used to ensure that the file text is in the forward direction, improve file processing efficiency, and facilitate application in file digital management and archiving.

附图说明Description of drawings

为了更好地理解和实施,下面结合附图详细说明本申请的技术方案。For better understanding and implementation, the technical solution of the present application will be described in detail below with reference to the accompanying drawings.

图1为本申请实施例提供的一种构建全生命周期的智慧档案管理方法的步骤流程图;Figure 1 is a flow chart of steps for constructing a full life cycle smart archive management method provided by an embodiment of the present application;

图2为本申请实施例提供的一种构建全生命周期的智慧档案管理系统的结构示意图。Figure 2 is a schematic structural diagram of a full life cycle smart archive management system provided by an embodiment of the present application.

具体实施方式Detailed ways

为更进一步阐述本发明为实现预定发明目的所采取的技术手段及功效,这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的方法和系统的例子。In order to further elaborate on the technical means and effects adopted by the present invention to achieve the intended inventive object, exemplary embodiments will be described in detail, examples of which are shown in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of methods and systems consistent with aspects of the application as detailed in the appended claims.

在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in this application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "the" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.

以下结合附图及较佳实施例,对依据本发明的具体实施方式、特征及其功效作详细说明。Specific implementations, features and effects of the present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments.

实施例1Example 1

请参阅图1,本申请实施例提供一种构建全生命周期的智慧档案管理方法,包括如下步骤:Please refer to Figure 1. This embodiment of the present application provides a method for constructing a full life cycle smart archive management method, which includes the following steps:

采集档案资料并创建初级档案库;Collect archival data and create a primary archive;

基于所述初级档案库进行档案整理,生成终极档案库并进行存储;Organize files based on the primary archives, generate and store the final archives;

通过建立检索规则进行档案检索和共享;Archive retrieval and sharing by establishing search rules;

记录档案的操作历史和访问历史并进行标记和追溯;Record the operation history and access history of archives and mark and trace them;

建立归档要求并进行归档和备份;Establish archiving requirements and perform archiving and backup;

设置档案销毁规则并定期进行销毁和更新;Set file destruction rules and destroy and update them regularly;

采集档案资料包括在线采集、离线采集和自动采集;Collecting archival data includes online collection, offline collection and automatic collection;

创建所述初级档案库具体为:通过信息扫描将纸质档案转化为电子档案;The specific steps of creating the primary archive include: converting paper archives into electronic archives through information scanning;

所述档案整理包括资料分类、文件图像处理和文件文字处理;The file sorting includes data classification, document image processing and document text processing;

建立所述检索规则包括关键词检索、模糊检索、全文检索、关联检索和同义词检索;Establishing the search rules includes keyword search, fuzzy search, full-text search, related search and synonym search;

其中,所述全文检索采用OCR识别技术;所述同义词检索采用自然语义分析技术;Among them, the full-text search uses OCR recognition technology; the synonym search uses natural semantic analysis technology;

所述归档要求的建立基于所述操作历史和所述访问历史,用于将非频繁访问档案进行归档和保存备份。The archiving requirement is established based on the operation history and the access history, and is used to archive and save backups of infrequently accessed files.

在本申请提供的一种实施例中,将所述终极档案库与区块链建立连接,包括如下步骤:In an embodiment provided by this application, establishing a connection between the ultimate archive and the blockchain includes the following steps:

在所述区块链中建立区块链节点并创建智能合约;Establish blockchain nodes and create smart contracts in said blockchain;

将所述终极档案中的档案信息记录在所述区块链节点中;Record the archive information in the ultimate archive in the blockchain node;

基于所述智能合约建立访问控制机制。Establish an access control mechanism based on the smart contract.

进一步地,所述访问控制机制具体为:Further, the access control mechanism is specifically:

将所述终极档案库中的档案分为若干包含档案信息的档案块;Divide the archives in the ultimate archive into a number of archive blocks containing archive information;

对所述档案块进行加密并设置验证方式;Encrypt the file block and set a verification method;

基于所述智能合约为访问者分配访问权限;Assign access rights to visitors based on the smart contract;

若访问者访问所述档案块,则对所述访问权限进行匹配和验证。If a visitor accesses the archive block, the access rights are matched and verified.

在本申请提供的一种实施例中,对档案资料进行文件图像处理,包括如下步骤:In an embodiment provided by this application, performing document image processing on archival data includes the following steps:

获取纸质档案材料中的档案图像;所述档案图像包括参考图像和待检测图像;Obtain archival images in paper archival materials; the archival images include reference images and images to be detected;

提取所述参考图像中的多个参考统计特征;extracting a plurality of reference statistical features in the reference image;

对所述参考统计特征进行特征编码,生成参考统计特征向量;Perform feature encoding on the reference statistical features to generate a reference statistical feature vector;

对所述参考图像进行图像编码,生成参考图像特征向量;Perform image coding on the reference image to generate a reference image feature vector;

基于所述参考统计特征向量对所述参考图像特征向量的特征编码进行优化,得到优化参考图像特征矩阵;Optimize the feature encoding of the reference image feature vector based on the reference statistical feature vector to obtain an optimized reference image feature matrix;

将所述待检测图像进行图像编码优化,得到优化检测图像特征矩阵;Perform image coding optimization on the image to be detected to obtain an optimized detection image feature matrix;

将所述优化参考图像特征矩阵和所述优化检测图像特征矩阵进行度量,得到度量特征向量;Measure the optimized reference image feature matrix and the optimized detection image feature matrix to obtain a measured feature vector;

将所述度量特征向量通过分类器得到分类结果;Pass the metric feature vector through a classifier to obtain a classification result;

其中,所述分类结果表示待检测图像的图像质量是否满足预定要求。Wherein, the classification result indicates whether the image quality of the image to be detected meets predetermined requirements.

进一步地,所述文件图像处理还包括图像纠偏、图像去污点、图像补边、图像剪裁和图像旋转;Further, the document image processing also includes image correction, image decontamination, image edge filling, image cropping and image rotation;

关于所述图像纠偏,具体包括如下步骤:Regarding the image correction, it specifically includes the following steps:

读取待校正档案图像并对所述待校正档案图像进行预处理;Read the archive image to be corrected and preprocess the archive image to be corrected;

利用傅里叶变换获取预处理后的待校正档案图像的倾斜角α;Use Fourier transform to obtain the tilt angle α of the preprocessed archival image to be corrected;

通过边界拟合确定待校正档案图像的外边界线和倾斜角β,并利用空间仿射变换对待校正档案图像的倾斜进行修正;Determine the outer boundary line and tilt angle β of the archive image to be corrected through boundary fitting, and use spatial affine transformation to correct the tilt of the archive image to be corrected;

提取待校正档案图像中的文字图片列表,并利用预设的偏旁组件和/或文字组件对待校正档案图像中文字图片进行匹配,确定待校正档案图像上下倒置情况,并对倒置的待校正档案图像进行翻转来获取文字内容正向且倾斜角度符合规范要求的档案图像;Extract the list of text pictures in the file image to be corrected, and use the preset radical components and/or text components to match the text pictures in the file image to be corrected, determine the inversion of the file image to be corrected, and compare the inverted file image to be corrected Flip to obtain a file image with the text content facing forward and the tilt angle meeting the specification requirements;

其中,所述预处理包括白边裁剪和灰度化处理。Wherein, the preprocessing includes white edge cropping and grayscale processing.

在本申请提供的一种实施例中,所述文件文字处理包括文字清理、文字标准化、文字分词、文字转换和敏感信息处理。In an embodiment provided by this application, the document text processing includes text cleaning, text standardization, text segmentation, text conversion and sensitive information processing.

进一步地,关于所述文件图像处理和所述文件文字处理,还包括图文对比,具体为:Further, regarding the document image processing and the document text processing, image and text comparison is also included, specifically:

通过所述文件图像处理得到档案图像;Obtain archive images through the document image processing;

通过所述文件文字处理得到档案文字;Obtain file text through the file text processing;

将所述档案图像与所述档案文字进行图文对比,获取对比结果;Compare the file image and the file text with pictures and texts to obtain comparison results;

所述对比结果表示档案的图像与文字的匹配程度。The comparison result indicates the degree of matching between the image and the text of the file.

在本申请提供的一种实施例中,对档案定期进行销毁和更新,具体为:设定档案销毁时间期限,基于所述档案销毁规则对待销毁档案进行判断,若所述待销毁档案符合所述档案销毁规则,则将所述待销毁档案销毁,并对所述终极档案库进行更新。In an embodiment provided by this application, files are regularly destroyed and updated, specifically: setting a time limit for file destruction, and judging the files to be destroyed based on the file destruction rules. If the files to be destroyed meet the requirements According to the file destruction rules, the files to be destroyed are destroyed and the ultimate archive library is updated.

在本申请提供的一种实施例中,在档案存储过程中,还包括对档案信息进行查验,具体包括如下步骤:In an embodiment provided by this application, the file storage process also includes checking the file information, which specifically includes the following steps:

获取待查验档案储存区域所对应的第一档案查验关键词在第一档案检索路径所对应的关联档案存储区域集;Obtain the set of associated file storage areas corresponding to the first file inspection keyword corresponding to the file storage area to be checked and the first file retrieval path;

从关联档案存储区域集中获取第二档案查验关键词,并辨析在预设时间内与第一档案查验关键词在第一档案检索路径相关联的第一档案解析序列以及与第二档案查验关键词对应的第二档案检索路径相关联的第二档案解析序列;Centrally obtain the second file check keyword from the associated file storage area, and analyze the first file parsing sequence associated with the first file check keyword in the first file retrieval path within the preset time and the second file check keyword The second file parsing sequence associated with the corresponding second file retrieval path;

基于预先配置的档案解析构架分别对第一档案解析序列中的档案信息和第二档案解析序列中的档案信息进行解析,分别获得第一档案解析序列中的第一档案信息集和第二档案解析序列中的第二档案信息集;Based on the pre-configured file parsing architecture, the file information in the first file parsing sequence and the file information in the second file parsing sequence are respectively parsed to obtain the first file information set and the second file parsing in the first file parsing sequence respectively. a second set of archival information in the sequence;

分别从第一档案信息集和第二档案信息集中确定与待查验档案储存区域的档案写入问题相关的关联档案信息,并将关联档案信息与待查验档案储存区域建立绑定关系;Determine associated file information related to the file writing problem in the file storage area to be checked from the first file information set and the second file information set respectively, and establish a binding relationship between the associated file information and the file storage area to be checked;

将关联档案信息与待查验档案储存区域的查验状态进行发送和记录。Send and record the associated file information and the inspection status of the file storage area to be inspected.

综上所述,本发明通过对档案信息资源进行收集、鉴定、整理、存储、检索统计和销毁,建立从档案收集到档案销毁的全周期的档案管理,从而实现对档案资源的一体化利用和管理;进而提高档案管理效率和质量,促进信息共享和协同。In summary, the present invention establishes a full-cycle archive management from archive collection to archive destruction by collecting, identifying, sorting, storing, retrieving statistics and destroying archive information resources, thereby achieving integrated utilization and destruction of archive resources. management; thereby improving the efficiency and quality of archives management and promoting information sharing and collaboration.

本发明将区块链与档案管理结合在一起,能够确保档案的安全性、隐私保护和透明度,提高档案信息的可靠性和可追溯性。The invention combines blockchain with archive management, which can ensure the security, privacy protection and transparency of archives, and improve the reliability and traceability of archive information.

本发明通过对档案信息进行查验,极大提高档案查验效率,还可以避免由人工进行一一查验误判,提高档案分析查验的准确性。The present invention greatly improves the efficiency of file inspection by checking the file information. It can also avoid misjudgments caused by manual inspection one by one and improve the accuracy of file analysis and inspection.

本发明通过采用基于深度学习的人工智能检测技术来提取出待检测纸质档案图像和参考纸质档案图像中的高维隐含特征分布信息,进一步再通过距离度量工具来度量所述待检测纸质档案图像隐含特征和所述参考纸质档案图像隐含特征之间的特征差异性,并以此来进行所述待检测纸质档案图像的质量评估。这样,能够智能且准确地对于扫描后的纸质档案图像进行质量检测,以此来判断扫描后的档案图像清晰度是否满足后续的应用需求。The present invention uses artificial intelligence detection technology based on deep learning to extract high-dimensional implicit feature distribution information in the paper archive image to be detected and the reference paper archive image, and further uses a distance measurement tool to measure the paper archive to be detected The feature difference between the implicit features of the image and the implicit features of the reference paper archive image is used to evaluate the quality of the paper archive image to be detected. In this way, the quality of the scanned paper archive image can be intelligently and accurately inspected to determine whether the clarity of the scanned archive image meets subsequent application requirements.

本发明通过拟合扫描档案图像边框并对其进行傅里叶变换,从而实现对档案图像的倾斜矫正,能够解决由于档案存放时间过久造成纸张破损而出现的直线无法准确拟合的情况,能够屏蔽版面内容无关信息,进而提升倾斜矫正的准确度;对于存在倒置情况,通过倒置检测来保证档案文本正向,提高档案加工效率,便于在档案数字化管理和归档中的应用。The present invention realizes the tilt correction of the archive image by fitting the frame of the scanned archive image and performing Fourier transform on it, and can solve the problem that the straight line cannot be accurately fitted due to paper damage caused by the archive being stored for too long, and can Shield information irrelevant to the layout content, thereby improving the accuracy of tilt correction; in the case of inversion, inversion detection is used to ensure the forward direction of the file text, improve file processing efficiency, and facilitate application in file digital management and archiving.

实施例2Example 2

请参阅图2,本申请实施例提供一种构建全生命周期的智慧档案管理系统,包括依次通信连接的档案采集模块、档案整理模块、档案检索模块、档案记录模块、档案存储模块和档案销毁模块;Please refer to Figure 2. The embodiment of the present application provides a smart archive management system that builds a full life cycle, including an archive collection module, an archive arrangement module, an archive retrieval module, an archive recording module, an archive storage module and an archive destruction module that are sequentially communicated and connected. ;

所述档案采集模块,用于采集档案资料并创建初级档案库;The archive collection module is used to collect archive materials and create a primary archive;

所述档案整理模块,基于所述初级档案库进行档案整理,生成终极档案库;The file sorting module organizes files based on the primary archive and generates a final archive;

所述档案检索模块,用于建立检索规则并进行档案检索和共享;The archive retrieval module is used to establish retrieval rules and perform archive retrieval and sharing;

所述档案记录模块,用于记录档案的操作历史和访问历史并进行标记和追溯;The archive recording module is used to record the operation history and access history of archives and to mark and trace them;

所述档案存储模块,用于建立归档要求并进行归档和备份,以及存储终极档案库;The archive storage module is used to establish archiving requirements, perform archiving and backup, and store the ultimate archive;

所述档案销毁模块,用于设置档案销毁规则并定期进行销毁和更新。The file destruction module is used to set file destruction rules and destroy and update them regularly.

在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above embodiments, each embodiment is described with its own emphasis. For parts that are not detailed or documented in a certain embodiment, please refer to the relevant descriptions of other embodiments.

具体地,该系统提供在线收集、离线收集、自动收集三种收集方式,能够对收集的档案进行四性检测;提供整理归档、批量挂接、批量扫描、档案接收功能;系统支持扫描仪联动扫描功能,支持电子文件图像处理(纠偏、去污点、补边、剪裁、旋转等),并提供图像高清图自动化处理工具,能够实现文字加深、红章加深、去黑点、换底色等功能。Specifically, the system provides three collection methods: online collection, offline collection, and automatic collection, and can conduct four-character detection on the collected files; it provides functions of organizing and archiving, batch hooking, batch scanning, and file receiving; the system supports scanner linkage scanning Function, supports electronic document image processing (correction, stain removal, edge filling, cropping, rotation, etc.), and provides automatic processing tools for high-definition images, which can realize functions such as text deepening, red seal deepening, black spots removal, and background color changing.

该系统还可以进行档案数据的增加、删除、修改及档案的日常维护工作。对于档案数据还提供了批量修改、批量删除、打印各种表格(各种目录)、转入数据、转出数据、个性设计、批量下载等功能。支持系统公告发布、数据回收站功能。The system can also add, delete, modify file data and perform daily maintenance of files. For archive data, it also provides functions such as batch modification, batch deletion, printing of various forms (various directories), data transfer, data transfer, personalized design, batch download, etc. Supports system announcement release and data recycle bin functions.

在本系统的档案销毁中,满足档案保管过程中存毁鉴定的管理要求,根据鉴定方案对鉴定档案完成续存、销毁的处理,并打印鉴定文件清册,支持到期提醒。In the file destruction of this system, it meets the management requirements of preservation and destruction appraisal in the file storage process, completes the renewal and destruction of appraisal files according to the appraisal plan, and prints an inventory of appraisal documents to support expiration reminders.

该系统提供一体化检索、卡片检索、高级检索、全文检索等多种检索方式;提供OCR识别功能,能够实现对电子档案内容进行全文检索;支持关联检索能实现不同档案类型之间的关联检索;具备自然语义分析功能,能够实现同义词检索,提供同义词维护功能。The system provides multiple retrieval methods such as integrated retrieval, card retrieval, advanced retrieval, and full-text retrieval; it provides OCR recognition function to enable full-text retrieval of electronic archive contents; it supports associated retrieval to achieve associated retrieval between different archive types; It has natural semantic analysis function, can realize synonym retrieval, and provides synonym maintenance function.

该系统提供档案管理员档案借阅事务管理,包括借阅登记、借阅授权审批(可针对单独文件审批或多个文件审批)、审批划分权限(允许查阅、允许查阅及打印),电子档案浏览、下载、打印需支持自定义水印(包含静态文字水印及动态系统时间、访问用户);支持电子档案敏感部分遮罩打印功能;支持实体档案的借阅归还管理,支持超期提醒功能。支持档案信息推送功能,包含主动推送和请求推送两种模式。The system provides file borrowing management for archivists, including borrowing registration, borrowing authorization approval (can be approved for individual files or multiple files), approval permission division (allowing to view, allowing to view and print), electronic file browsing, downloading, Printing needs to support custom watermarks (including static text watermarks and dynamic system time, access users); support the mask printing function of sensitive parts of electronic files; support the borrowing and return management of physical files, and support the overdue reminder function. Supports file information push function, including active push and request push modes.

该系统支持档案自评、专家复评功能。用户可根据自身档案管理情况,对档案管理情况进行自我评价,系统可根据自我评价情况,生成自评分数。专家可根据情况对评选单位进行复评。系统可根据考评、自评、复评情况生成考评结果报告。The system supports file self-evaluation and expert re-evaluation functions. Users can self-evaluate their file management situation based on their own file management situation, and the system can generate a self-evaluation score based on the self-evaluation situation. Experts can re-evaluate the selection units according to the situation. The system can generate assessment result reports based on assessment, self-assessment, and re-assessment.

所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, only the division of the above functional units and modules is used as an example. In actual applications, the above functions can be allocated to different functional units and modules according to needs. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit. The above-mentioned integrated unit can be hardware-based. It can also be implemented in the form of software functional units. In addition, the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of protection of the present application. For the specific working processes of the units and modules in the above system, please refer to the corresponding processes in the foregoing method embodiments, and will not be described again here.

本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件,或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered to be beyond the scope of the present invention.

以上所述,仅是本发明的较佳实施例而已,并非对本发明作任何形式上的限制,虽然本发明已以较佳实施例揭示如上,然而并非用以限定本发明,任何本领域技术人员,在不脱离本发明技术方案范围内,当可利用上述揭示的技术内容做出些许更动或修饰为等同变化的等效实施例,但凡是未脱离本发明技术方案内容,依据本发明的技术实质对以上实施例所作的任何简介修改、等同变化与修饰,均仍属于本发明技术方案的范围内。The above are only preferred embodiments of the present invention, and do not limit the present invention in any form. Although the present invention has been disclosed above in preferred embodiments, it is not intended to limit the present invention. Anyone skilled in the art , without departing from the scope of the technical solution of the present invention, the technical contents disclosed above can be used to make some changes or modifications to equivalent embodiments with equivalent changes. However, without departing from the technical solution of the present invention, according to the technical solution of the present invention, In essence, any brief modifications, equivalent changes and modifications made to the above embodiments still fall within the scope of the technical solution of the present invention.

Claims (10)

1.一种构建全生命周期的智慧档案管理方法,其特征在于:包括如下步骤:1. A method for constructing a full life cycle of smart archives management, which is characterized by: including the following steps: 采集档案资料并创建初级档案库;Collect archival data and create a primary archive; 基于所述初级档案库进行档案整理,生成终极档案库并进行存储;Organize files based on the primary archives, generate and store the final archives; 通过建立检索规则进行档案检索和共享;Archive retrieval and sharing by establishing search rules; 记录档案的操作历史和访问历史并进行标记和追溯;Record the operation history and access history of archives and mark and trace them; 建立归档要求并进行归档和备份;Establish archiving requirements and perform archiving and backup; 设置档案销毁规则并定期进行销毁和更新;Set file destruction rules and destroy and update them regularly; 采集档案资料包括在线采集、离线采集和自动采集;Collecting archival data includes online collection, offline collection and automatic collection; 创建所述初级档案库具体为:通过信息扫描将纸质档案转化为电子档案;The specific steps of creating the primary archive include: converting paper archives into electronic archives through information scanning; 所述档案整理包括资料分类、文件图像处理和文件文字处理;The file sorting includes data classification, document image processing and document text processing; 建立所述检索规则包括关键词检索、模糊检索、全文检索、关联检索和同义词检索;Establishing the search rules includes keyword search, fuzzy search, full-text search, related search and synonym search; 其中,所述全文检索采用OCR识别技术;所述同义词检索采用自然语义分析技术;Among them, the full-text search uses OCR recognition technology; the synonym search uses natural semantic analysis technology; 所述归档要求的建立基于所述操作历史和所述访问历史,用于将非频繁访问档案进行归档和保存备份。The archiving requirement is established based on the operation history and the access history, and is used to archive and save backups of infrequently accessed files. 2.根据权利要求1所述的一种构建全生命周期的智慧档案管理方法,其特征在于:将所述终极档案库与区块链建立连接,包括如下步骤:2. A method for constructing a full life cycle smart archive management method according to claim 1, characterized in that: establishing a connection between the ultimate archive and the blockchain includes the following steps: 在所述区块链中建立区块链节点并创建智能合约;Establish blockchain nodes and create smart contracts in said blockchain; 将所述终极档案中的档案信息记录在所述区块链节点中;Record the archive information in the ultimate archive in the blockchain node; 基于所述智能合约建立访问控制机制。Establish an access control mechanism based on the smart contract. 3.根据权利要求2所述的一种构建全生命周期的智慧档案管理方法,其特征在于:所述访问控制机制具体为:3. A method for constructing a full life cycle smart archive management method according to claim 2, characterized in that: the access control mechanism is specifically: 将所述终极档案库中的档案分为若干包含档案信息的档案块;Divide the archives in the ultimate archive into a number of archive blocks containing archive information; 对所述档案块进行加密并设置验证方式;Encrypt the file block and set a verification method; 基于所述智能合约为访问者分配访问权限;Assign access rights to visitors based on the smart contract; 若访问者访问所述档案块,则对所述访问权限进行匹配和验证。If a visitor accesses the archive block, the access rights are matched and verified. 4.根据权利要求1所述的一种构建全生命周期的智慧档案管理方法,其特征在于:对档案资料进行文件图像处理,包括如下步骤:4. A method for constructing a full life cycle of smart archive management according to claim 1, characterized in that: performing file image processing on archive materials, including the following steps: 获取纸质档案材料中的档案图像;所述档案图像包括参考图像和待检测图像;Obtain archival images in paper archival materials; the archival images include reference images and images to be detected; 提取所述参考图像中的多个参考统计特征;extracting a plurality of reference statistical features in the reference image; 对所述参考统计特征进行特征编码,生成参考统计特征向量;Perform feature encoding on the reference statistical features to generate a reference statistical feature vector; 对所述参考图像进行图像编码,生成参考图像特征向量;Perform image coding on the reference image to generate a reference image feature vector; 基于所述参考统计特征向量对所述参考图像特征向量的特征编码进行优化,得到优化参考图像特征矩阵;Optimize the feature encoding of the reference image feature vector based on the reference statistical feature vector to obtain an optimized reference image feature matrix; 将所述待检测图像进行图像编码优化,得到优化检测图像特征矩阵;Perform image coding optimization on the image to be detected to obtain an optimized detection image feature matrix; 将所述优化参考图像特征矩阵和所述优化检测图像特征矩阵进行度量,得到度量特征向量;Measure the optimized reference image feature matrix and the optimized detection image feature matrix to obtain a measured feature vector; 将所述度量特征向量通过分类器得到分类结果;Pass the metric feature vector through a classifier to obtain a classification result; 其中,所述分类结果表示待检测图像的图像质量是否满足预定要求。Wherein, the classification result indicates whether the image quality of the image to be detected meets predetermined requirements. 5.根据权利要求4所述的一种构建全生命周期的智慧档案管理方法,其特征在于:所述文件图像处理还包括图像纠偏、图像去污点、图像补边、图像剪裁和图像旋转;5. A method of constructing a full life cycle smart archive management method according to claim 4, characterized in that: the document image processing also includes image correction, image decontamination, image edge filling, image cropping and image rotation; 关于所述图像纠偏,具体包括如下步骤:Regarding the image correction, it specifically includes the following steps: 读取待校正档案图像并对所述待校正档案图像进行预处理;Read the archive image to be corrected and preprocess the archive image to be corrected; 利用傅里叶变换获取预处理后的待校正档案图像的倾斜角α;Use Fourier transform to obtain the tilt angle α of the preprocessed archival image to be corrected; 通过边界拟合确定待校正档案图像的外边界线和倾斜角β,并利用空间仿射变换对待校正档案图像的倾斜进行修正;Determine the outer boundary line and tilt angle β of the archive image to be corrected through boundary fitting, and use spatial affine transformation to correct the tilt of the archive image to be corrected; 提取待校正档案图像中的文字图片列表,并利用预设的偏旁组件和/或文字组件对待校正档案图像中文字图片进行匹配,确定待校正档案图像上下倒置情况,并对倒置的待校正档案图像进行翻转来获取文字内容正向且倾斜角度符合规范要求的档案图像;Extract the list of text pictures in the file image to be corrected, and use the preset radical components and/or text components to match the text pictures in the file image to be corrected, determine the inversion of the file image to be corrected, and compare the inverted file image to be corrected Flip to obtain a file image with the text content facing forward and the tilt angle meeting the specification requirements; 其中,所述预处理包括白边裁剪和灰度化处理。Wherein, the preprocessing includes white edge cropping and grayscale processing. 6.根据权利要求1所述的一种构建全生命周期的智慧档案管理方法,其特征在于:所述文件文字处理包括文字清理、文字标准化、文字分词、文字转换和敏感信息处理。6. A method for constructing a full life cycle smart archive management method according to claim 1, characterized in that: the document text processing includes text cleaning, text standardization, text segmentation, text conversion and sensitive information processing. 7.根据权利要求6所述的一种构建全生命周期的智慧档案管理方法,其特征在于:关于所述文件图像处理和所述文件文字处理,还包括图文对比,具体为:7. A method for constructing a full life cycle smart archive management method according to claim 6, characterized in that: regarding the document image processing and the document text processing, it also includes image and text comparison, specifically: 通过所述文件图像处理得到档案图像;Obtain archive images through the document image processing; 通过所述文件文字处理得到档案文字;Obtain file text through the file text processing; 将所述档案图像与所述档案文字进行图文对比,获取对比结果;Compare the file image and the file text with pictures and texts to obtain comparison results; 所述对比结果表示档案的图像与文字的匹配程度。The comparison result indicates the degree of matching between the image and the text of the file. 8.根据权利要求1所述的一种构建全生命周期的智慧档案管理方法,其特征在于:对档案定期进行销毁和更新,具体为:设定档案销毁时间期限,基于所述档案销毁规则对待销毁档案进行判断,若所述待销毁档案符合所述档案销毁规则,则将所述待销毁档案销毁,并对所述终极档案库进行更新。8. A method for constructing a full life cycle of smart archive management according to claim 1, characterized in that: archives are destroyed and updated regularly, specifically: setting a time limit for archive destruction, and treating the archives based on the archive destruction rules. Destruction of files is judged. If the file to be destroyed meets the file destruction rules, the file to be destroyed is destroyed and the ultimate archive is updated. 9.根据权利要求1所述的一种构建全生命周期的智慧档案管理方法,其特征在于:在档案存储过程中,还包括对档案信息进行查验,具体包括如下步骤:9. A method for constructing a full life cycle of smart archive management according to claim 1, characterized in that: during the archive storage process, it also includes checking archive information, specifically including the following steps: 获取待查验档案储存区域所对应的第一档案查验关键词在第一档案检索路径所对应的关联档案存储区域集;Obtain the set of associated file storage areas corresponding to the first file inspection keyword corresponding to the file storage area to be checked and the first file retrieval path; 从关联档案存储区域集中获取第二档案查验关键词,并辨析在预设时间内与第一档案查验关键词在第一档案检索路径相关联的第一档案解析序列以及与第二档案查验关键词对应的第二档案检索路径相关联的第二档案解析序列;Centrally obtain the second file check keyword from the associated file storage area, and analyze the first file parsing sequence associated with the first file check keyword in the first file retrieval path within the preset time and the second file check keyword The second file parsing sequence associated with the corresponding second file retrieval path; 基于预先配置的档案解析构架分别对第一档案解析序列中的档案信息和第二档案解析序列中的档案信息进行解析,分别获得第一档案解析序列中的第一档案信息集和第二档案解析序列中的第二档案信息集;Based on the pre-configured file parsing architecture, the file information in the first file parsing sequence and the file information in the second file parsing sequence are respectively parsed to obtain the first file information set and the second file parsing in the first file parsing sequence respectively. a second set of archival information in the sequence; 分别从第一档案信息集和第二档案信息集中确定与待查验档案储存区域的档案写入问题相关的关联档案信息,并将关联档案信息与待查验档案储存区域建立绑定关系;Determine associated file information related to the file writing problem in the file storage area to be checked from the first file information set and the second file information set respectively, and establish a binding relationship between the associated file information and the file storage area to be checked; 将关联档案信息与待查验档案储存区域的查验状态进行发送和记录。Send and record the associated file information and the inspection status of the file storage area to be inspected. 10.一种构建全生命周期的智慧档案管理系统,应用于如权利要求1-9任一项所述的一种构建全生命周期的智慧档案管理方法,其特征在于:包括依次通信连接的档案采集模块、档案整理模块、档案检索模块、档案记录模块、档案存储模块和档案销毁模块;10. A smart archive management system that constructs a full life cycle, applied to a smart archive management method that constructs a full life cycle as described in any one of claims 1 to 9, characterized in that: it includes archives that are sequentially communicated and connected. Collection module, file sorting module, file retrieval module, file recording module, file storage module and file destruction module; 所述档案采集模块,用于采集档案资料并创建初级档案库;The archive collection module is used to collect archive materials and create a primary archive; 所述档案整理模块,基于所述初级档案库进行档案整理,生成终极档案库;The file sorting module organizes files based on the primary archives and generates a final archives; 所述档案检索模块,用于建立检索规则并进行档案检索和共享;The archive retrieval module is used to establish retrieval rules and perform archive retrieval and sharing; 所述档案记录模块,用于记录档案的操作历史和访问历史并进行标记和追溯;The archive recording module is used to record the operation history and access history of archives and to mark and trace them; 所述档案存储模块,用于建立归档要求并进行归档和备份,以及存储终极档案库;The archive storage module is used to establish archiving requirements, perform archiving and backup, and store the ultimate archive; 所述档案销毁模块,用于设置档案销毁规则并定期进行销毁和更新。The file destruction module is used to set file destruction rules and destroy and update them regularly.
CN202311712970.XA 2023-12-13 2023-12-13 Intelligent archive management method and system for building full life cycle Withdrawn CN117671714A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311712970.XA CN117671714A (en) 2023-12-13 2023-12-13 Intelligent archive management method and system for building full life cycle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311712970.XA CN117671714A (en) 2023-12-13 2023-12-13 Intelligent archive management method and system for building full life cycle

Publications (1)

Publication Number Publication Date
CN117671714A true CN117671714A (en) 2024-03-08

Family

ID=90080597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311712970.XA Withdrawn CN117671714A (en) 2023-12-13 2023-12-13 Intelligent archive management method and system for building full life cycle

Country Status (1)

Country Link
CN (1) CN117671714A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119338335A (en) * 2024-12-19 2025-01-21 北京汉龙思琪数码科技有限公司 A standardized management system for the entire digital process of archives
CN120045520A (en) * 2025-04-24 2025-05-27 济南国韵电子技术有限公司 Archives full life cycle management system and method based on cloud computing

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111032A (en) * 2021-04-20 2021-07-13 河南水利与环境职业学院 Archive management system data archiving method and system
CN113190502A (en) * 2021-01-26 2021-07-30 云南电网有限责任公司信息中心 Archive management method based on deep learning
CN114706960A (en) * 2022-06-06 2022-07-05 济南市干部人事档案服务中心 File information checking method based on cloud computing and file checking terminal
CN115620303A (en) * 2022-10-13 2023-01-17 杭州京胜航星科技有限公司 Personnel file intelligent management system
CN115619656A (en) * 2022-09-19 2023-01-17 郑州大学 Digital file deviation rectifying method and system
CN116308114A (en) * 2023-01-10 2023-06-23 上海付正信息科技有限公司 Comprehensive file management system
CN117113199A (en) * 2023-10-23 2023-11-24 浙江星汉信息技术股份有限公司 File security management system and method based on artificial intelligence

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190502A (en) * 2021-01-26 2021-07-30 云南电网有限责任公司信息中心 Archive management method based on deep learning
CN113111032A (en) * 2021-04-20 2021-07-13 河南水利与环境职业学院 Archive management system data archiving method and system
CN114706960A (en) * 2022-06-06 2022-07-05 济南市干部人事档案服务中心 File information checking method based on cloud computing and file checking terminal
CN115619656A (en) * 2022-09-19 2023-01-17 郑州大学 Digital file deviation rectifying method and system
CN115620303A (en) * 2022-10-13 2023-01-17 杭州京胜航星科技有限公司 Personnel file intelligent management system
CN116308114A (en) * 2023-01-10 2023-06-23 上海付正信息科技有限公司 Comprehensive file management system
CN117113199A (en) * 2023-10-23 2023-11-24 浙江星汉信息技术股份有限公司 File security management system and method based on artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119338335A (en) * 2024-12-19 2025-01-21 北京汉龙思琪数码科技有限公司 A standardized management system for the entire digital process of archives
CN120045520A (en) * 2025-04-24 2025-05-27 济南国韵电子技术有限公司 Archives full life cycle management system and method based on cloud computing

Similar Documents

Publication Publication Date Title
US8571317B2 (en) Systems and methods for automatically processing electronic documents using multiple image transformation algorithms
US20210150338A1 (en) Identification of fields in documents with neural networks without templates
CN117671714A (en) Intelligent archive management method and system for building full life cycle
JP2794085B2 (en) Document image storage / processing method
US20090116755A1 (en) Systems and methods for enabling manual classification of unrecognized documents to complete workflow for electronic jobs and to assist machine learning of a recognition system using automatically extracted features of unrecognized documents
US8023155B2 (en) Imaging system with quality audit capability
CN114218467B (en) Digital archive management method and system
CN113901817A (en) Document classification method and device, computer equipment and storage medium
Tornés et al. Receipt dataset for document forgery detection
Rusiñol et al. Symbol spotting in digital libraries
CN118038478A (en) Intelligent form identification, intelligent merging and intelligent submitting method and system
CN116229493B (en) Cross-modal picture text named entity recognition method and system and electronic equipment
US12175786B2 (en) Systems, methods, and devices for automatically converting explanation of benefits (EOB) printable documents into electronic format using artificial intelligence techniques
Girgensohn et al. Automatic rights management for photocopiers
JP6855711B2 (en) Information processing equipment and information processing programs
Ning et al. Design of an automated data entry system for hand-filled forms
CN114820211B (en) Method, device, computer equipment and storage medium for checking and verifying quality of claim data
CN117251526B (en) Conference file digital management system, method and electronic equipment
Zaripov et al. Methods for Recognizing the Structure of Mounting Schemes in Railway Automation and Remote Control Systems
Varshaneya et al. Information Retrieval for Aviation Applications
Pack Enhancing Document Layout Analysis on Historical Newspapers: Visual Representation, Pseudo-Ground-Truth, and Downscaling
CN115392209A (en) Method, equipment and medium for automatically generating civil case legal documents
Dimov Rapid and Reliable Content Based Image Retrieval
CN117351501A (en) Information input method, device, equipment and storage medium
CN120126161A (en) Document comparison method, device, computer equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20240308