Disclosure of Invention
The invention provides an AI-based electronic file intelligent management method and system, which are used for solving the problems mentioned in the background art:
the invention provides an AI-based electronic file intelligent management method, which comprises the following steps:
S1, carrying out multi-level feature extraction on file content based on a convolutional neural network and a recurrent neural network, and training an adaptive classification model based on the extracted features;
S2, constructing a user portrait based on a deep neural network according to the historical operation record and behavior habit of the user, introducing a reinforcement learning mechanism, and dynamically adjusting a recommendation strategy according to the user portrait and the current context;
S3, automatically generating an intelligent folder according to the similarity of file contents and the use habit of a user, dynamically adjusting a file organization structure, combining a natural language processing technology, realizing a semantic-based search function, and optimizing search result sequencing by reinforcement learning;
And S4, modeling the file access mode through the deep neural network, monitoring abnormal access behaviors in real time, triggering an early warning mechanism by the system if the abnormal behaviors are found, and simultaneously automatically taking response measures according to a preset security policy.
Further, the step S1 includes:
s11, preprocessing file content before feature extraction, wherein the preprocessing comprises the following steps:
text preprocessing, namely, word segmentation, word stopping and stem extraction operation are carried out on a text file,
Image preprocessing, namely scaling, graying and edge detection operation on the image file,
Audio preprocessing, namely performing spectrum analysis and mel frequency cepstrum coefficient extraction operation on an audio file;
s12, carrying out multi-level feature extraction on the preprocessed file content by utilizing a convolutional neural network and a recurrent neural network, wherein the method comprises the following steps:
The convolution operation is carried out on the preprocessed text through the convolution neural network to extract the local characteristics of the text, the modeling is carried out on the text sequence by combining with the recursion neural network to capture the context dependency relationship of the text,
Extracting features of the image by a pretrained convolutional neural network to obtain high-level semantic information of the image,
Performing convolution operation on the MFCC characteristics of the audio through a convolution neural network, and capturing the time sequence characteristics of the audio by combining a recurrent neural network;
S13, constructing a self-adaptive classification model based on the extracted multi-level features, training the classification model through a large number of labeled file data, inputting the extracted features into the trained self-adaptive classification model, and outputting the classification label of each file.
Further, the step S13 includes:
Based on a cross-modal feature fusion mechanism, the features from three different modes of text, image and audio are fused, and the weight of each modal feature in the fusion process is dynamically adjusted through an attention mechanism or a gating mechanism;
Further integrating local feature extraction capability from the convolutional neural network and context/time sequence modeling capability of the recurrent neural network on the basis of feature fusion;
based on the result of feature fusion, performing super-parameter tuning and model architecture searching through a deep neural network, and integrating the prediction results of a plurality of base classifiers through a random forest;
collecting and sorting a large-scale file data set containing rich tag information, and increasing the data diversity through a data enhancement technology;
Under the condition of limited marked data, a semi-supervised learning or unsupervised learning method is introduced, and unmarked data is utilized to improve the performance of the model;
Based on a user feedback mechanism, the model is continuously optimized and iterated through feedback of a model classification result in actual application, and the model is manually marked according to user feedback selection samples by utilizing active learning, and the model is retrained to form a closed-loop feedback system.
Furthermore, the cross-modal feature fusion mechanism is used for fusing the features from three different modes of text, image and audio and dynamically adjusting the weight of the features of each mode in the fusion process through an attention mechanism or a gating mechanism, and comprises the following steps:
Carrying out dimension unification processing on the characteristics extracted from the text, the image and the audio, and carrying out normalization processing on the characteristics through Z-score standardization;
the text, the image and the audio features subjected to preliminary alignment and normalization are spliced to form a joint feature vector containing all mode information;
Based on the feature fusion network, dynamically learning through the relevance and importance between different modal features of the attention mechanism or the self-attention mechanism;
Controlling information flows of different modal characteristics in the fusion process through a gating mechanism;
And selecting an optimal fusion scheme by comparing the evaluation results under different fusion strategies, and further optimizing the fusion characteristics according to the evaluation results.
Further, the step S2 includes:
S21, acquiring a historical operation record of a user from a server background, and performing feature extraction on user behavior data through a deep neural network to construct a user portrait;
S22, taking the user portrait and the current context as a state space of reinforcement learning, and defining an action space of a recommendation system;
s23, rewarding or punishing the recommendation system according to user behaviors through a rewarding function, wherein the user behaviors comprise whether to click on a recommendation file and stay time;
s24, training the recommendation strategy through a reinforcement learning algorithm, so that the system dynamically adjusts the recommendation strategy according to the user portrait and the context.
Further, the step S22 includes:
collecting and integrating current multidimensional context information, and dynamically updating the user portraits through an online learning technology based on the user portraits constructed in the step S21 by combining with real-time user behavior data;
combining the integrated context information with the dynamically updated user representation to form a high-dimensional state space;
And defining an action space of the recommendation system, and dynamically adjusting the recommended action according to the specific situation of the user based on the situation awareness strategy on the basis of the action space.
Further, the step S3 includes:
s31, calculating similarity of file contents by utilizing cosine similarity, and analyzing using habits of a user by combining a historical operation record of the user;
s32, automatically generating an intelligent folder according to the similarity of file contents and the use habit of a user, and dynamically adjusting the file organization structure;
s33, carrying out semantic understanding on query sentences input by a user through a GPT pre-training language model, carrying out index construction on file contents, and supporting search based on keywords and natural language;
s34, sorting the search results through a reinforcement learning algorithm to enable related files to be displayed to a user preferentially.
Further, the step S32 includes:
On the basis of calculating file content similarity by cosine similarity, carrying out semantic understanding on file content by a deep semantic analysis technology, constructing a multi-dimensional file similarity matrix by combining file types and content characteristics, and capturing deeper association among files;
Deep analysis is carried out on the historical operation record of the user, and potential modes used by the user file are identified through cluster analysis and association rule mining technology;
based on file content similarity and user behavior mode, automatically classifying related files into the same folder according to a preset intelligent folder generation algorithm, and introducing a user feedback mechanism to allow a user to name, adjust or delete the automatically generated folder;
Dynamically adjusting a file organization structure according to the use habit and the workflow of a user, and intelligently recommending or hiding related folders according to the current working situation of the user based on a context awareness technology;
The intelligent folders are personalized and optimized by combining personal preferences and working styles of users, and the generation rules and the display strategies of the folders are continuously adjusted and optimized according to the use feedback (such as click rate, residence time and the like) of the intelligent folders by the users based on the reinforcement learning algorithm.
Further, the step S4 includes:
S41, extracting characteristics of a file access record of a user, modeling a file access mode of the user through a deep neural network, and capturing normal access behavior characteristics of the user;
s42, monitoring the file access behaviors of the user in real time, comparing the file access behaviors with the normal access behaviors predicted by the model, and judging the file access behaviors as abnormal behaviors if the access behaviors of the user deviate from the normal mode;
S43, if abnormal behavior is found, immediately triggering an early warning mechanism, sending early warning information to a user or an administrator, and automatically taking response measures according to a preset safety strategy.
The invention provides an AI-based electronic file intelligent management system, which comprises:
the feature extraction module is used for carrying out multi-level feature extraction on file contents based on a convolutional neural network and a recurrent neural network, and training an adaptive classification model based on the extracted features;
The strategy adjustment module is used for constructing a user portrait based on the deep neural network according to the historical operation record and the behavior habit of the user, introducing a reinforcement learning mechanism, and dynamically adjusting a recommendation strategy according to the user portrait and the current context;
the ordering optimization module is used for automatically generating an intelligent folder according to the similarity of file contents and the use habit of a user, dynamically adjusting a file organization structure, combining a natural language processing technology, realizing a semantic-based search function, and optimizing search result ordering by reinforcement learning;
The abnormality processing module is used for modeling the file access mode through the deep neural network, monitoring abnormal access behaviors in real time, triggering an early warning mechanism by the system if the abnormal behaviors are found, and simultaneously automatically taking response measures according to a preset security policy.
The method has the advantages that multi-level features can be extracted from multi-mode files such as texts, images and audios through the convolutional neural network and the recurrent neural network, internal information of the files is fully mined, accuracy of file classification is improved, feature fusion mechanisms (such as cross-mode feature fusion and attention mechanisms) can effectively integrate data of different modes, accuracy and robustness of classification models are improved, the self-adaptive classification models can be continuously optimized based on a large amount of labeling data, continuous improvement of classification results is guaranteed, and recommendation strategies are intelligently adjusted according to historical behaviors and current context information of users through reinforcement learning technology, so that personalized recommendation is achieved. The user portraits and the real-time behavior data can be combined to accurately capture user preferences and demands, the dynamically updated user portraits and context awareness enable recommendation to better meet actual work demands of users, user experience of a file management system is improved, intelligent folders are automatically generated by combining similarity analysis of file contents and user use habits, and file organization structures are dynamically adjusted according to user behaviors. The file organization mode based on the semantics can effectively reduce the searching cost and the file management difficulty of the user, and the user feedback mechanism and the personalized optimization enable the generation of the file to be more in line with the workflow of the user, thereby enhancing the flexibility and the practicability of the file system. Based on natural language processing techniques, users can quickly locate desired documents through semantic understanding and keyword searching. The method combines reinforcement learning to optimize the sequence of search results, ensures the preferential display of files with high correlation, improves the search efficiency and accuracy, dynamically adjusts the function of search sequencing, enables the system to be continuously optimized according to user behaviors and feedback, enhances the intellectualization of the system, and can efficiently identify abnormal access behaviors through deep learning modeling and real-time monitoring of file access modes, thereby providing timely early warning response. The system automatically adopts proper safety measures according to the safety strategy to ensure the safety of file data, and the abnormal behavior detection can not only improve the safety of the system, but also avoid the potential risk brought by artificial omission.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
In one embodiment of the present invention, as shown in fig. 1, an AI-based electronic file intelligent management method, the method includes:
S1, carrying out multi-level feature extraction on file content based on a convolutional neural network and a recurrent neural network, and training an adaptive classification model based on the extracted features;
S2, constructing a user portrait based on a deep neural network according to the historical operation record and behavior habit of the user, introducing a reinforcement learning mechanism, and dynamically adjusting a recommendation strategy according to the user portrait and the current context (such as time, place, equipment state and the like);
S3, automatically generating an intelligent folder according to the similarity of file contents and the use habit of a user, dynamically adjusting a file organization structure, combining a natural language processing technology, realizing a semantic-based search function, and optimizing search result sequencing by reinforcement learning;
And S4, modeling the file access mode through the deep neural network, monitoring abnormal access behaviors in real time, triggering an early warning mechanism by the system if the abnormal behaviors are found, and simultaneously automatically taking response measures according to a preset security policy.
The working principle of the technical scheme is that a Convolutional Neural Network (CNN) is used for processing image and audio content, low-level features (such as edges, textures, audio frequency spectrums and the like) of the multimedia content are extracted through structures such as a convolutional layer and a pooling layer, then the low-level features are combined into high-level semantic features through a full-connection layer, a Recursive Neural Network (RNN) or variants thereof (such as LSTM and GRU) is used for processing text content, time sequence dependency relations (such as sentence structures and paragraph continuity and the like) in the text are captured through a circulating structure, so that semantic features of the text are extracted, and the extracted semantic features of the multimedia content are input into an adaptive classification model. the self-adaptive classification model automatically adjusts the classification strategy according to the complexity of the file content through training and learning. for example, for complex files containing multiple media types, the model may employ more complex classification strategies, while for files with more single content, a simpler classification strategy may be employed; in the training process, the model continuously adjusts parameters according to label data (known classified files) until reaching preset classification accuracy, analyzes historical operation records and behavior habits of users based on a deep neural network, extracts characteristics of user preference, interest, use habits and the like, combines the characteristics into a user portrait for representing the personality and the requirement of the users, introduces a reinforcement learning mechanism and changes the characteristics into a user portrait according to the current context (such as time, time and the like, Location, equipment state, etc.), dynamically adjusting recommendation strategies, strengthening learning models to continuously try different recommendation strategies, evaluating the effects of the strategies according to feedback (such as clicking, browsing time, storage or the like) of users, finally finding the recommendation strategy which is most suitable for the users through continuous iteration and optimization of the models, providing personalized file recommendation, automatically generating intelligent folders according to the similarity of file contents and the use habit of the users, and performing clustering algorithm (such as K-means, DBSCAN, etc.) to aggregate similar files together to form a folder, the names of the folders may be automatically generated based on natural language processing techniques to reflect the subject matter or content of the files within the folder, the intelligent folders may dynamically adjust their structure based on user usage habits and updates to the content of the files, e.g., files within a folder may be more closely organized if they are frequently used or accessed together, a semantic-based search function may be implemented in conjunction with natural language processing techniques, a user may search for files by entering a natural language query, the system may match the semantics of the query to the semantics of the content of the files to find the most relevant files, rank the most relevant files in order using reinforcement learning to optimize search results, model file access patterns via deep neural networks, learn normal access patterns including access time, Access frequency, access equipment, access users and other features, monitoring access behavior of the file in real time and comparing the access behavior with a normal access mode, triggering an early warning mechanism if abnormal behavior (such as unauthorized access, frequent data transmission and the like) is found, automatically taking response measures according to a preset security policy, and possibly locking the file to prevent further access, logging for subsequent analysis, notifying an administrator of manual intervention and the like.
The technical scheme has the advantages that multi-level feature extraction is carried out on file contents through a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN), semantic features of the text, images, audio and other multimedia contents can be accurately captured, a self-adaptive classification model can automatically adjust classification strategies according to complexity of the file contents, so that classification accuracy and accuracy are improved, the method can process various file contents including texts, images, audio and the like, an application range of electronic file management is expanded, a large number of files can be efficiently processed through processing of a deep learning model, work efficiency of file management is improved, user images are built based on historical operation records and behavior habits of users, requirements and preferences of users can be more accurately known, a reinforcement learning mechanism is introduced, recommendation strategies can be dynamically adjusted according to the user images and current context, satisfaction degree and experience of users are improved, time for searching files by the users can be shortened, work efficiency is improved, meanwhile, utilization of storage resources can be optimized by recommending files which can be possibly interested by the users, the users can be reduced, the use of the users is reduced, the storage efficiency is improved, the search results can be more easily and the user can be more accurately arranged according to the user's natural search language, the search results can be more easily and conveniently and automatically arranged, the user can be more easily, and automatically is improved, the user can be searched by searching has a natural language, and has better search results can be easily and has improved, and has better search results, the method has the advantages that the most relevant files are arranged in front, the searching accuracy is further improved, the file access mode is modeled through the deep neural network, abnormal access behaviors such as unauthorized access and frequent data transmission can be monitored in real time, once the abnormal behaviors are found, the system can immediately trigger an early warning mechanism, response measures such as locking the files, recording logs and informing an administrator are automatically adopted according to a preset safety strategy, so that the safety and the integrity of the files are effectively guaranteed, the real-time monitoring and early warning mechanism can timely find potential safety threats, enough time is provided for the administrator to deal with and process the abnormal behaviors, the preset safety strategy can automatically deal with the abnormal behaviors, delay and errors of manual intervention are reduced, and the efficiency and the accuracy of emergency response are improved.
In one embodiment of the present invention, the S1 includes:
s11, preprocessing file content before feature extraction, wherein the preprocessing comprises the following steps:
text preprocessing, namely performing word segmentation, word stopping and word stem extraction operation on a text file;
image preprocessing, namely scaling, graying and edge detection operation is carried out on an image file;
audio preprocessing, namely performing spectrum analysis and mel frequency cepstrum coefficient extraction operation on an audio file;
s12, carrying out multi-level feature extraction on the preprocessed file content by utilizing a convolutional neural network and a recurrent neural network, wherein the method comprises the following steps:
carrying out convolution operation on the preprocessed text through a convolution neural network to extract local characteristics of the text, modeling a text sequence by combining with a recurrent neural network, and capturing context dependency relationship of the text;
extracting features of the image through a pre-trained convolutional neural network to obtain high-level semantic information of the image;
performing convolution operation on the MFCC characteristics of the audio through a convolution neural network, and capturing the time sequence characteristics of the audio by combining a recurrent neural network;
S13, constructing a self-adaptive classification model based on the extracted multi-level features, training the classification model through a large number of labeled file data, inputting the extracted features into the trained self-adaptive classification model, and outputting the classification label of each file.
The working principle of the technical scheme is that the pretreatment of the file content comprises the following steps:
Text preprocessing:
word segmentation, namely splitting a text file into single words or phrases for subsequent processing.
Decommissioning words-words that are common in text but do not contribute much to the meaning of the text, such as "have", etc.
And extracting word stems, namely reducing words into basic forms, such as reducing running into run, so as to reduce the influence of word diversity on subsequent processing.
Image preprocessing:
scaling, namely adjusting the image size to meet the requirements of subsequent processing or model input.
Graying, converting the color image into gray image, simplifying the image information and reducing the calculated amount.
Edge detection, namely detecting edges in the image through an algorithm, and extracting structural information of the image.
Audio pretreatment:
Spectral analysis, converting an audio signal into a spectral representation in order to analyze the frequency content of the audio.
Mel-frequency cepstrum coefficient extraction, namely extracting mel-frequency cepstrum coefficient from an audio signal, which is a characteristic commonly used in audio signal processing and can reflect the voice characteristic of audio.
The method comprises the steps of carrying out convolution operation on a preprocessed text by using a convolution neural network, extracting local features of the text, such as a character or word combination mode, modeling a text sequence by combining a recurrent neural network, capturing context dependency relationship of the text, namely semantic relation among words, carrying out feature extraction on the preprocessed image by using a pretrained convolution neural network (such as VGG, resNet and the like), obtaining high-level semantic information of the image, such as shape, color, texture and the like of an object, carrying out convolution operation on preprocessed audio MFCC features, extracting local features of audio by using CNN, capturing time sequence features of the audio by combining RNN, namely the change rule of audio signals along with time, and constructing an adaptive classification model based on the extracted multi-level features. The method comprises the steps of training a classification model by using a large amount of marked file data, wherein the classification model comprises files with known classification and corresponding characteristics, continuously adjusting parameters of the files in the training process to reduce classification errors and improve classification accuracy, inputting the extracted characteristics into a trained self-adaptive classification model, outputting classification labels of each file according to the input characteristics by the model, wherein the classification labels represent the classification or the theme of the files and can be used for sorting, searching and managing the files.
The technical scheme has the advantages that noise can be removed and data can be simplified by preprocessing file contents, so that subsequent feature extraction is more accurate. For example, word segmentation, word deactivation and word stem extraction in text preprocessing are beneficial to reducing the diversity of words and enabling text features to be more consistent, scaling, graying and edge detection in image preprocessing can highlight key information of images and reduce interference of irrelevant information, spectrum analysis and MFCC extraction in audio preprocessing can capture key features of audio signals and provide a basis for subsequent analysis, and the preprocessing process can simplify data and reduce the data volume of subsequent processing, so that the overall processing efficiency is improved. For example, the image graying can reduce the calculated amount and accelerate the image feature extraction speed, and the Convolutional Neural Network (CNN) and the Recurrent Neural Network (RNN) are utilized to carry out multi-level feature extraction on the content of the preprocessed files, so that the rich features of different types of files such as texts, images, audios and the like can be captured. CNN is good at extracting local features such as character combination in text, edges and textures in images and spectrum features in audio, RNN can capture context dependency relationship of sequence data such as semantic consistency in text and time sequence features in audio, and multi-level feature extraction can extract deep features of file content, so that the features are abstract and generalized, and the generalization capability of a classification model is improved. The self-adaptive classification model can automatically adjust the classification strategy according to the complexity of file contents, such as adjusting model parameters, selecting different characteristic combinations and the like. The self-adaptive capacity enables the model to be better suitable for file classification tasks with different types and difficulties, and the mapping relation between file features and classification labels can be learned by training the classification model through a large amount of marked file data. The trained model can input the extracted features and output accurate classification labels, so that the accuracy of file classification is improved; the self-adaptive classification model can automatically classify the files, reduces the workload of manual classification, and improves the efficiency of file management. Meanwhile, the accurate classification result is beneficial to a user to find the required file more quickly, and the convenience of file retrieval is improved.
In one embodiment of the present invention, the step S13 includes:
S131, based on a cross-mode feature fusion mechanism, features from three different modes of text, images and audio are fused, and the weight of the features of each mode in the fusion process is dynamically adjusted through an attention mechanism or a gating mechanism;
S132, further integrating local feature extraction capability from a convolutional neural network and context/time sequence modeling capability of a recurrent neural network on the basis of feature fusion;
s133, based on the result of feature fusion, performing super-parameter tuning and model architecture searching (such as network width and depth tuning) through a deep neural network, and integrating the prediction results of a plurality of base classifiers through a random forest;
s134, collecting and sorting a large-scale file data set containing rich tag information, and adding data diversity by a data enhancement technology (adding data diversity;
S135, under the condition that the marked data is limited, a semi-supervised learning or unsupervised learning method is introduced, and the unmarked data is utilized to improve the performance of the model;
s136, based on a user feedback mechanism, continuously optimizing and iterating the model through feedback of a model classification result in practical application, manually marking a sample with the most information amount according to user feedback by utilizing active learning, and retraining the model to form a closed-loop feedback system.
The working principle of the technical proposal is that the text, the image are considered, The method comprises the steps of combining three different modes of audio features, namely, the three different modes of audio features have different data structures and representation modes, so that cross-mode feature fusion mechanisms are needed to be used for fusing the features, in the fusion process, the weight of each mode feature in the fusion process is dynamically adjusted through a concentration mechanism or a gating mechanism so as to fully utilize the complementarity between the different modes of audio features and reduce the influence of noise and redundant information, for texts, word embedding is combined with Word embedding (such as Word2Vec or BERT) and CNN-RNN architecture, word embedding is used for preserving semantic information of words, CNN is used for extracting local features of the texts, RNN is used for capturing structural dependence and context relation of sentences, for images, advanced CNN models such as a residual network (ResNet) are introduced, the features comprise high-level semantic information of the images, meanwhile, global context information of the images is combined, for audio frequency spectrum features are extracted through deep CNN processing of Mel frequency coefficient (MFCC), for audio frequency, word embedding is combined with a short-time-lag network (LSTM) or gating audio frequency (GRU), the Word is used for capturing the audio frequency, the structural dependence and context is used for capturing the context, the structural dependence and context relation of sentences is used for capturing the sentences is improved, and the optimal classification result is found through the aid of the depth-dependent graph-based on the depth-context model, and the optimal-based prediction model is used for searching, and the optimal structure is used for searching. Random forest can reduce the risk of overfitting of single model by constructing multiple decision trees and synthesizing their prediction results, and collect and sort large-scale file data set containing rich label information, which can cover various files such as text, image and audio. The diversity of data is increased by data enhancement techniques (such as text synonym substitution, image rotation/cropping, audio shifting, etc.) to increase the generalization ability of the model. The data enhancement can generate more training samples to help the model learn to more robust feature representation, under the condition of limited marked data, a semi-supervised learning or non-supervised learning method is introduced, the unmarked data is utilized to improve the performance of the model, and the semi-supervised learning is utilized to improve the generalization capability of the model. For example, the self-training method predicts unlabeled data by using a model, trains the model by taking a high-confidence prediction result as new labeled data, and the unsupervised learning discovers the internal structure and mode of the data by using methods such as cluster analysis and the like, thereby providing guidance for the labeling process. For example, the clustering result can be used for primarily dividing the data category so as to accelerate the labeling process, the model is continuously optimized and iterated through feedback of the model classification result in practical application based on a user feedback mechanism, the active learning is utilized to select the sample with the most information amount according to the user feedback for manual labeling, and the model is retrained to form a closed-loop feedback system. The active learning can efficiently utilize limited labeling resources, the performance of the model is improved by selecting a most representative sample for labeling, and the model can be continuously adjusted and optimized by combining the user feedback and the active learning, so that the model is more suitable for the requirements and the changes of practical application.
The technical scheme has the advantages that information of different modes such as texts, images and audios can be fully utilized through a cross-mode feature fusion mechanism, and the comprehensiveness and the accuracy of the classification model are improved. The method has the advantages that the weight of each modal characteristic is dynamically adjusted by using an attention mechanism or a gating mechanism, the isomerism among different modal characteristics can be dealt with, the fusion effect is further improved, the diversity and the complexity of file contents can be more accurately understood by a model, the classification accuracy and the robustness are improved, and the characteristic information of the file contents can be deeply mined by combining the local characteristic extraction capability of a convolutional neural network and the context/time sequence modeling capability of a recurrent neural network. The method has the advantages that the method can be better suitable for the characteristics of different types of files such as texts, images and audios by adopting a targeted processing strategy, the model can be used for capturing the detail characteristics and time sequence relations of the file contents more finely, the classification fineness and accuracy are improved, the super-parameter tuning and the model architecture searching are carried out through a deep neural network, the optimal model configuration can be found, and the performance of the model is improved. Meanwhile, the prediction results of the plurality of base classifiers are integrated by utilizing random forests, so that the overfitting risk of the model can be further reduced, the classification stability is improved, the model can better adapt to different data distribution and task requirements, the generalization capability and accuracy of classification are improved, and a large-scale file data set containing rich label information is collected and arranged, so that sufficient data support can be provided for model training. The model can learn more abundant characteristic representations, improve classification accuracy and robustness, introduce semi-supervised learning or unsupervised learning methods under the condition of limited marked data, improve model performance by utilizing unmarked data, reduce data marking cost, improve model generalization capability, obtain better classification effect under limited marked data, process more unmarked data, improve overall performance, and continuously optimize and iterate the model based on a user feedback mechanism, so that the model can be ensured to be always suitable for practical application requirements. Meanwhile, the performance of the model can be further improved by utilizing active learning to select a sample with the most information to perform manual labeling and retraining, the model can be continuously adapted to the change of practical application, and the classification accuracy and the user experience are improved. Meanwhile, limited labeling resources can be efficiently utilized through the combination of active learning and user feedback, and the efficiency and effect of model training are improved.
In one embodiment of the present invention, the step S131 includes:
Carrying out dimension unification processing on the characteristics extracted from the text, the image and the audio, and carrying out normalization processing on the characteristics through Z-score standardization;
the text, the image and the audio features subjected to preliminary alignment and normalization are spliced to form a joint feature vector containing all mode information;
Based on the feature fusion network, dynamically learning through the relevance and importance between different modal features of the attention mechanism or the self-attention mechanism;
Controlling information flows of different modal characteristics in the fusion process through a gating mechanism;
the method comprises the steps of carrying out quality evaluation on the fused features through a plurality of evaluation indexes, selecting an optimal fusion scheme by comparing evaluation results under different fusion strategies, and further optimizing the fused features according to the evaluation results, wherein the quality evaluation is carried out on the fused features through the following formula:
Wherein, 、、、The weight coefficient is represented by a number of weight coefficients,A penalty term representing a feature consistency score for computing a sum of inconsistencies between different modality features,Representing mutual information between the ith and jth modality features, N representing the number of modalities; a degree of discrimination score is represented, The degree of divergence between the classes is represented,The method is characterized in that the method is used for representing the intra-class divergence, the larger the inter-class divergence is, the smaller the intra-class divergence is, the better the distinguishing degree of the features is, and the P represents the improvement score of the features on the task performance; And The minimum and maximum values of the performance score are represented for normalizing P to the 0,1 interval.
The technical scheme has the working principle that the characteristics extracted from the text, the image and the audio often have different dimensions and dimensions. And performs dimension unification processing on them, and the dimension unification method includes feature scaling (such as minimum-maximum scaling, Z-score standardization, etc.), principal Component Analysis (PCA), or self-encoder, etc. The method can map features of different dimensions to the same dimensional space while preserving information of original features as much as possible, Z-score standardization is a common feature scaling method, feature values are converted into standard normal distribution forms with 0 as a mean value and 1 as a standard deviation by calculating the mean value and standard deviation of each feature, so that dimension differences among different features are eliminated, text, images and audio features subjected to preliminary alignment and normalization processing are spliced together to form a joint feature vector containing all modal information, the spliced process is to expand the feature vectors of different modalities in dimensions, so that the joint feature vector can simultaneously contain information from different modalities, the joint feature vector is formed as a key step for realizing cross-modal feature fusion, a neural network structure capable of learning relevance and importance among different modal features is provided for a subsequent fusion process, the feature fusion network can dynamically capture interactions among different modal features through an attention mechanism or an automatic attention mechanism, the interaction between different modal features is regulated according to the interaction weight of the features, the importance of the characteristics can be controlled in the stress mechanism, the importance of the different modal features is controlled in the stress mechanism is controlled through the stress mechanism, the importance of the different modal fusion mechanism is controlled in the stress mechanism, the importance of the different modal fusion process is controlled in the stress mechanism, the importance of the fusion process is realized, the method and the device ensure that important features are fully utilized, redundant or noise features are suppressed, quality evaluation is carried out on the fused features through various evaluation indexes, and the selection of the evaluation indexes depends on specific application scenes and task requirements. For example, in the classification task, the classification accuracy and F1 score are commonly used evaluation indexes, and according to the evaluation result, an optimal fusion scheme is selected and fusion characteristics are further optimized according to the requirement.
The technical scheme has the advantages that features of different modes are mapped to the same dimension space through methods such as feature scaling, principal Component Analysis (PCA) or a self-encoder, normalization processing is carried out, and the problem of isomerism and dimension difference among the features is effectively solved; the method comprises the steps of comparing and fusing characteristics of different modes under the same mathematical framework, providing a good basis for the subsequent characteristic fusion process, splicing the preliminarily aligned and normalized text, image and audio characteristics to form a joint characteristic vector containing all mode information, enabling the joint characteristic vector to comprehensively reflect diversity and complexity of file contents, providing rich information sources for the subsequent characteristic fusion and classification tasks, dynamically learning relevance and importance among the different mode characteristics through an attention mechanism or a self-attention mechanism based on a characteristic fusion network, realizing effective fusion of the characteristics, simultaneously highlighting important characteristics, inhibiting redundancy or noise characteristics, improving performance and robustness of a model, controlling information flow of the different mode characteristics in the fusion process through a gating mechanism (such as a gating structure in GRU or LSTM), dynamically adjusting weight of each characteristic under a specific task, ensuring that the important characteristics are fully utilized, inhibiting the redundancy or noise characteristics, improving the redundancy or the noise characteristics, optimizing the fusion quality through a focus mechanism or a self-attention mechanism, simultaneously evaluating the optimal quality of the fusion characteristics, evaluating the optimal quality of the fusion, and evaluating the fusion characteristics according to the optimal quality, the method is suitable for feature fusion of modes such as text, images and audio, can be expanded to feature fusion of other modes such as video and sensor data, can be flexibly applied to various cross-mode tasks such as multimedia retrieval, emotion analysis and event detection, and has wide application prospect and expansibility. The formula can effectively measure and optimize the consistency of feature fusion by calculating the sum of the inconsistencies among different modal features, the penalty term is helpful for reducing the conflict of information among the modalities, the accuracy and the reliability of the fusion features are improved, and the degree of distinction of the features can be evaluated by calculating the degree of divergence among classes and the degree of divergence in the classes. The higher the inter-class divergence, the smaller the intra-class divergence, the better the discrimination of the representing features, which is crucial to improving the accuracy of classification tasks, and the contribution of the features to the final task performance can be quantified by evaluating the improvement scores of the features to the task performance.AndFor normalizing P to the [0,1] interval, helping to fairly compare the performance of different features or fusion strategies;、、、 The weight coefficients allow the model to adjust the importance of different scoring items according to different application scenes and requirements, flexibility and adaptability of the model are improved, the formula integrates evaluation of multiple dimensions such as feature consistency, distinction degree and performance improvement, a comprehensive visual angle is provided for evaluating the quality of feature fusion, the optimal fusion scheme is facilitated to be selected, fusion features are further optimized according to evaluation results, the model can dynamically learn the relevance and importance among different modal features through an attention mechanism or a self-attention mechanism, more key and relevant information can be facilitated to be extracted, performance of the model is improved, the gating mechanism allows control of information flow of different modal features in the fusion process, and the model is facilitated to process and fuse information from different modes more accurately, and fusion effect is improved.
In one embodiment of the present invention, the S2 includes:
S21, acquiring a historical operation record of a user from a server background, and performing feature extraction on user behavior data through a deep neural network to construct a user portrait;
S22, taking the user portrait and the current context (such as time, place, equipment state and the like) as a state space of reinforcement learning, and defining an action space of a recommendation system, such as a recommendation file, a recommendation sequence adjustment and the like;
s23, rewarding or punishing the recommendation system according to user behaviors through a rewarding function, wherein the user behaviors comprise whether to click on a recommendation file and stay time;
s24, training the recommendation strategy through a reinforcement learning algorithm, so that the system dynamically adjusts the recommendation strategy according to the user portrait and the context.
The technical scheme comprises the working principle that a system firstly obtains historical operation records of a user from a server background, wherein the records comprise key information such as file access frequency, editing time, file type preference and the like, and then, the deep neural network is utilized to extract characteristics of the user behavior data. The deep neural network can automatically learn complex modes and features in the data, so that key information in user behaviors is effectively extracted, and the system builds a user portrait based on the extracted features. User portraits, including user interest preferences, work habits, etc., are comprehensive descriptions and summaries of user behavior patterns, with user portraits and current context (e.g., time, place, device status, etc.) as state spaces for reinforcement learning. The state space is the basis for decision making by the reinforcement learning algorithm, reflects the current environment and the state of the user, and defines the action space of the recommendation system. The action space includes all possible actions that the recommender system may take, such as recommending files, adjusting the order of recommendations, etc. The actions are decision results made by the recommendation system according to the current state, and a reward function is designed to reward or punish the recommendation system according to the user behavior. The reward function is a core component of the reinforcement learning algorithm and determines a reward value obtained after the recommendation system takes a certain action, and the user behavior comprises key indexes such as whether to click on a recommendation file, stay time and the like. When the user clicks on and peruses the file provided by the recommender system, the system gives a positive reward (+1 points) indicating that the recommender system has made the correct decision. If the user ignores the recommended file, the system gives a slight negative reward (-0.1 score) to indicate that the recommendation system needs to improve the decision strategy, the recommendation system is stimulated to provide the file more meeting the user requirement through the design of the reward function, so that the user satisfaction degree and the performance of the recommendation system are improved, and the recommendation strategy is trained by using a reinforcement learning algorithm. The reinforcement learning algorithm gradually finds the optimal recommendation strategy through continuous trial and error and learning, and in the training process, the system dynamically adjusts the recommendation strategy according to the user portrait and the context. Through continuous iteration and optimization, the recommendation system can gradually adapt to the behavior mode and the requirement change of different users, and finally, the trained recommendation system can intelligently recommend files according to user images and current contexts, so that the satisfaction degree of the users and the performance of the system are improved.
The technical scheme has the advantages that the characteristic extraction is carried out on the historical operation record of the user through the deep neural network, and the user portrait is constructed, wherein the user portrait comprises interest preference, working habit and the like. The recommendation system can deeply understand the personalized requirements of the user, provide personalized recommendation according to the user portrait, and improve the accuracy of recommendation and the satisfaction degree of the user. The user can more easily find the file which is interested or needed by the user, the work efficiency and the experience are improved, and the current context (such as time, place, equipment state and the like) is used as one of the state spaces for reinforcement learning. The recommendation system can consider more environmental factors to make more intelligent recommendation, and dynamically adjust the recommendation strategy according to the current environment of the user, such as recommending files related to work in working time and recommending entertainment content in leisure time. The recommendation method and the system have the advantages that the recommendation perceived by the context better meets the actual demands of users, the recommendation strategy is trained through the reinforcement learning algorithm, so that the system can dynamically adjust the recommendation strategy according to the user portraits and the context, and the system can continuously learn and optimize the recommendation strategy to adapt to the changes of the behaviors of the users and the new demands. The recommendation system is more intelligent and flexible through the dynamic adjustment capability, and the rewarding function is designed to reward or punish the recommendation system according to the user behavior. The recommendation system can be stimulated to provide files which are more in line with the demands of users, and the accuracy of recommendation and the satisfaction of users are improved. At the same time, this also promotes self-optimization and improvement of the recommendation system.
In one embodiment of the present invention, the step S22 includes:
collecting and integrating current multidimensional context information, and dynamically updating the user portraits through an online learning technology based on the user portraits constructed in the step S21 by combining with real-time user behavior data;
combining the integrated context information with the dynamically updated user representation to form a high-dimensional state space;
And defining an action space of the recommendation system, and dynamically adjusting the recommended action according to the specific situation of the user based on the situation awareness strategy on the basis of the action space.
The working principle of the technical scheme is that the system firstly collects and integrates current multi-dimensional context information, wherein the information comprises time (such as working habit differences between a working day and a weekend and between the morning and the evening), places (such as offices, families or in movement can influence file access preference), equipment states (such as screen size, convenience of reading or editing if peripheral equipment is connected, battery power and the like can influence), network environments (such as loading of large files can be favored under a high-speed network, and lightweight documents are recommended under a low-speed network), and the system dynamically updates the user portrait through an online learning technology based on user portrait constructed in the S21 by combining real-time user behavior data (such as newly added file access, editing behavior and preference change). The user image can reflect the latest interests and preferences of the user in real time, and the integrated context information is combined with the dynamically updated user image to form a high-dimensional state space. The state space not only contains basic characteristics (such as interest preference and working habit) of the user, but also integrates current specific context information (such as time, place, equipment state and network environment). This combination provides a rich context for the recommendation strategy, making the recommendation more accurate and personalized, and the action space first includes basic recommended file operations such as recommending specific files, folders, etc. These operations are basic functions of the recommendation system, can meet the basic requirements of users, and besides basic operations, the action space introduces finer and personalized action options. For example, recommending a collection of files related to it (e.g., recommending templates and commonly used charts at report composition), intelligent ordering based on file type (pre-listing high frequency edited files), and even providing a content-based summary preview or quick edit entry, based on the user's current workflow predictions. These action options are intended to enhance the practicality of the recommendation and the user experience, and the action space should also include an instant response mechanism to user feedback. The system can adjust the subsequent recommendation policy based on whether the user is viewing the recommended content. The recommendation system can continuously learn and optimize the recommendation strategy through the instant response mechanism, and dynamically adjust the recommendation action according to the specific situation of the user based on the situation awareness strategy on the basis of the action space. For example, short and important file summaries are recommended preferentially shortly before a meeting, while more references and deep reading material may be recommended when working continuously for a long period of time. Such adjustments aim to increase the relevance and practicality of the recommended content, further enhancing the user experience.
The technical scheme has the advantages that the system can more comprehensively know the current environment and situation of the user by collecting and integrating the multidimensional context information such as time, place, equipment state and network environment, so that the recommendation system can more accurately predict the behavior and the demand of the user, and the accuracy and the relevance of recommendation are improved; based on the user portraits constructed in S21, the user portraits are dynamically updated through an online learning technology in combination with real-time user behavior data, the real-time updating of the user portraits enables the system to follow the changes of user interests and behaviors, timeliness and individualization of recommendation are kept, the integrated context information is combined with the dynamically updated user portraits to form a high-dimensional state space, the high-dimensional state space provides rich context background for the recommendation strategies, the system can comprehensively consider the basic characteristics and the current situation of the users to make more intelligent and accurate recommendation, in addition to basic recommendation file operation, finer and individualization action options are introduced, such as recommending related file sets according to workflow, intelligent ordering based on file types and the like, the practicability and user experience of recommendation are improved, the users can more easily find required files, work efficiency is improved, the action space comprises an instant response mechanism for feedback to the users, such as adjusting the follow-up recommendation strategies according to whether the user views the recommendation content, the instant feedback mechanism enables the system to continuously learn and optimize the recommendation strategies, better adapt to the change of the user demands, the situation of recommendation strategies is better, the recommendation related to the actual situation is adjusted according to the recommendation content, the actual situation is adjusted, the importance of the recommendation situation is improved, the relevant file is adjusted according to the recommendation situation is important, the actual situation is adjusted, and the user is better is recommended, more references are recommended during long-term operation. Such context-aware recommendation strategies significantly enhance the user experience.
In one embodiment of the present invention, the S3 includes:
S31, calculating the similarity of file contents by utilizing cosine similarity, and analyzing the use habit of the user by combining the historical operation record of the user, wherein the use habit comprises common file combinations and access frequency, and the similarity of the file contents is calculated by the following formula:
Wherein, Representing documentsAndA similarity score between the two,AndRepresenting files separatelyAndAt the feature value of the i-th dimension, n represents the dimension of the feature vector.
S32, automatically generating an intelligent folder according to the similarity of file contents and the use habit of a user, and dynamically adjusting the file organization structure;
s33, carrying out semantic understanding on query sentences input by a user through a GPT pre-training language model, carrying out index construction on file contents, and supporting search based on keywords and natural language;
S34, sorting the search results through a reinforcement learning algorithm to enable the most relevant files to be displayed to the user preferentially.
The working principle of the technical scheme is that the contents of two files can be expressed as two vectors in a vector space. The cosine of the angle between the two vectors can be used to measure the similarity between them. The closer the cosine value is to 1, the more similar the content of the two files, and the closer the cosine value is to 0, the greater the difference in content of the two files. By analyzing the historical operation records of the user, such as opening, editing, saving and the like of the files, the common file combination and access frequency of the user can be known. The information helps the system understand the preference and behavior pattern of the user, thereby providing more personalized service for the user, and based on the similarity of file contents and the use habit of the user, the system can automatically generate intelligent folders and classify similar files together. The system can dynamically adjust the file organization structure along with the use of the user and the change of the file so as to ensure the accuracy and the practicability of the intelligent folder. The method comprises the steps of adding new files to a proper folder, updating the content of the existing folder and the like, carrying out semantic understanding on query sentences input by a user through a GPT pre-training language model, carrying out index construction on file content and supporting search based on keywords and natural language, wherein the GPT is a powerful pre-training language model with excellent language understanding and generating capability. The system can accurately understand the semantics and the intention of the query statement input by the user through the GPT model, and can index and construct the file content based on the understanding capability of the GPT model so as to quickly retrieve the file related to the user query. This includes extracting key information of the file, generating a summary, etc., and the user can search the file by inputting keywords or natural language queries. The system matches the most relevant file according to the semantic of the query sentence and the index of the file content and returns the most relevant file to the user, and the reinforcement learning is a machine learning method and learns the optimal behavior strategy by continuous trial and error. In the sorting of the search results, the reinforcement learning algorithm can optimize the sorting strategy according to the historical search behavior and feedback information of the user, and the system sorts the search results according to the relevance and preferentially displays the most relevant files to the user. The reinforcement learning algorithm can continuously adjust the sorting strategy according to feedback information such as clicking behaviors and residence time of the user so as to improve the accuracy of the search result and the satisfaction degree of the user.
The technical scheme has the advantages that the similarity between file contents can be accurately calculated through a cosine similarity algorithm, so that a user is helped to quickly find similar or related files, the literal similarity of the file contents is considered, the semantic similarity of the file contents can be captured to a certain extent, and the intelligent level of file management is improved; the method combines the historical operation records of the users to carry out deep analysis on the use habits of the users, including common file combination, access frequency and the like, personalized analysis is helpful for the system to better understand the user demands, provides a file management scheme which is more suitable for the use habits of the users, can automatically generate intelligent folders according to the similarity of file contents and the use habits of the users to classify similar files together, reduces the time for manually arranging the files by the users, improves the regularity and the definition of file organization, can dynamically adjust the file organization structure to adapt to the user demands and the change of the file contents, ensures the timeliness and the accuracy of the file organization by dynamic adjustment, is beneficial to the users to quickly find the required files, improves the working efficiency, can accurately understand the semantics and the intention of query sentences input by the users by GPT pre-training language models, can more accurately match the user queries and the file contents, improves the accuracy and the relevance of search results, supports the traditional search mode based on keywords, also supports the search mode based on natural language, ensures the search mode by the diversified search mode, can be more flexible and has improved the natural search algorithm by the natural algorithm, the intelligent sorting mode enables the most relevant files to be preferentially displayed to the user, improves the accuracy of the search results and the satisfaction of the user, can display the search results according to the using habit and preference of the user, enables the search results to meet the personalized requirements of the user, and is beneficial to improving the search experience and the satisfaction of the user through the personalized display mode.
In one embodiment of the present invention, the S32 includes:
On the basis of calculating file content similarity by cosine similarity, carrying out semantic understanding on file content by a deep semantic analysis technology, constructing a multi-dimensional file similarity matrix by combining file types and content characteristics, and capturing deeper association among files;
Deep analysis is carried out on the historical operation record of the user, and potential modes used by the user file are identified through cluster analysis and association rule mining technology;
Based on the similarity of file contents and a user behavior mode, automatically classifying related files into the same folder according to a preset intelligent folder generation algorithm, and introducing a user feedback mechanism to allow a user to name, adjust or delete the automatically generated folder;
Dynamically adjusting a file organization structure according to the use habit and the workflow of a user, and intelligently recommending or hiding related folders according to the current working situation of the user based on a context awareness technology;
The intelligent folders are personalized and optimized by combining personal preferences and working styles of users, and the generation rules and the display strategies of the folders are continuously adjusted and optimized according to the use feedback (such as click rate, residence time and the like) of the intelligent folders by the users based on the reinforcement learning algorithm.
The working principle of the technical scheme is that a deep semantic analysis technology is introduced to carry out finer semantic understanding on file contents, deep association among files is captured, a multi-dimensional file similarity matrix is built by combining file types (documents, pictures, codes and the like) and content characteristics (keywords, topics, emotions and the like), historical operation records of users are deeply analyzed, common file combinations, access frequencies and file use habits under different conditions are focused on, clustering analysis and association rule mining technology is applied to identify potential modes of file use of the users, such as file sets related to specific tasks, frequently accessed file paths and the like, related files are automatically classified into the same file according to a preset intelligent file generation algorithm based on the file content similarity and the user behavior mode, a user feedback mechanism is introduced to allow users to name, adjust or delete the automatically generated file, file organization structures are dynamically adjusted according to the use habits and work flows of the users, such as adjusting the display sequence of the file folders, automatically ordering the file and the like, intelligent file patterns are intelligently selected by the aid of context sensing technology, the current work conditions of the users (such as edited file sets, frequently accessed file paths and the like) are used, the related files are intelligently selected, the user's personal file patterns are not well as the personal file patterns and the user's preference is not clicked, the user's preference is optimized, and the user's personal preference is not clicked, and the file patterns are optimized according to the personal preference is generated according to the personal preference and the user interaction rules such as the user's interaction rules are generated by the user.
The technical scheme has the advantages that file contents can be more carefully understood through deep semantic analysis technologies such as BERT or Transformer, and deeper association among files can be captured; the file classification and search are more accurate, and the intelligent level of file management is improved; the method comprises the steps of combining file types and content characteristics, constructing a multi-dimensional file similarity matrix, helping to comprehensively evaluate the similarity among files, enabling file classification to be more reasonable through multi-dimensional similarity evaluation, enabling users to find required files quickly, further analyzing historical operation records of the users, mining file use habits of the users in different environments, identifying potential modes of file use of the users, such as a file set related to a specific task, a frequently accessed file path and the like, based on mining results of the user behavior modes, intelligently recommending files or folders possibly needed by the users, enabling personalized recommendation to reduce time for searching the files by the users, improving work efficiency, automatically generating intelligent folders according to the file content similarity and the user behavior modes, classifying related files into the same folders, automatically classifying to reduce time for users to manually arrange the files, enabling file organization structures to be clearer, dynamically adjusting file organization structures according to use habits and work flows of the users, such as adjusting display sequence of the folders, automatically sorting the files and the like, intelligently recommending the files or the like, automatically or automatically hiding the files based on context awareness technology, enabling the files to be more intelligent or related to be better optimized, enabling users to be better in terms of user's preference, enabling users to be better optimized, enabling users to be better in terms of user's personal and not to be required to be better or by optimizing the user's learning, the intelligent folder management method comprises the steps of carrying out personalized optimization on the intelligent folder, such as adjusting the appearance of icons, colors and the like of the folder, continuously adjusting and optimizing the generation rule and the display strategy of the folder according to the use feedback of a user on the intelligent folder based on a reinforcement learning algorithm, improving the visual experience and satisfaction of the user through the personalized optimization, and carrying out continuous learning and optimization according to the use feedback of the user on the intelligent folder by utilizing the reinforcement learning algorithm.
In one embodiment of the present invention, the S4 includes:
S41, extracting characteristics of a file access record of a user, modeling a file access mode of the user through a deep neural network, and capturing normal access behavior characteristics of the user;
s42, monitoring the file access behaviors of the user in real time, comparing the file access behaviors with the normal access behaviors predicted by the model, and judging the file access behaviors as abnormal behaviors if the access behaviors of the user deviate from the normal mode;
S43, if abnormal behavior is found, immediately triggering an early warning mechanism, sending early warning information to a user or an administrator, and automatically taking response measures according to a preset safety strategy.
The technical scheme comprises the working principle that a system firstly collects file access records of users, wherein the file access records comprise access time, access frequency, access file types and other key information, the characteristics are extracted and processed through a Deep Neural Network (DNN) so as to capture normal access behavior characteristics of the users more accurately, the extracted characteristics are learned and trained through the deep neural network to construct a model capable of reflecting normal file access behaviors of the users, the model can identify and predict the possible file access behaviors of the users under specific situations, the system monitors the file access behaviors of the users in real time, the file access behaviors comprise accessed files, access time, access modes and the like, real-time data are continuously input into the model established before so as to be subjected to dynamic comparison and analysis, the real-time monitored user access behaviors are compared with the normal access behaviors predicted by the model, if the access behaviors of the users deviate from the normal modes, such as unauthorized access, frequent data transmission and other abnormal behaviors are found, the system can immediately identify, an early warning mechanism can be immediately triggered, early warning information can be sent to the users or managers in time, the early warning information can be reminded, the potential safety system can take a detailed safety response, the safety response can be more comprises the safety log information, the safety response can be more accurately recorded, and the safety response can be used for the system can be used for accurately recording the safety precaution, and the safety response can be more comprise the safety-related to the safety precaution, and the safety precaution can be more.
The technical scheme has the advantages that the file access mode of the user is modeled through the deep neural network, normal access behavior characteristics of the user can be accurately captured, file access behaviors of the user are monitored in real time and compared with normal behaviors predicted by the model, once access behaviors deviating from the normal modes, such as unauthorized access or frequent data transmission, are found, the system can immediately judge as abnormal behaviors, the real-time monitoring and abnormal detection mechanism can rapidly identify potential security threats and timely prevent malicious behaviors, the security of the file access behaviors is remarkably improved, the early warning mechanism can be immediately triggered to send early warning information to the user or an administrator once the abnormal behaviors are found, response measures, such as locking files, recording logs, notifying the administrator and the like, are automatically adopted according to preset security policies, so that damage is caused by the abnormal behaviors is prevented, the timeliness and the accuracy of the early warning and response mechanism can be effectively reduced, the data security of the user is protected, the requirements of manual intervention are reduced through the automatic monitoring and the early warning mechanism, the manager can be more focused on handling early warning information and abnormal conditions, a large amount of time is not required to be required to manually monitor the file access behaviors of the user, the automatic monitoring is performed, the abnormal behaviors can be automatically monitored, the system can be prevented from being better by the detailed information, the user can be better known about the detailed information through the detailed information, the system, the detailed information can be better known about the security information, the user has been better information can be better known, and the security information can be better analyzed, and the abnormal information can be well read through the system, and the system has better security information, the system can quickly solve the security problem possibly encountered by the user, enhance the trust feeling of the user on the system security, improve the satisfaction degree and the loyalty degree of the user, formulate personalized security strategies according to the access behavior characteristics and the security requirements of the user, enable the personalized security strategies to better meet the requirements of the user, promote the user experience, enable the data security and privacy protection to be compliance requirements in many industries and organizations, enable the organizations to find and process potential security risks in time through a real-time monitoring and early warning mechanism, ensure that the data security and privacy protection meet relevant regulations and standards, and enable the system to record and generate detailed audit logs and reports which have important significance on compliance audit and internal investigation of the organizations.
In one embodiment of the present invention, as shown in fig. 2, an AI-based electronic file intelligent management system, the system includes:
the feature extraction module is used for carrying out multi-level feature extraction on file contents based on a convolutional neural network and a recurrent neural network, and training an adaptive classification model based on the extracted features;
The strategy adjustment module is used for constructing a user portrait based on the deep neural network according to the historical operation record and the behavior habit of the user, introducing a reinforcement learning mechanism, and dynamically adjusting a recommendation strategy according to the user portrait and the current context;
the ordering optimization module is used for automatically generating an intelligent folder according to the similarity of file contents and the use habit of a user, dynamically adjusting a file organization structure, combining a natural language processing technology, realizing a semantic-based search function, and optimizing search result ordering by reinforcement learning;
The abnormality processing module is used for modeling the file access mode through the deep neural network, monitoring abnormal access behaviors in real time, triggering an early warning mechanism by the system if the abnormal behaviors are found, and simultaneously automatically taking response measures according to a preset security policy.
The working principle of the technical scheme is that a convolutional neural network is used for processing image and audio content, low-level features of the multimedia content are extracted through structures such as a convolutional layer and a pooling layer, then the low-level features are combined into high-level semantic features through a full-connection layer, a recursive neural network or variants thereof is used for processing text content, time sequence dependency relations in the text are captured through a cyclic structure, so that semantic features of the text are extracted, and the extracted semantic features of the multimedia content are input into an adaptive classification model. The self-adaptive classification model automatically adjusts the classification strategy according to the complexity of the file content through training and learning. For example, for complex files containing multiple media types, a model may adopt a more complex classification strategy, and for files with single content, a simpler classification strategy is adopted, in the training process, the model continuously adjusts parameters of the files according to label data until reaching a preset classification accuracy, historical operation records and behavior habits of users are analyzed based on a deep neural network, characteristics such as preference, interest and use habits of the users are extracted, the characteristics are combined into a user portrait which is used for representing individuality and requirement of the users, a reinforcement learning mechanism is introduced, the recommendation strategies are dynamically adjusted according to the user portrait and the current context, the reinforcement learning model continuously tries different recommendation strategies and evaluates the effects of the strategies according to feedback of the users, the model can finally find the recommendation strategy which is most suitable for the users through continuous iteration and optimization, personalized file recommendation is provided, an intelligent folder is automatically generated according to similarity of file content and use habits of the users, similar files are gathered together through a clustering algorithm to form a folder, names of the folder can be automatically generated based on natural language processing technology, so that the names of the folders can be combined into a user portrait and requirements, different from the user can be used for searching the files according to the natural language processing technology, the user's semantic files can be frequently searched for matching the user, if the user is more than the user's semantic files can be frequently queried according to the semantic system, and the user's requirements can be frequently matched with the user's semantic content can be searched and matched by the user's semantic system, and the user is frequently has been searched for the user, the method comprises the steps of finding the most relevant files, sorting the most relevant files by using reinforcement learning to optimize search results, enabling the most relevant files to be ranked in front, modeling a file access mode through a deep neural network, learning a normal access behavior mode, monitoring access behaviors of the files in real time and comparing the access behaviors with the normal access mode, triggering an early warning mechanism if abnormal behaviors are found, automatically taking response measures according to a preset security policy, and possibly locking the files to prevent further access, recording logs for subsequent analysis, notifying an administrator of manual intervention and the like.
The technical scheme has the advantages that multi-level feature extraction is carried out on file contents through a convolutional neural network and a recurrent neural network, semantic features of multimedia contents such as texts, images and audios can be accurately captured, a self-adaptive classification model can automatically adjust classification strategies according to complexity of the file contents, so that classification accuracy and precision are improved, the method can process various file contents including texts, images and audios, the application range of electronic file management is expanded, a large number of files can be efficiently processed through processing of a deep learning model, work efficiency of file management is improved, user portraits can be more accurately known based on historical operation records and behavior habits of users, a reinforcement learning mechanism is introduced, personalized file recommendation can be provided for users according to the user portraits and current context dynamic adjustment recommendation strategies, satisfaction and experience of the users are improved, the personalized recommendation can reduce time for users to search the files, meanwhile, utilization of storage resources can be optimized by recommending files which the users possibly have interest, unnecessary storage waste is reduced, the files can be automatically processed according to the processing of the deep learning model, the user portraits and the behavior habits of the users can be more accurately constructed, the requirements and preferences of the users can be more accurately known, the users can be better, the user can be searched by searching the related search results can be more easily and the user can be more easily ordered, the user can be more has a natural language can be more optimized, and has better search results can be based on the user-oriented, and has a user-oriented search system, can be provided, the method further improves the searching accuracy, models the file access mode through the deep neural network, can monitor abnormal access behaviors such as unauthorized access and frequent data transmission in real time, once the abnormal behaviors are found, the system can immediately trigger an early warning mechanism and automatically take response measures such as locking the file, recording logs and notifying an administrator according to a preset safety strategy, so that the safety and the integrity of the file are effectively guaranteed, the real-time monitoring and early warning mechanism can timely find potential safety threats, provide enough time for the administrator to deal with and process the abnormal behaviors, and the preset safety strategy can automatically deal with the abnormal behaviors, reduce delay and errors of manual intervention and improve the efficiency and the accuracy of emergency response.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.