Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a flowchart of a sample source augmentation method according to an embodiment of the present invention, where the method is applicable to augmentation of images in a sample source used in a deep learning model, and the method may be performed by a sample source augmentation device, which may be implemented in hardware and/or software, and the sample source augmentation device may be configured in a computer device. As shown in fig. 1, the method includes:
and S110, determining a sample source to be amplified, wherein the sample source to be amplified is applied to training of a deep learning model.
The sample sources are a set of a plurality of images for training, and the images in the sample sources are used for the deep learning model to learn, so that the performance of the deep learning model is improved. When deep learning is performed, the number of sample sources has a great influence on the deep learning, and if a sufficient number of sample sources cannot be provided, the performance and the effect of the deep learning model are influenced. For example, in a factory where defective pictures in quality inspection pictures are identified, the number of sample sources of the defective pictures that can be provided by many factories is small at present, which causes low quality of a model for deep learning training and affects subsequent model applications, so that an augmentation operation needs to be performed on a limited number of sample source data to increase the data volume, thereby achieving the purpose of improving the performance and quality of the model. The sample source is augmented, i.e. the images in the sample source are augmented, and similar but different training samples are generated by making a series of changes to the training images, thereby enlarging the scale of the training data set.
And S120, obtaining a corresponding sample image according to the image in the sample source to be amplified.
The deep learning model is trained to be applied to a specific service scene, so that during training, sample images related to the specific service scene are used as a training set and a verification set, and therefore, images in a sample source to be augmented are processed to obtain sample images suitable for the specific service scene.
In an embodiment of the present invention, the obtaining a corresponding sample image according to the image in the sample source to be augmented includes: determining a region to be identified of each image; marking the area to be identified of each image through a marking frame; and marking the area to be identified of each image through a label to obtain the sample image.
For example, a business scenario of deep learning model application is to identify defect locations in quality inspection images in a factory, so the selected sample source to be augmented is a sample source composed of several quality inspection images. And for the quality detection image in the sample source to be amplified, marking the quality defect area by the marking frame to obtain the sample image marked with the quality defect area, wherein the area to be identified is the quality defect area in the image. Preferably, the quality defect region can be labeled using Labelimg, which is a graphical image annotation tool written using Python and using Qt as its graphical interface. The annotations were saved as XML files in PASCALVOC format, using ImageNet. For example, for a quality detection image, a defect part in the quality detection image is marked by an annotation frame to obtain a sample image, and the sample image is used for training a deep learning model to determine a quality defect area through the annotation frame.
S130, determining a custom augmentation scheme corresponding to the sample source to be augmented.
And S140, applying the custom augmentation scheme to the sample image to obtain an augmented sample source.
Most of the existing methods for amplifying the sample sources write corresponding algorithms for specific sample sources, the implementation mode is inflexible, the amplification dimensions and the modes are written in the algorithms, the amplification dimensions of the sample sources are limited, and the sample sources cannot be dynamically amplified according to training results to achieve the effect of improving the performance and quality of deep learning models. By formulating the customized augmentation scheme corresponding to the sample source to be augmented, the augmentation scheme can be customized according to user requirements, augmentation operation is achieved by using different schemes for different sample sources according to different application scenes, and data augmentation effect and model training quality are improved.
As mentioned above, the augmentation of the sample source is essentially the image augmentation of the sample image in the sample source, and the sample image is augmented through the customized augmentation scheme, so that the sample image generates similar but different sample images according to the customized augmentation scheme, thereby enlarging the scale of the training set and the verification set used by the deep learning model training and improving the performance and the quality of the model.
According to the technical scheme of the embodiment of the invention, the sample source to be amplified is determined, the corresponding sample image is obtained according to the image in the sample source to be amplified, the custom amplification scheme corresponding to the sample source to be amplified is determined, and the custom amplification scheme is applied to the sample image to obtain the amplified sample source. The scheme provided by the invention can be used for customizing the amplification scheme of the sample source, so that the amplification scheme can be customized for different sample sources and scenes, and the sample image in the sample source is amplified through the configured customized amplification scheme, thereby improving the synergistic effect of the sample source.
Example two
Fig. 2 is a flowchart of a method for determining a customized augmentation scheme according to a second embodiment of the present invention. As shown in fig. 2, the method includes:
s210, determining an augmentation mode of the customized augmentation scheme, wherein the augmentation mode is online augmentation or offline augmentation.
In an embodiment of the present invention, when an augmentation mode of a custom augmentation scheme is online augmentation, the applying the custom augmentation scheme to the sample image to obtain an augmented sample source includes: inputting the sample source to be amplified into a preset deep learning model for training; in the deep learning model training process, according to a target augmentation rule included in the user-defined augmentation scheme, augmenting the sample image in the sample source to be augmented to obtain the augmented sample source.
The method has the advantages that the data which are augmented do not need to be synthesized, so that the storage space of the data is saved, the method has high flexibility, and the data amount in the training process is infinite theoretically, but the method can also cause a problem that each image which is trained by an epoch is different, so that the method can obtain good effect when classification or other tasks which do not have high requirements on data change are carried out, but cannot obtain good promotion when tasks which have high requirements on data change such as character recognition are carried out, particularly, the model cannot be well converged under the condition that the image change is large after the data are augmented, and under the condition, off-line augmentation can be tried or the probability of using the original image can be improved when the augmentation is carried out.
In an embodiment of the present invention, when an augmentation mode of a custom augmentation scheme is offline augmentation, the applying the custom augmentation scheme to the sample image to obtain an augmented sample source includes: amplifying the sample image in the sample source to be amplified according to a target amplification rule included in the user-defined amplification scheme to obtain an amplified sample source; inputting the augmented sample source into a preset deep learning model for training.
The method has the advantages that the augmented data is visualized, so that developers can control the effect of the augmented data, the data augmented by the offline method are limited, the performance of the model can be well evaluated by the data augmentation, and the method has the defects that the augmented data needs to be generated, so that more storage space is occupied and the flexibility is poor.
For the customized augmentation scheme, offline augmentation or online augmentation is selected, which can be selected according to actual service requirements or hardware conditions, for example, when the storage space is sufficient, offline augmentation can be selected to ensure augmentation.
S220, selecting the target augmentation rule of the user-defined augmentation scheme from preset augmentation rules, and configuring the target augmentation rule.
The scheme of the invention supports the user to configure the augmentation rule in a customized manner and customizes the augmentation scheme aiming at different sample sources and scenes.
In an embodiment of the present invention, the augmentation rule includes: the dimension augmentation rule comprises at least one augmentation dimension, and each augmentation dimension corresponds to one attribute of the image; correspondingly, when the target augmentation rule is the dimension augmentation rule, the configuring the target augmentation rule includes: selecting at least one target augmentation dimension from the at least one augmentation dimension; and configuring attribute values of the attributes corresponding to each target augmentation dimension.
The attributes are a plurality of intrinsic parameters of the image, each augmentation dimension corresponds to different parameters of the image, and the augmentation dimensions are configured by modifying the parameters. The dimensions of augmentation include, but are not limited to: horizontal mirroring, vertical mirroring, special angle rotation, affine transformation, brightness transformation, contrast transformation, gamma transformation, blurring, noise, sharpening, saturation, elastic deformation, and the like. The different expansion dimensions can be configured individually or in a superposition. After the target augmentation dimensions are selected, the attribute value of each target augmentation dimension is set, such as a change interval of brightness, a change interval of saturation, and the like.
In an embodiment of the present invention, the augmentation rule includes: editing augmentation rules, wherein the editing augmentation rules comprise image cutting and image aliasing; correspondingly, when the target augmentation rule is the editing augmentation rule, configuring the target augmentation rule includes: configuring regions of image cropping and/or configuring effects of image aliasing.
The sample image is cut, a new sample image can be obtained without adjusting various attribute values of the sample image, and after the sample image is cut, identifiable elements in the sample image are reduced, for example, in the original sample image, elements such as sand beach, blue sky and seawater can be identified by the depth identification model, and the sample image can be determined to be a beach. After the sample image is cut, for example, a large amount of sand beach parts are cut, the deep learning model can improve the recognition capability in the process of recognizing the cut sample image.
The sample images are subjected to aliasing, the gray values of the pixel points in the gray images of different sample images can be adjusted to the gray values of other sample images, or the different sample images are overlapped according to different transparencies, so that the blended sample images are obtained. The image aliasing effect is determined by the adjustment value of the gray value and the transparency when different sample images are overlapped. The mixed and overlapped sample images can affect the integrity and accuracy of the images, so that the deep learning model can improve the identification capability by identifying and verifying the mixed and overlapped sample images.
In an embodiment of the present invention, the augmentation rule includes: a random augmentation rule; correspondingly, when the target augmentation rule is the random augmentation rule, the configuring the target augmentation rule includes: randomly selecting one of the dimension augmentation rules and the editing augmentation rules; when the dimension augmentation rule is selected randomly, at least one augmentation dimension is selected randomly from at least one augmentation dimension included in the dimension augmentation rule, and attribute values corresponding to the augmentation dimensions are configured randomly; when the editing augmentation rule is randomly selected, the region for image cropping is randomly configured and/or the effect of image aliasing is randomly configured.
In the method, for example, based on the two kinds of augmentation rules, an augmentation mode for the sample image can be randomly selected through a random augmentation rule, when the random augmentation rule is selected, one augmentation rule is randomly selected from the dimension augmentation rule and the editing augmentation rule, in the randomly selected dimension augmentation rule, the number of augmentation dimensions to be selected is randomly determined, after the number is determined, a corresponding number of augmentation dimensions are randomly selected from the dimension augmentation rule, for each augmentation dimension, a configured attribute value is randomly determined, and an augmentation scheme is obtained and applied to the sample image; when a random augmentation rule is selected, a clipping region and an aliasing effect of the sample image are randomly determined, and an augmentation scheme is obtained and applied to the sample image. The advantage of obtaining the augmentation scheme through the random augmentation rules is that the configuration steps required by other augmentation rules can be reduced under the condition that the pertinence of the requirements on the augmentation scheme is not strong, and the efficiency of determining the augmentation scheme is improved.
The scheme of the embodiment of the invention is preferably realized in the following way: uploading and labeling of sample sources, off-line augmentation release, on-line augmentation and the like are realized by building a server, establishing web services, algorithm services, local area network management of storage services, maintaining storage services, front-end web services, algorithm services, rear-end services and the like; a custom augmentation scheme corresponding to the sample source can be configured in the management system, and dynamic augmentation configuration of the sample source dimension can be performed in a specific scene according to specific requirements. The method can be used for carrying out off-line amplification configuration on the sample source, and generating a new amplification sample source picture for the existing sample source according to the configured amplification rule. The sample source can be amplified on line during training, and the sample source is amplified during training.
The implementation of the scheme of the invention can comprise two parts of hardware and software, wherein the hardware is server equipment and the like, and the software system comprises front-end web service, algorithm service and back-end service; the front-end web service realizes dynamic and flexible configuration of the augmentation scheme, the back-end service is responsible for storing the augmentation scheme, offline augmentation and the like, and the algorithm service is responsible for online augmentation, model training and the like. And when the front end is configured with an augmentation rule to issue, augmenting the data set according to the rule by using an OpenCV framework to obtain a final augmented data set. Random augmentation of the data set with randAugment is also achieved. GridMask performs an image cropping operation on the dataset. And carrying out image aliasing and amplification on Mixup and Cutmix.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a sample source amplification device according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes:
a to-be-augmented sample source determining unit 310, configured to determine a to-be-augmented sample source, where the to-be-augmented sample source is applied to training of a deep learning model;
a sample image determining unit 320, configured to obtain a corresponding sample image according to the image in the sample source to be augmented;
a custom augmentation scheme determining unit 330, configured to determine a custom augmentation scheme corresponding to the sample source to be augmented;
a custom augmentation scheme applying unit 340, configured to apply the custom augmentation scheme to the sample image to obtain an augmented sample source.
In the embodiment of the present invention, the sample image determining unit 320 is configured to determine a region to be identified of each of the images; marking the area to be identified of each image through a marking frame; and marking the area to be identified of each image through a label to obtain the sample image.
In the embodiment of the present invention, the customized augmentation scheme determining unit 330 is configured to determine an augmentation mode of the customized augmentation scheme, where the augmentation mode is online augmentation or offline augmentation; and selecting the target augmentation rule of the customized augmentation scheme from preset augmentation rules, and configuring the target augmentation rule.
In the embodiment of the present invention, when the augmentation mode of the customized augmentation scheme is online augmentation, the customized augmentation scheme determining unit 330 is configured to input the sample source to be augmented into a preset deep learning model for training; in the deep learning model training process, according to a target augmentation rule included in the user-defined augmentation scheme, augmenting the sample image in the sample source to be augmented to obtain the augmented sample source.
In the embodiment of the present invention, when the augmentation mode of the custom augmentation scheme is offline augmentation, the custom augmentation scheme determining unit 330 is configured to augment the sample image in the sample source to be augmented according to a target augmentation rule included in the custom augmentation scheme to obtain the augmented sample source; inputting the augmented sample source into a preset deep learning model for training.
In the embodiment of the present invention, when the augmentation rule includes: a dimension augmentation rule, wherein the dimension augmentation rule includes at least one augmentation dimension, and when each augmentation dimension corresponds to one attribute of the image, a corresponding customized augmentation scheme determining unit 330 is configured to select at least one target augmentation dimension from the at least one augmentation dimension; and configuring attribute values of the attributes corresponding to each target augmentation dimension.
In the embodiment of the present invention, when the augmentation rule includes: and editing an augmentation rule, wherein the editing augmentation rule includes image cropping and image aliasing, and correspondingly, when the target augmentation rule is the editing augmentation rule, the customized augmentation scheme determining unit 330 is configured to configure an image cropping area and/or configure an image aliasing effect.
In an embodiment of the present invention, the augmentation rule includes: a random augmentation rule; correspondingly, when the target augmentation rule is the random augmentation rule, the customized augmentation scheme determining unit 330 is configured to randomly select one of the dimension augmentation rule and the editing augmentation rule; when the dimension augmentation rule is selected randomly, at least one augmentation dimension is selected randomly from at least one augmentation dimension included in the dimension augmentation rule, and attribute values corresponding to the augmentation dimensions are configured randomly; when the editing augmentation rule is randomly selected, randomly configuring the area for image cropping and/or randomly configuring the effect of image aliasing.
The sample source amplification device provided by the embodiment of the invention can execute the sample source amplification method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
FIG. 4 shows a schematic block diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM12, and the RAM13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 performs the various methods and processes described above, such as the sample source augmentation method.
In some embodiments, the sample source augmentation method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM12 and/or the communication unit 19. When the computer program is loaded into RAM13 and executed by processor 11, one or more steps of the sample source augmentation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the sample source augmentation method by any other suitable means (e.g., by way of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the Internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.