TWI430204B - Codebook generating method - Google Patents
Codebook generating method Download PDFInfo
- Publication number
- TWI430204B TWI430204B TW98132154A TW98132154A TWI430204B TW I430204 B TWI430204 B TW I430204B TW 98132154 A TW98132154 A TW 98132154A TW 98132154 A TW98132154 A TW 98132154A TW I430204 B TWI430204 B TW I430204B
- Authority
- TW
- Taiwan
- Prior art keywords
- neurons
- neuron
- original
- cluster
- codebook
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 71
- 210000002569 neuron Anatomy 0.000 claims description 139
- 239000013598 vector Substances 0.000 claims description 65
- 238000009826 distribution Methods 0.000 claims description 16
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 210000005036 nerve Anatomy 0.000 claims description 3
- 230000006835 compression Effects 0.000 description 11
- 238000007906 compression Methods 0.000 description 11
- 238000004364 calculation method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000006467 substitution reaction Methods 0.000 description 5
- 238000013139 quantization Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 239000002699 waste material Substances 0.000 description 2
- 235000002566 Capsicum Nutrition 0.000 description 1
- 241000758706 Piperaceae Species 0.000 description 1
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Image Processing (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Image Analysis (AREA)
Description
本發明係關於一種編碼簿產生方法,特別是一種用於影像壓縮技術之編碼簿產生方法。The present invention relates to a codebook generation method, and more particularly to a codebook generation method for image compression technology.
一般而言,影像壓縮技術中,通常係透過一編碼簿產生方法產生一編碼簿(codebook),後續在保存圖片及傳輸圖片時,都以此容量較小的編碼簿代替原始輸入影像,進而達到壓縮之目的;相反地,在進行解壓縮時,將編碼簿透過解碼演算法還原成數個還原區塊,最後再將這些還原區塊組合成一還原影像(reconstructed image),即完成解壓縮之步驟。Generally, in the image compression technology, a codebook is usually generated by a codebook generation method, and then the original input image is replaced by a smaller codebook when the image is saved and the image is transferred. The purpose of compression; conversely, when decompressing, the codebook is restored to a plurality of restoration blocks by a decoding algorithm, and finally these reduction blocks are combined into a reconstructed image, that is, the step of decompressing is completed.
習用編碼簿產生方法通常係先對一原始輸入影像進行”切塊”,將該原始輸入影像切成較小且數量較多的原始區塊,透過一向量量化(VQ,Vector Quantization)技術將該些原始區塊轉換為原始向量。再以一編碼演算法對該些原始向量進行處理,以取得數量較少但可代表該原始輸入影像的代表區塊(codeword),以透過該些代表區塊共同組成該編碼簿。The conventional codebook generation method generally performs a "cutting" on an original input image, and cuts the original input image into smaller and larger number of original blocks, and the vector is quantized by a vector quantization (VQ, Vector Quantization) technique. These original blocks are converted to the original vector. The original vectors are processed by a coding algorithm to obtain a representative number of codewords that can represent the original input image to form the codebook through the representative blocks.
習用編碼演算法,如LBG演算法,其係由Linde、Buzo及Gray於1980年提出。其概念類似於資料分群方法中之K-means法。其係先進行第一步驟預設一失真度參數ε及欲分群之群數k;接著進行一第二步驟隨機由該些原始向量中選出k個原始向量作為形心點;接著進行一第三步驟將各原始向量與分別與所有之形心點進行歐幾里得距離運算,以將各個原始向量分群至所對應之形心點;接著進行一第四步驟求出各群內原始向量的質心,以作為新的形心點;接著進行一第五步驟計算新的形心點與舊的形心點的差異,亦即失真度,若失真度不小於該失真度參數ε,則重複該第三步驟,若差異大於該失真度參數ε,則完成訓練,所獲得之形心點便可作為該代表區塊共同組成該編碼簿。Conventional coding algorithms, such as the LBG algorithm, were proposed in 1980 by Linde, Buzo, and Gray. The concept is similar to the K-means method in the data grouping method. The first step is to perform a first step of presetting a distortion degree parameter ε and a group number k to be grouped; then performing a second step of randomly selecting k original vectors from the original vectors as centroid points; and then performing a third The step performs Euclidean distance operations on each of the original vectors and all the centroid points to group the original vectors into the corresponding centroid points; and then performs a fourth step to obtain the quality of the original vectors in each group. The heart is taken as a new centroid point; then a fifth step is performed to calculate the difference between the new centroid point and the old centroid point, that is, the degree of distortion, and if the distortion is not less than the distortion parameter ε, repeat the In the third step, if the difference is greater than the distortion parameter ε, the training is completed, and the obtained centroid point can be used as the representative block to form the codebook.
一般而言,上述LBG演算法係以隨機初始之方式進行,且原始向量之數目較多且較為複雜,因此以該LBG法進行編碼演算所獲得之結果不甚穩定,使其具有品質較差之缺點。In general, the above LBG algorithm is performed in a random initial manner, and the number of original vectors is large and complicated. Therefore, the result obtained by the LBG method is not stable, which has the disadvantage of poor quality. .
另一習用編碼演算法,例如自組織映射圖(Self-Organizing Map,SOM)法,如第1圖所示,其係先進行一第一步驟將一影像中之數個像素(pixel)轉換為數個輸入樣本8;接著進行一第二步驟預設N個之神經元9,並隨機選出N個輸入樣本作為該N個神經元9之初始位置;接著進行一第三步驟隨機選出一輸入樣本8,並將該輸入樣本8分別與所有神經元9進行歐幾里得距離運算;接著進行一第四步驟選擇距離最短之神經元9作為一優勝神經元91,並調整該優勝神經元91以及距離該優勝神經元91一鄰近半徑R內之鄰近神經元92,使該優勝神經元91及鄰近神經元92向該一個輸入樣本8產生位移;接著進行一第五步驟判斷是否所有輸入樣本8皆完成該歐幾里得距離運算,若否,則進行該第三步驟,若是,則完成本次訓練,並依一定比例縮小該鄰近半徑R,再進行下一次之訓練,直至迭代次數達一預定之迭代次數後,便可獲得由該數個神經元9所組成之神經網絡圖。如此,所獲得之該些神經元9便可作為編碼簿應用於影像壓縮技術中。Another conventional coding algorithm, such as the Self-Organizing Map (SOM) method, as shown in FIG. 1, is a first step of converting a number of pixels (pixels) in an image into numbers. Input sample 8; then perform a second step to preset N neurons 9 and randomly select N input samples as the initial position of the N neurons 9; then perform a third step to randomly select an input sample 8 And the input sample 8 is respectively subjected to a Euclidean distance operation with all the neurons 9; then a fourth step is performed to select the neuron 9 having the shortest distance as a superior neuron 91, and the superior neuron 91 and the distance are adjusted. The superior neuron 91 is adjacent to the adjacent neuron 92 in the radius R, causing the superior neuron 91 and the adjacent neuron 92 to shift to the one input sample 8. Then, a fifth step is performed to determine whether all the input samples 8 are completed. The Euclidean distance calculation, if not, the third step is performed, and if so, the training is completed, and the adjacent radius R is reduced according to a certain ratio, and then the next training is performed until the number of iterations reaches After the predetermined number of iterations, the neural network can be obtained from this plurality of FIG. 9 composed of neurons. Thus, the obtained neurons 9 can be applied as an encoder to image compression technology.
上述習用編碼演算法由於需要經過大量的運算,方可得到令人滿意之神經網絡圖,且當該神經元9之數目越多,或者該輸入樣本8越多時,其需要越長之運算時間,因此,需要耗費大量時間進行運算,造成其具有效率低落之缺點。The above conventional coding algorithm can obtain a satisfactory neural network map because a large number of operations are required, and the more the number of the neurons 9 or the more the input samples 8, the longer the operation time is required. Therefore, it takes a lot of time to perform calculations, resulting in its inefficiency.
另一習用編碼演算法,例如Fast SOMs法,其係針對上述以自組織映射圖法進行改良,其係先以K-means演算法將該數個輸入樣本分為N群,且N係與該神經元之數目相同;再以該N群之群心點分別做為該N個神經元之初始位置,如此,便可獲得初步之神經網絡圖;接著再以該習用自組織映射圖法進行運算,便可縮短自組織映射圖法之運算時間。Another conventional coding algorithm, such as the Fast SOMs method, is based on the self-organizing map method described above, which first divides the input samples into N groups by the K-means algorithm, and the N system and the The number of neurons is the same; then the group points of the N group are used as the initial positions of the N neurons, so that a preliminary neural network map can be obtained; and then the self-organizing map method is used for the operation. , can shorten the operation time of the self-organizing map method.
然而,上述Fast SOMs法由於該神經元之初始位置係利用二維空間的概念,因此並不適合應用於高維度資料之處理。However, the above Fast SOMs method is not suitable for the processing of high-dimensional data because the initial position of the neuron utilizes the concept of two-dimensional space.
另一習用編碼演算法,例如階層式自組織映射圖(HSOM)法,其主要概念係將自組織映射圖法之演算過程分為兩層,例如,若原先自組織映射圖法中之神經元預設為16×16=256個,則其時間複雜度將較高,而此法則是先以16個第一層神經元,利用自組織映射圖法完成第一層訓練後,再將該數個輸入樣本分群至該16個第一層神經元,以分成16群輸入樣本;再分別對每一群之輸入樣本以16個第二層神經元,並利用自組織映射圖法完成第二層訓練後,每一群之輸入樣本中便可獲得16個第二層神經元,如此便可獲得16×16=256個第二層神經元。如此,便可降低自組織映射圖法之時間複雜度。Another conventional coding algorithm, such as the hierarchical self-organizing map (HSOM) method, the main concept is to divide the calculus process of the self-organizing map method into two layers, for example, if the neurons in the original self-organizing map method The default is 16 × 16 = 256, the time complexity will be higher, and this rule is to use 16 first-level neurons, after using the self-organizing map method to complete the first layer of training, then the number The input samples are grouped into the 16 first-level neurons to be divided into 16 groups of input samples; 16 second-level neurons are respectively input to each group of input samples, and the second layer training is completed by using the self-organizing map method. After that, 16 second-level neurons can be obtained from the input samples of each group, so that 16×16=256 second-layer neurons can be obtained. In this way, the time complexity of the self-organizing map method can be reduced.
然而,由於該階層式自組織映射圖法中,第一層及第二層神經元之數目皆以SOM演算法進行訓練,其訓練時間較長;再且,各群內之輸入樣本數量不一,各群卻以相同之神經元數量進行第二層神經元之訓練,相同無法確實描述輸入樣本的真實分佈狀況,使得其具有訓練精確度較差之缺點;再且,於SOM法訓練過程中,可能有些神經元初始位置不佳,造成訓練過程中該些神經元從未成為優勝神經元,而使得該些神經元亦無法確實描述輸入樣本的真實分佈狀況。However, due to the hierarchical self-organizing map method, the number of neurons in the first layer and the second layer are trained by the SOM algorithm, and the training time is longer; and the number of input samples in each group is different. Each group performs the training of the second layer of neurons with the same number of neurons. The same cannot describe the true distribution of the input samples, which makes it have the disadvantage of poor training accuracy. Moreover, during the SOM training process, It may be that some neurons have poor initial positions, which cause the neurons to never become superior neurons during training, and the neurons cannot accurately describe the true distribution of the input samples.
基於上述原因,其有必要進一步改良上述習用編碼簿產生方法。For the above reasons, it is necessary to further improve the above-described conventional codebook generation method.
本發明乃改良上述缺點,以提供一種編碼簿產生方法,係以降低產生編碼簿之時間成本為目的。The present invention is to improve the above disadvantages to provide a method for generating a codebook for the purpose of reducing the time cost of generating a codebook.
本發明次一目的係提供一種編碼簿產生方法,係可以提升影像資料之壓縮品質。The second object of the present invention is to provide a method for generating a code book, which can improve the compression quality of image data.
本發明再一目的係提供一種編碼簿產生方法,係可以提升神經元之分佈精確度。Still another object of the present invention is to provide a method for generating a codebook which can improve the accuracy of distribution of neurons.
根據本發明的編碼簿產生方法,係包含:一切割轉換步驟,將一原始輸入影像分割成數個原始區塊,再將各該原始區塊轉換為原始向量,並設定一編碼簿之大小;一初步分群步驟,以分割式分群演算法對該數個原始向量進行分群,以獲得數個第一形心點,並將該些原始向量分別分群至最接近之第一形心點,以獲得數個群集;一神經元分配步驟,係依失真度比例分配各群集中神經元之數量,再根據分配結果於各群集中選擇數個原始向量作為該神經元,其中該些神經元之總數係與該編碼簿之大小相同;一神經元訓練步驟,係以各群集內之原始向量做為樣本,並以自組織映射圖(SOM)演算法分別對各群集內之神經元進行訓練,以獲得數個最終神經元;一閒置神經元判斷步驟,選取其中一群集,並將該群集中從未成為優勝者之最終神經元定義為閒置神經元;一取代步驟,對該群集內之原始向量以分割式分群演算法進行分群,以獲得對應該閒置神經元數量之第二形心點,並以該第二形心點取代該閒置神經元作為最終神經元;一終止判斷步驟,係判斷是否所有群集之閒置神經元皆已被取代,若判斷為「否」,則重新進行該閒置神經元判斷步驟,若判斷為「是」,則將該些最終神經元所分別對應之向量共同儲存於該編碼簿內。The method for generating a codebook according to the present invention comprises: a cutting conversion step of dividing an original input image into a plurality of original blocks, converting each original block into an original vector, and setting a size of the code book; In the preliminary grouping step, the plurality of original vectors are grouped by a split clustering algorithm to obtain a plurality of first centroid points, and the original vectors are separately grouped to the closest first centroid point to obtain a number Clustering; a neuron allocation step of assigning the number of neurons in each cluster according to the degree of distortion, and then selecting a plurality of original vectors as the neurons in each cluster according to the distribution result, wherein the total number of the neurons is The codebook has the same size; a neuron training step takes the original vector in each cluster as a sample, and uses a self-organizing map (SOM) algorithm to train the neurons in each cluster to obtain the number. a final neuron; an idle neuron decision step, selecting one of the clusters, and defining the final neuron in the cluster that has never been the winner as a idle neuron; In the substitution step, the original vector in the cluster is grouped by a split clustering algorithm to obtain a second centroid point corresponding to the number of idle neurons, and the idle heart is replaced by the second centroid as the final nerve A termination step determines whether all the idle neurons of the cluster have been replaced. If the determination is "No", the idle neuron determination step is re-executed. If the determination is "Yes", then the final The vectors corresponding to the neurons are stored together in the codebook.
為讓本發明之上述及其他目的、特徵及優點能更明顯易懂,下文特舉本發明之較佳實施例,並配合所附圖式,作詳細說明如下:請參照第2圖所示,本發明較佳實施例之編碼簿產生方法係包含一切割轉換步驟S1、一初步分群步驟S2、一神經元分配步驟S3、一神經元訓練步驟S4、一閒置神經元判斷步驟S5、一取代步驟S6及一終止判斷步驟S7。The above and other objects, features and advantages of the present invention will become more <RTIgt; The codebook generating method of the preferred embodiment of the present invention comprises a cutting conversion step S1, a preliminary grouping step S2, a neuron allocation step S3, a neuron training step S4, an idle neuron determining step S5, and a substitution step. S6 and a termination determination step S7.
請參照第2圖所示,本發明之編碼簿產生方法係藉由一電腦系統連接至少一資料庫作為執行架構,該資料庫中係存有至少一原始輸入影像,且該原始輸入影像係由數個像素所組成。Referring to FIG. 2, the codebook generating method of the present invention connects at least one database as an execution architecture by using a computer system, wherein at least one original input image is stored in the database, and the original input image is Consists of several pixels.
請參照第2及3圖所示,本發明之切割轉換步驟S1將一原始輸入影像分割成數個原始區塊,再將各該原始區塊轉換為原始向量11,並設定一編碼簿之大小。舉例而言,若該影像資料為512×512之圖片檔,可將該原始輸入影像切割成16384個原始4×4區塊,則各個原始區塊內將包含有16個像素,接著再透過向量量化(VQ,Vector Quantization)技術將該16384個原始區塊內之像素轉換為原始向量11,並儲存於該資料庫中。該編碼簿之大小係指該編碼簿中所儲存之向量數量,例如,若使用者最終欲以1024個代表區塊表示該原始輸入影像,則該編碼簿之大小便為1024,以容納1024個用以轉換成該些代表區塊之向量。舉例而言,本實施例之編碼簿大小係設定為1024。Referring to Figures 2 and 3, the cutting conversion step S1 of the present invention divides an original input image into a plurality of original blocks, converts each original block into an original vector 11, and sets the size of an encoded book. For example, if the image data is a 512×512 image file, the original input image can be cut into 16384 original 4×4 blocks, and each original block will contain 16 pixels, and then the vector will be transmitted. The quantization (VQ, Vector Quantization) technique converts the pixels in the 16384 original blocks into the original vector 11 and stores them in the database. The size of the codebook refers to the number of vectors stored in the codebook. For example, if the user finally wants to represent the original input image in 1024 representative blocks, the size of the codebook is 1024 to accommodate 1024. A vector used to convert to the representative blocks. For example, the codebook size of this embodiment is set to 1024.
請參照第2及4圖所示,本發明之初步分群步驟S2以分割式分群演算法對該數個原始向量11進行分群,以獲得數個第一形心點12,並將該些原始向量11分別分群至最接近之第一形心點12,以獲得數個群集2。更詳言之,於本實施例中,該分割式演算法係選擇為LBG分群演算法,其中,該LBG法詳述如下:該LBG法之第一步驟係先設定欲分群之群數k,該群數k之平方較佳係等於該編碼簿之大小。延續上述例子,該編碼簿之大小為1024=k2 ,則該群數k便為;接著進行該LBG法之第二步驟,由該些原始向量11隨機選出32個作為初始形心點之初始位置;接著進行該LBG法之第三步驟,將各原始向量11與該初始形心點進行歐幾里得距離運算,以將各個原始向量11分群至所對應之初始形心點;接著進行該LBG法之第四步驟,求出各個群組內原始向量11的質心,以作為新的形心點;接著進行該LBG法之第五步驟,以計算失真度,該失真度係指新的初始形心點與舊的初始形心點的差異,若失真度不小於該失真度參數,則重複該LBG法之第三步驟,若差異大於該失真度參數,則完成訓練,並獲得32個第一形心點12。再將各個原始向量11分別與所有第一形心點12進行歐幾里得距離運算,並將原始向量11分群至距離最近之第一形心點12。延續上述例子,該第一形心點12之數量係為32個,因此便可將該些原始向量11分為32個群集2。其中,該失真度係以平均平方誤差(Mean Square Error,MSE)進行估算,如式(a)所示:Referring to FIGS. 2 and 4, the preliminary grouping step S2 of the present invention groups the plurality of original vectors 11 by a split grouping algorithm to obtain a plurality of first centroid points 12, and the original vectors are 11 is grouped separately to the closest first centroid point 12 to obtain a plurality of clusters 2. In more detail, in the embodiment, the split algorithm is selected as an LBG grouping algorithm, wherein the LBG method is detailed as follows: the first step of the LBG method is to first set the number k of groups to be grouped, The square of the group number k is preferably equal to the size of the codebook. Continuing the above example, the size of the codebook is 1024=k 2 , then the number k of the group is And then performing the second step of the LBG method, randomly selecting 32 initial positions of the initial centroid points from the original vectors 11; and then performing the third step of the LBG method, respectively, the original vectors 11 and the initial centroids The points are subjected to a Euclidean distance operation to group the original original vectors 11 to the corresponding initial centroid points; then the fourth step of the LBG method is performed to find the centroid of the original vector 11 in each group as a new centroid point; then performing the fifth step of the LBG method to calculate the distortion, which is the difference between the new initial centroid point and the old initial centroid point, if the distortion is not less than the distortion For the parameter, the third step of the LBG method is repeated. If the difference is greater than the distortion parameter, the training is completed and 32 first centroid points 12 are obtained. Each of the original vectors 11 is then subjected to a Euclidean distance operation with all of the first centroid points 12, and the original vectors 11 are grouped to the nearest first centroid point 12. Continuing the above example, the number of the first centroids 12 is 32, so that the original vectors 11 can be divided into 32 clusters 2. Wherein, the distortion is estimated by Mean Square Error (MSE), as shown in equation (a):
於式(a)中,W 係指影像之寬;H 係指影像之高;係指原始影像中第(x ,y )個像素與壓縮影像中第(x ,y )像素之位元深度的差值,求得連續兩次之MSE ,即MSE (t )與MSE (t +1),進而在求得平均失真的變異量△,求得之△MSE 若大於預先設定之收斂閥值則進入下一個迭代,反之則停止,本實施例該LBG法之收斂閥值係設定為0.0001。如此,便可透過分割式演算法對該些大量之原始向量11進行初步分群,以降低後續訓練之複雜度,進而可大幅的降低運算時間。In the formula (a), W means the width of the image; H means the height of the image; Refers to the difference between the ( x , y )th pixel in the original image and the bit depth of the ( x , y ) pixel in the compressed image, and finds the MSE of two consecutive times, namely MSE ( t ) and MSE ( t + 1), and then find the variation of the average distortion △ If the Δ MSE is greater than the preset convergence threshold, the process proceeds to the next iteration, and vice versa. The convergence threshold of the LBG method in this embodiment is set to 0.0001. In this way, a large number of original vectors 11 can be initially grouped by a split algorithm to reduce the complexity of subsequent training, thereby greatly reducing the computation time.
請參照第2及5圖所示,本發明較佳實施例之神經元分配步驟S3係依失真度比例分配各群集2中神經元13之數量,再根據分配結果於各群集2中選擇數個原始向量11作為該神經元13,其中該些神經元13之總數係與該編碼簿之大小相同。更詳言之,延續上述例子,本發明之編碼簿大小若設定為1024,而該第一形心點12之數量為32,則各群集2中神經元16之數量應配置為32,方可達到該編碼簿之大小32×32=1024。然而,各個群集2中之原始向量11並不相同,原始向量11較多之群集2應該配置較多的神經元13並進行訓練,方可較為準確的表達該區域原始向量11之分佈。因此,本實施例中該神經元13之分佈係依據各群集2之失真度比例進行配置,其主要係先透過式(d)將各群集2中各個樣本(原始向量11)與其所對應第一形心點12的距離總和,式(b)如下所示:Referring to Figures 2 and 5, the neuron allocation step S3 of the preferred embodiment of the present invention allocates the number of neurons 13 in each cluster 2 according to the degree of distortion, and then selects several clusters 2 according to the distribution result. The original vector 11 is used as the neuron 13, wherein the total number of the neurons 13 is the same as the size of the codebook. More specifically, continuing the above example, if the codebook size of the present invention is set to 1024 and the number of the first centroids 12 is 32, the number of neurons 16 in each cluster 2 should be configured to be 32. The size of the codebook is reached by 32×32=1024. However, the original vectors 11 in each cluster 2 are not the same, and the cluster 2 with more original vectors 11 should be configured with more neurons 13 and trained to more accurately express the distribution of the original vectors 11 of the region. Therefore, in this embodiment, the distribution of the neurons 13 is configured according to the degree of distortion of each cluster 2, which mainly transmits each sample (original vector 11) of each cluster 2 to the first corresponding to each cluster 2 through the formula (d). The sum of the distances of the centroid points 12, the formula (b) is as follows:
式(b)中,d係為距離總和;xir 係為第i個樣本第r個維度;cr 係為該樣本所對應之第一形心點12之位置。再透過式(c)求出每一群集2內應配置的神經元13的數量,若求得知Ng 值為0,則將其設定為1。式(c)如下所示:In equation (b), d is the sum of distances; x ir is the rth dimension of the i-th sample; and c r is the position of the first centroid 12 corresponding to the sample. Then, the number of neurons 13 to be placed in each cluster 2 is obtained by the equation (c), and if it is found that the Ng value is 0, it is set to 1. Equation (c) is as follows:
式(c)中的Ng 係指第g群內應配置的神經元13數量,且符號[]係為四捨五入。如此,便可於具有較大距離總和值的群集2內配置較多之神經元13,以符合樣本之真實分佈。N g in the formula (c) refers to the number of neurons 13 to be disposed in the g- th group, and the symbol [] is rounded off. In this way, more neurons 13 can be placed in the cluster 2 with a larger distance sum value to match the true distribution of the samples.
請參照第2及5至8圖所示,本發明之神經元訓練步驟S4係以各群集2內之原始向量11做為樣本,並以自組織映射圖(SOM)演算法分別對各群集2內之神經元13進行訓練,以獲得數個最終神經元14。更詳言之,延續上述例子,如第5圖所示,由該群集2中隨機選取數個原始向量11作為神經元13之初始位置,再透過自組織映射圖演算法以該群集2內之原始向量11作為樣本訓練該數個神經元13。本實施例之自組織映射圖演算法詳述如下:首先,如第6至8圖所示,該SOM法之第一步驟係以式(d)將群集2內其中一原始向量11’分別與對應分配數量之神經元13進行歐幾里得距離計算,選擇距離最接近該原始向量11’之神經元13作為優勝神經元15。式(d)如下所示:Referring to Figures 2 and 5 to 8, the neuron training step S4 of the present invention takes the original vector 11 in each cluster 2 as a sample, and uses a self-organizing map (SOM) algorithm for each cluster 2 The inner neuron 13 is trained to obtain a number of final neurons 14. More specifically, continuing the above example, as shown in FIG. 5, a plurality of original vectors 11 are randomly selected from the cluster 2 as initial positions of the neurons 13, and then through the self-organizing map algorithm to the cluster 2 The original vector 11 trains the plurality of neurons 13 as samples. The self-organizing map algorithm of this embodiment is detailed as follows: First, as shown in FIGS. 6 to 8, the first step of the SOM method is to use the formula (d) to associate one of the original vectors 11' in the cluster 2 with The neuron 13 corresponding to the assigned number performs the Euclidean distance calculation, and the neuron 13 closest to the original vector 11' is selected as the winning neuron 15. Equation (d) is as follows:
式(d)中,w* n 係為優勝神經元15;ym 係為第m個原始向量11之位置,m=1,2…k,wn (t)係為第n個神經元13於第t次時的位置。In the formula (d), w * n is the superior neuron 15; the y m is the position of the mth original vector 11, m = 1, 2... k, and w n (t) is the nth neuron 13 The position at the tth time.
接著,如第7圖所示,該SOM法之第二步驟係以式(e)調整優勝神經元15及該優勝神經元15鄰近區域S內之鄰近神經元16的位置,式(e)如下所示:Next, as shown in FIG. 7, the second step of the SOM method adjusts the position of the neighboring neuron 16 in the vicinity S of the winning neuron 15 and the superior neuron 15 by the formula (e), and the formula (e) is as follows Shown as follows:
式(e)中,η(t)係為第t次之學習率;r係為神經元13之相對位置,以r :∥n-n * ∥表示;R係為鄰近區域半徑;本實施例中,η(0)=1,鄰近區域半徑。In the formula (e), η(t) is the learning rate of the tth time; r is the relative position of the neuron 13, and is represented by r : ∥ nn * ;; R is the radius of the adjacent region; in this embodiment, η(0)=1, radius of adjacent area .
接著重複進行該SOM法之第一步驟及第二步驟,直至所有原始向量11皆完成距離計算。Then, the first step and the second step of the SOM method are repeated until all the original vectors 11 complete the distance calculation.
接著進行該SOM法之第三步驟,係更新鄰近區域半徑R(t)及學習率η(t),其中,R(t+1)=R(t)×0.95,若R(t+1)<0.1,則R(t+1)=0.1;η(t+1)=η(t)×0.975,若η(t+1)<0.01,則η(t+1)=0.01。Then performing the third step of the SOM method, updating the neighboring region radius R(t) and the learning rate η(t), where R(t+1)=R(t)×0.95, if R(t+1) <0.1, then R(t+1)=0.1; η(t+1)=η(t)×0.975, and if η(t+1)<0.01, η(t+1)=0.01.
最後再進行該SOM法之第四步驟,係判斷平均失真率是否大於一門檻值,若判斷為是,則重複進行該SOM法之第一步驟至第三步驟;若判斷為否,則終止該SOM法。該平均失真率之計算詳述如下:先以式(f)計算所有樣本(原始向量11)與其對應的神經元之距離的總和:Finally, the fourth step of the SOM method is performed to determine whether the average distortion rate is greater than a threshold. If the determination is yes, the first step to the third step of the SOM method are repeated; if the determination is no, the termination is terminated. SOM method. The calculation of the average distortion rate is detailed as follows: First, the sum of the distances of all samples (original vector 11) and their corresponding neurons is calculated by equation (f):
式(f)中,xir 係為第i個樣本第r個維度;cr 係為該樣本所對應之神經元之位置。再如式(g)將該距離總和除以樣本數(式(g)中之分母)求得平均失真度,如式(g)所示。In the formula (f), x ir is the rth dimension of the i-th sample; c r is the position of the neuron corresponding to the sample. Further, the average distortion is obtained by dividing the sum of the distances by the number of samples (the denominator in the formula (g)) as shown in the formula (g).
再以式(h)計算平均失真率,便可判斷該平均失真率是否大於該門檻值,本實施例之門檻值係設定為0.0001。Then, the average distortion rate is calculated by the equation (h), and it can be determined whether the average distortion rate is greater than the threshold value. The threshold value of the embodiment is set to 0.0001.
如上述,本發明所採用之SOM法與習用SOM法之差異在於,習知SOM法係以預設之迭代次數作為終止條件,因此即使訓練過程中已相當接近正確分佈,仍須達到預設之迭代次數後方可終止,無法針對分佈差異較大之樣本進行適當調整,將耗費過多時間進行多餘運算;而本發明係進行改良,以平均失真率作為終止條件之判斷依據,因此可針對各種不同樣本分佈進行差異化調整,可避免耗費不必要之時間成本。至此,便完成該神經元訓練步驟S4,並獲得數個最終神經元14,如第8圖所示。As described above, the difference between the SOM method and the conventional SOM method adopted by the present invention is that the conventional SOM method uses the preset number of iterations as the termination condition, so even if the training process is fairly close to the correct distribution, the preset must be reached. The number of iterations can be terminated, and it is not possible to properly adjust the samples with large differences in distribution, and it will take too much time to perform redundant operations. The present invention is improved, and the average distortion rate is used as a basis for determining the termination conditions, so that it can be used for various samples. The distribution is differentiated to avoid unnecessary time costs. At this point, the neuron training step S4 is completed and a plurality of final neurons 14 are obtained, as shown in FIG.
請參照第2、6及9圖所示,本發明較佳實施例之閒置神經元判斷步驟S5係選取其中一群集2’,並將該群集2’中從未成為優勝者(winner)之最終神經元14定義為閒置神經元17。更詳言之,上述之神經元訓練步驟S4中,可能有部分神經元13於SOM訓練過程中從未成為優勝神經元15,而於SOM訓練完成後直接成為該最終神經元14,而造成該神經元13之浪費,所以於此閒置神經元判斷步驟S5中,係將群集2’中從未成為優勝神經元15之最終神經元14定義為閒置神經元17,再進行該取代步驟S6。Referring to Figures 2, 6 and 9, the idle neuron determining step S5 of the preferred embodiment of the present invention selects one of the clusters 2' and the winner of the cluster 2' never becomes the winner. Neuron 14 is defined as an idle neuron 17. More specifically, in the neuron training step S4 described above, some of the neurons 13 may never become the winning neuron 15 during the SOM training process, and become the final neuron 14 directly after the SOM training is completed. The waste of the neurons 13 is so defined in the idle neuron determination step S5 that the final neuron 14 which has never become the superior neuron 15 in the cluster 2' is defined as the idle neuron 17, and the substitution step S6 is performed.
請參照第2、9及10圖所示,本發明較佳實施例之取代步驟S6係對該群集2’內的原始向量11以分割式分群演算法進行分群,以獲得對應該閒置神經元17數量之第二形心點18,並以該第二形心點18取代該閒置神經元17作為最終神經元。更詳言之,如第9圖所示,若該群集2’中有3個閒置神經元17,則以分割式演算法對該群集2’內之原始向量11進行分群,本實施例係選擇以LBG分群演算法進行分群,以獲得對應該閒置神經元17數量的3個第二形心點18,並以該3個第二形心點18取代該3個閒置神經元17,如第10圖所示,以避免該閒置神經元17無法真實表達該原始向量11之分佈。Referring to Figures 2, 9 and 10, the substitution step S6 of the preferred embodiment of the present invention groups the original vectors 11 in the cluster 2' by a split-group algorithm to obtain corresponding idle neurons 17 The second centroid point 18 is replaced by the second centroid 18 as the final neuron. More specifically, as shown in FIG. 9, if there are three idle neurons 17 in the cluster 2', the original vector 11 in the cluster 2' is grouped by a split algorithm, and this embodiment selects The LBG clustering algorithm is used to perform grouping to obtain three second centroid points 18 corresponding to the number of idle neurons 17, and replace the three idle neurons 17 with the three second centroids 18, such as the tenth The figure is shown to avoid that the idle neurons 17 cannot truly express the distribution of the original vector 11.
請參照第2、10及11圖所示,本發明較佳實施例之終止判斷步驟S7係判斷是否所有群集2之閒置神經元17皆已被取代,若判斷為「否」,則重新進行該閒置神經元判斷步驟S5,若判斷為「是」,則將該些最終神經元14所分別對應之向量共同儲存於該編碼簿內。更詳言之,接著係判斷是否所有群集2之閒置神經元17是否皆已被該第二形心點18取代,若有群集2內的閒置神經元17尚未被取代,則重新進行該閒置神經元判斷步驟S5;若所有群集2內的閒置神經元17皆已被取代,則將獲得之最終神經元14所對應之向量儲存於該編碼簿內,便可完成本發明之編碼簿產生方法。Referring to Figures 2, 10 and 11, the termination determining step S7 of the preferred embodiment of the present invention determines whether all of the idle neurons 17 of the cluster 2 have been replaced. If the determination is "No", the In the idle neuron determination step S5, if the determination is YES, the vectors corresponding to the final neurons 14 are collectively stored in the codebook. More specifically, it is then determined whether all of the idle neurons 17 of the cluster 2 have been replaced by the second centroid 18, and if the idle neurons 17 in the cluster 2 have not been replaced, the idle nerves are re-executed. The element judges step S5; if all the idle neurons 17 in the cluster 2 have been replaced, the vector corresponding to the obtained final neuron 14 is stored in the codebook, and the codebook generating method of the present invention can be completed.
透過本發明上述步驟所獲得之編碼簿內的向量可轉換為數個代表區塊,且各個代表區塊皆有一對應之編碼。如此,該原始輸入影像中之原始區塊便可分別與該些代表區塊進行比較,以最接近的代表區塊之編碼代表該原始區塊做為索引,便可以該容量較小的編碼簿代表該原始輸入影像,而達到壓縮之目的。The vector in the codebook obtained through the above steps of the present invention can be converted into a plurality of representative blocks, and each of the representative blocks has a corresponding code. In this way, the original block in the original input image can be compared with the representative blocks, and the code of the nearest representative block is used as an index to represent the original block, so that the code book with smaller capacity can be used. Represents the original input image for compression purposes.
相反地,在進行解壓縮時,透過解碼演算法將該編碼簿並透過索引還原出數個還原區塊,最後再將這些還原區塊組合成一還原影像(reconstructed image),即完成解壓縮之步驟。Conversely, when decompressing, the codebook is used to restore a plurality of restored blocks through the index, and finally the restored blocks are combined into a reconstructed image, that is, the step of decompressing is completed. .
請參照表一所示,為驗證本發明之壓縮品質及壓縮時間,另對6組影像圖片進行測試,該6組影像圖片分別為Lena、Airplane、Peppers及Boat,並執行?次的獨立運算來取得平均測試結果。該6組影像圖片以4×4的區塊大小來進行前置切塊。其中,由於習用HSOM法無法處理不可開平方得整數之編碼簿大小,因此有些測試表格中的欄位並無測試資料(N/A,Not Available)。而PSNR(Peak Signal-to-Noise Ratio)為一常用來評比壓縮品質之公式,PSNR越高代表壓縮品質越好,如式(i)所示:Referring to Table 1, in order to verify the compression quality and compression time of the present invention, six sets of image pictures are tested. The six sets of image pictures are Lena, Airplane, Peppers and Boat, and executed? Independent calculations to obtain average test results. The 6 sets of image pictures are pre-cut by a block size of 4×4. Among them, because the HSOM method can't handle the size of the codebook that cannot be squared, some fields in the test table have no test data (N/A, Not Available). PSNR (Peak Signal-to-Noise Ratio) is a formula commonly used to evaluate compression quality. The higher the PSNR, the better the compression quality, as shown in equation (i):
式(i)中之MSE如式(a)所示。The MSE in the formula (i) is as shown in the formula (a).
其中,表一中之INTSOM係指以本發明所產生之編碼簿進行壓縮之方法。由表一之結果可得知,本發明(INTSOM)之編碼簿產生方法不論於壓縮品質(PSNR)及運算時間上皆較習用LBG、SOM及HSOM法佳。Among them, the INTSOM in Table 1 refers to a method of compressing the codebook generated by the present invention. As can be seen from the results of Table 1, the codebook generation method of the present invention (INTSOM) is better than the conventional LBG, SOM, and HSOM methods in terms of compression quality (PSNR) and computation time.
本發明係提供一種編碼簿產生方法,其係透過該初步分群步驟S2對該些原始向量11進行初步分群,以降低訓練複雜度。The present invention provides a codebook generation method for preliminary grouping the original vectors 11 through the preliminary grouping step S2 to reduce training complexity.
本發明係提供一種編碼簿產生方法,其係根據失真度比例決定各群集2所配置之神經元13數目,以使該些神經元13可更真實的表現該些原始向量11之分佈。The present invention provides a codebook generation method for determining the number of neurons 13 arranged in each cluster 2 according to the degree of distortion, so that the neurons 13 can more realistically represent the distribution of the original vectors 11.
本發明係提供一種編碼簿產生方法,其係以平均失真率作為自組織映射圖(SOM)演算法之終止條件判斷依據,可依據不同樣本分佈之群組而差異化地進行訓練,可避免花費不必要的時間成本進行訓練。The invention provides a codebook generation method, which uses the average distortion rate as a termination condition judgment criterion of a self-organizing map (SOM) algorithm, and can be differentially trained according to groups of different sample distributions, thereby avoiding cost Unnecessary time costs for training.
本發明係提供一種編碼簿產生方法,其係以該些第二形心點18取代該些閒置神經元17,以避免神經元17之浪費。The present invention provides a codebook generation method that replaces the idle neurons 17 with the second centroids 18 to avoid waste of the neurons 17.
雖然本發明已利用上述較佳實施例揭示,然其並非用以限定本發明,任何熟習此技藝者在不脫離本發明之精神和範圍之內,相對上述實施例進行各種更動與修改仍屬本發明所保護之技術範疇,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。While the invention has been described in connection with the preferred embodiments described above, it is not intended to limit the scope of the invention. The technical scope of the invention is protected, and therefore the scope of the invention is defined by the scope of the appended claims.
11...原始向量11. . . Primitive vector
11’...原始向量11’. . . Primitive vector
12...第一形心點12. . . First centroid
13...神經元13. . . Neurons
14...最終神經元14. . . Ultimate neuron
15...優勝神經元15. . . Winning neuron
16...鄰近神經元16. . . Adjacent neurons
17...閒置神經元17. . . Idle neurons
18...第二形心點18. . . Second centroid
2...群集2. . . Cluster
2’...群集2'. . . Cluster
R...鄰近區域半徑R. . . Adjacent area radius
8...輸入樣本8. . . Input sample
9...神經元9. . . Neurons
91...優勝神經元91. . . Winning neuron
92...鄰近神經元92. . . Adjacent neurons
第1圖:習用編碼演算法之示意圖。Figure 1: Schematic diagram of a conventional coding algorithm.
第2圖:本發明之編碼簿產生方法的流程圖。Fig. 2 is a flow chart showing a method of generating a codebook of the present invention.
第3圖:本發明之編碼簿產生方法的原始向量分佈示意圖。Figure 3 is a diagram showing the original vector distribution of the codebook generation method of the present invention.
第4圖:本發明之編碼簿產生方法的初步分群步驟之示意圖。Figure 4 is a schematic illustration of the preliminary grouping steps of the codebook generation method of the present invention.
第5圖:本發明之神經元訓練步驟中進行SOM之神經元初始示意圖。Figure 5: Initial schematic diagram of neurons performing SOM in the neuron training step of the present invention.
第6圖:本發明之神經元訓練步驟中進行SOM之優勝神經元的示意圖。Fig. 6 is a schematic view showing the superior neurons of the SOM in the neuron training step of the present invention.
第7圖:本發明之神經元訓練步驟中進行SOM之調整神經元位置之示意圖。Figure 7 is a schematic diagram showing the position of the SOM adjusting neuron in the neuron training step of the present invention.
第8圖:本發明完成神經元訓練步驟之局部示意圖。Figure 8: A partial schematic view of the completion of the neuron training procedure of the present invention.
第9圖:本發明完成神經元訓練步驟之示意圖。Figure 9: Schematic diagram of the completion of the neuron training steps of the present invention.
第10圖:本發明之閒置神經元判斷步驟的示意圖。Fig. 10 is a view showing the steps of judging the idle neurons of the present invention.
第11圖:本發明之取代步驟的示意圖。Figure 11: Schematic representation of the substitution steps of the invention.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW98132154A TWI430204B (en) | 2009-09-23 | 2009-09-23 | Codebook generating method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW98132154A TWI430204B (en) | 2009-09-23 | 2009-09-23 | Codebook generating method |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201112171A TW201112171A (en) | 2011-04-01 |
TWI430204B true TWI430204B (en) | 2014-03-11 |
Family
ID=44909177
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW98132154A TWI430204B (en) | 2009-09-23 | 2009-09-23 | Codebook generating method |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI430204B (en) |
-
2009
- 2009-09-23 TW TW98132154A patent/TWI430204B/en not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
TW201112171A (en) | 2011-04-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI385592B (en) | Codebook generating method | |
CN106157339B (en) | Animation mesh sequence compression method based on low-rank vertex trajectory subspace extraction | |
Pareek et al. | IntOPMICM: intelligent medical image size reduction model | |
JP5662429B2 (en) | Method for encoding / decoding a 3D mesh model composed of one or more components | |
JP7614288B2 (en) | Representation of neural networks | |
CN112133373B (en) | Prediction method of titanium alloy constitutive relationship based on machine learning | |
CN114782265B (en) | Image inpainting method based on adversarial multi-scale and residual multi-channel spatial attention | |
CN104869425A (en) | Compression and decompression method based on texture image similarity | |
CN110162290B (en) | A compression method for DeMURA data of OLED screen | |
Cui et al. | Multi-stage residual hiding for image-into-audio steganography | |
Pal et al. | An efficient codebook initialization approach for LBG algorithm | |
US20220360788A1 (en) | Image encoding method and image decoding method | |
CN116939226A (en) | A generative residual repair method and device for low bit rate image compression | |
CN112702600A (en) | Image coding and decoding neural network layered fixed-point method | |
WO2008067766A1 (en) | Method and device for quantizing vector | |
TWI376960B (en) | Codebook generating method for image compression | |
TWI430204B (en) | Codebook generating method | |
Baviskar et al. | Performance evaluation of high quality image compression techniques | |
Begum | An efficient algorithm for codebook design in transform vector quantization | |
CN101267557B (en) | A Method of Image Compression and Decoding Using Composite Vector Quantization | |
CN103794219B (en) | A kind of Codebook of Vector Quantization based on the division of M code word generates method | |
CN115834914A (en) | Tensor network-based entropy coding and decoding method and image compression method | |
CN114554210A (en) | Lossless image compression method and system | |
TWI478540B (en) | Image compression method | |
CN106028043A (en) | 3D Self-Organizing Map Image Coding Method Based on New Neighborhood Function |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |