CN109802964B - DQN-based HTTP adaptive flow control energy consumption optimization method - Google Patents
DQN-based HTTP adaptive flow control energy consumption optimization method Download PDFInfo
- Publication number
- CN109802964B CN109802964B CN201910060941.7A CN201910060941A CN109802964B CN 109802964 B CN109802964 B CN 109802964B CN 201910060941 A CN201910060941 A CN 201910060941A CN 109802964 B CN109802964 B CN 109802964B
- Authority
- CN
- China
- Prior art keywords
- state
- action
- value
- energy consumption
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000005265 energy consumption Methods 0.000 title claims abstract description 22
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000005457 optimization Methods 0.000 title claims abstract description 12
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 11
- 230000003993 interaction Effects 0.000 claims abstract description 3
- 230000009471 action Effects 0.000 claims description 37
- 238000013528 artificial neural network Methods 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 230000007613 environmental effect Effects 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 4
- 230000002787 reinforcement Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000007958 sleep Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
Landscapes
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Information Transfer Between Computers (AREA)
Abstract
一种基于DQN的HTTP自适应流控制能耗优化方法,考虑了不同的网络状况,缓存区域内的加载状况,以及客户端设备电量剩余情况,并基于此环境下模拟了使用状况,客户端与服务器的交互过程中,流媒体通过DQN学习系统对多媒体文件进行质量不同的切换,高频低频内核的切换从而达到能耗优化的目的。
A DQN-based HTTP adaptive flow control energy consumption optimization method, which considers different network conditions, loading conditions in the cache area, and the remaining power of the client device, and simulates the usage based on this environment. During the interaction of the server, the streaming media switches the multimedia files with different qualities through the DQN learning system, and switches the high-frequency and low-frequency cores to achieve the purpose of energy consumption optimization.
Description
Technical Field
The invention belongs to the technical field of computer network communication, and particularly relates to a DQN-based HTTP adaptive flow control energy consumption optimization method.
Background
In recent years, the development speed of the multimedia field is rapid, the transmission of multimedia contents is more and more emphasized by people, the HTTP video protocol is a mainstream online video watching mode after the popularization of the internet, the HTTP protocol is mainly divided into two stages, the first stage is a progressive downloading stage, and colloquially, the user is supported to download and play the multimedia files at the same time, and the whole files do not need to be downloaded and played. But the streaming transmission is not real streaming transmission and is not different from the downloading of common files, the second stage HTTP streaming technology is mainly characterized in that a server end divides a media file into small slices, a service receives a request and sends the slices of the media file through HTTP response, in the interaction process of the server and a client end, the client end adjusts the slicing code rate in real time according to the state of a network, a high code rate is used under the condition of good network state, a low code rate is used under the condition of busy network state and automatic switching is carried out, the main realization method is that the server section has the marked code rate in each list file provided, a player of the client end can automatically adjust according to the playing progress and the downloading speed, on the basis of ensuring the playing continuity and fluency, the user experience is improved as much as possible, and what we need to do is to carry out deep-level optimization on the energy consumption of client end equipment under the premise of ensuring the playing continuity and fluency, when the client plays an online video, the network state, the cache state and the mobile phone residual power are parts which are often ignored by people, the code rate selection flexibility of the HTTP adaptive stream is low, the HTTP adaptive stream cannot well cope with complex network conditions, the code rate of the video stream which is frequently switched not only causes uncomfortable experience for viewers, but also ignores energy consumption overhead caused by switching, and an energy consumption optimization model based on deep q learning of an enhanced learning and neural network is provided.
Q-learning is a classic method of reinforcement learning, the main core idea of reinforcement learning is that an agent enters the next state by continuously interacting with the environment, the agent obtains a return value by taking a proper action, and Q-learning core Q-table, row and column represent state and action respectively, the Q value in the Q-table is just the measure of how good the state s takes the action a, and how the neural network works here, we can regard it as a black box, the input is a state value, the output is the value of the state, the training data come from some data generated in the whole system operation process, the data are corrected in the process of calculating the return, the corrected values are used as the input of the neural network, secondary training is carried out, the convergence effect is finally achieved, and the optimal strategy is selected.
Disclosure of Invention
In order to overcome the defects of the prior art, the present invention provides a DQN-based HTTP adaptive flow control energy consumption optimization method, which uses a q-learning reinforcement learning method combined with a bp (back propagation) neural network to interact with an environment, wherein the environment is continuously changing, the network changes, and the power consumption during the online video watching process of a user, and the system dynamically matches and switches the video quality in a video player under a variable environment and dynamically schedules different cpu cores to obtain the most suitable media quality level and the most suitable cpu core. Finally, the function of reducing energy consumption is achieved.
In order to achieve the purpose, the invention adopts the technical scheme that:
a DQN-based HTTP adaptive flow control energy consumption optimization method comprises the following steps:
1) environment acquisition modeling: the method comprises the steps that a network used in daily life is simulated by Dummynet, a client is used in 3g,4g and Wifi network environments, current environment information is collected, a client data cache state B is provided, namely a set consisting of three states including a segment length in a current cache region, a network state N and a battery electric quantity E, S is (B, N and E), time is divided into a plurality of time points, the time points correspond to one another, and data are stored;
2) definition of client action set and reward function: establishing a Q-learning state space and an action set of the model according to the environment data collected in the step 1) as a state set, selecting a proper action to enter the next state by the system through the network state, the cache state and the battery power, wherein the action set for establishing the model mainly comprises two action states, and switching the video quality and the high-frequency core and the low-frequency core; the switching of the video slice quality defines the sum of the energy consumption level and the switching overhead as a return function, and the return function has the following two points, wherein the first is energy consumptionThe grade value is a mapping relation formed by the energy consumption grade, different network grades, different video qualities and different CPU cores, wherein the energy consumption grade value is selected from a mapping table, the second value is the expense caused by video switching and large and small core switching, and the value is negative feedback, so that the return function expression is as follows: r ═ C1Re nergy+C2RswitchHere with C1C2The weights of the two reported values are respectively, a specific value is set according to the preferred weight of the user, and the weight value can be 1;
3) the algorithm is realized as follows: selecting an optimal action by continuously interacting with the environment, wherein the neural network mainly has the function of converting a high-dimensional state into a low-dimensional output, the neural network inputs an environment state s by changing the environment state into a low-dimensional state value, outputs a Q value corresponding to the action, uses an epsilon-greedy algorithm in a vector form, randomly selects an action with a small probability epsilon in each state, selects an optimal action with 1-epsilon according to the bp neural network, then adds the randomly selected action and the action selected according to the neural network into a reproduction _ buffer experience pool in the neural network for secondary training, makes an action, reaches the next state, trains the neural network to optimize the input state, and uses an optimal solution strategy for the output value, outputting an optimal solution;
4) in practical problems, the device obtains the environment state value through the system, and selects the best matching quality video and the kernel which saves the most power and does not influence the user experience through the DQN.
The system environment information is characterized in that a defined state set S contains network levels, the network levels are divided into six levels from high to low, but through measurement, the lowest quality, the remaining value of the electric quantity of the mobile phone and the length of a cache segment in a test video cannot be normally loaded under the conditions of 1, 2 levels or 3g, and the cache state of a unit time point, namely the length of the segment, is selected by compiling a script for calling cache information.
The system in the invention interacts with the constantly changing state in the environment, and allocates reasonable streaming media quality and a reasonable CPU core to each segment, and experimental results show that the optimization method can effectively reduce the energy consumption of the mobile streaming media to equipment without influencing user experience, and the energy consumption of a loading part is reduced by twenty-one percent.
Drawings
FIG. 1 is a flow chart of the system of the present invention.
Fig. 2 is a diagram of the DQN learning process of the present invention.
Fig. 3 is a diagram of an application scenario of the present invention.
Detailed Description
The present invention will be further described with reference to the following examples, but the present invention is not limited to the following examples.
An HTTP adaptive flow control energy consumption optimization method based on DQN, as shown in fig. 1 and 3, the main working situation of HTTP adaptive flow work is to divide a streaming media file into smaller segments for HTTP request, transmission, etc., so we receive a slice of the streaming media file first, and the system collects the network environment and current electric quantity condition and processes the data, the specific process is as follows:
defining a state set S, dividing the state set S into six levels from high to low, but measuring that the quality in the test video cannot be normally loaded at two levels of 1, 2 or 3g, calculating the residual value of the electric quantity of the mobile phone and the length of the cache segment according to the return value of 0, and selecting the cache state of a unit time point, namely the length of the segment, by writing a script for calling cache information.
Defining action sets, namely an android XU3 development version used by people, a main Cortex-A15 high-frequency core and a Cortex-A7 low-frequency core, wherein the actions are mainly used for adjusting which core is used for working according to environmental changes, which core sleeps, a task selection A15 and a task selection A7, and streaming media quality is divided into lossless, high-definition and low-definition, and the actions are only limited to a video set of experimental tests.
3) The construction of the reward function and the model is selected,
firstly, initializing a neural network, wherein the main function of using the BP neural network is to estimate the value of action in each state, reduce the dimension of a vector, and assign values to a learning rate alpha and a discount factor gamma in a Q value iterative formula and an exploration probability epsilon in action selection. For each iteration cycle, the following process is carried out as shown in fig. 2, after initialization is completed, the state of the system is input, the output is the value generated by the current action, the output is estimated to replace the previous output, the optimal solution is found by optimizing step by step, and after the value of each action is obtained, an epsilon-tree strategy is adopted to find the optimal solution, here, a threshold value is initialized, the initial value is 0.8, namely eighty percent of actions are randomly selected when the actions are selected, twenty percent of actions are calculated through a neural network, the most appropriate one is selected, and the initialized value is lower and lower until the most appropriate one is not randomly selected as learning continues.
Claims (1)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910060941.7A CN109802964B (en) | 2019-01-23 | 2019-01-23 | DQN-based HTTP adaptive flow control energy consumption optimization method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910060941.7A CN109802964B (en) | 2019-01-23 | 2019-01-23 | DQN-based HTTP adaptive flow control energy consumption optimization method |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109802964A CN109802964A (en) | 2019-05-24 |
| CN109802964B true CN109802964B (en) | 2021-09-28 |
Family
ID=66560085
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910060941.7A Active CN109802964B (en) | 2019-01-23 | 2019-01-23 | DQN-based HTTP adaptive flow control energy consumption optimization method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN109802964B (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110414725B (en) * | 2019-07-11 | 2021-02-19 | 山东大学 | Wind farm energy storage system dispatching method and device integrating forecasting and decision making |
| CN114885208B (en) * | 2022-03-21 | 2023-08-08 | 中南大学 | Dynamic adaptive method, device and medium for scalable streaming media transmission under NDN network |
| CN117979054B (en) * | 2024-02-04 | 2025-08-22 | 山东大学 | Energy saving method for playing short video streaming media in mobile terminal |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108063961A (en) * | 2017-12-22 | 2018-05-22 | 北京联合网视文化传播有限公司 | A kind of self-adaption code rate video transmission method and system based on intensified learning |
| CN108737382A (en) * | 2018-04-23 | 2018-11-02 | 浙江工业大学 | SVC coding HTTP streaming media self-adaption method based on Q-L earning |
| AU2017268276A1 (en) * | 2016-05-16 | 2018-12-06 | Wi-Tronix, Llc | Video content analysis system and method for transportation system |
| CN108966330A (en) * | 2018-09-21 | 2018-12-07 | 西北大学 | A kind of mobile terminal music player dynamic regulation energy consumption optimization method based on Q-learning |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11062207B2 (en) * | 2016-11-04 | 2021-07-13 | Raytheon Technologies Corporation | Control systems using deep reinforcement learning |
-
2019
- 2019-01-23 CN CN201910060941.7A patent/CN109802964B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AU2017268276A1 (en) * | 2016-05-16 | 2018-12-06 | Wi-Tronix, Llc | Video content analysis system and method for transportation system |
| CN108063961A (en) * | 2017-12-22 | 2018-05-22 | 北京联合网视文化传播有限公司 | A kind of self-adaption code rate video transmission method and system based on intensified learning |
| CN108737382A (en) * | 2018-04-23 | 2018-11-02 | 浙江工业大学 | SVC coding HTTP streaming media self-adaption method based on Q-L earning |
| CN108966330A (en) * | 2018-09-21 | 2018-12-07 | 西北大学 | A kind of mobile terminal music player dynamic regulation energy consumption optimization method based on Q-learning |
Non-Patent Citations (3)
| Title |
|---|
| Evaluation of Q-Learning approach for HTTP Adaptive Streaming;Virginia Martın;《2016 IEEE International Conference on Consumer Electronics》;20160314;第293-294页 * |
| Live Streaming with Content Centric Networking;Hongfeng Xu;《2012 Third International Conference on Networking and Distributed Computing》;20121231;第1-5页 * |
| 基于Q-learning的HTTP自适应流码率控制方法研究;熊丽荣;《通信学报》;20170925;第38卷(第9期);第18-24页 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109802964A (en) | 2019-05-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Zhang et al. | A multi-agent reinforcement learning approach for efficient client selection in federated learning | |
| CN113434212A (en) | Cache auxiliary task cooperative unloading and resource allocation method based on meta reinforcement learning | |
| CN109802964B (en) | DQN-based HTTP adaptive flow control energy consumption optimization method | |
| CN104598292B (en) | A kind of self adaptation stream adaptation and method for optimizing resources applied to cloud game system | |
| CN113114756A (en) | Video cache updating method for self-adaptive code rate selection in mobile edge calculation | |
| US11570063B2 (en) | Quality of experience optimization system and method | |
| CN114706433B (en) | Equipment control method and device and electronic equipment | |
| CN108966330A (en) | A kind of mobile terminal music player dynamic regulation energy consumption optimization method based on Q-learning | |
| CN114090108B (en) | Computing task execution methods, devices, electronic equipment and storage media | |
| CN112866756B (en) | Code rate control method, device, medium and equipment for multimedia file | |
| CN110209845B (en) | Recommendation method, device and storage medium of multimedia content | |
| CN118069506A (en) | Method and system for generating path coverage test data based on reinforcement learning selection strategy | |
| CN113672372A (en) | A multi-edge collaborative load balancing task scheduling method based on reinforcement learning | |
| Mowafi et al. | Energy efficient fuzzy-based DASH adaptation algorithm | |
| CN120264046A (en) | Energy-saving video adaptive bitrate optimization method based on deep reinforcement learning | |
| Lin et al. | KNN-Q learning algorithm of bitrate adaptation for video streaming over HTTP | |
| CN114138493A (en) | Edge computing power resource scheduling method based on energy consumption perception | |
| CN113015179B (en) | Network resource selection method and device based on deep Q network and storage medium | |
| Jia et al. | DQN algorithm based on target value network parameter dynamic update | |
| CN116233088B (en) | Real-time super-division video stream transmission optimization method based on end cloud cooperation | |
| CN118605807A (en) | A file storage method based on deep reinforcement learning file value algorithm | |
| CN118093054A (en) | Edge computing task offloading method and device | |
| CN110879730B (en) | Method and device for automatically adjusting game configuration, electronic equipment and storage medium | |
| CN118301675A (en) | Star-earth cooperative network cache optimization method and related equipment | |
| CN110743164A (en) | Dynamic resource partitioning method for reducing response delay in cloud game |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |
