On the Fairness of Internet Congestion Control over WiFi with Deep Reinforcement Learning
<p>Starvation in multiple TCP protocols (PCC Vivace, BBR, Copa, and CUBIC) with (<b>a</b>) different loss ratio of 2 flows (<math display="inline"><semantics> <mrow> <msub> <mi>p</mi> <mn>1</mn> </msub> <mo>=</mo> <mi>n</mi> <msub> <mi>p</mi> <mn>2</mn> </msub> <mo>,</mo> <mi>n</mi> <mo>=</mo> </mrow> </semantics></math> loss ratio <math display="inline"><semantics> <mrow> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>…</mo> <mo>,</mo> <mn>20</mn> </mrow> </semantics></math>), where <math display="inline"><semantics> <msub> <mi>p</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>p</mi> <mn>2</mn> </msub> </semantics></math> represent flow 1 and flow 2 loss ratios, <span class="html-small-caps">Case I:</span> Starvation from <a href="#futureinternet-16-00330-t001" class="html-table">Table 1</a> with different loss ratios, <math display="inline"><semantics> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>…</mo> </mrow> </semantics></math>: (<b>b</b>) (<math display="inline"><semantics> <mrow> <msub> <mi>τ</mi> <mn>1</mn> </msub> <mo>=</mo> <mi>n</mi> <msub> <mi>τ</mi> <mn>2</mn> </msub> <mo>,</mo> <mi>n</mi> <mspace width="3.33333pt"/> <mo>=</mo> </mrow> </semantics></math> RTT ratio <math display="inline"><semantics> <mrow> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>…</mo> <mo>,</mo> <mn>20</mn> </mrow> </semantics></math>) , where <math display="inline"><semantics> <msub> <mi>τ</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>τ</mi> <mn>2</mn> </msub> </semantics></math> represents flow 1 and flow 2 RTT ratios, <span class="html-small-caps">Case II:</span> Starvation from <a href="#futureinternet-16-00330-t001" class="html-table">Table 1</a> with different RTT ratios, <math display="inline"><semantics> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>…</mo> </mrow> </semantics></math>.</p> "> Figure 2
<p>A high-level view of our model design with different components and their interactions.</p> "> Figure 3
<p>Our experimental testbed with 2 clients, one WiFi router, and one server where the clients are contending each other in the wireless bottleneck and the server is connected with ethernet.</p> "> Figure 4
<p>Understanding throughput fairness (<math display="inline"><semantics> <mrow> <mi>T</mi> <msub> <mi>h</mi> <mrow> <mi>r</mi> <mi>a</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> </mrow> </semantics></math>) by a comparison of throughput between two flows using the same CCA: (<b>a</b>–<b>c</b>). CUBIC flows showed higher fluctuations (<b>a</b>) compared to PCC flows (<b>b</b>), while BBR flows exhibited a significant throughput gap (<b>c</b>).</p> "> Figure 5
<p>Comparison of throughput between CUBIC and BBR, focusing on Client 2’s switch to CUBIC at 25 s, understanding throughout fairness <math display="inline"><semantics> <mrow> <mi>T</mi> <msub> <mi>h</mi> <mrow> <mi>r</mi> <mi>a</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> </mrow> </semantics></math>. Before the switch, Client 2 (BBR) had 4.08 Mbps compared to Client 1’s 79.44 Mbps CUBIC flow. After switching, Client 2’s average throughput increased to 38.72 Mbps, stabilizing close to Client 1’s 40.64 Mbps CUBIC flow with minimal fluctuations. (<b>a</b>) CUBIC vs. BBR->CUBIC. (<b>b</b>) CUBIC vs. BBR->CUBIC.</p> "> Figure 6
<p>RTT comparison of CUBIC vs. BBR->CUBIC, Client 2, before and after switching. RTT variation is significant: Client 1 (CUBIC) peaks at 170 ms at 5 s and drops to 20 ms, while Client 2 (BBR->CUBIC) ranges from 98 ms to 4 ms. Both reach 13 ms at 80 s (<b>a</b>). After switching, RTT variation and median values for both clients have decreased (<b>b</b>).</p> "> Figure 7
<p>Jitter comparison of Client 1 (CUBIC) and Client 2 (BBR->CUBIC) before and after switching at 25 s. Client 2’s jitter varied widely, ranging from 90 ms to almost 0 ms. After switching (<b>a</b>), Client 2’s average jitter dropped from 23 ms to 0.84 ms. Client 1’s median jitter remained near 0 ms (<b>b</b>).</p> "> Figure 8
<p>Packet loss ratio comparison of Client 1 (CUBIC) vs. Client 2 (BBR->CUBIC) before and after switching at 25 s. Client 2 (BBR) reduced its packet loss ratio from 19.35% to 11.45% after switching to CUBIC, while Client 1 (CUBIC) fluctuated from 70% to 11% at 5 and 8 s (<b>a</b>). The packet loss distribution for Client 2 (BBR), later CUBIC, had a median value of 20% before dropping to around 13%, with Client 1 experiencing a slight increase after switching (<b>b</b>).</p> "> Figure 9
<p>Comparison of the throughput of BBR and PCC, focusing on Client 2’s switch to PCC at 25 s for understanding throughout fairness <math display="inline"><semantics> <mrow> <mi>T</mi> <msub> <mi>h</mi> <mrow> <mi>r</mi> <mi>a</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> </mrow> </semantics></math>. Before switching, Client 2 (BBR) averaged 12.32 Mbps, which surged to 40.40 Mbps afterward (<b>a</b>). The distribution plot (<b>b</b>) shows Client 2’s median throughput increasing from below 15 Mbps to above 35 Mbps.</p> "> Figure 10
<p>RTT variation over time: Client 1 (PCC) peaks at 102 ms at 3 s, dropping to 6 ms; Client 2 (BBR->PCC) ranges from 155 ms to 4 ms. Both clients maintained RTTs below 20 ms (<b>a</b>). Post-switch to Client 2 (PCC), RTT variation and median values decreased (<b>b</b>).</p> "> Figure 11
<p>Jitter comparison of Client 1 (PCC) and Client 2 (BBR->PCC) in milliseconds over time before and after switching at 25 s. Few fluctuations were observed for Client 1 (PCC) before switching (<b>a</b>). Median jitter remained near 1 ms for Client 1 after Client 2 switched to BBR->PCC (<b>b</b>).</p> "> Figure 12
<p>Packet loss ratio comparison of Client 1 (PCC) vs. Client 2 (BBR->PCC) before and after switching at 25 s. The packet loss ratio for Client 1 (PCC) decreased from 48.97% to 9.22% after switching to Client 2 (PCC) in (<b>a</b>). The packet loss distribution for Client 2 (BBR), later PCC, had a median value of around 50% before dropping to nearly 10% (<b>b</b>).</p> "> Figure 13
<p>Comparison of throughput and sending rate in Mbps over time before and after switching Client 2 (PCC->BBR) for an understanding of throughout fairness <math display="inline"><semantics> <mrow> <mi>T</mi> <msub> <mi>h</mi> <mrow> <mi>r</mi> <mi>a</mi> <mi>t</mi> <mi>i</mi> <mi>o</mi> </mrow> </msub> </mrow> </semantics></math>. Before switching to BBR at 25 s, Client 2 (PCC) had an average throughput of 44 Mbps, double that of the competing Client 1 (BBR) flow. After switching, Client 2 (BBR) received over 40 Mbps (<b>a</b>). The median value for Client 1 increased from 15 Mbps to 40 Mbps (<b>b</b>).</p> "> Figure 14
<p>RTT variation over time: Client 2 (PCC) peaks at 104 ms at 3 s, dropping to 4 ms; Client 2 (PCC->BBR) ranges from 155 ms to 4 ms. Both clients maintained RTTs below 15 ms (<b>a</b>). Post-switch to Client 2 (BBR), RTT variation and median values decreased (<b>b</b>).</p> "> Figure 15
<p>Jitter comparison of Client 1 (BBR) and Client 2 (PCC->BBR) before and after switching at 25 s. Average jitter for Client 2 dropped from around 3 ms to nearly 1 ms after switching (<b>a</b>). Median jitter for Client 1 remained near 0 ms (<b>b</b>).</p> "> Figure 16
<p>Packet loss ratio comparison of Client 1 (BBR) vs. Client 2 (PCC->BBR) before and after switching at 25 s. Reduced packet loss ratio from 32% to 10% after the switch (<b>a</b>). The median packet loss for Client 1 (BBR) dropped from around 29% to near 10% (<b>b</b>).</p> "> Figure 17
<p>Visualization of training a DRL model: The total rewards over 1000 epochs demonstrate consistent improvement, with performance approaching a near-optimal policy after 700 epochs, where the rewards stabilize around 75, as shown in line graph (<b>a</b>). The box plot (<b>b</b>) reveals that the majority of training epochs achieve an average reward of 70, which represents the median value.</p> "> Figure 18
<p>Visualization of training a DRL model: The loss over 500 epochs, shown in line graph (<b>a</b>), approaches near zero after 58 epochs, with minimal fluctuations between epochs 176 and 179. The box plot (<b>b</b>) indicates that 88% of the loss distribution is negligible (close to zero).</p> "> Figure 19
<p>Visualization of training a DRL model: The accuracy over 500 epochs, as shown in line graph (<b>a</b>), attains 100% after the 60-epoch mark. The box plot (<b>b</b>) shows the distribution of training accuracy, with 96.8% of epochs achieving 100% accuracy.</p> ">
Abstract
:1. Introduction
1.1. Gaps in Existing Studies
1.2. Why These Problems Need to Be Addressed
1.3. Proposed Solution and Contributions
- CCA Switching Mechanism: We propose a DRL-based solution that adjusts congestion control strategies based on real-time network conditions, compatible with existing TCP and WiFi standards.
- Delay Variation Analysis: Our method uses DRL models trained on non-congestive delay variations to help avoid extreme unfairness and starvation.
- Testing with Real Data: We tested our approach offline over simulated environment using real data to ensure it aligns with the goals of fairness and efficiency in TCP. Online training and evaluation is beyond the scope of this paper.
2. Background and Related Works
2.1. TCP Fairness and ML Approaches
2.2. Closest Works in Literature
2.3. Deep Reinforcement Learning in TCP
3. Mathematical Interpretation of Unfairness and Starvation in Representative CCAs
- Equal Loss with Varying RTT: This hypothesis aims to investigate how a CCA manages flows with different RTTs when they experience the same level of packet loss. Specifically, it evaluates whether disparities in RTTs result in unequal congestion window allocation under identical loss conditions. The primary purpose is to ascertain whether variations in RTT alone can lead to unfairness in congestion window sizes across different flows.
- Equal RTT with Varying Loss: In this hypothesis, the focus is on how varying levels of packet loss impact the congestion window of flows with the same RTT. It examines whether changes in packet loss lead to fairness issues or potential starvation in flows that experience identical network delays. The objective is to investigate whether packet loss alone can disrupt fairness and lead to differential treatment of flows.
- Starvation Hypothesis: This hypothesis examines extreme cases of unfairness, where one flow significantly underperforms compared to another due to variations in RTT or packet loss. The aim is to assess the worst-case scenarios for fairness and evaluate how effectively the CCA prevents starvation.
4. System Model of Proposed TCP-Switching Mechanism
4.1. Model Interpretation
4.1.1. Proposed DRL-Based CCA Switching
State Space (S)
Action Space (A)
Reward Function (R)
Q-Function (Q)
Algorithm 1 DRL-Based TCP Switching Algorithm |
|
5. Experimental Methodology, Results, and Discussion
5.1. Experimental Testbed Setup
Motivation to Choose the Scenarios
5.2. Scenario 1: CUBIC vs. BBR-CUBIC
5.3. Scenario 2: PCC vs. BBR-PCC
5.4. Scenario 3: BBR vs. PCC-BBR
RTT and Jitter
5.5. DRL Model
5.5.1. Training Reward
5.5.2. Training Loss
5.5.3. Training Accuracy
5.6. Our Key Findings
- Mitigating Non-Congestive Delay Variations: Delay-bounding CCAs aim to manage non-congestive delay variations by ensuring that delay adjustments comprise at least half the expected non-congestive jitter along the network path [18]. If the delay oscillations fall below this threshold, the CCA may struggle to maintain high throughput, bounded delays, and fairness, potentially leading to inefficient network performance.
- Characteristics and Thresholds for Network Design: CCAs should adjust delays by at least half the expected non-congestive delays to differentiate between congestion-related and other delays. Failing to meet this threshold can cause the CCA to struggle with throughput, delay management, and fairness [18].
- Dynamic Switching of CCAs and Insights into Fairness and Stability: This study reveals that throughput unfairness persists within the same CCA, influenced by network path characteristics. Dynamically switching between CCAs based on network conditions can improve fairness and stability. These findings highlight the potential of using Deep Reinforcement Learning to adapt CCAs dynamically for better network performance and fairness.
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Detailed Calculations
Appendix A.1. TCP CUBIC
Appendix A.2. PCC Vivace
Appendix A.3. BBR
Appendix A.4. COPA
References
- Al-Saadi, R.; Armitage, G.; But, J.; Branch, P. A survey of delay-based and hybrid TCP congestion control algorithms. IEEE Commun. Surv. Tutor. 2019, 21, 3609–3638. [Google Scholar] [CrossRef]
- Kua, J.; Armitage, G.; Branch, P. A survey of rate adaptation techniques for dynamic adaptive streaming over HTTP. IEEE Commun. Surv. Tutor. 2017, 19, 1842–1866. [Google Scholar] [CrossRef]
- Zhang, J.; Zhang, Y.; Dong, E.; Zhang, Y.; Ren, S.; Meng, Z.; Xu, M.; Li, X.; Hou, Z.; Yang, Z.; et al. Bridging the Gap between QoE and QoS in Congestion Control: A Large-scale Mobile Web Service Perspective. In Proceedings of the 2023 USENIX Annual Technical Conference (USENIX ATC 23), Boston, MA, USA, 10–12 July 2023; pp. 553–569. [Google Scholar]
- Hoe, J.C. Improving the start-up behavior of a congestion control scheme for TCP. ACM SIGCOMM Comput. Commun. Rev. 1996, 26, 270–280. [Google Scholar] [CrossRef]
- Ha, S.; Rhee, I.; Xu, L. CUBIC: A new TCP-friendly high-speed TCP variant. ACM SIGOPS Oper. Syst. Rev. 2008, 42, 64–74. [Google Scholar] [CrossRef]
- Tan, K.; Song, J.; Zhang, Q.; Sridharan, M. A compound TCP approach for high-speed and long distance networks. In Proceedings of the IEEE INFOCOM, Barcelona, Spain, 23–29 April 2006. [Google Scholar]
- Pokhrel, S.R.; Williamson, C. Modeling compound TCP over WiFi for IoT. IEEE/ACM Trans. Netw. 2018, 26, 864–878. [Google Scholar] [CrossRef]
- Floyd, S. TCP and explicit congestion notification. ACM SIGCOMM Comput. Commun. Rev. 1994, 24, 8–23. [Google Scholar] [CrossRef]
- Winstein, K.; Sivaraman, A.; Balakrishnan, H. Stochastic forecasts achieve high throughput and low delay over cellular networks. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), Berkeley, CA, USA, 2–5 April 2013; pp. 459–471. [Google Scholar]
- Winstein, K.; Balakrishnan, H. Tcp ex machina: Computer-generated congestion control. ACM SIGCOMM Comput. Commun. Rev. 2013, 43, 123–134. [Google Scholar] [CrossRef]
- Brakmo, L.S.; O’Malley, S.W.; Peterson, L.L. TCP Vegas: New techniques for congestion detection and avoidance. In Proceedings of the Conference on Communications Architectures, Protocols and Applications, London, UK, 31 August–2 September 1994; pp. 24–35. [Google Scholar]
- Wei, D.X.; Jin, C.; Low, S.H.; Hegde, S. FAST TCP: Motivation, architecture, algorithms, performance. IEEE/ACM Trans. Netw. 2006, 14, 1246–1259. [Google Scholar] [CrossRef]
- Cardwell, N.; Cheng, Y.; Gunn, C.S.; Yeganeh, S.H.; Jacobson, V. BBR: Congestion-based congestion control. Commun. ACM 2017, 60, 58–66. [Google Scholar] [CrossRef]
- Dong, M.; Meng, T.; Zarchy, D.; Arslan, E.; Gilad, Y.; Godfrey, B.; Schapira, M. PCC vivace:Online-Learning congestion control. In Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), San Francisco, CA, USA, 16–18 April 2018; pp. 343–356. [Google Scholar]
- Jay, N.; Gilad, T.; Frankel, N.; Meng, T.; Godfrey, B.; Schapira, M.; Chung, J.W.; Siwach, V.; Salim, J.H. A PCC-Vivace Kernel Module for Congestion Control. University of Illinois Urbana-Champaign, Hebrew University of Jerusalem in Israel, Verizon. 2018. Available online: https://pbg.web.engr.illinois.edu/papers/jay18pcc-kernel.pdf (accessed on 2 September 2024).
- Arun, V.; Balakrishnan, H. Copa: Practical Delay-Based Congestion Control for the Internet. In Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), Renton, WA, USA, 9–11 April 2018; pp. 329–342. [Google Scholar]
- Zaki, Y.; Pötsch, T.; Chen, J.; Subramanian, L.; Görg, C. Adaptive congestion control for unpredictable cellular networks. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, London, UK, 17–21 August 2015; pp. 509–522. [Google Scholar]
- Arun, V.; Alizadeh, M.; Balakrishnan, H. Starvation in end-to-end congestion control. In Proceedings of the ACM SIGCOMM 2022 Conference, Amsterdam, The Netherlands, 22–26 August 2022; pp. 177–192. [Google Scholar]
- Seo, S.J.; Cho, Y.Z. Fairness enhancement of TCP congestion control using reinforcement learning. In Proceedings of the 2022 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Jeju Island, Republic of Korea, 21–24 February 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 288–291. [Google Scholar]
- Liao, X.; Tian, H.; Zeng, C.; Wan, X.; Chen, K. Towards fair and efficient learning-based congestion control. arXiv 2024, arXiv:2403.01798. [Google Scholar]
- Pokhrel, S.R.; Panda, M.; Vu, H.L. Fair Coexistence of Regular and Multipath TCP over Wireless Last-Miles. IEEE Trans. Mob. Comput. 2019, 18, 574–587. [Google Scholar] [CrossRef]
- Hamzah, M.F.; Athab, O.A. A Review of TCP Congestion Control Using Artificial Intelligence in 4G and 5G Networks. Am. Acad. Sci. Res. J. Eng. Technol. Sci. 2022, 88, 172–186. [Google Scholar]
- Pokhrel, S.R.; Panda, M.; Vu, H.L.; Mandjes, M. TCP Performance over Wi-Fi: Joint Impact of Buffer and Channel Losses. IEEE Trans. Mob. Comput. 2016, 15, 1279–1291. [Google Scholar] [CrossRef]
- Wang, L. Low-Latency, High-Throughput Load Balancing Algorithms. J. Comput. Technol. Appl. Math. 2024, 1, 1–9. [Google Scholar]
- Haile, H.; Grinnemo, K.J.; Ferlin, S.; Hurtig, P.; Brunstrom, A. End-to-end congestion control approaches for high throughput and low delay in 4G/5G cellular networks. Comput. Netw. 2021, 186, 107692. [Google Scholar] [CrossRef]
- Kua, J.; Nguyen, S.H.; Armitage, G.; Branch, P. Using active queue management to assist IoT application flows in home broadband networks. IEEE Internet Things J. 2017, 4, 1399–1407. [Google Scholar] [CrossRef]
- Pokhrel, S.R.; Kua, J.; Satish, D.; Ozer, S.; Howe, J.; Walid, A. DDPG-MPCC: An Experience Driven Multipath Performance Oriented Congestion Control. Future Internet 2024, 16, 37. [Google Scholar] [CrossRef]
- Satish, D.; Kua, J.; Pokhrel, S.R. Active Queue Management in L4S with Asynchronous Advantage Actor-Critic: A FreeBSD Networking Stack Perspective. Future Internet 2024, 16, 265. [Google Scholar] [CrossRef]
- Liu, Q.; Yang, P.; Yang, M.; Yu, L. CKCD: A fair and low latency queue control algorithm for heterogeneous TCP flows. In Proceedings of the 2020 International Conference on Computing, Networking and Communications (ICNC), Big Island, HI, USA, 17–20 February 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 725–730. [Google Scholar]
- Bazi, K.; Nassereddine, B. Comparative analysis of TCP congestion control mechanisms. In Proceedings of the 3rd International Conference on Networking, Information Systems & Security, Marrakech, Morocco, 31 March–2 April 2020; pp. 1–4. [Google Scholar]
- Gettys, J. Bufferbloat: Dark buffers in the Internet. IEEE Internet Comput. 2011, 15, 96. [Google Scholar] [CrossRef]
- Ye, J.; Leung, K.C. Adaptive and stable delay control for combating bufferbloat: Theory and algorithms. IEEE Syst. J. 2019, 14, 1285–1296. [Google Scholar] [CrossRef]
- McNair, D.S. Preventing disparities: Bayesian and frequentist methods for assessing fairness in machine learning decision-support models. In New Insights Into Bayesian Inference; IntechOpen: London, UK, 2018; Volume 71. [Google Scholar]
- Kang, M.; Li, L.; Weber, M.; Liu, Y.; Zhang, C.; Li, B. Certifying some distributional fairness with subpopulation decomposition. Adv. Neural Inf. Process. Syst. 2022, 35, 31045–31058. [Google Scholar]
- Valli, S.; Sankar, S.; Mehata, K. A Heuristic Method for Improving Tcp Performance by a Greedy Routing Algorithm. J. Theor. Appl. Inf. Technol. 2017, 95, 5215–5223. [Google Scholar]
- Yamazaki, M.; Yamamoto, M. Fairness improvement of congestion control with reinforcement learning. J. Inf. Process. 2021, 29, 592–595. [Google Scholar] [CrossRef]
- Zhang, S.; Lei, W.; Zhang, W.; Li, H. An evaluation of bottleneck bandwidth and round trip time and its variants. Int. J. Commun. Syst. 2021, 34, e4772. [Google Scholar] [CrossRef]
- Xiao, K.; Mao, S.; Tugnait, J.K. TCP-Drinc: Smart congestion control based on Deep Reinforcement Learning. IEEE Access 2019, 7, 11892–11904. [Google Scholar] [CrossRef]
- Ke, C.H.; Astuti, L. Applying Deep Reinforcement Learning to improve throughput and reduce collision rate in IEEE 802.11 networks. KSII Trans. Internet Inf. Syst. (TIIS) 2022, 16, 334–349. [Google Scholar]
- Kim, M.; Hwang, S.; Lee, I. Deep Reinforcement Learning approach for fairness-aware scheduling in wireless networks. In Proceedings of the 2022 13th International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 19–21 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1229–1232. [Google Scholar]
- Arianpoo, N.; Leung, V.C. How network monitoring and reinforcement learning can improve tcp fairness in wireless multi-hop networks. EURASIP J. Wirel. Commun. Netw. 2016, 2016, 278. [Google Scholar] [CrossRef]
- Yu, Y.; Wang, T.; Liew, S.C. Deep-reinforcement learning multiple access for heterogeneous wireless networks. IEEE J. Sel. Areas Commun. 2019, 37, 1277–1290. [Google Scholar] [CrossRef]
- Maeta, K.; Kitagata, G.; Hasegawa, G. Improving per-flow fairness by ML-based estimation of competing flows’ congestion control algorithm. In Proceedings of the 2022 13th International Conference on Ubiquitous and Future Networks (ICUFN), Barcelona, Spain, 5–8 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 376–381. [Google Scholar]
- Jay, N.; Rotman, N.; Godfrey, B.; Schapira, M.; Tamar, A. A Deep Reinforcement Learning perspective on Internet congestion control. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; PMLR: Birmingham, UK, 2019; pp. 3050–3059. [Google Scholar]
- Naqvi, H.A.; Hilman, M.H.; Anggorojati, B. Implementability improvement of Deep Reinforcement Learning based congestion control in cellular network. Comput. Netw. 2023, 233, 109874. [Google Scholar] [CrossRef]
- Giacomoni, L. Enhancing Design and Evaluation Methods for Reinforcement Learning-based Congestion Control: A Large Scale Experimental Study of Fairness, Efficiency, Responsiveness and a Novel Simulation Framework as a Training and Evaluation Playground. Ph.D. Thesis, University of Sussex, Sussex, UK, 2024. Available online: https://hdl.handle.net/10779/uos.26135407.v1 (accessed on 2 September 2024).
- Pan, W.; Li, X.; Tan, H.; Xu, J.; Li, X. Improvement of RTT fairness problem in BBR congestion control algorithm by gamma correction. Sensors 2021, 21, 4128. [Google Scholar] [CrossRef]
- Njogu, C.K.; Yang, W.; Njogu, H.W.; Bosire, A. BBR-With Enhanced Fairness (BBR-EFRA): A new enhanced RTT fairness for BBR congestion control algorithm. Comput. Commun. 2023, 200, 95–103. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4123862 (accessed on 28 July 2024).
- Raiciu, C. Coupled Congestion Control for Multipath Transport Protocols. IETF RFC 6182. 2011. Available online: https://www.rfc-editor.org/info/rfc6182 (accessed on 2 September 2024).
- Khalili, R.; Gast, N.; Popovic, M.; Boudec, J.-Y.L. MPTCP Is Not Pareto-Optimal: Performance Issues and a Possible Solution. IEEE/ACM Trans. Netw. 2013, 21, 1651–1665. [Google Scholar] [CrossRef]
- Chen, K.; Shan, D.; Luo, X.; Zhang, T.; Yang, Y.; Ren, F. One rein to rule them all: A framework for datacenter-to-user congestion control. In Proceedings of the 4th Asia-Pacific Workshop on Networking, Seoul, Republic of Korea, 3–4 August 2020; pp. 44–51. [Google Scholar]
- Cao, Y.; Jain, A.; Sharma, K.; Balasubramanian, A.; Gandhi, A. When to use and when not to use BBR: An empirical analysis and evaluation study. In Proceedings of the Internet Measurement Conference, Amsterdam, The Netherlands, 21–23 October 2019; pp. 130–136. [Google Scholar]
- Quevedo Caballero, E.; Donahoo, M.; Cerny, T. Fairness Analysis of Deep Reinforcement Learning based Multi-Path QUIC Scheduling. In Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, Tallinn, Estonia, 27–31 March 2023; pp. 1772–1781. [Google Scholar]
- Ming, F.; Gao, F.; Liu, K.; Zhao, C. Cooperative modular reinforcement learning for large discrete action space problem. Neural Netw. 2023, 161, 281–296. [Google Scholar] [CrossRef] [PubMed]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjel, A.K.; Ostrovski, G.; et al. Human-level control through Deep Reinforcement Learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Van Hasselt, H.; Guez, A.; Silver, D. Deep Reinforcement Learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Bellemare, M.G.; Dabney, W.; Munos, R. A distributional perspective on reinforcement learning. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 July 2017; PMLR: Birmingham, UK, 2017; pp. 449–458. [Google Scholar]
- Vardoyan, G.; Hollot, C.V.; Towsley, D. Towards stability analysis of data transport mechanisms: A fluid model and its applications. IEEE/ACM Trans. Netw. 2021, 29, 1730–1744. [Google Scholar] [CrossRef]
- Wang, Z.; Ni, H.; Han, R. Copa-ICN: Improving Copa as a Congestion Control Algorithm in Information-Centric Networking. Electronics 2022, 11, 1710. [Google Scholar] [CrossRef]
Metric | Interpretation |
---|---|
Throughput of flow A/Throughput of flow B | |
Travel time from source to destination and back to source | |
Delay | One way delay—from source to the destination |
Number of loss packets/Total number of sent packets | |
Jitter | Variability in packet arrival times |
Goodput | (Total transferred data—overhead data)/time taken |
Total amount of buffered data waiting to be transmitted | |
Bytes the sender processes in the last selection period |
Metric | Interpretation |
---|---|
The optimum delivery rate | |
The minimum Round Trip Time | |
Average | |
Number of lost packets/Total number of sent packets | |
Running CCA | |
Goodput | (total transferred data—overhead data)/time taken |
Total amount of buffered data waiting to be transmitted | |
Bytes the sender processes in the last selection period |
Testbed Setup | Network Setup |
---|---|
Operating System | Ubuntu 22.04.4 |
Traffic Generator | iperf3 |
Data Collection Tools | Wireshark, tshark, Scapy |
RouterOS | MikroTik RouterOS 6.49.4; WiFi hAP ax |
(https://mikrotik.com/product/hap_ax3, accessed on 27 June 2024) | |
Network Configuration | |
Bottleneck | 100 Mbps |
Queue Buffer | 50 packets |
Queue Discipline | PFIFO |
TCP CCAs | CUBIC, BBR, PCC, BB2, BBR3 |
Experiment Duration | 100 s |
Metrics | Throughput, Sending Rate, Packet Loss Ratio, RTT, Jitter |
Our Setup | CCAs over WiFi Network Settings | 2 Clients (C1, C2) | Aggregate Performance Metrics | |||
---|---|---|---|---|---|---|
Throughput | RTT | Jitter | Loss | |||
Scen. | CUBIC vs. BBR | CUBIC(C1) | 79.44 Mbps | 21.44 ms | 0.75 ms | 12.8% |
(before) | BBR(C2) | 4.08 Mbps | 2.32 ms | 23.99 ms | 19.35% | |
1 | CUBIC vs. BBR->CUBIC | CUBIC(C1) | 40.64 Mbps | 16.30 ms | 1.09 ms | 21.10% |
(after) | CUBIC(C2) | 38.72 Mbps | 13.01 ms | 0.82 ms | 11.45% | |
Scen. | PCC vs. BBR | PCC(C1) | 77.36 Mbps | 6.27 ms | 0.84 ms | 48.97% |
(before) | BBR(C2) | 12.32 Mbps | 22.02 ms | 1.163 ms | 11.83% | |
2 | PCC vs. BBR->PCC | PCC(C1) | 44 Mbps | 15.08 ms | 1.722 ms | 9.22% |
(after) | PCC(C2) | 40.4 Mbps | 15.98 ms | 1.53 ms | 7.21% | |
Scen. | BBR vs. PCC | BBR(C1) | 20.48 Mbps | 13.40 ms | 2.85 ms | 29.44% |
(before) | PCC(C2) | 42.08 Mbps | 15.76 ms | 2.14 ms | 4.17 % | |
3 | BBR vs. PCC->BBR | BBR(C1) | 42.56 Mbps | 14.63 ms | 1.16 ms | 9.35% |
(after) | BBR(C2) | 45.6 Mbps | 15.78 ms | 1.53 ms | 7.20% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shrestha, S.K.; Pokhrel, S.R.; Kua, J. On the Fairness of Internet Congestion Control over WiFi with Deep Reinforcement Learning. Future Internet 2024, 16, 330. https://doi.org/10.3390/fi16090330
Shrestha SK, Pokhrel SR, Kua J. On the Fairness of Internet Congestion Control over WiFi with Deep Reinforcement Learning. Future Internet. 2024; 16(9):330. https://doi.org/10.3390/fi16090330
Chicago/Turabian StyleShrestha, Shyam Kumar, Shiva Raj Pokhrel, and Jonathan Kua. 2024. "On the Fairness of Internet Congestion Control over WiFi with Deep Reinforcement Learning" Future Internet 16, no. 9: 330. https://doi.org/10.3390/fi16090330