Zhang, 2018 - Google Patents
Failure Diagnosis for Datacenter ApplicationsZhang, 2018
View PDF- Document ID
- 3452178566735176806
- Author
- Zhang Q
- Publication year
External Links
Snippet
Fast and accurate failure diagnosis remains a major challenge for datacenter operators. Current datacenter applications are increasingly architected around loosely-coupled modular components: each component can scale and evolve independently. However …
- 238000003745 diagnosis 0 title abstract description 42
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/06—Arrangements for maintenance or administration or management of packet switching networks involving management of faults or events or alarms
- H04L41/0654—Network fault recovery
- H04L41/0659—Network fault recovery by isolating the faulty entity
- H04L41/0663—Network fault recovery by isolating the faulty entity involving offline failover planning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/08—Configuration management of network or network elements
- H04L41/0803—Configuration setting of network or network elements
- H04L41/0813—Changing of configuration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0721—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing packet switching networks
- H04L43/08—Monitoring based on specific metrics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/02—Arrangements for maintenance or administration or management of packet switching networks involving integration or standardization
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing packet switching networks
- H04L43/10—Arrangements for monitoring or testing packet switching networks using active monitoring, e.g. heartbeat protocols, polling, ping, trace-route
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/14—Arrangements for maintenance or administration or management of packet switching networks involving network analysis or design, e.g. simulation, network model or planning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network-specific arrangements or communication protocols supporting networked applications
- H04L67/10—Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Application independent communication protocol aspects or techniques in packet data networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic regulation in packet switching networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tan et al. | {NetBouncer}: Active device and link failure localization in data center networks | |
US12335275B2 (en) | System for monitoring and managing datacenters | |
US11700190B2 (en) | Technologies for annotating process and user information for network flows | |
Yap et al. | Taking the edge off with espresso: Scale, reliability and programmability for global internet peering | |
US20200366573A1 (en) | Systems and methods for visualizing dependency experiments | |
EP3944081B1 (en) | Data center resource monitoring with managed message load balancing with reordering consideration | |
US20210216908A1 (en) | Self-learning packet flow monitoring in software-defined networking environments | |
EP3283954B1 (en) | Restoring service acceleration | |
Wu et al. | Virtual network diagnosis as a service | |
US11012523B2 (en) | Dynamic circuit breaker applications using a proxying agent | |
US11121941B1 (en) | Monitoring communications to identify performance degradation | |
Malik et al. | A measurement study of open source SDN layers in OpenStack under network perturbation | |
EP3624401A1 (en) | Systems and methods for non-intrusive network performance monitoring | |
US11539728B1 (en) | Detecting connectivity disruptions by observing traffic flow patterns | |
Turchetti et al. | NFV‐FD: Implementation of a failure detector using network virtualization technology | |
Hamid et al. | ReCSDN: Resilient controller for software defined networks | |
Zhang | Failure Diagnosis for Datacenter Applications | |
Lin | Monarch: Scalable monitoring and analytics for visibility and insights in virtualized heterogeneous cloud infrastructure | |
Alfatafta | An analysis of partial network partitioning failures in modern distributed systems | |
Casimiro et al. | Trone: Trustworthy and resilient operations in a network environment | |
US20240214279A1 (en) | Multi-node service resiliency | |
Zhang et al. | Volur: Concurrent edge/core route control in data center networks | |
Gogunska | Study of the cost of measuring virtualized networks | |
Sousa et al. | Expedient reconfiguration in the cloud | |
Liu et al. | HifiCNet: High-Fidelity Cloud Network Validation Platform at Scale by Hybrid Architecture |