[go: up one dir, main page]

Zhang, 2018 - Google Patents

Failure Diagnosis for Datacenter Applications

Zhang, 2018

View PDF
Document ID
3452178566735176806
Author
Zhang Q
Publication year

External Links

Snippet

Fast and accurate failure diagnosis remains a major challenge for datacenter operators. Current datacenter applications are increasingly architected around loosely-coupled modular components: each component can scale and evolve independently. However …
Continue reading at digital.lib.washington.edu (PDF) (other versions)

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/06Arrangements for maintenance or administration or management of packet switching networks involving management of faults or events or alarms
    • H04L41/0654Network fault recovery
    • H04L41/0659Network fault recovery by isolating the faulty entity
    • H04L41/0663Network fault recovery by isolating the faulty entity involving offline failover planning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/08Configuration management of network or network elements
    • H04L41/0803Configuration setting of network or network elements
    • H04L41/0813Changing of configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0721Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing packet switching networks
    • H04L43/08Monitoring based on specific metrics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/02Arrangements for maintenance or administration or management of packet switching networks involving integration or standardization
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing packet switching networks
    • H04L43/10Arrangements for monitoring or testing packet switching networks using active monitoring, e.g. heartbeat protocols, polling, ping, trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance or administration or management of packet switching networks
    • H04L41/14Arrangements for maintenance or administration or management of packet switching networks involving network analysis or design, e.g. simulation, network model or planning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network-specific arrangements or communication protocols supporting networked applications
    • H04L67/10Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Application independent communication protocol aspects or techniques in packet data networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic regulation in packet switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes

Similar Documents

Publication Publication Date Title
Tan et al. {NetBouncer}: Active device and link failure localization in data center networks
US12335275B2 (en) System for monitoring and managing datacenters
US11700190B2 (en) Technologies for annotating process and user information for network flows
Yap et al. Taking the edge off with espresso: Scale, reliability and programmability for global internet peering
US20200366573A1 (en) Systems and methods for visualizing dependency experiments
EP3944081B1 (en) Data center resource monitoring with managed message load balancing with reordering consideration
US20210216908A1 (en) Self-learning packet flow monitoring in software-defined networking environments
EP3283954B1 (en) Restoring service acceleration
Wu et al. Virtual network diagnosis as a service
US11012523B2 (en) Dynamic circuit breaker applications using a proxying agent
US11121941B1 (en) Monitoring communications to identify performance degradation
Malik et al. A measurement study of open source SDN layers in OpenStack under network perturbation
EP3624401A1 (en) Systems and methods for non-intrusive network performance monitoring
US11539728B1 (en) Detecting connectivity disruptions by observing traffic flow patterns
Turchetti et al. NFV‐FD: Implementation of a failure detector using network virtualization technology
Hamid et al. ReCSDN: Resilient controller for software defined networks
Zhang Failure Diagnosis for Datacenter Applications
Lin Monarch: Scalable monitoring and analytics for visibility and insights in virtualized heterogeneous cloud infrastructure
Alfatafta An analysis of partial network partitioning failures in modern distributed systems
Casimiro et al. Trone: Trustworthy and resilient operations in a network environment
US20240214279A1 (en) Multi-node service resiliency
Zhang et al. Volur: Concurrent edge/core route control in data center networks
Gogunska Study of the cost of measuring virtualized networks
Sousa et al. Expedient reconfiguration in the cloud
Liu et al. HifiCNet: High-Fidelity Cloud Network Validation Platform at Scale by Hybrid Architecture