VPC Flow Logs

VPC Flow Logs records a sample of packets sent from and received by virtual machine (VM) instances, including instances used as Google Kubernetes Engine nodes, and packets sent through VLAN attachments for Cloud Interconnect and Cloud VPN tunnels (Preview).

Flow logs are aggregated by IP connection (5-tuple). These logs can be used for network monitoring, forensics, security analysis, and expense optimization.

You can view flow logs in Cloud Logging, and you can export logs to any destination that Cloud Logging export supports.

Use cases

Network monitoring

VPC Flow Logs provides you with visibility into network throughput and performance. You can:

  • Monitor the VPC network
  • Perform network diagnosis
  • Filter the flow logs by VMs, VLAN attachments, and Cloud VPN tunnels to understand traffic changes
  • Understand traffic growth for capacity forecasting

Understanding network usage and optimizing network traffic expenses

You can analyze network usage with VPC Flow Logs to optimize network traffic expenses. For example, you can analyze the network flows for the following:

  • Traffic between regions and zones
  • Traffic to specific countries on the internet
  • Traffic to on-premises and other cloud networks
  • Top talkers in the network, including VMs, VLAN attachments, and Cloud VPN tunnels

Network forensics

You can use VPC Flow Logs for network forensics. For example, if an incident occurs, you can examine the following:

  • Which IPs talked with whom and when
  • Any compromised IPs by analyzing all the incoming and outgoing network flows

Specifications

  • VPC Flow Logs is part of Andromeda, the software that powers VPC networks. VPC Flow Logs introduces no delay or performance penalty when enabled.
  • VPC Flow Logs works with VPC networks, not legacy networks. You enable or disable VPC Flow Logs per subnet, VLAN attachment for Cloud Interconnect (Preview), or Cloud VPN tunnel (Preview). If enabled for a subnet, VPC Flow Logs collects data from all VM instances, including GKE nodes, in that subnet.
  • VPC Flow Logs samples TCP, UDP, ICMP, ESP, and GRE flows. Both inbound and outbound flows are sampled. These flows can be within Google Cloud or between Google Cloud and other networks. If a flow is captured by sampling, VPC Flow Logs generates a log for the flow. Each flow record includes the information described in the Record format section.
  • VPC Flow Logs interacts with firewall rules in the following ways:
    • Egress packets are sampled before egress firewall rules. Even if an egress firewall rule denies outbound packets, those packets can be sampled by VPC Flow Logs.
    • Ingress packets are sampled after ingress firewall rules. If an ingress firewall rule denies inbound packets, those packets are not sampled by VPC Flow Logs.
  • You can use filters in VPC Flow Logs to generate only certain logs.
  • VPC Flow Logs supports VMs that have multiple network interfaces. You need to enable VPC Flow Logs for each subnet, in each VPC, that contains a network interface.
  • To log flows between Pods on the same Google Kubernetes Engine (GKE) node, you must enable Intranode visibility for the cluster.
  • VPC Flow Logs are not reported from Cloud Run resources.

Logs collection

Packets are sampled within an aggregation interval. All packets collected for a given IP connection within the aggregation interval are aggregated into a single flow log entry. This data is then sent to Logging.

Logs are stored in Logging for 30 days by default. If you want to keep logs longer than that, you can either set a custom retention period or export them to a supported destination.

Log sampling and processing

To generate flow logs, VPC Flow Logs samples packets that leave and enter a VM or pass through a gateway such as a VLAN attachment or Cloud VPN tunnel. After the flow logs are generated, VPC Flow Logs processes them by following the procedure described in this section.

VPC Flow Logs samples packets using a primary sampling rate. The primary sampling rate is dynamic and varies depending on the load of the physical host running the VM or gateway at the time of sampling. The probability of sampling any single IP connection increases with the volume of packets. You can't control the primary flow log sampling process or adjust the primary sampling rate.

After the flow logs are generated, VPC Flow Logs processes them according to the following procedure:

  1. Filtering: You can specify that only logs that match specified criteria are generated. For example, you can filter so that only logs for a particular VM or only logs with a particular metadata value are generated and the rest are discarded. For more information, see Log filtering.
  2. Aggregation: Information for sampled packets is aggregated over a configurable aggregation interval to produce a flow log entry.
  3. Secondary flow log sampling: This is a second sampling process. Flow log entries are further sampled according to a configurable secondary sampling rate parameter. The secondary sampling is performed on the flow logs generated by the primary flow log sampling process. For example, if the secondary sampling rate is set to 1.0, or 100%, VPC Flow Logs samples 100% of the flow logs generated by the primary flow log sampling.
  4. Metadata: If disabled, all metadata annotations are discarded. If you want to keep metadata, you can specify that all fields or a specified set of fields are retained. For more information, see Metadata annotations.
  5. Write to Logging: The final log entries are written to Cloud Logging.

Because VPC Flow Logs does not capture every packet, it compensates for missed packets by interpolating from the captured packets. This happens for packets missed because of initial and user-configurable sampling settings.

Even though Google Cloud doesn't capture every packet, log record captures can be quite large. You can balance your traffic visibility and storage cost needs by adjusting the following aspects of logs collection:

  • Aggregation interval: Sampled packets for a time interval are aggregated into a single log entry. This time interval can be 5 seconds (default), 30 seconds, 1 minute, 5 minutes, 10 minutes, or 15 minutes.
  • Secondary sampling rate:
    • For VMs, 50% of log entries are kept by default. You can set this parameter from 1.0 (100%, all log entries are kept) to 0.0 (0%, no logs are kept).
    • For VLAN attachments and Cloud VPN tunnels, 100% of log entries are kept by default. You can set this parameter from 1.0 to greater than 0.0.
  • Metadata annotations: By default, flow log entries are annotated with metadata information, such as the names of the source and destination within Google Cloud or the geographic region of external sources and destinations. Metadata annotations can be turned off, or you can specify only certain annotations, to save storage space.
  • Filtering: By default, logs are generated for every sampled flow. You can set filters so that only logs that match certain criteria are generated.

Pricing

Standard pricing for Logging, BigQuery, or Pub/Sub apply. VPC Flow Logs pricing is described in Network Telemetry pricing.

What's next