BGPy The BGP Python Security Simulator
BGPy The BGP Python Security Simulator
41
Compared to BGPy, BGPEval simulates routers directly via containers Furthermore BGPEval prioiratizes realistic simulation over performance, they use actual routing deamons
however in BGPy, anouncements and routes propoagete and simulation such as Quagga and GoBGP on the other hand BGPy uses an algorithm-based propagation approach to
converges in a single iteration simulate BGP announcements across the AS topology.
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
Figure 1: Main components of BGPy. Each scenario selects the attacker strategy in the form of announcements, as well as
which ASes will adopt. The simulation engine receives this information and propagates these announcements throughout the
AS graph, returning the LocalRIB at each AS. The simulation Framework then generates metrics and graphs.
a testing framework that provides user-friendly tools that facilitate propagation of BGP announcement inside an AS topology, where
easy modeling and debugging. BGPy is written in Python, a fast each AS can plug in a specific security policy, which allows much I guess we can
prototyping language that allows easy further extension. more flexibility, including easy support of mixed deployment of only allow for
our threat
BGPy has been used in several projects on BGP security (§6). security policies. In addition, the run time complexity of our Simu- model to vary?
These projects have demonstrated that BGPy is efficient, and can lation Engine is similar to that of the routing tree algorithms (see
run on standard laptops/desktops without the need of computing §4). The simulator in [64] is also based on algorithms, instead of
clusters (Appendix C). In addition, it is easily extensible to support actual propagation of BGP announcements, and their complexity is
new security policies, sometimes with just a few lines of additional worse than that of BGPy.
code. We have open sourced BGPy for facilitating BGP security
research: https://github.com/jfuruness/bgpy 2 BGPY DESIGN OVERVIEW
Contributions. The contributions of this work include: In this section we present a high level overview of BGPy. BGPy
• Design and implementation of BGPy. We design and implement simulates BGP and allows comparison of security policies against
BGPy as a Python-based simulation tool that aims to be efficient, various attacks. It consists of three main components: Simulation
flexible and easily extensible to support a wide range of attacks and Engine, Simulation Framework, and System Test Suite.
Well in the
security policies. Simulation Engine (§4). As shown in Figure 1, the simulation
BGPEval it • Evaluation of BGPy. We present several use cases that use BGPy engine abstracts away packet-level and intra-domain details to per-
does claim
otherwise, to for simulating defenses against prefix hijacks, subprefix hijacks, form BGP simulations by propagating announcements across the
simulate entire AS topology. The simulation engine receives an empirically
internet scale and path manipulation, in partial and mixed deployment scenarios.
BGP security Our results demonstrate that BGPy realizes its design goals, and inferred ASes topology and relationships from the CAIDA Serial-2
implementation
s, they require can be a valuable tool for BGP security research. [10] dataset, announcements to populate the AS graph, and ROV
large amount of
Related work. Several studies have developed BGP simulators. adoption estimates from [14, 50, 51]. From there, the AS topology
computational
power and
Specifically, SSFNet [16, 46], Genesis [55], and BGP++ [19] are BGP is populated, along with the routing policies defined in the BG-
even then So instead of
simulation simulators at the packet level. They provide fine-grained simulation, PAS class. Among other attributes, this class contains relationship containers (as
takes
but are difficult to scale, and hence not suitable for simulating information, RIB information, and routing policies. This class is opposed to
substantial BGPEval) this
amount of time Internet-size topologies. BGPSIM [63] and Cisco WAE [54] abstracts easily extendable and ROVAS (performing ROV) and PeerROVAS is done as
such as (performing an ROV variant) are included by default as well as sub- imitating the
creation of away some details in BGP, but still has high computational overhead. packet-level
testbed takes
The studies in [20, 21, 47] focus on the simulation at a single AS. classes. The simulation engine supports dynamic routing policies, propagation,
up to 2-3 hours routing?
etc. The above simulators are mainly designed for networking research and can use any routing policy at any AS. After these announce-
(e.g., studying convergence time and routing dynamics), not for ments are propagated throughout the AS topology, the Local RIB
security. at each AS is produced as output (and used by the framework).
For BGP security research, it is important to consider the full Simulation Framework (§3). As shown in Figure 1, the simu-
Internet AS topology [26]. Existing studies used simulations that lation framework is a wrapper around the Simulation Engine that
are designed for specific security policies (e.g., [15, 25–28, 32, 64]), facilities the comparison of multiple security policies against attack
rather than presenting general extensible simulation tools. Specifi- scenarios. It contains two major components - the main simulator
cally, the studies in [26–28] use routing tree algorithms to compute and the scenarios. The simulator controls all scenarios and aspects
the best available paths from each AS to the destination (origin of of the simulation such as the number of trials, partial adoption
the prefix). While the algorithms are efficient, they cannot be easily percentages, etc. The simulator has a list of several scenarios to
extended to simulate custom security protocols. We simulate the compare. Each scenario controls the attacking strategy, i.e. which
42
BGPy: The BGP Python Security Simulator CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA
43
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
1 Simulation( subcategories: ASes that are stubs or multihomed, ASes that are a
2 percent_adoptions = ( part of the input clique defined by CAIDA [10], and other ASes (to
3 SpecialPercentAdoptions.ONLY_ONE, which we refer as etc ASes). For example, if 50% of all ASes should
4 .1, # This means 10 percent of ASes adopt
adopt the defensive policy, 50% of the input clique ASes will be
5 .2,
chosen, 50% of the etc ASes will be chosen, and 50% of the stubs or
6 .4,
7 .8, multihomed ASes will be chosen.
8 SpecialPercentAdoptions.ALL_BUT_ONE Attacker and Victim ASes are by default excluded from these se-
9 ), lections. By default, attacker ASes are not adopting, and victim ASes
10 scenarios=( are adopting, although this can easily be extended and modified.
11 SubprefixHijack(AdoptASCls=ROVSimpleAS), Note that these subgroupings are configurable. A user can specify
12 SubprefixHijack(AdoptASCls=PeerROVSimpleAS), any subgroupings according to one’s preferences. Similar to the
13 ), attacker and the victim, the ASes chosen to be adopting ASes will
14 output_path=Path("~/Desktop/cset23_graphs").expanduser(), remain the same across all defensive policies for each trial to enforce
comparability.
15 num_trials=1000,
4. Engine Setup: After the adopting ASes are using the routing
16 parse_cpus=12,
17 ).run() policies chosen as the defense for the scenario, the attacking strat-
egy is utilized to create and insert the originating announcements
at the local RIBs of both the victims and the attackers. This func-
Figure 3: Simulation Example code used to generate Figure 2 tionality is contained within the scenario class, and creating the
and other graphs not depicted here. This simulation com- attacker strategy is discussed in §3.2.
pares both ROV and an ROV variant, Peer ROV (an ROV The simulator includes by default six different attacking strate-
variant that only filters peers, see Appendix E for details), gies, that cover the typical ‘family’ of ‘generalized prefix attacks’
against a subprefix hijack. It does so for multiple partial that are commonly used when studying ROV and ROV++. They
adoption percentages, for 1000 trials, using 12 cores for mul- are defined in Table 5.
tiprocessing. 5. Engine runs: Then the simulation engine, described in §4,
is run. The attackers and victims announcements are propagated
All of these options are configurable. For a full list of simulator throughout the AS topology, and the defensive routing policies for
parameters, see Table 3. We could also compare any number of the adopting ASes are used. At the conclusion of this stage, the
scenarios, and each scenario is also highly configurable (see Table 4 simulation engine contains an AS topology where each AS will have
for a full list of parameters). a local RIB containing some subset of the attacker’s and victim’s
2. Attacker and Victim Selection: Before each scenario runs, announcements.
the simulator randomly selects attacker and victim ASes. A victim is 6. Data Analysis: At this point, the framework begins to per-
an AS that is the origin of a legitimate announcement. An attacker form data analysis. To start with, we perform traceback, a process
is an AS that announces an announcement that is illegitimate for in which, at each AS, we trace back each prefix to it’s origin on the
any reason. Perhaps the announcement is invalid by ROA, or the data plane. Figure 7, which describes a subprefix hijack originating
announcement is a route leak, etc. The goal of the attacker(s) could at AS 666, shows an example for the importance of traceback. .
be to attract, to itself, traffic sent to the victim, to cause a Denial of Without traceback, i.e., looking only at the control-plane data, it
Service, e.g., preventing traffic from reaching the victim, or another, would seem that AS 3 is not hijacked by AS 666, since AS 3 would
user-defined goal. A scenario can select any number of attackers be routing according to the legitimate announcement (from AS
or victims as a parameter (see Table 4); this capability was used in 777). However, in reality, and as the framework finds by traceback,
some use-cases (see §6). The default is one attacker and one victim. the traffic is hijacked once it reaches AS 1 (the provider of AS 3),
The same attacker(s) and victim(s) are used across all scenarios in since AS 1 routes according to the subprefix hijack from AS 666.
a trial to maintain comparability. By default, attackers are selected This is since Internet Protocol (IP) routing prefers the most-specific
from either stubs or multihomed ASes, since edge ASes are more route. For this reason, it is not enough to simply look at the local
likely to be malicious [12, 37, 66]. RIB at each AS. We must perform traceback, tracing the prefixes
3. Adopting ASes: Then the adopting ASes will be chosen. The back to their origins, in order to determine the true outcome of the
simulator will take each scenario and run the scenario across a con- traffic. During traceback, we start at each AS, and trace back the
figurable set of different partial adoption percentages. For example, prefixes on the data plane to the origin AS. The outcome of the
in our code used in figure Figure 3, we are testing six scenarios: traceback is then used to analyze a multitude of metrics. The default
one AS adopting, 10%, 20%, 40% and 80% adopting ASes, and fi- metrics included are percent of ASes hijacked (indicating that the
nally, when all ASes, except one, adopt. For each of these adoption announcement at an individual AS was traced back to the attacker),
percentages, adopting ASes must be chosen. One challenge we ex- percent of ASes disconnected (indicating that the announcement
perienced when adopting ASes is that there is a wide variety in was traced back to neither the attacker nor the victim), and percent
connectivity between the ASes. A stub AS adopting will not have of ASes successfully connected (indicating that the announcement
nearly as much impact as a highly connected AS adopting. This was traced back to the victim AS). We further divide these met-
can result in large variance when using uniform random selection rics into various subgroups, such as in Figure 2, the percent of
of ASes. To decrease the variance, we separate the ASes into three stub or multihomed ASes that were adopting and traced back to
44
BGPy: The BGP Python Security Simulator CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA
the attacker, indicating attacker success. These metrics are easily The announcement validity function can be overridden, e.g.,
extendable to keep track of custom metrics the user wants to track. to implement policies such as ROV. The default announcement
7. Graphs and Example Simulation From here, the resulting validity function only checks for loop-prevention, i.e., returns a
data is output into a CSV, and 18 default graphs are output as well. Boolean indicating if the ASN (of the AS class) is contained within
The user can also use the resulting CSV to create their own graphs the AS Path of the announcement (if so, the announcement is in-
as they wish. The CSV contains, for each scenario, for each percent valid). Notice that default function already allows simulation of path
adoption, the average tracked metrics (percent of ASes that were poisoning[35], as used either by attacks or for traffic engineering.
hijacked, disconnected, or successfully connnected). An example The code in Figure 4 is an example, showing the simple derivation
of these graphs can be seen in Figure 2. of the ROV-AS validity function from BGP-AS. As shown, this only
requires eight lines of code, where we check if the announcement
3.2 Refining the Framework: Empowering is invalid due to a ROA, and otherwise, simply invoke the default
Customization BGP validity function. Other examples can be seen in Figure 9 and
The simulation framework offers a wide range of optional parame- Figure 10.
ters that can be easily configured during setup, providing flexibility
for customization. Moreover, its design enables straightforward sub- 1 class ROVAS(BGPAS):
classing and the creation of simple, personalized functions, which 2 def _valid_ann(self, ann: Ann) -> bool:
3 # If ROA is invalid, ROV says announcement is invalid
we will explore through various examples in the following sections.
4 if ann.invalid_by_roa:
Configurable Parameters. There are many parameters that
5 return False
can be set when running a simulation that can affect various aspects 6 # If ROA is valid, determine validity with BGP
of the analysis. The parameters for the simulator can be seen in 7 else:
Table 3. While the simulator parameters affect all scenarios that a 8 return super(ROVAS, self)._valid_ann(ann)
user wants to compare (for example, setting the python_hash_seed
will make all scenarios deterministic), there are also configuration
options for each individual scenario. For example, setting the Adop- Figure 4: A subclass of BGP AS that implements ROV.
tASCls to ROV will set that specific scenario to defend using ROV,
but other scenarios may have different values for AdoptASCls and
•Scenario Class The Scenario class, used to control attack and
different defensive strategies. The parameters for the various scenar-
defense strategy, can also be easily extended. Common examples
ios can be seen in Table 4. We offer a range of preconfigured attack
include modifying how attackers and victims are selected, how
strategies accompanied by various scenarios for a users selection
adopting ASes are selected, how various metrics get recorded, etc.
(refer to Table 5). It is crucial to distinguish between the defensive
Below we show an example for how easy it is to create a non routed
and attacking strategies implemented in our system. The defensive
prefix hijack scenario (defined in Table 5). With just 15 lines of
strategy can be customized through the parameter AdoptASCls,
code, we can create a completely new attack strategy. Multiple use
whereas the attacking strategy necessitates the use of its dedicated
cases listed in §6 implement a wide variety of custom attacking
subclass. As a result, the scenarios listed in the table implement
strategies.
their own attacking strategy in a subclassed function, while the pa-
rameters still accommodate different defensive strategies, attacker
counts, and more. 1 class NonRoutedPrefixHijack(Scenario):
Extendable Classes. There are also many functions and classes 2 def _get_announcements(self) -> List["Announcement"]:
3 """Returns announcements used to seed the engine"""
that have been written in such a way that users can easily subclass
4 anns = list()
and override them to control almost any aspect of the simulations. 5 for attacker_asn in self.attacker_asns:
For example, §6 lists drastically different simulations that did not 6 anns.append(self.AnnCls(
need to modify the source code of the simulator and were facil- 7 prefix='1.2.0.0/16',
itated by subclassing, made possible by BGPy’s modular design. 8 as_path=(attacker_asn,),
Subclassing allows basically unlimited customizations, and was ex- 9 timestamp=1,
tensively used in different simulations. Here we describe the three 10 seed_asn=attacker_asn,
most common classes that get extended: 11 roa_valid_length=True,
12 roa_origin=0,
• AS Class The AS class controls the routing policy for a given AS.
13 recv_relationship=Relationships.ORIGIN
Within this AS class, a user can easily control decisions such as path 14 ))
selection and export policy. Two common functions that are often 15 return anns
overridden in this class is the function used to rank announcements
(called ‘_new_ann_better’, detailed in §4.2) and the function used
to determine the announcement validity (called ‘_valid_ann’, an Figure 5: A scenario subclass that demonstrates how to im-
example shown in Figure 4). plement a non routed prefix hijack
The ranking of announcements follows the Gao Rexford model
Maybe they can by default, but, when needed, can be tailored, for example, to imple-
through this?
(implement BRP- ment policies such as ‘security first’ or ‘security second’, see §6.1.
RPKI) The user can also easily modify the tie-breaking mechanisms.
45
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
•Announcement Class The Announcement class (containing customers to provide traffic. IXPs and sibling relationships are ex-
the information used for the BGP announcements that are prop- cluded, as these are also excluded in the provided CAIDA topology.
agated) is often extended to contain attributes specific to each
simulation. This can be accomplished easily by simply overriding 4.2 Vally Free Assumptions and Other Routing
the init function of the Announcement class in a subclass with just Policies
a few lines of code, as shown in Figure 6. This is an example of the By default the base AS class (BGPAS) uses BGP and adheres to Gao-
announcement class used to simulate the announcements described Rexford valley-free routing rules, which is in line with other works
in ROV++ [43], which contains a few additional attributes, such as that evaluate the security of inter-domain routing [24]. By default,
the holes in the announcement, whether or not the announcement the BGPAS class also utilizes an export-to-all policy, meaning a
is a blackhole or a preventive announcement, etc. route exported to one provider is exported to all providers. The
same applies to peers and customers.
1 @dataclass(frozen=True, slots=True) Gao-Rexford defines rules for ranking received announcements
2 class ROVPPAnn(Announcement): based on relationship (peer-to-peer or provider-to-customer) and
3 holes: tuple[str] = () export policies based on those relationships. These rules are usually
4 blackhole: bool = False implemented in routers using the Local Preference mechanism in
5 # V3 attributes
the BGP Best-Path Selection process, which follows several steps
6 preventive: bool = False
to determine the best path for a particular prefix. The steps are:
7 attacker_on_route: bool = False
• Local Preference. The local preference of the AS follows the
business incentives of that AS. First, announcements from cus-
Figure 6: A subclass of Announcement that implements ad- tomers are given the highest priority, because customers are paying
ditional attributes needed to simulate ROV++ [43] for the traffic. Then, announcements from peers are considered,
since traffic to and from peers is free. Lastly, announcements from
providers are considered. Provider announcements are the lowest
4 THE BGPY SIMULATION ENGINE for the local preference because this traffic must be paid for.
In this section we describe various aspects of the simulation en- • Shortest AS Path. If multiple announcements have the same
gine, used to simulate BGP with given security policies and attacks. relationship, the announcements are ranked by shortest AS Path.
The simulator engine abstracts away packet level interactions, and
• Tiebreakers. If all other aspects of announcements are equal,
simulates BGP from a high level, propagating BGP announcements
then the AS defaults to tiebreakers. In our simulations, we default
across the AS topology to produce as local RIB at each AS for data
to the lowest ASN to win ties to be consistent with [45] but this
analysis within the Simulation Framework (see §3.1).
can easily be changed and is extendable.
4.1 AS Graph Export Policies. The export policies also follow the business
incentives for an AS by default in the BGPAS class. Announce-
To start our simulations, we build an AS Graph. The simulator ments from customers are sent to peers, providers, and customers.
receives the topology information in the CAIDA serial-2 format Announcements from both peers and providers are sent only to
[10], and, by default, uses the latest CAIDA topology. This topology customers. By default the BGPAS uses an export to all methodology.
is a directed acyclic graph consisting of ASes for nodes and peer to Accuracy of the Valley Free Routing model This method-
peer connections as well as provider to customer connections for ology is one that is widely used across various routing security
edges. studies [11, 25, 28], although we do acknowledge that in the real
Nodes. The nodes in this graph are ASes, by default performing world ASes do not always adhere to this [1, 36, 40, 44]
BGP (for further detail, see §4.2). We differentiate between four Prefix Aggregation. The BGPAS class by default does not do
types of ASes: any prefix aggregation. That is to say that the BGPAS class will
• Stubs. These are ASes that have only one provider and no select an announcement for every single prefix, regardless of the
peers or customers. existence of subprefixes or superprefixes.
• Multihomed. These are ASes which do not have any cus-
tomers, but do have more than one provider, or have one or 4.3 Real World Data
more peers. Often this is done for backup purposes or load There are currently several sources of data that provide information
balancing [59]. on routing policies, particularly for ROV (Route Origin Validation).
• Input Clique. The CAIDA AS topology [10] provides a In Table 4, it is explained that any Scenario class can receive a dic-
strongly connected clique of ASes at the top of the graph tionary of ASN-AS Class key-value pairs as a parameter using the
with less than 20 ASes. hardcoded_asn_cls_dict. This section provides an explanation of the
• Etc. These are ASes that don’t fit into the other categories data sources utilized in a utility function that is integrated within
Edges (relationships). We use the standard edges associated the system. This function generates a hardcoded_asn_cls_dict con-
with ASes and as defined by the CAIDA topology [10, 24]. Peer- taining actual ROV data from the real world. This enhancement
to-peer connections, where traffic flows freely from AS to AS, and ensures that real-world ROV can be employed in any simulation,
customer-provider connections, where providers are paid by their resulting in improved accuracy compared to previous simulators.
It is important to highlight that users have the flexibility to input
46
BGPy: The BGP Python Security Simulator CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA
any dataset or routing policy for a given ASN. As additional rout- ensures that running simulations multiple times will yield identical
ing policies become available, they will be incorporated into our outcomes, despite the initial random generation. This functionality
datasets and simulations. greatly facilitates the reproduction of specific issues and repro-
• ROV RPKI. https://rov.rpki.net/ [50] is a website detailing ASN’s ducible results that might otherwise be challenging to recreate.
and their corresponding ROV confidence.
• Cloudflare’s https://isbgpsafeyet.com. [14] This website accu- • YAMLable Every object in BGPy can be seamlessly converted
rately identifies a user’s ISP’s properties, including its ROV adop- to and from YAML, providing us with a convenient way to save So we actually
different parts of the state. For instance, we can easily dump the do not
tion and whether it follows the default policy or a variant that only simulate
filters announcements from peers. entire AS graph along with all the announcements into YAML. Ad- internet scale
or closer to
• Revisiting RPKI Route Origin Validation on the Data Plane. ditionally, we can save traceback results and other metrics in YAML that, we just
[51] Various metrics were utilized in this study to acquire ROV format. This greatly simplifies the comparison between the YAML simulating few
AS relations
ASNs along with their corresponding confidence levels. generated during testing and the ground truth YAML files we gen-
Is this the case
erate ourselves. This functionality is made possible by YAMLable with BGPEval,
4.4 Running a Scenario with the Simulation [61], which necessitates custom functions for YAML conversion if so why it take
too much time?
in each class. We prefer YAML over JSON because it allows direct
Engine
conversion between Python objects, unlike JSON. While pickling
When we are running a Scenario class , we must first insert the in Python also supports this feature, pickled objects are not easily
attacker’s announcements in the local RIB of the attacker AS, and readable by humans, and pickling highly recursive objects such as
the victim’s announcements in the local RIB of the victim AS. These the AS graph proved quite challenging. YAML, on the other hand,
initial announcements are defined in the Scenario class, for an is both human-readable and supports conversion to and from even
example see Figure 5. After inserting these announcements, we the most complex and recursive Python objects.
then run the simulation engine. This performs propagation, to
• Diagrams Our test suite includes a feature that allows users to
propagate the announcements throughout the internet and the AS
visually represent smaller AS graphs, the local RIB, and various
topology.
metrics as diagrams prior to running newly implemented routing
• To start with, we propagate the announcements from customers policies on the full topology. These diagrams are generated using
to providers, all the way up the graph. graphviz [2] and serve as valuable tools for debugging. Without a
• Then, we propagate announcements across peer connections. visual representation of the AS topology in small graphs, identifying
• Finally, we propagate announcements from providers to cus- logic problems becomes extremely challenging.
tomers. To generate these diagrams, input requirements include an AS
We perform the propagation of announcements in this way to topology and a scenario. The simulation is then executed, resulting
simulate a moment in time for the AS topology. With this method- in the display of the AS graph and associated metrics.
ology, our simulator converges after a single round of propagation, For instance, consider Figure 7. Each AS in the diagram contains
which significantly reduces the run time of simulations. With this the ASN, policy information (in other words the label of the AS
method we avoid a lot of the run time constraints detailed in many class), and its local RIB. The local RIB table includes the prefix, AS
other packet-level or discrete event simulators that must converge Path, and origin for each announcement. Circular shapes repre-
where simulations required convergence [13, 18, 19, 22, 55], result- sent BGP ASes (as also indicated in their policies listed), while the
ing in many rounds of propagation for a single trial, drastically octagon shapes denote adoption of a different policy (as seen in Fig-
increasing the runtime. ure 7 with ASes 3 and 4 adopting ROV). ASes 666 (the attacker) and
Our method of propagation has time complexity of 𝑂 (𝐸) per 777 (the victim) have an additional ring around them, signifying that
propagation round, where 𝐸 is the number of edges in the AS they are the origins of announcements. The arrows between ASes
topology. By default the AS graph converges after a single round of represent provider-to-customer connections, while peer-to-peer
propagation, however, if there are user-defined routing policies that connections (shown in Figure 11) are denoted by dashed lines.
delay the convergence of the AS graph, the propagation_rounds The color automatically determined for each AS indicates the
is a parameter that is easily modifiable. This runtime allows us to outcome of the AS on the data plane for the most specific prefix
be able to run our simulations on a standard laptop. For more per- (in Figure 7, it is /24). Differentiating between the control plane
formance enhancements, future improvements, and various bench- and the data plane is crucial for understanding these outcomes, as
marks, refer to Appendix C. they often differ. For example, in the control plane of Figure 7, AS
3 appears unaffected by hijacking, as it does not contain any of
5 VERIFICATION AND TESTING the attacker’s announcements within it’s local RIB. However, upon
performing traceback, we discover that on the data plane, traffic
Routing attacks and defenses are subtle; it is all too easy to make
for AS 3 is routed to AS 1, which then forwards all traffic within
hard-to-detect errors. Therefore, verification and testing are critical.
the /24 prefix to AS 666, the attacker. Green represents successfully
BGPy includes the following mechanisms to assist in testing and
routed announcements to the victim for the most specific prefix
verification.
(/24 in Figure 7). Gray (shown in Figure 11) indicates that an AS
• Deterministic randomness Setting the PYTHON_HASH_SEED is disconnected. An aggregated table presents the metrics for at-
in both the Python interpreter’s environment and the simulator tacker success, victim success, and disconnections. Additionally,
parameters (which seeds the random module, as shown in Table 3)
47
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
users have the option to label the graph and provide a description. extensions to the core BGPy mechanisms and did not require any
The emoji on the rightmost column denotes the origin, with the modification to the simulator source code. Not only do these use
angel being the victim, the ‘devil’ being the attacker, and a shield cases use the wide variety of parameters available, they also have
being a preventive announcement (as shown in Figure 11) implemented many of their own AS subclasses for their defensive
policies, many different attacker scenarios, subclassed many differ-
ent aspects of the simulations themselves, etc.
System Test Suite All of the features listed in this section are 6.2 Attacker Collusion
utilized in our system test suite. A user has the ability to define a The simulator’s versatility is supported by its use in developing
custom AS topology, along with a custom scenario, and the system and simulating policies that can cope with multi-attacker hijack
test suite will automatically perform the following: scenarios; ones of which the attackers may or may not be colluding.
(1) Creates an AS topology with custom routing policies. This means orchestrating a number of attackers as a group/team to
(2) Runs a scenario on said AS topology, such as a subprefix launch an attack on a target to achieve a particular goal, such as
hijack. decreasing successful connections to the origin.
(3) Records all the metrics associated with that topology, such A policy that deals with such attacks was created by subclassing
as number of ASes hijacked, etc. the BGP_AS and override method(s) related to how it processes
(4) Displays this topology and resulting output in a diagram for the announcements. In order to vary the number of attackers, one
easy visualization (described in the bullet for diagrams, see simply needs to provide an argument of how many attackers are
an example in Figure 7). in the system via the Simulation class (see Table 3 for a full list
(5) Compares it to ground truth YAML that is saved and vetted of parameters), and the attacker announcements would need to be
in advance. constructed and coordinated in a subclass of the Scenario class.
Additionally, the ground truth would be difficult and time con- Auxiliary entities, such as a centralized server, were integral
suming for a user to generate due the vast amount of text that to this policy. Although BGPy doesn’t have a centralized server
would be required to write these YAML AS graphs by hand. Instead, as a component of its architecture, it provides the flexibility to
the ground truth can be auto generated, and the user can simply incorporate it. As in this case, the server was created as a class
verify that it is accurate, saving countless hours of manual work. which became a member attribute of the new Scenario class.
Across all of our various use cases we have hundreds of sys-
tem tests. The system tests suite can even be used to verify other
6.3 Path Security Policies
simulation engines. BGPy has been extended to evaluate defenses against path manip-
ulation attacks, where an attacker modifies the AS Path or other
6 USE CASES attributes of an announcement. Notably, this facilitates evaluation
This section describes some ways in which BGPy has been used
for BGP security research so far. All of these use cases involved
48
BGPy: The BGP Python Security Simulator CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA
of BGPsec [34] and alternative mechanisms that provide Path Secu- [5] Tony Bates, Geoff Huston, and Philip Smith. [n. d.]. CIDR REPORT for 15 May
rity. The BGPsecAS class extends the default path selection mecha- 23. https://www.cidr-report.org/as2.0/
[6] BGPStream. 2018. BGPMON’s BGP Stream incident alert service. https:
nism to validate and prefer signed paths over unsigned paths and //bgpstream.com.
extends the default export function to add signatures in outgoing [7] Jay Borkenhagen. 2019. AT&T/as7018 now drops invalid prefixes from peers.
Email. https://mailman.nanog.org/pipermail/nanog/2019-February/099501.html
announcements. The Announcement class is extended to support NANOG Mailing List Archive.
propagating signatures. A new scenario subclass evaluates defenses [8] Kevin Butler, Toni R. Farley, Patrick McDaniel, and Jennifer Rexford. 2010. A
against origin hijacks, where the attacker announces an AS path Survey of BGP Security Issues and Solutions. Proc. IEEE 98, 1 (2010), 100–122.
[9] CAIDA. [n. d.]. CAIDA BGPStream twitter. https://twitter.com/bgpstream/.
indicating it is a neighbor of the legitimate origin. These attacks [10] CAIDA. 2016. The CAIDA AS Relationships Dataset.
evade detection by ROV since the origin is valid, and may become http://www.caida.org/data/as-relationships/.
more common as RPKI/ROV adoption continues to increase. A [11] Haowen Chan, Debabrata Dash, Adrian Perrig, and Hui Zhang. 2006. Modeling
Adoptability of Secure BGP Protocols. In Proc. of SIGCOMM. ACM.
mechanism like BGPsec is necessary to detect and prevent such [12] Zhiguo Chen, Xin Wang, Rui Zhang, Vern Paxson, and Stefan Savage. 2014.
attacks. Characterizing and detecting malicious behavior in bgp routing. In Proceedings
of the 2014 ACM SIGCOMM conference on computer and communications security.
The Path Security extensions also include support for evaluating ACM, 103–114.
defenses against route leaks [53], where an AS announces routes in [13] Zhe Chen, Daqiang Zhang, and Yinxue Ma. 2015. Modeling and analyzing the
violation of common export policy. Importantly, a route leak may convergence property of the BGP routing protocol in SPIN. Telecommunication
Systems 58 (03 2015). https://doi.org/10.1007/s11235-014-9870-y
redirect a substantial amount of traffic even without manipulating [14] Cloudflare. 2023. Is BGP Safe Yet? https://isbgpsafeyet.com/
the AS path. In other words, even if BGPsec was fully deployed, a [15] Avichai Cohen, Yossi Gilad, Amir Herzberg, and Michael Schapira. 2016. Jump-
route leak could still have the potential to cause harm and redirect starting BGP security with path-end validation. In Proc. of ACM SIGCOMM. ACM,
342–355.
internet traffic. Separate solutions, such as ASPA [3] are needed to [16] J.H. Cowie, D.M. Nicol, and A.T. Ogielski. 1999. Modeling the global Internet.
protect against route leaks. Computing in Science & Engineering 1, 1 (1999).
[17] Remy de Boer and Javy de Koning. 2013. BGP Origin Validation (RPKI). Technical
Report. Univeristy of Amsterdam, Systems and Network Engineering Group.
7 CONCLUSION [18] Christos Dimitropoulos and Greg Riley. 2008. SSFNet: A Scalable Simulation
Framework for BGP. IEEE Journal on Selected Areas in Communications 26, 1
We presented BGPy, an open-source BGP Python simulator de- (2008), 158–167. https://doi.org/10.1109/JSAC.2007.357111
signed to analyze various attack and defense security scenarios [19] Xenofontas A. Dimitropoulos and George F. Riley. 2006. Efficient Large-Scale
https://github.com/jfuruness/bgpy. We showcased the simulator BGP Simulations. Comput. Netw. 50, 12 (aug 2006), 2013–2027. https://doi.org/
10.1016/j.comnet.2005.09.033
framework and functionality, and demonstrated it’s ease of ex- [20] Nick Feamster, Jonathan Winick, and John Rexford. 2004. A model of BGP
tensibility and use. We detailed the simulation engine and design, routing for network engineering. In Proceedings of the 2004 ACM SIGMETRICS
documenting how we model the AS topology and it’s correspond- international conference on Measurement and modeling of computer systems. ACM,
191–202. https://doi.org/10.1145/1012888.1005726
ing routing policies. We also demonstrated the BGPy test suite, [21] Anja Feldmann, Albert Greenberg, Carsten Lund, Nick Reingold, and John Rex-
and showed how it can make simulations both easy to debug and ford. 2000. Netscope: Traffic engineering for ip networks. IEEE Network 14, 2
(2000), 11–19.
robust. We cataloged numerous performance metrics and enhance- [22] Tony Feng and Rob Ballantyne. 2004. Implementation of bgp in a network
ments. We described several use cases where BGPy is already in use, simulator. Proc. Applied Telecommunications Symposium, ATS’04 (01 2004).
and highlighted it’s extendability and usefulness. We will continue [23] Python Software Foundation. 2023. Python 3.11. https://www.python.org/
downloads/release/python-311/
to improve upon it’s limitations as future work, as described in [24] Lixin Gao and Jennifer Rexford. 2001. Stable Internet Routing without Global
Appendix A. Coordination. IEEE/ACM Trans. Netw. 9, 6 (dec 2001), 681–692. https://doi.org/
10.1109/90.974523
[25] Yossi Gilad, Avichai Cohen, Amir Herzberg, Michael Schapira, and Haya Shulman.
ACKNOWLEDGMENTS 2017. Are We There Yet? On RPKI’s Deployment and Security. In NDSS. The
We thank the anonymous reviewers for their suggestions. This Internet Society.
[26] Phillipa Gill and Nick Feamster. 2012. Modeling on Quicksand: Dealing with
manuscript is based upon work supported by the National Science the Scarcity of Ground Truth in Interdomain Routing Data. ACM SIGCOMM
Foundation under Grant No. 2247810. Prof. Herzberg was partially Computer Communication Review 42, 4 (2012), 107–118. https://people.cs.umass.
edu/~phillipa/papers/QuickSand.pdf
supported by an endowment from Comcast. Any opinions, findings, [27] Phillipa Gill, Michael Schapira, and Sharon Goldberg. 2011. Let the market
and conclusions or recommendations expressed in this material are drive deployment: A strategy for transitioning to BGP security. ACM SIGCOMM
those of the author(s) and do not necessarily reflect the views of Computer Communication Review 41, 4 (2011), 14–25.
[28] Sharon Goldberg, Michael Schapira, Pete Hummon, and Jennifer Rexford. 2014.
the National Science Foundation or of Comcast. How secure are secure interdomain routing protocols? Computer Networks 70
(2014), 260–287.
REFERENCES [29] Amir Herzberg, Matthias Hollick, and Adrian Perrig. 2015. Secure Routing for
Future Communication Networks (Dagstuhl Seminar 15102). Dagstuhl Reports 5,
[1] Ruwaifa Anwar, Haseeb Niaz, David Choffnes, Italo Cunha, Phillipa Gill, and 3 (2015), 28–40. https://doi.org/10.4230/DagRep.5.3.28
Ethan Katz-Bassett. 2015. Investigating Interdomain Routing Policies in the Wild. [30] G. Huston, M. Rossi, and G. Armitage. 2011. Securing BGP: A literature survey.
In Proc. of ACM IMC. IEEE Communications Surveys & Tutorials 13, 2 (2011), 199–222.
[2] The NetworkX Developers Aric Hagberg. 2022. Graphviz. https://graphviz.org/ [31] Cheng Jin, Qian Chen, and Sugih Jamin. 2000. Inet: Internet Topology Generator.
[3] Alexander Azimov, Eugene Bogomazov, Randy Bush, Keyur Patel, and Job Sni- IEEE/ACM Transactions on Networking 8 (11 2000), 753–765. Issue 6. https:
jders. 2022. BGP AS_PATH Verification Based on Resource Public Key Infrastruc- //doi.org/10.1109/TNET.2000.880966
ture (RPKI) Autonomous System Provider Authorization (ASPA) Objects. Internet- [32] Josh Karlin, Stephanie Forrest, and Jennifer Rexford. 2008. Autonomous security
Draft draft-ietf-sidrops-aspa-verification-09. Internet Engineering Task Force. for autonomous systems. Computer Networks 52 (10 2008), 2908–2923. https:
https://datatracker.ietf.org/doc/draft-ietf-sidrops-aspa-verification/09/ Work in //doi.org/10.1016/j.comnet.2008.06.012
Progress. [33] S. Kent and K.seo. 2012. An Infrastructure to Support Secure Internet Routing. RFC
[4] Hitesh Ballani, Paul Francis, and Xinyang Zhang. 2007. A Study of Prefix Hi- 6480. The Internet society. http://tools.ietf.org/html/rfc6480
jacking and Interception in the Internet. In Proc. of ACM SIGCOMM. 265–276. [34] M. Lepinski (Ed.) and K. Sriram (Ed.). 2017. BGPsec Protocol Specification. RFC
https://doi.org/10.1145/1282380.1282411 8205 (Proposed Standard). https://doi.org/10.17487/RFC8205 Updated by RFC
8206.
49
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
[35] Thomas B London, Stephen R Hanna, and Kevin L Carter. 2000. Path poisoning in [61] Jacob Vlijm. 2022. yamlable. https://github.com/jacobvlijm/yamlable
bgp. In Proceedings of the 2000 IEEE international conference on network protocols. [62] Matthias Wählisch, Olaf Maennel, and Thomas C. Schmidt. 2012. Towards
IEEE, 191–198. detecting BGP route hijacking using the RPKI. In Proc. of ACM SIGCOMM. 103–
[36] H. Madhyastha, E. Katz-Bassett, T. Anderson, A. Krishnamurthy, and A. 104. https://doi.org/10.1145/2342356.2342381
Venkataramani. 2009. iPlane Nano: Path Prediction for Peer-to-Peer Applications. [63] M. Wojciechowski. 2008. Border gateway protocol modeling and simulation. Mas-
In Proce of NSDI. ter’s thesis. University of Warsaw.
[37] Mohammadreza Maleki, Mohammad Mahdi Hajian, and Mohammad R Sadeghi. [64] J. Wu, Y. Zhang, Z. M. Mao, and K. Shin. 2007. Internet routing resilience to
2018. BGP anomalies: A survey of detection methods and their limitations. IEEE failures: Analysis and implications. In Proc. of CoNEXT.
Communications Surveys & Tutorials 20, 3 (2018), 2028–2055. [65] Ellen W Zegura, Kenneth L Calvert, and Samrat Bhattacharjee. 1996. How
[38] Niko Matsakis. 2023. PyO3: Bringing Python to Rust. https://github.com/PyO3/ to model an internetwork. In Proceedings of the IEEE conference on computer
pyo3. communications (INFOCOM), Vol. 2. IEEE, 594–602.
[39] Jared Mauch. [n. d.]. BGP Routing Leak Detection System. https://puck.nether. [66] Rui Zhang, Zhiguo Chen, Xin Wang, Vern Paxson, and Stefan Savage. 2015.
net/bgp/leakinfo.cgi/leaks-majornet.txt. BGPMon: A system for monitoring and detecting malicious behavior in bgp. In
[40] R. Mazloum, M. Buob, J. Auge, B. Baynat, D. Rossi, and T. Friedman. 2014. Viola- Proceedings of the 2015 ACM SIGSAC Conference on Computer and Communications
tion of Interdomain Routing Assumptions. In Proc. of Passive and Active Measure- Security. ACM, 1591–1602.
ment Conference (PAM).
[41] Alberto Medina, Anukool Lakhina, Ibrahim Matta, and John Byers. 2001. BRITE:
an approach to universal topology generation. In MASCOTS 2001, Proceedings A FUTURE WORK AND LIMITATIONS
Ninth International Symposium on Modeling, Analysis and Simulation of Computer
and Telecommunication Systems. 346–353. https://doi.org/10.1109/MASCOT.2001.
BGPy is also currently being iterated on, and we leave a few im-
948886 provements as future work.
[42] Asya Mitseva, Andriy Panchenko, and Thomas Engel. 2018. The state of affairs
in BGP security: A survey of attacks and defenses. Computer Communications
• Python BGPy is implemented in Python [23], which unlike lower
124 (June 2018), 45–60. level languages has higher runtime and memory constraints. We
[43] Reynaldo Morillo, Justin Furuness, Amir Herzberg, Cameron Morris, James Bres- recommend for BGPy to set aside 1-2GB per core used in simulations
lin, and Bing Wang. 2020. ROV++: Improved Deployable Defense against BGP
Hijacking. In Proceedings of the Network and Distributed System Security Sympo- and for 1000 trials it takes about 16 minutes per policy per core. In
sium. https://doi.org/10.14722/ndss.2021.24438 our experience working on these types of simulations, the pros with
[44] W. Mühlbauer, A. Feldmann, O. Maennel, M. Roughan, and S. Uhlig. 2006. Building the faster development time and flexible extensions far outweighs
an AS-topology model that captures route diversity. In Proc. of SIGCOMM.
[45] C. Perkins, P. Calhoun, and J. Bharatia. 2007. Mobile IPv4 Challenge/Response the negatives of a higher level language. We found that developer
Extensions (Revised). RFC 4721 (Proposed Standard). https://doi.org/10.17487/ time was the main constraint as we wanted to evaluate security
RFC4721
[46] Brian J. Premore. 2003. An Analysis of Convergence Properties of the Border
ideas quickly. However, we plan on improving on this, and we
Gateway Protocol Using Discrete Event Simulation. Ph. D. Dissertation. Dartmouth believe that we can use Rust bindings using the PyO3 crate [38]
College. for the speed of a lower level language while still maintaining the
[47] B. Quoitin and S. Uhlig. 2005. Modeling the routing of an autonomous system
with C-BGP. IEEE Network 19, 6 (2005), 12–19. https://doi.org/10.1109/MNET. flexibility and ease of development of a higher level language. See
2005.1541716 a more in depth discussion in C.
[48] Y. Rekhter (Ed.), T. Li (Ed.), and S. Hares (Ed.). 2006. A Border Gateway Protocol 4
(BGP-4). RFC 4271 (Draft Standard). https://doi.org/10.17487/RFC4271 Updated • Discrete Event Simulations BGPy is not designed to be a dis-
by RFCs 6286, 6608, 6793, 7606, 7607, 7705, 8212, 8654, 9072. crete event simulator. BGPy assumes a completely synchronous
[49] Renesys. [n. d.]. The New Threat: Targeted Internet Traffic Misdirection. http: system for each propagation round for efficiency purposes. By
//www.renesys.com/2013/11/mitm-internet-hijacking/.
[50] Andreas Reuter, Randy Bush, Italo Cunha, Ethan Katz-Bassett, Thomas C Schmidt, avoiding discrete event simulations, we can avoid the need to allow
and Matthias Wählisch. 2018. Towards a rigorous methodology for measuring the AS graph to converge, which significantly reduces the runtime
adoption of RPKI route validation and filtering. ACM SIGCOMM Computer
Communication Review 48, 1 (2018), 19–27. Online service: https://rov.rpki.net/.
for these simulations. However, as a parameter in the simulator, a
[51] Nils Rodday, Ítalo S. Cunha, Randy Bush, Ethan Katz-Bassett, Gabi Dreo Rodosek, user of the simulator can set the amount of rounds of propagation,
Thomas C. Schmidt, and Matthias Wählisch. 2021. Revisiting RPKI Route Origin so in theory they can do as many as they need. BGPy is also extend-
Validation on the Data Plane. In 5th Network Traffic Measurement and Analy-
sis Conference, TMA 2021, Virtual Event, September 14-15, 2021, Vaibhav Bajpai, able, and can be extended to propagate until convergence, and we
Hamed Haddadi, and Oliver Hohlfeld (Eds.). IFIP. http://dl.ifip.org/db/conf/tma/ can add this as a core feature if other developers wish. That being
tma2021/tma2021-paper11.pdf said, propagating more rounds quickly becomes infeasible due to
[52] Muhammad S. Siddiqui, Diego Montero, Rene Serral-Gracia, Xavi Masip-Bruin,
and Marcelo Yannuzzi. 2015. A survey on the recent efforts of the Internet runtime constraints. We hope to revisit this as future work once
Standardization Body for securing inter-domain routing. Computer Networks 80 the simulator contains Rust bindings using the PyO3 crate [38]. See
(April 2015), 1–26.
[53] K. Sriram, D. Montgomery, D. McPherson, E. Osterweil, and B. Dickson. 2016.
C for a more in depth discussion on performance enhancements.
Problem Definition and Classification of BGP Route Leaks. RFC 7908 (Informa- • High Level BGP BGPy focuses on high level aspects of BGP,
tional). https://doi.org/10.17487/RFC7908
[54] Cisco Systems. 2023. Cisco WAN Automation Engine (WAE). https://www.cisco.
such as best announcement selection, import and export policy,
com/c/en/us/products/routers/wan-automation-engine/index.html etc. Due to runtime constraints we do not simulate many of the
[55] Boleslaw Szymanski, Yu Liu, and Rashim Gupta. 2003. Parallel network simu- lower level BGP aspects. We also have limited knowledge of the
lation under distributed Genesis. In Proceedings of the 2003 ACM SIGMETRICS
international conference on Measurement and modeling of computer systems. 61– AS graph. For example, the CAIDA topology [10] does not include
68. https://doi.org/10.1145/824475.825869 sibling relationships or IXPs. We do not include information about
[56] PyPy Team. 2023. PyPy. https://pypy.org/ prefix aggregation, AS blacklists, etc. However, all of these can be
[57] Andree Toonk. 2011. Indosat a Quick Report. http://www.bgpmon.net/hijack-
by-as4761-indosat-a-quick-report/. addressed and added by a user to any level of granular detail. The
[58] Andree Toonk. 2014. Turkey Hijacking IP Addresses for Popular Global DNS graph and routing policies are extendable, and just like we have
Providers. BGPMon.
[59] Maarten C van Steen, Arno P Akkermans, and Henk J Sips. 2008. Multihoming
added support for ROV, with the appropriate data sources a user
in the internet: A survey. IEEE Communications Surveys & Tutorials 10, 2 (2008), can add support for any number of these enhancements. We leave
372–392. this as future work.
[60] Pierre-Antoine Vervier, Olivier Thonnard, and Marc Dacier. 2015. Mind Your
Blocks: On the Stealthiness of Malicious BGP Hijacks. In NDSS.
50
BGPy: The BGP Python Security Simulator CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA
B TEST SUITE EXAMPLE announcements across the graph. However, due to the inherent
In this section of the appendix, we showcase an example of one of slowness of constructing these objects in Python, we have devised
the diagrams auto generated from a single system test in figure 8, a plan for performance enhancements for future work. Our strat-
a larger version of Figure 7. There are hundreds like it across the egy involves transitioning to Rust bindings within Python using
many use cases the simulator is used for. For a detailed description powerful tools like PyO3 [38]. By leveraging this approach, we will
of this diagram, please refer to 5 under the bullet ‘diagrams’ be able to facilitate a Python package that incorporates Rust’s capa-
bilities. This transition is expected to yield significant performance
C PERFORMANCE AND BENCHMARKS enhancements without sacrificing extensibility whilst maintaining
all of the functionality of the current implementation.
The Internet is large, with over 80,000 ASes [5], hence, simulations
Tradeoffs between speed and ease of use. The BGP Rout-
of BGP attacks and defenses can require significant computation
ing Information Base (RIB) contains the routes known by a BGP
time. This can be a concern, especially considering that BGPy uses
router, and it is divided into three parts. The Adj-RIBs-In holds all
Python, an interpreted language which is not as efficient as com-
routes received from neighbors, the Local RIB holds the currently
piled language such as C or C++. However, we found that BGPy
selected best routes, and the Adj-RIBs-Out holds routes selected
is sufficiently efficient, to allow simulations on standard laptops
for advertisement to neighbors. The BGP RFC [48] emphasizes that
to complete in reasonable time frames. For our benchmarks (in
the distinction between these parts is purely conceptual and that
Table 1), we used the following parameters defined in Table 2:
implementations need not actually maintain separate copies of the
Python 3.11 [23] was used as the default interpreter, for a total
data in memory. Omitting copies of data can substantially decrease
runtime of about 30 minutes per adopted policy on a laptop. We
run time and memory use, however, in some cases it is convenient
highly recommend the use of PyPy [56], a just-in-time Python
to have these data structures easily accessible. Searching for alter-
compiler that requires no changes to the Python code. With PyPy,
nate routes, for example, is straightforward when the Adj-RIBs-In
on the same machine, the total runtime was about 16 minutes per
is available. Similarly, withdrawn routes can be easily computed
adopted policy.
from the Adj-RIBs-Out.
We also showcase how the simulations scale linearly with the
For this reason, we offer users a choice between two base BGP
number of CPU cores, which is made possible because trials are
ASes. The BGPAS class maintains copies of the RIB data structures
treated as independent so each CPU core can run any subset of
as they are defined in the RFC. This class is best suited for poli-
trials. We present several instances where utilizing cloud compute So there are
cies that require multiple rounds of propagation and may cause research on
services can drastically speed up runtimes. We recommend allowing pruning the AS
withdrawn routes. We also offer a light-weight alternative class,
for 1-2GB per core for RAM. graph? We can
They present BGPSimple. The BGPSimple class only has a Local RIB and does not utilize these?
some
Optimizations. The performance breakdown reveals that 70% Also are these
simulate withdrawn routes, the Adj-RIBs-In, or the Adj-RIBs-Out. used in
complexity of the total run time is dedicated to running the simulation engine BGPEval?
calculations of For many defenses, including ROV and the policies defined in [43],
possible ROV and propagating the announcements. Another 25% is allocated to
validation for the simplified AS model is sufficient. This drastically reduces the
performing the traceback and data analysis (refer to §3.1). Interest-
real world complexity of simulating scenarios with these policies, saving time
scenarios we ingly, BGPy employs a naive recursive traceback approach (without
might need to and memory by avoiding the operations to populate those data
reason on this memoization) that may trace back the same sub-path multiple times.
structures.
before fully Surprisingly, we discovered that this method is more efficient than
committing to Other simulators in the past have employed various techniques
development storing the results for already traced paths. The reason behind this
process for speedups, such as modifying the AS graph to reduce it in size
lies in the shallow nature of the AS graph, where the average path
[31, 41, 65]. This can be done by doing things such as removing stub
length is just 4. Thus, the savings achieved from reduced tracebacks
ASes (since they should have identical Local RIBs to their providers),
outweigh the overhead of storage and lookup operations, including
amongst other techniques. For our simulations we do not employ
hashing.
these techniques, since certain policies and scenarios may utilize
In the real world, it is common for each autonomous system (AS)
the ASes that were removed for attack or defense, based on their
to calculate the ROV validity for every announcement, resulting in
respective policy.
a run time of O(V*A), where V represents the number of ASes (ver-
tices) and A denotes the number of announcements. To avoid this
lengthy computation, for our simulations we simplify the process
D SIMULATOR AND SCENARIO PARAMETERS
by determining the ROV validity at the originating AS and adding it In this section of the appendix, we include parameters to the simu-
as an attribute to the announcement. As the announcement spreads lator (Table 3) and to the scenario class (Table 4). We also showcase
across the internet, the ROV validity remains constant since it is a table of provided scenario’s in Table 5.
based on the prefix-origin pair. Consequently, there is no need to In the simulator parameters, we detail various options that af-
recalculate the ROV validity, reducing the run time for this calcu- fect the entire simulation and all scenarios contained within that
lation to O(A). Typically, our simulations involve only one or two simulation. For instance, by setting the number of trials to 1000, all
announcements at the victim and attacker, making this operation scenarios will be run 1000 times.
exceptionally fast. In contrast, each scenario that is being compared is also highly
Anticipated Performance Enhancements .Our performance configurable. As in the example showcased in Figure 3, one scenario
profiling has revealed that more than 70% of the overall run time has an adopting AS of ROV, and a different scenario has an adopting
is dedicated to executing the simulation engine and duplicating AS of a different ROV variant that filters only by peers.
51
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
Parameter Description
percent_adoptions A list of percentages representing the adoption rate of the defensive policy among ASes (Autonomous
Systems). Special options are available to set the adoption rate to 0% (with only one AS adopting) or
100% (with all but one AS adopting).
scenario A list of configuration to be analyzed in the simulation. For example, analyzing the effects of subprefix
hijack by an attacker and the adoption of ROV (Route Origin Validation) by the victim. Refer to table 5
for the default available scenario options, and see 3 for usage examples
num_trials The total number of trials to be executed. Even a small number of trials (e.g., 100) can provide a clear
understanding of the graph, but we typically recommend running 1000 or more trials in our simulations.
propagation_rounds The number of rounds of propagation. In most cases, the graph converges after a single round of
propagation. However, in certain scenarios, multiple rounds are necessary. An attacker that behaves
differently based on how other ASes respond to its attack, for example, would require multiple rounds
to simulate.
output_path Specifies the output path for the generated graphs.
parse_cpus The number of CPU cores to be utilized for multiprocessing. For the scenario’s included by default,
approximately 1-2GB of memory is required per core.
python_hash_seed When set to an integer, enables deterministic runs. The AS graph is complex, and certain edge cases
may only arise when running thousands of trials. This option facilitates debugging such problems and
enables reproducibility.
Table 3: Parameters for the Simulator. To see an example of usage, see figure 3. Simulator parameters affect the entire simulation
and all scenarios.
The wide variety of parameters offer ease of customization to customers and providers, so this routing security policy offered
each unique simulation without requiring any code changes. little benefit, as shown in Figure 2.
52
BGPy: The BGP Python Security Simulator CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA
Parameter Description
AnnCls Announcement class to be used in the simulation. This allows users to easily create their own an-
nouncements with additional path attributes. This ability have been used in several use cases including
the one described in section § 6.1).
BaseASCls Base AS class to be used in the simulation. This is the default class that all ASes will adopt unless
otherwise specified in the hardcoded_asn_cls_dict. The default base AS class is BGP.
AdoptASCls This is the adopting AS class that will be adopted at each data point for the specified
percent_adoptions in the simulation parameters. For example, at 5% adoption, 5% of the AS graph will
adopt this class for its routing policies. One example of this would be ROV (Route Origin Validation).
num_attackers The number of attackers that will be randomly selected. The default value is one attacker.
num_victims The number of victims that will be randomly selected to announce the legitimate announcement. The
default value is one victim.
hardcoded_asn_cls_dict A dictionary of key-value pairs, where the key is the ASN (Autonomous System Number), and the
value is the class that the ASN should adopt. By default, this dictionary is empty. For example, there is
a utility function included that can pass in the real-world ROV ASes to be set into this dictionary.
adopting_subcategories This is a collection of subcategories in which the adoption will be evenly distributed. Presently, it
comprises stubs and multihomed ASes, input clique ASes, and the remaining ASes referred to as "etc
ASes." To clarify, if we specify a 10% adoption rate for ASes, it means that 10% of stubs and multihomed
ASes, 10% of input clique ASes, and 10% of etc ASes will adopt.
Table 4: Additional Parameters for the scenarios contained within the simulation. To see an example of usage, see Figure 3.
Scenario parameters affect only a single scenario that is being compared (for example, one scenario may adopt ROV, while the
other scenario may adopt a different ROV variant), not the entire simulation.
Scenario Description
PrefixHijack Both the attacker and the victim announce the same prefix. The victim’s announcement is
covered by a ROA.
SubprefixHijack The victim announces a prefix that is covered by a ROA. The attacker announces a subprefix of
this.
NonRoutedPrefixHijack The attacker announces a non-routed prefix that is not covered by a ROA (Route Origin
Authorization), while the victim announces nothing.
NonRoutedSuperprefixHijack The attacker announces the superprefix of a non-routed prefix. The non-routed prefix is covered
by a ROA; however, the superprefix is not. The victim announces nothing.
NonRoutedSuperprefixPrefixHijack The attacker announces the superprefix of a non-routed prefix, as well as the non-routed
prefix. The non-routed prefix is covered by a ROA; however, the superprefix is not. The victim
announces nothing.
SuperprefixPrefixHijack The victim announces a prefix that is covered by a ROA. The attacker announces the same
prefix as the victim, as well as a superprefix that is not covered by a ROA.
Table 5: Default scenarios included in the simulator. Hijacks come from [43, 60].
53
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
Figure 8: Test Suite Diagram Example. This diagram was dynamically generated from an input of an AS topology and the two
originating announcements (at AS 666 and AS 777). For a detailed description of this diagram, please refer to §5 under the bullet
‘diagrams’. For a more complex example containing disconnections, peers, and different types of announcements, see Figure 11.
1 class PeerROVAS(BGPAS):
2 def _valid_ann(self, ann: Ann) -> bool:
3 # If ROA is invalid and announcement is from a peer this ROV variant says the announcement is invalid
4 if (ann.invalid_by_roa and ann.recv_relationship == Relationships.PEERS):
5 return False
6 # If ROA is valid or announcement is not from a peer, determine validity with BGP
7 else:
8 return super(PeerROVAS, self)._valid_ann(ann)
Figure 9: A subclass of BGPAS that implements PeerROV, an ROV variant that only filters announcements received from peers.
54
BGPy: The BGP Python Security Simulator CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA
1 class ROVPPV2LiteSimpleAS(ROVPPV1LiteSimpleAS):
2
6 if ann.blackhole:
7 if self._send_competing_hijack_allowed(ann, propagate_to):
8 self._process_outgoing_ann(neighbor, ann, propagate_to, *args)
9 return True
10 else:
11 return False
12
13 def _send_competing_hijack_allowed(self, ann, propagate_to):
14 return (ann.recv_relationship in [Relationships.PEERS,
15 Relationships.PROVIDERS,
16 Relationships.ORIGIN]
17 and propagate_to == Relationships.CUSTOMERS
18 and (not ann.roa_valid_length or not ann.roa_routed))
55
CSET 2023, August 07–08, 2023, Marina del Rey, CA, USA Justin Furuness, Cameron Morris, Reynaldo Morillo, Amir Herzberg, and Bing Wang
Figure 11: ROV++ Test Suite Diagram Example. This diagram was dynamically generated as a test case to ensure the correct
functionality of ROV++ V3 described in [43]. The gray indicates disconnected. The shield indicates preventive announcements
from ROV++ V3. The blackhole indicates a blackhole announcement.
56