[go: up one dir, main page]

0% found this document useful (0 votes)
15 views6 pages

Programming Languages For Synthetic Biology PDF

This document discusses programming languages for synthetic biology. It introduces Kera, a new programming language for synthetic biology with an associated rule library called Samhita. Kera is described as a full-fledged object-oriented language that can handle a wide variety of biological interactions through its rules. The paper demonstrates Kera through a toy example and outlines its future development. Existing programming languages for synthetic biology like Antimony, GenoCAD, and GEC are also briefly discussed.

Uploaded by

tsth4ck
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

Programming Languages For Synthetic Biology PDF

This document discusses programming languages for synthetic biology. It introduces Kera, a new programming language for synthetic biology with an associated rule library called Samhita. Kera is described as a full-fledged object-oriented language that can handle a wide variety of biological interactions through its rules. The paper demonstrates Kera through a toy example and outlines its future development. Existing programming languages for synthetic biology like Antimony, GenoCAD, and GEC are also briefly discussed.

Uploaded by

tsth4ck
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/51844469

Programming languages for synthetic biology

Article  in  Systems and Synthetic Biology · December 2010


DOI: 10.1007/s11693-011-9070-y · Source: PubMed

CITATIONS READS

20 293

4 authors, including:

Umesh P Achuthsankar S Nair


University of Kerala University of Kerala
11 PUBLICATIONS   55 CITATIONS    149 PUBLICATIONS   570 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Computational modelling of biomolecular interactions View project

Compressing genomic data View project

All content following this page was uploaded by Umesh P on 15 November 2014.

The user has requested enhancement of the downloaded file.


Syst Synth Biol (2010) 4:265–269
DOI 10.1007/s11693-011-9070-y

RESEARCH PAPER

Programming languages for synthetic biology


P. Umesh • F. Naveen •
Chanchala Uma Maheswara Rao •

Achuthsankar S. Nair

Received: 6 December 2010 / Revised: 24 December 2010 / Accepted: 3 February 2011 / Published online: 20 February 2011
Ó Springer Science+Business Media B.V. 2011

Abstract In the backdrop of accelerated efforts for cre- Introduction


ating synthetic organisms, the nature and scope of an ideal
programming language for scripting synthetic organism The paradigm shift from traditional biology to the new
in-silico has been receiving increasing attention. A few biology can be most succinctly referred as an evolution
programming languages for synthetic biology capable of from low throughput analysis to high throughput analysis
defining, constructing, networking, editing and delivering and synthesis. Computational means to put forward strong
genome scale models of cellular processes have been hypotheses for physically feasible genome modifications is
recently attempted. All these represent important points in increasingly being demanded in the space of synthetic
a spectrum of possibilities. This paper introduces Kera, a biology (Endler et al. 2009). This process is partially
state of the art programming language for synthetic biology enabled by GUI based web servers such as GenoCAD
which is arguably ahead of similar languages or tools such (Pedersen et al. 2009); Tinkercell (Chandran et al. 2009)
as GEC, Antimony and GenoCAD. Kera is a full-fledged and Biojade (Goler 2004) etc. However, for a greater
object oriented programming language which is tempered control over modeling leading to synthesis, programming
by biopart rule library named Samhita which captures the languages are currently being developed. Antimony (Smith
knowledge regarding the interaction of genome compo- et al. 2009) and GEC (Pedersen et al. 2009), developed by
nents and catalytic molecules. Prominent feature of the Microsoft, are prominent examples.
language are demonstrated through a toy example and the In our opinion, the language that will survive the rigors
road map for the future development of Kera is also of biological engineering shall be the one that can handle
presented. widest variety of contextual interactions. Versatile pro-
gramming constructs can make programming languages a
Keywords Synthetic biology  Programming language  powerful tool at the hand of synthetic biologists. Consid-
Design and construction of organisms  ering that the rules of biological composition are unclear,
Computational biology we have attempted to develop a new language that covers
functionalities of existing languages and adds a rule library
named samhita.
In this paper we present a brief overview of the field of
P. Umesh  A. S. Nair (&) computational synthetic biology, including programming
State Inter University Centre of Excellence in Bioinformatics, languages. We point out some lacunae in the present lan-
University of Kerala, Thiruvananthapuram, India
guages and describe the basic structure of our new language
e-mail: sankar.achuth@gmail.com
Kera and the associated evolving rule libraries named
F. Naveen Samhita and demonstrate its flavor by a few toy examples.
Travancore Analytics, Technopark, Trivandrum, Kerala, India Kera in Sanskrit means coconut. The most innocent
reason for choosing this name is that our state is known for
C. U. M. Rao
Innovation Labs, Tata Consultancy Services, Madhapur, the abundance of coconut trees. From a scientific standpoint
Hyderabad, Andhra Pradesh, India coconut represents one of the finest biological engineering

123
266 P. Umesh et al.

of nature due to its multi level and modular natural programming languages, aimed at giving finer control over
construction. representation and modeling. GenoCAD is a web based tool
with some features of programming language (Czar et al.
2009). GenoCAD has a GUI and a built-in database con-
A brief introduction to computational synthetic biology
taining Biobricks. GenoCAD provides some rules for
assembling systems S. cerevisiae and E. coli. It uses mass
Computational support to synthetic biology first emerged in
action rates for simulation (Pedersen et al. 2009).
the form of repository of biological parts. These repositories
Antimony (Smith et al. 2009) is closer to a programming
which have been evolving for more than a decade by now,
language than GenoCAD and built on the concept of modu-
consists of sequences of promoter, ribosome binding sites,
larity (Cai et al. 2007). User can define models, store, and
coding sequences and terminators. The Biobrick registry of
simulate antimony code of the model. Antimony uses poly-
standard biological parts (http://bbf.openwetware.org/) is
merase per second (PoPS) and ribosomes per second (RiPS)
one of the most widely used repository in the field. Scientist
rates for simulating instead of mass action rates in GenoCAD.
have attempted to build composite devices using these
The latest in the series of programming language is GEC
building blocks (Young and Alper 2010). These building
(Genetic Engineering of living Cells) released by Microsoft
blocks along with associated information enable stitching of
foundation. GEC is a rule based language. It has associated
parts into composite devices. One of the earliest successful
part database and reaction database. GEC uses Prolog based
demonstration of synthesizing genetic devices was pro-
engine to choose compatible parts from database. In its cur-
vided by Elowitz and Leibler, Gardner who constructed a
rent form, GEC comes with some performance issues.
repressilator with a stand-alone capability in the host that
Example: retrieving data from database is time consuming.
housed the constructs (Elowitz and Leibler 2000; Gardner
GEC does not give condition for optimal expression and GEC
et al. 2000). This involved building in silico model of the
data model is not robust yet. Some researchers in the field
circuit, determining the right parameters, modifying the
feels that, model storing and distribution of genetic material is
binding co-efficient of molecules experimentally and gen-
losing its relevance and new computational strategies are to
erating oscillations in vivo with an accuracy of a compu-
be developed to correlate sequence and functions through the
tational model (Bashor et al. 2010). Following the
programs and models (Clancy and Voigt 2010).
successful implementation of these genetic devices, a new
In the next section, we describe our effort to develop
hope of designing genetic circuits ground up emerged in the
another programming language for synthetic biology,
community. Bacterial toggle switch and oscillator, pro-
which address many of these lacunae pointed out above
grammed pattern formation, bacteria designed to detect
and also tries to feature a powerful rule library named
cancer, Artemisinin produced in yeast, RNA- based devices
Samhita integrated into language.
(Purnick and Weiss 2009) are some of the significant
achievements of evolving synthetic biology the community.
All these efforts involved some amount of computational
The Kera programming language
work in the form of databases, sequence analysis and
molecular interaction modeling (Khalil and Collins 2010).
Kera is an object oriented, Knowledge based programming
language for synthetic biology which enables users to
A review of the existing synthetic biology create, edit, combine, and display in-silico simulation run
programming languages of experimental synthetic genomes. It provides C like data
types and control structures, and is more object oriented
With the popularity of Biobrick repository and the emerging than its counter parts, thus delivering more of the tradi-
non-biobrick repositories, there is a concomitant increase in tional advantages of object oriented thinking, viz, clarity in
complimentary computational tools to support mathematical managing complex thoughts. We feel that this aspect is
and computational modeling of parts devices and circuits. crucial as the language evolves to cover multiple levels
Systems Biology Markup Language (SBML, Hucka et al. within organisms and absorb the technological develop-
2003) is a community standard of exchanging information on ments in biology through synthesis.
part and interactions. Biojade was the first computational In the v1.0 of Kera we have focused on structure and
tool designed to meet the emerging requirements of synthetic functionality rather than language efficiency. Kera 1.0 is
community and was lanched at the synthetic biology meeting based on the C# platform. We plan to make Kera stand-
1.0. Subsequently more tools were developed that offer more alone only after it matures, attracts a critical mass of users
functionalities and user friendliness (Goler 2004). While and proves its relevance sufficiently. For now the intent is
tools are emerging recently, new approaches have emerged. to capture biological knowledgebase into programming
These are in the form of full-fledged or non-full-fledged language formats.

123
Programming languages for synthetic biology 267

Fig. 1 The repressilator

Given that biologists will be the end users of this lan- boolean compatibility;
guage, not computer scientists or technologists, we have compatibility = X.promoter(‘‘replace’’, ‘‘ACTGCTA’’);
sacrificed brevity for readability, aimed at biologists. if compatibility = 0
However, we have provided ‘cryptic’ and ‘verbose’ options {
of the language for those who might think otherwise. To X.promoter(‘‘replace’’, ‘‘TTGATGCTA’’);
aid rapid prototype development and experimentation, }
Kera works in the interpreter mode by default. An efficient
Let’s consider a simple synthetic genetic regulatory
compilation facility is also available.
network—repressilator [] which was implemented in
Kera has three basic classes to capture genomic knowl-
Escherichia coli. It acts as an electrical oscillator system
edge- the cell, the genome and the proteins. The cell class
with fixed time periods. The repressilator has three genes
carries with itself a large user-edited library named Samhita.
which are connected in a feedback loop and each gene
Samhita will check compatibility of bioparts and suggests
represses the other in the loop. The network diagram is
optimal bioparts for required design. Samhita has a rule
given bellow (Fig. 1).
library and a part library which are dynamic, with permission
For simulating model of a repressilator in Kera, we
to each user to add non-persistent parts or rules to Samhita.
begin by creating a E. Coli cell variable X and setting the
desired compartment as nucleus defined by the following
Kera code
The syntax of Kera Language
cell X = new cell(‘‘E. Coli’’);
Kera program starts with a definition of cell and com- X.compartment(‘‘nucleus’’);
partment of interest. In the following toy Kera code related
Each segment including promoter, RBS and terminator
to an E. coli cell, a cell object is created and stored in X.
in the network are defined by the following code, which
The compartment of interest can be nucleus, mitochondria,
related to the first part of the repressilator, consisting of X1,
cytoplasm and so on.
X2, X3, and X4. The remaining part of the network is
Cell X = new Cell(‘‘E. Coli’’); defined in similar manner
X.compartment(‘‘nucleus’’);
X.seg(1) = new prom();
The above Kera code works with one gene at a time X.seg(2) = new rbs();
selected by the ‘selectgene’ method in the cell object. Once X.seg(3) = new pcr();
the gene is selected, various actions can be performed on X.seg(4) = new ter();
the gene upstream and downstream. For instance, the
Proteins produced by each of the three coding sequences
promoter can be replaced with another one of our choice as
are programmed as follows:
in the following example.
Protein A = new protein ();
X.selectgene(‘‘g1’’);
X.seg(3).codes(A);
X.promoter(‘‘replace’’, ‘‘ACTGCTA’’);
Protein B = new protein ();
The replacement invokes the Samhita library and would X.seg(7).codes(B);
return a compatibility indicator, which can be verified Protein C = new protein();
using conditional statements of Kera for instance. In X.seg(11).codes(B);
addition to the program level checking, Kera also supports
Now the regulatory information can be coded as follows
a type ahead dropdown menu of rule-confirming options.
The following Kera code attempts a replacement with X.seg(1).regulates(C,-1);
‘‘ACTGCTA’’ and if it fails, attempts another replacement X.seg(5).regulates(A,-1);
with ‘‘TTGATGCTA’’. X.seg(9).regulates(B,-1);

123
268 P. Umesh et al.

This completes the formal representation of a repressi- regulatory effect. One part may have more than one known
lator. Now to display the synthetic sequence, the following regulatory effect. For the purpose of database normaliza-
code can be added: tion, these shall be encoded as separate rules. An example
of Samhita rule database is shown in Table 2.
X.display();
In the rule library also permits non-persistent and a non-
Kera converts the defined model into corresponding contradictory addition of the rule by users as mentioned in
sequence and the user can save this file using the code: the description of the programming language. The methods
of various classes of Kera language will call the rule library
X.writedna(‘‘filename.txt’’);
and return compatibility status as Boolean variables. The
library can be used to choose parts based on requirements.
As the programming language is capable of invoking the
Samhita–Biorule library rules during runtime, part addition can be iteratively
checked and designed at runtime. The power of the rule
Samhita bio-rule library is the most value adding feature of library depends on the experimental results available in the
Kera language. It attempts to distill coarse biological rules public domain and would need to be upgraded much fre-
that apply to the bioparts that Kera aims to alter for quently than the language versions itself.
manipulation through the programming language. Conse- Kera 1.0 is confined to qualitative data management
quently the rule library is based on the part library. The part rather than quantitative for obvious reasons. Quantitative
library is a relational database, consisting of the part id, characterization of even widely employed parts are lacking
part type and the sequence. The part ids are adopted from (Machisio and Stelling 2009). However, future versions of
Biobrick ids. However, the user can define addition to the Kera plan to include a framework where reaction rates and
part library which will be non-persisting. The types predictions based on dynamic models of biological circuits
included in the Kera are promoter, ribosome binding site, are incorporated.
coding sequence, and terminator. Sample part library
extract is shown in the Table 1.
The rule library has a relational database with an onto Conclusion
relational with the part library. This would mean that every
rule in the rule library, which would refer one or more parts Kera 1.0 provided subtle design power to the synthetic
in part library, but not necessarily vise versa. The rule biologists through the powerful programming constructs
library database consists of the rule id, part id, and the and biorule library. The language is currently in its infancy

Table 1 An extract from part


Part id Part type Sequence
database
BBa_I760005 Promoter atgacaaaattgtcat
BBa_J63003 Ribosome binding site cccgccgccaccatggag
BBa_B0015 Terminator ccaggcatcaaataaaacgaaaggctcagtcgaaagactggg
cctttcgttttatctgttgtttgtcggtgaacgctctctactagagt
cacactggctcaccttcgggtgggcctttctgcgtttata
BBa_G00100 Primer tgccacctgacgtctaagaa
BBa_J18922 Linker ggtagcggcagcggtagcggtagcggcagc
ORF83P1 Promoter cgagccgctttccatatctattaacgcataaaaaactctgctggcatt
cacaaatgcgcaggggtaaaacgtttcctgtagcaccgtgagttatactttgt
aceB Promoter caacaagttatcaagtatttttaattaaaatggaaattgtttttgattttgcattttaaat
gagtagtcttagttgtgctgaacgaaaagagcacaacgat

Table 2 The samhita database


Part Regulation Kera code

BBa_I1051 LuxR (positive regulation) BBa_I1051.regulates (LuxR,1)


BBa_I1051 cI dimer (negative regulation) BBa_I1051.regulates (cI dimer,-1)
BBa_I751501 AHL (positive regulation) BBa_I751501.regulates (AHL,1)
BBa_I751501 Lambda cI (negative regulation) BBa_I751501.regulates (Lambda cI,-1)
BBa_I739104 P22 cII (negative regulation) BBa_I739104.regulates (P22 cII,-1)

123
Programming languages for synthetic biology 269

Table 3 A comparative study


GenoCAD Antimony Little b GEC Kera
of existing programming
languages GUI H 9 9 9 H
BioBrick support H H 9 H H
Part repository H 9 9 H H
Rule based H 9 9 H H
Modularity 9 H H H H
User can create models 9 H H H H
Text based 9 H H H H
Import other file formats (like SBML) 9 H H 9 H
Open source H H H 9 H
Object oriented 9 9 9 9 H
Full scale programming construction 9 H H H H

but designed for a long term evolution into a leading tool Czar MJ, Cai Y, Peccoud J (2009) Writing DNA with GenoCAD TM.
for the synthetic biologists. The authors have attempted to Grammars 37:40–47. doi:10.1093/nar/gkp361
Elowitz MB, Leibler S (2000) A synthetic oscillatory network of
compare various features of the closest competitors to transcriptional regulators. Nature 403(6767):335–338
Kera. (One must admit that this comparison is unfair, as Endler L, Rodriguez N, Juty N, Chelliah V, Laibe C, Li C et al. (2009)
some of these tools have no resemblance to a formal pro- Designing and encoding models for synthetic biology rapid
gramming language). response designing and encoding models for synthetic biology.
J R Soc. doi:10.1098/rsif.2009.0035.focus
We summarize in the Table 3, the comparison of Kera Gardner TS, Cantor CR, Collins JJ (2000) Construction of a genetic
with GenoCAD, Antimony, Little b, and GEC. It must be toggle switch in Escherichia coli. Nature 339–342
pointed out that associated rule base and part base are Goler JA (2004) BioJADE: a design and simulation tool for synthetic
widely varying in the compared tools. What ultimately biological systems. MIT Computer Science and Artificial
Intelligence Laboratory AT Technical Report 2004-003
matters is whether the language serves the perceived and Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H,
creative needs of synthetic biologists in a futuristic way. Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A et al (2003)
From this standpoint the authors believe that Kera is well The systems biology markup language (SBML): a medium for
positioned. The real power of Kera or similar tools would representation and exchange of biochemical network models.
Bioinformatics 19:524–531
emerge once the quantitative modeling is seamlessly inte- Khalil AS, Collins JJ (2010) Synthetic biology: applications come of
grated into these tools. age, 11. doi:10.1038/nrg2775
Machisio MA, Stelling J (2009) Computational design tools for
synthetic biology. Science Direct 20(4):479–485
Pedersen M, Phillips A, Pedersen M, Phillips A (2009) Towards
programming languages for genetic engineering of living cells
References Towards programming languages for genetic engineering of
living cells. Interface. doi:10.1098/rsif.2008.0516.focus
Bashor CJ, Horwitz AA, Peisajovich SG, Lim WA (2010) Rewiring Purnick PE, Weiss R (2009) The second wave of synthetic biology:
cells: synthetic biology as a tool to interrogate the organizational from modules to systems. Nature Publishing Group 10(6):
principles of living systems. Rev Lit Arts Am. doi:10.1146/annu 410–422. doi:10.1038/nrm2698
rev.biophys.050708.133652 Smith LP, Bergmann FT, Chandran D, Sauro HM (2009) Antimony: a
Cai Y, Hartnett B, Gustafsson C, Peccoud J (2007) A syntactic model modular model definition language. Bioinformatics 25(18):
to design and verify synthetic genetic constructs derived from 2452–2454. doi:10.1093/bioinformatics/btp401
standard biological parts. Bioinformatics 23(20):2760 Young E, Alper H (2010) Synthetic biology: tools to design, build,
Chandran D, Bergmann FT, Sauro HM (2009) TinkerCell: modular and optimize cellular processes. J Biomed Biotechnol. doi:
CAD tool for synthetic biology. J Biol Eng 3:19. doi:10.1186/ 10.1155/2010/130781
1754-1611-3-19
Clancy K, Voigt CA (2010) Programming cells: towards an
automated ‘genetic compiler’. Curr Opin Biotechnol 572–581.
doi:10.1016/j.copbio.2010.07.005

123

View publication stats

You might also like