Synthetic biology takes a ground-up approach to genetically engineer cellular systems capable of the sophisticated sensing, information processing, and actuation exhibited by natural systems. While it is important to build increasingly complex systems when necessary, the goal is to do so using tools and methodologies that streamline biology and make it easier to engineer. At the center of this approach lies the need to impart novel biological function(s) by systematically introducing new designed DNA sequences into living cells. The two main challenges in this endeavor are: first, knowing how to design sequences that impart a particular function; and second, how to construct the DNA encoding such function in a form that can be readily introduced into the cell. Standard assembly parts, such as BioBricks™ , provide a paradigm that addresses both of these problems by recognizing that functional units of DNA sequence are frequently reused in a variety of projects. These units, which include promoters, ribosome binding sites, protein coding sequences, among others, represent non-reducible elements of genetic composition, and as such, are considered "basic parts." The sequences of each part are stored within databases, such as the Registry of Standard Biological Parts , while the physical DNAs are housed in part collections. Each basic part is a genetic element that has been refined to comply with any of several publicly-available standards and is not associated with one particular assembly strategy. Nevertheless, because each part is standardized to conform to a defined set of rules, a single standard assembly reaction can be used to concatenate basic parts. Furthermore, because only one standard assembly reaction is required to iteratively combine any two parts, we can assemble multi-part devices and characterize the rules of functional composition for each part in the context of other parts [3, 4]. Thus, by standardizing the basic part junction sequence, the task of defining contextual rules for part function is significantly narrowed. We envision that a robust standardized assembly process will enable the development of low-cost, high-throughput, automated assembly facilities, and ultimately, the outsourcing of entire DNA fabrication processes at a reasonable price.
The BioBricks™ standard described by Knight and coworkers was the first implementation of a strategy for defining composition rules that allow the assembly of standard biological parts using a single assembly chemistry . The assembly method employs iterative restriction enzyme digestion and ligation reactions to assemble small basic parts into larger composite parts. Basic parts are flanked by XbaI and SpeI restriction sites on their 5' and 3' ends, respectively. Digestion with these enzymes generates compatible cohesive ends that can be ligated back together head-to-tail. The ligation of two parts generates a scar sequence between the parts that contains neither of the original sites, and thus, it is unaffected by subsequent digestion with either XbaI or SpeI. The resulting product is a new composite part with the same assembly characteristics as the two parent parts. It is still flanked by unique XbaI and SpeI restriction sites on its 5' and 3' ends, respectively, and hence, the iterative assembly of larger and larger composite parts becomes possible. Over 2,000 basic parts that conform to this standard have been described, and they have been used in the construction of a wide range of genetic circuits and biosynthetic devices [5–8].
Since the inception of the first assembly standard, several others have been proposed and/or developed to describe functional composition and/or physical assembly. In fact, this field is currently undergoing robust activity, and the number of assembly standards is expanding rapidly. At present, the BioBricks Foundation (BBF) has implemented an organizational framework, known as a BBF RFC (request for comments) process, to help define, evaluate, and propose new standards in the field . As an example, we refer the reader to BBF RFC 29 which describes the major assembly standards proposed to date and suggests an organized naming process for future standards . One issue with the original BioBricks™ standard, and addressed herein, is the ability to compose protein-fusion parts encoding elements such as peptide tags and single domains of polypeptides for protein engineering applications.
Modular protein engineering is an emerging area of synthetic biology. Several studies have shown the power of building sophisticated protein machines by assembling multiple modular domains into a variety of larger polypeptide sequences [6, 11–14]. Unfortunately, the original BioBricks™ assembly scheme (BBF RFC 10) is not suitable for building chimeric proteins because the 8-nucleotide scar sequence that remains between parts after they have been joined together is incompatible with protein fusions for two reasons: first, the scar sequence (TACTAGAG) encodes tyrosine followed by a stop codon; and second, an 8 nucleotide scar inserts a frame shift between the two coding sequences. Theoretically, this problem could be addressed by elongating the scar sequence to contain a number of nucleotides that is a multiple of 3, such that the stop codon is no longer in frame and the frame shift is eliminated. In the mean time, a number of improvements to the original BioBricks™ standard, along with completely alternative assembly strategies and standards, have been proposed and/or developed. The Biofusion standard (BBF RFC 23), for example, modifies the original BioBricks™ standard to create a smaller, 6 nucleotide scar sequence (ACTAGA) that encodes threonine-arginine, and thus, eliminates both reading frame shifts and encoded stop codons . Unfortunately, the AGA codon encoding arginine is a rare codon in E. coli, and furthermore, the XbaI site can be blocked by dam methylation when flanked by certain sequences . The Fusion Parts standard (BBF RFC 25), developed by the Freiburg 2007 iGEM team, is another extension of the BioBricks™ standard that seeks to alleviate some of the disadvantages of the Biofusion format. Here, AgeI and NgoMIV restriction sites are used to generate a 6 nucleotide scar sequence (ACCGGC) encoding threonine-glycine. Finally, the BioBricks++ standard is a scarless assembly standard that uses two steps for assembly . It uses type IIs restriction enzymes to recognize sites flanking the part and digest at the boundary of the part. The cohesive ends are then blunted prior to scarless ligation, which is the ultimate goal of standard assembly strategies. Unfortunately, robust reactions necessary to implement such a method have yet to be identified. All of these standards, along with a complete list of their advantages and disadvantages, are described further at the BBF's Standards and Formats page .
Here we describe a new robust, yet flexible, standard for composing biological parts called BglBricks . The new standard addresses several of the key problems associated with the original BioBricks™ standard, and furthermore, it provides a foundation for developing automated assembly platforms. The BglBrick standard supports assembly with the BglII and BamHI restriction enzymes flanking the 5' and 3' ends of basic parts, respectively. These enzymes possess several advantages over the ones used in previous standards: first, they have an extensive history of use, which ensures their reliability; second, they cut with high efficiency; third, they are unaffected by overlapping dam or dcm methylation; finally, they result in a 6-nucleotide scar sequence (GGATCT) encoding glycine-serine, a sequence demonstrated to be innocuous in most protein fusion applications in a variety of host systems, including E. coli, yeast, and humans [19–21]. In the following sections, we showcase uses of the BglBrick standard in 3 diverse applications. These include the construction of constitutively active gene expression devices that elicit a wide range of expression profiles; the construction of chimeric, multi-domain fusion protein expression devices; and finally, the targeted integration of parts and devices into specific loci of the E. coli genome.