Fungal cellulases

Share Embed


Descripción

Review pubs.acs.org/CR

Fungal Cellulases Christina M. Payne,†,# Brandon C. Knott,‡,# Heather B. Mayes,§,# Henrik Hansson,⊥ Michael E. Himmel,∥ Mats Sandgren,⊥ Jerry Ståhlberg,⊥ and Gregg T. Beckham*,‡ †

Department of Chemical and Materials Engineering and Center for Computational Sciences, University of Kentucky, 177 F. Paul Anderson Tower, Lexington, Kentucky 40506, United States ‡ National Bioenergy Center, National Renewable Energy Laboratory, 15013 Denver West Parkway, Golden, Colorado 80401, United States § Department of Chemical and Biological Engineering, Northwestern University, 2145 Sheridan Road, Evanston, Illinois 60208, United States ∥ Biosciences Center, National Renewable Energy Laboratory, 15013 Denver West Parkway, Golden, Colorado 80401, United States ⊥ Department of Chemistry and Biotechnology, Swedish University of Agricultural Sciences, Uppsala BioCenter, Almas allé 5, SE-75651 Uppsala, Sweden 4.3. T. reesei Cellulases: Understanding the Mechanisms of Action 4.3.1. T. reesei as an Early Model for Cellulase Action 4.3.2. GHs and Related Enzymes 5. Carbohydrate-Binding Modules and Linkers 5.1. Family 1 Carbohydrate-Binding Modules 5.2. Linkers 6. Family 7 Glycoside Hydrolases 6.1. Structural Studies and Catalytic Function 6.1.1. TrCel7A: Wild-Type 6.1.2. TrCel7A Catalytic Mutants 6.1.3. F. oxysporum Cel7B with Active SiteSpanning Nonhydrolyzable Inhibitor 6.1.4. TrCel7B 6.1.5. H. insolens Cel7B S37W/P39W Mutant 6.1.6. TrCel7A: Cello-Oligomer Complexes 6.1.7. P. chrysosporium Cel7D 6.1.8. TrCel7A: Exo Loop Engineering 6.1.9. TeCel7A 6.1.10. PcCel7D Bound with Disaccharides 6.1.11. M. albomyces Cel7B 6.1.12. Heterobasidion irregulare Cel7A 6.1.13. T. harzianum Cel7A 6.1.14. L. quadripunctata Cel7B 6.1.15. TrCel7A Michaelis Complex and Glycosyl-Enzyme Intermediate 6.1.16. GH7 Catalytic Insights from Molecular Simulation 6.2. Processivity, Kinetic Modeling, and Visualization 6.3. Product Inhibition 6.4. Pyroglutamate 6.5. Glycosylation 6.6. Protein Engineering 6.7. Conclusions 7. Family 6 Glycoside Hydrolases 7.1. Structural Studies

CONTENTS 1. Introduction 2. Cellulose 2.1. Cellulose Structures 2.2. Cellulose Microfibrils 2.3. Cellulose Substrates 2.3.1. Microcrystalline Cellulose (MCC) from Plants 2.3.2. MCC from Microbes 2.3.3. Phosphoric Acid Swollen Cellulose (PASC) 2.3.4. Cellulose Crystallinity 3. Glycoside Hydrolase Catalytic Mechanisms 3.1. Retaining and Inverting Mechanisms 3.2. Carbohydrate Ring Puckering 4. Early Developments in Fungal Cellulases 4.1. History of the Discovery and Improvement of T. reesei Strains: Premolecular Era 4.1.1. Early Work at U.S. Army Natick Laboratories 4.1.2. Early International Effort for Strain Improvement 4.2. New Understanding from the Molecular Era: Cloning in T. reesei and S. cerevisiae 4.2.1. Cloning and Protein Production in T. reesei 4.2.2. RUT C-30 Revealed 4.2.3. Cloning T. reesei Genes in S. cerevisiae: The Road to Consolidated Bioprocessing

© 2015 American Chemical Society

1309 1311 1312 1314 1315 1315 1315 1316 1316 1316 1316 1318 1320 1320 1320 1320 1321 1321 1322 1322

1322 1322 1324 1326 1326 1331 1338 1341 1341 1341 1341 1342 1343 1345 1345 1345 1346 1346 1346 1346 1347 1347 1347 1347 1350 1357 1359 1360 1361 1362 1364 1366

Received: July 2, 2014 Published: January 28, 2015 1308

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews 7.1.1. 7.1.2. 7.1.3. 7.1.4. 7.1.5.

TrCel6A: Wild-Type T. f usca Cel6A TrCel6A: Y169F Variant H. insolens Cel6A: Apo H. insolens Cel6A: Cello-Oligomer Complexes 7.1.6. TrCel6A: Non-Hydrolyzable Ligands 7.1.7. H. insolens Cel6B 7.1.8. H. insolens Cel6A: D416A/Thio-Oligosaccharide Complex 7.1.9. TrCel6A: D175A and D221A Variants 7.1.10. H. insolens Cel6A: D405N Variant 7.1.11. H. insolens Cel6A: D416A/Isofagomine Complex 7.1.12. C. cinerea Cel6C 7.1.13. C. cinerea Cel6A 7.1.14. C. thermophilum Cel6A 7.1.15. TrCel6A Variants HJPlus and 3C6P 7.2. Catalytic Function 7.2.1. Catalytic Acid 7.2.2. pKa Modifying Residues 7.2.3. Catalytic Base 7.2.4. Catalytic Priming of Ring Distortion 7.2.5. Substrate Binding 7.2.6. Product Inhibition 7.2.7. Processive Catalytic Cycle 7.2.8. Synergistic and Processive Function 7.3. Glycosylation 7.4. Protein Engineering 7.5. Conclusions 8. Family 5 Glycoside Hydrolases 8.1. Structural Studies 8.1.1. Catalytic Function 8.1.2. T. aurantiacus Cel5A 8.1.3. P. rhizinflata EglA/CelA 8.1.4. T. reesei Cel5A (Formerly EG II/EG III) 8.2. Characterization of Activity and Specificity 8.2.1. T. viride EG III 8.2.2. TrCel5A 8.2.3. H. insolens Cel5A 8.2.4. Other Fungal GH5s 8.3. Protein Engineering 8.4. Conclusions 9. Family 12 Glycoside Hydrolases 9.1. Structural Studies 9.1.1. Overall Structure 9.1.2. GH12 Ligand Complex Structures 9.2. Plant Cell Wall Loosening/Extension Activity by GH12 Enzymes 9.3. GH Clan-C: Structure and Sequence Comparison 9.4. Enzyme Discovery and Engineering 9.5. Conclusions 10. Family 45 Glycoside Hydrolases 10.1. Structural Studies 10.1.1. Subfamily A 10.1.2. Subfamily B 10.1.3. Subfamily C 10.2. Catalytic Function 10.3. Similarities of GH45s to Expansins and Swollenins 10.4. Conclusions 11. Lytic Polysaccharide Monooxygenases

Review

11.1. Initial Discoveries of Oxidative Function 11.2. Mechanistic and Structural Studies 11.3. Conclusions 12. Modeling Enzymatic Hydrolysis 12.1. Ordinary Differential Equation-Based Models 12.2. Agent-Based Models 13. Concluding Remarks Author Information Corresponding Author Author Contributions Notes Biographies Acknowledgments Abbreviations References

1366 1367 1367 1367 1368 1368 1369 1369 1370 1371 1372 1373 1373 1374 1374 1374 1374 1374 1375 1376 1376 1377 1378 1378 1379 1380 1382 1383 1384 1384 1386 1389 1389 1390 1391 1391 1393 1396 1396 1397 1400 1401 1402 1402

1417 1419 1424 1425 1425 1427 1428 1429 1429 1429 1429 1429 1431 1431 1431

1. INTRODUCTION Lignocellulosic biomass has enormous potential to contribute to worldwide energy, chemical, and material demands in a renewable, sustainable manner. In the United States alone, it has been estimated that 30% of the current petroleum usage could be offset via biomass conversion to transportation fuels.1 Indeed, in both the United States and European Union, biomass is currently the most abundant source of energy from renewable sources. Given escalating energy demands worldwide, especially in liquid transportation fuels,2,3 coupled with concerns of global climate change through continued use of fossil fuel resources, it is likely that biomass utilization will be a primary contributor in the near- to mid-term to the global sustainable energy portfolio.4−7 All plant cells exhibit thick cell walls that primarily consist of polysaccharides and the aromatic polymer lignin. These polymers have evolved to form complex composite materials that render plant cells highly resistant to attack from pathogens.8 Plant polysaccharides primarily consist of cellulose, the β-1,4 linked homopolymer of glucose; hemicellulose, a heterogeneous, branched polysaccharide primarily made up of a β-1,4 linked polymers including xylan, glucoronxylan, xyloglucan, glucomannan, and arabinoxylan backbones with heterogeneous side chains;9 and pectin, a typically minor component in cell walls consisting of a complex set of polysaccharide polymers enriched in α-linked galacturonic acid or galacturonic acid and rhamnose monomers.10,11 Lignin is a heterogeneous, branched, alkyl-aromatic polymer comprising three phenyl-propanoid monomers linked by myriad C−O and C−C bonds that are likely formed through radical coupling reactions during cell wall synthesis.12 The enzymatic machinery for the synthesis of plant cell walls is a matter of intense research with many outstanding questions.8−15 Cellulose, hemicellulose, and lignin represent approximately 20−50%, 15−35%, and 10−30% of plant cell walls, respectively, on a dry weight basis.16 Due to its prevalence as the most abundant polymer in terrestrial plants, cellulose is the most abundant biological material on Earth. In addition, cellulose is the most recalcitrant carbohydrate polymer to catalytic degradation when compared to other plant cell wall polysaccharides.6 Cellulose serves a key structural function in plants and is synthesized by complex enzymatic machinery during cell wall synthesis.14,15 The sugars covalently locked in cellulose and hemicellulose represent a vast, renewable feedstock for the production of fuels and chemicals. Lignin, in addition, represents a potential feedstock for valorization to

1403 1404 1404 1405 1406 1409 1409 1410 1411 1411 1411 1415 1416 1309

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 1. Overall view of a conventional biochemical conversion process to produce fuels and chemicals from lignocellulosic biomass. Cellulase enzymes can be used to convert the cellulose portion of nonfood biomass, such as agricultural waste and energy crops, into fermentable sugars for subsequent conversion to renewable fuels and chemicals.

fuels and chemicals although selective conversion of lignin to value-added products remains a significant challenge.17−19 For production of fuels and chemicals from lignocellulosic biomass, overcoming the heterogeneity and recalcitrance of plant cell walls in a cost-effective manner at scales sufficient to offset fossil-fuel-derived resources is a major technical challenge. Second-generation biofuels production facilities are currently under development, and several have been constructed to date around the world, with a primary aim to convert lignocellulosic biomass to ethanol. These facilities generally utilize a “biochemical conversion” process wherein biomass is first size reduced through milling or chipping, followed by a mild thermochemical pretreatment step to render plant cell wall materials more amenable to attack by biocatalysts. The enzymatic hydrolysis step then depolymerizes cellulose and residual hemicellulose to sugars. The last step upgrades and converts sugars to fuel or chemicals (Figure 1).20 In the large suite of process options, the biomass depolymerization steps, namely pretreatment and enzymatic hydrolysis, have long been identified as the most costly portion of the conversion process.21−23 Many different pretreatment options have been examined including dilute acid, hot water, steam explosion, ammonia fiber expansion, alkaline, lime, maleic acid, and others, as extensively discussed in recent reviews and comprehensive technology comparisons.24−30 The enzymatic hydrolysis step represents the second portion of biomass depolymerization and is a major cost driver in bioethanol production due to the high cost of enzyme production.21−23,31 However, cellulolytic enzymes are incredibly selective for glucose production and produce fewer downstream catalyst inhibitors relative to high temperature deconstruction. Thus, significant efforts have been expended to understand and improve natural paradigms for enzymatic depolymerization of biomass with the aim to decrease

the cost of sugar production for fuels and chemicals production.6,32−34 Recently, cellulase enzyme research has been accelerated due to renewed interest in the production of ethanol from lignocellulosic biomass. Ethanol is a useful blend stock for light-duty vehicles in the transportation sector, but there are significant issues with its large-scale use including problems associated with hygroscopy, blending limits with gasoline, and the need for a new distribution system beyond petroleumderived fuels. Thus, third-generation biofuels are now undergoing focused research and development with the goal of costeffective production of infrastructure-compatible fuels from lignocellulosic biomass including fuels to fulfill demands in the gasoline, diesel, jet fuel, and maritime sectors. From the perspective of biochemical conversion processes that utilize carbohydrate intermediates such as clean sugars and carbohydrate derivatives as feedstocks, this body of work can be very broadly categorized into biological and catalytic upgrading of sugars to hydrocarbon fuels.5,35−40 Multiple strategies have emerged for each class of fuel and chemical production, which have been widely reviewed in the past several years.5,35−40 Many of these strategies still rely on the production of sugars or sugar derivatives, and thus, there remains significant incentive to reduce the cost of selective biomass depolymerization to carbohydrates through some combination of thermochemical pretreatment and enzymatic hydrolysis. Many organisms across the kingdoms of life have evolved the necessary enzymatic machinery for converting cellulose to soluble species for a food and energy source. Given the complexity of plant cell walls, most biomass-degrading organisms employ a battery of enzymes with synergistic function to break down polysaccharides41,42 and, in some cases, lignin.43−47 To date, two primary paradigms have been discovered in cellulose depolymerization: the “free” enzyme 1310

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

enzymes. Cellulases, unlike many commonly studied enzymes, act on an insoluble substrate, and cellulose represents a complex, heterogeneous macromolecule; thus, we first briefly review physical and chemical aspects of cellulose relevant to enzyme− substrate interactions. The initial structural reports of cellulases primarily beginning in the early 1990s enabled identification of substrate interactions and catalytic mechanisms, and given that most fungal enzymes that depolymerize cellulose employ a hydrolytic mechanism, we discuss general aspects of how cellulases prime cellulose for hydrolysis. We then briefly review research efforts into cellulase mechanisms up until the initial structural reports. Many well-characterized cellulases are multimodular with binding modules and linkers that enable multiple functions within the same enzyme; the efforts dedicated to the study of binding function and multimodularity are reviewed here as well. As fungi and most other biomassdegrading organisms secrete enzyme cocktails, we then discuss the primary components of fungal cellulolytic cocktails including glycoside hydrolase (GH) family 7, 6, 5, 12, and 45 cellulases, in this order. We note that we do not discuss βglucosidases (GH1 and GH3 enzymes), which cleave cellobiose to glucose in solution as these enzymes have been recently reviewed and primarily act on soluble substrates.63 Additionally, fungi and other biomass-degrading organisms employ a vast diversity of enzymes aimed at other substrates in the plant cell wall such as hemicelluloses and pectins; these polysaccharides are also not reviewed here, primarily because cellulose is the most recalcitrant carbohydrate substrate in the cell wall compared to the other polysaccharides, but the importance of other cell wall polysaccharides is of significant relevance for biomass conversion. For each GH family, we focus on the molecular-level aspects of cellulase action by reviewing studies from structural biology, biophysical and biochemical measurements including microscopy, spectroscopy, scattering, and modeling. All of these tools are needed to develop a comprehensive, mechanistic view of cellulase structure and function. Additionally, we review exciting new discoveries in mechanistic paradigms for cellulose depolymerization, namely the recent discovery of lytic polysaccharide monooxygenases (LPMOs), which employ copper, oxygen, and a reducing agent to oxidatively cleave cellulose. In summary, work in recent years has demonstrated that substantial gains are still possible in reducing enzyme loadings for biomass conversion, and for further gains, fundamental science, enzyme discovery and screening, and improved methods for enzyme engineering will be required.

paradigm and the cellulosomal paradigm. The free enzyme paradigm represents the case wherein enzymes diffuse as single catalytic units, often accompanied by binding modules covalently attached together via linker domains, exemplified by the enzyme suite from the filamentous fungus Trichoderma reesei (or Hypocrea jecorina).41 The primary mode of action of free cellulases is one wherein endoglucanases (EGs), or nonprocessive cellulases, act by cleaving cellulose chains in amorphous regions, and cellobiohydrolases (CBHs), or processive cellulases, attach to cellulose chain ends and depolymerize and hydrolyze cellulose chains typically into disaccharide units, down the length of a chain. In solution, cellobiose is then cleaved by β-glucosidases into glucose for cellular uptake by organisms. The recent discovery of selective, oxidative enzymes adds another element of endo-acting action to this paradigm wherein chains are likely cleaved in crystalline regions.48−52 The other well-characterized paradigm for enzymatic biomass degradation is the use of cellulosomes wherein noncovalent cohesin−dockerin interactions enable large complexes up to hundreds of enzymes to operate in close proximity on large protein scaffolds. Cellulosomes were first discovered in the anaerobic rumen bacterium, Clostridium thermocellum.53−56 More recently, additional paradigms have perhaps begun to emerge57,58 including one with multimodular enzymes that contain multiple catalytic domains (CDs) per protein, which appears to be a natural “interpolation” between free cellulases and cellulosomes in function,59 and another with polysaccharide utilization loci, but additional characterization of these potential new paradigms will be required. Of all biomass-degrading organisms, fungi play a pivotal role, as they are responsible for the vast majority the biomass degradation in nature. Fungi have colonized a vast range of terrestrial and marine environments, and are thus of vital importance for the recycling of carbon on Earth, a capacity with broad implications in global ecology, biogeochemistry, agriculture, and, more recently, for industrial applications. Broadly, two primary modes of fungal biomass degradation exist employing either enzymatic or chemical means to break down biomass. Brown-rot fungi initially utilize Fenton chemistry to generate hydroxyl radicals, which attack plant cell walls via powerful oxidation reactions.60,61 Conversely, filamentous fungi characterized as soft-rots and white-rots have been long known to primarily employ enzymatic means to break down biomass. Many filamentous fungi produce high titers of effective cellulolytic enzymes employing the “free” enzyme paradigm mentioned above, and thus, filamentous fungal enzyme production has become a cornerstone of industrial biofuels research and development.6,34 In particular, the isolation of the soft-rot ascomycete fungus T. reesei in the South Pacific in the 1940s followed by its characterization at the Natick Research Laboratories marked the beginning of the development of filamentous fungi for biomass conversion purposes.62 Subsequently, a wealth of mechanistic information regarding fungal cellulase structure and function has been reported from the 1980s onward. This research area has been accelerated in the past decade by significant governmental and commercial investments into research into biofuels production worldwide. Here, we review enzymatic mechanisms utilized by fungi to depolymerize cellulose, with the primary focus on discoveries made since the first structural reports for each enzyme family. We highlight open questions related to furthering our understanding of these biologically and industrially important

2. CELLULOSE Cellulose is the β-1,4-linked homopolymer of β-D-glucose and exhibits a reducing and nonreducing end, the former of which can ring open to produce an aldehyde form. Given its natural prevalence and utility as a fuel and chemical precursor, and as a material with myriad applications, the study of cellulose is vast and diverse. As such, we briefly review salient physical and chemical properties relevant to cellulose deconstruction by cellulolytic enzymes in nature. Even this narrowing of scope leaves a great deal of informative and influential work to be addressed. We note this section is by no means meant to be an exhaustive review of cellulose structure, but, rather, a brief introduction setting the stage for discussion of cellulose deconstruction and the methods used to study cellulases. Lastly, we note that the mechanisms of cellulose synthesis, which are key for elucidating its structure in plants, have long been studied, 1311

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 2. Natural and synthetic cellulose polymorphs. For each of the four polymorphs, the “end-on” view is shown at top and the “top-down” view at bottom. Celluloses Iα and Iβ are naturally occurring polymorphs exhibiting only intralayer hydrogen bonding.88,89 The differences between the two polymorphs are most easily observed from the “top down” view, which illustrates the subtle differences in interlayer chain stacking. Celluloses II and IIII, the result of chemically pretreating cellulose I, are significantly different in their chain stacking arrangement.90,94 Hydrogen bonding, shown in yellow, occurs between sheets as across the layers.

and identified as one of the major challenges in plant

2.1. Cellulose Structures

biology.14,64−66 Exciting recent work has begun to illustrate

Enzymes that break down recalcitrant polysaccharides must overcome multiple challenges in their catalytic action. At the atomic level, β-1,4-linked polysaccharides exhibit incredibly

the molecular level details of cellulose biosynthesis.15,67−70 1312

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 3. Four of the putative Iβ microfibril models proposed in recent years. For many years, the 36-chain diamond shape has been the prevailing model.64,104−108 NMR and X-ray scattering techniques have suggested alternative models may better fit the data. Fernandes et al. suggested the 24chain rectangle and diamond models, the former of which was also proposed by Thomas et al.102,103 Newman et al. recently suggested the 18-chain diamond models fits wide-angle X-ray scattering data well.109

oriented chains was originally reported by Gardner and Blackwell in 1974 using cellulose from the algae Valonia ventricosa.85 In 1984, plant cellulose was further shown to be a mixture of the cellulose Iβ and Iα polymorphs in two papers from Vanderhart and Atalla.86,87 In 2002 and 2003, Nishiyama and co-workers reported refined structures of both cellulose Iβ and Iα obtained from a tunicate (Halocynthia roretzi) and from the freshwater algae Glaucocystis nostochinearum.88,89 These structures were obtained from synchrotron X-ray and neutron diffraction studies of oriented cellulose films and were obtained at nearly atomic resolution in both cases such that hydrogen bond patterns could be established. The primary differences between the cellulose Iβ and Iα polymorphs reside in the hydrogen bonding patterns and the interlayer chain stacking arrangement (Figure 2). Cellulose Iβ forms two different layers, dubbed the “center” and “origin” layers, whereas cellulose Iα forms a unit cell with a single chain. In both cellulose I polymorphs, the hydrogen bonds only exist within single layers with no intersheet hydrogen bonds, which was noted at the time to be surprising, further suggesting that the van der Waals contacts in cellulose contribute a great deal to its overall thermodynamic stability. It should also be noted that both cellulose structures originated from cellulose microfibrils that are known to be much bigger in diameter than elementary plant cellulose microfibrils, which may have an impact on the structures, as discussed in more detail below. Certain chemical treatments can convert cellulose I to other crystalline forms. Treatment with solvents such as sodium hydroxide (known as mercerization)90 or dissolution in ionic liquids91,92 can convert the parallel chains in native cellulose into an antiparallel arrangement with both inter- and intralayer hydrogen bonding interactions, producing cellulose II (Figure 2). The structure of cellulose II was also recently presented.90,93

strong covalent bonds. In an influential study, Wolfenden and colleagues estimated that the uncatalyzed half-life of Oglycosidic linkages such as those found in cellulose, chitin, and other polysaccharides are 2 and 4 orders of magnitude more stable than DNA or peptide bonds, respectively, to uncatalyzed hydrolysis at neutral pH with a half-life of an astounding 5 million years.71,72 Indeed, chemically intact cellulose and chitin (a similar substrate based on N-acetylglucosamine monomers) have been found in fossilized plants that are significantly older.73−77 Given these bond strengths, GHs are incredibly proficient enzymes in that they can provide rate enhancements (kcat/kon) up to 1017-fold. This in turn makes GHs the most powerful hydrolytic enzymes known to man that do not employ metals or other cofactors.78,79 The covalent bond strength of the β-1,4 glycosidic linkage in cellulose is merely one of the challenges that enzymes face when depolymerizing this recalcitrant substrate. Cellulose microfibrils in plants pack into tightly bound, crystalline lattices wherein only a fraction of the chains are accessible to enzymatic attack on the microfibril surface, which forms yet another barrier that fungi and other biomass-degrading organisms must overcome. Individual cellulose chains are cosynthesized by large, membrane-bound terminal complexes, which simultaneously extrude and assemble multiple chains of cellulose into elementary microfibrils.14,64−66 Understanding how enzymes depolymerize cellulose from a mechanistic perspective is predicated on knowing the localized crystalline structure of cellulose chains and the shapes and properties of the cellulose microfibrils. Each question is briefly described below. Cellulose can pack into multiple crystalline forms, or polymorphs. Natural systems, including plants, produce cellulose I, the study of which dates back to the 19th century.80−84 The structure of cellulose I as a set of parallel1313

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 4. Computer simulation suggests that the cellulose Iβ microfibril twists by approximately 1.5° per cellobiose unit when it is solvated in water and the ends are not fixed. The existence of cellulose twist remains an open experimental question. A representative microfibril begins to twist after only 1 ns, shown at left. Reprinted with permission from ref 114. Copyright 2005 Elsevier Ltd.

corresponds to a cellulose microfibril consisting of 15−25 chains each. In 2011, the same group used a wide variety of scattering and spectroscopic tools to examine microfibril cross sections in spruce wood.102 With the assumption that the number of chains in the microfibril must be divisible by 6, their findings suggest that spruce wood microfibrils comprise about 24 chains. In terms of microfibril shape, they suggest either a diamond or a rectangular shaped microfibril, with the latter model fitting their experimental observations better (Figure 3). Moreover, their results suggest that the microfibrils are likely twisted and that the surfaces are disordered.102 In 2013, the same group again examined celery collenchyma cellulose microfibrils and demonstrated that the best-fit model was a 24-chain microfibril with 8 layers of 3 chains each.103 Subsequently, a study from Newman et al. applied similar methods to mung bean cellulose.109 They fit their scattering and NMR data to a 36chain microfibril model but were not able to match the experimental data to the calculated diffractograms. Conversely, Newman et al. were able to achieve good agreement between the computed and measured diffractograms and spectra using a 24-chain and 18-chain model, with the 18-chain model providing the best fit (Figure 3). Overall, this recent body of work suggests that cellulose microfibrils in higher plants may be smaller than the commonly assumed 36-chain models. Computer simulations have also been applied alongside of structural studies to gain additional insights into molecular level cellulose properties.110−113 Simulations in particular have provided supporting evidence for the twisting of microfibrils, and as such, several recent highlights that are complementary to structural, NMR, and scattering studies are discussed here. In an early study, Matthews et al. simulated a 36-chain model of cellulose Iβ and demonstrated that the microfibril is prone to twisting by approximately 1.5° per cellobiose unit (Figure 4).114 Additional computational studies have reported twisting in microfibrils of various sizes and shapes with multiple empirically fit energetic representations (force fields) for the carbohydrates,115−118 and recent simulations have suggested the

Less severe chemical treatments, for example in ammonia, can also convert cellulose I or cellulose II into cellulose III (sometimes called cellulose IIII and IIIII indicating that it originates from cellulose I or II, respectively).94 Cellulose III forms staggered layers with intra- and interlayer hydrogen bonding interactions, unlike in cellulose I where the chains pack into flat layers (Figure 2). Both cellulose II and III typically exhibit greater digestibility by cellulase enzymes. 20,95,96 Cellulose IV has been suggested as another polymorph,97 but less characterization has been conducted to date, and it has been recently suggested that it is likely quite similar to cellulose Iβ.98 Several thermodynamic measurements have been conducted in the 1980s to understand the differences between cellulose I and II, which revealed relative enthalpic stabilities;99 additional rigorous thermodynamic and kinetic experiments will be required to more fully understand the interconversion between cellulose polymorphs. 2.2. Cellulose Microfibrils

In the cellulose Iβ and Iα structures determined from Nishiyama et al.,88,89 the cellulose microfibrils were very large in diameter (10−20 nm).88,89 Conversely, plant cellulose chains pack into microfibrils with much smaller diameters,100−103 and, thus, have much higher surface-to-volume ratios than cellulose from algae or tunicates. Regardless of the source, the cross-sectional shape of the microfibril is quite likely dictated by the shape and arrangement of the terminal synthase complexes. It has long been hypothesized that the elementary microfibril size in plants is 36 chains, primarily on the basis of imaging of terminal complexes combined with other analyses (Figure 3).64,104−108 More recently, new reports have begun to challenge the notion of a 36-chain model by applying a variety of scattering and nuclear magnetic resonance (NMR) techniques.100,102,103,109 Kennedy and co-workers examined cellulose from celery collenchyma, which is similar to cellulose from higher plants, using NMR and scattering methods. In a careful study with their assumptions discussed thoroughly, they conclude that the microfibril diameters are between 2.4 and 3.2 nm, which 1314

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

physical basis for this twisting phenomenon.119 As mentioned above, experimental studies have also suggested that plant microfibrils indeed exhibit twist.102 The effects of cellulose microfibril twisting on cellulase action largely remain unanswered. Beyond atomistic simulations, multiple groups have developed coarse-grained models for studying cellulose at increasingly larger scales and at a variety of resolutions.120−127 Coarsegrained models are quite important for the study of cellulose phenomenon such as enzyme action on the substrate, interactions of multiple cellulose microfibrils, or interactions with other biopolymers, all of which occur at long length and time scales. Going forward, improved simulation models for cellulose coupled to the development of high-resolution imaging will likely converge such that simulation and experiment can be directly connected. To this end, Ciesielski et al. recently reported the development of a transmission electron microscopy (TEM) method wherein atomic coordinates for cellulose microfibrils can be mapped onto the experimentally measured nanoscale architecture of the plant cell wall after mild thermochemical pretreatment.128 This study highlights the power of combined experimental and computational tools to understand the behavior of cellulose microfibrils in the cell wall context, which is key to understanding the physical and chemical environments that cellulases encounter during their enzymatic action. It is commonly thought that cellulose comprises crystalline and amorphous regions. Although significant research has been conducted toward this question, there remains much controversy around this topic, and a clear definition of amorphous cellulose does not exist. Habibi et al. note that amorphous cellulose likely arises from chain dislocations wherein microfibrils distort due to internal strain.129 Taking the definition of amorphous cellulose as distorted regions along the length of crystalline fibers aligns well with the conventional model of cellulose hydrolysis by GH enzyme cocktails32 mentioned previously. Namely, EGs cleave cellulose chains in amorphous regions along the cellulose crystals, and CBHs attach to chains and processively hydrolyze glycosidic linkages down the chain. During processive hydrolysis in crystalline regions, CBHs must decrystallize individual chains of cellulose, which requires thermodynamic work. Several studies have examined the amount of work that is required to decrystallize chains from the surface of cellulose microfibrils with computer simulation.130−132 Beckham et al. examined chain decrystallization of cellulose Iβ, Iα, II, and III demonstrating that the work to decrystallize chains of celluloses Iβ and Iα is greater than that of celluloses II and III.130 Moreover, it was shown that an increasing number of inter- and intralayer interactions increase the decrystallization work in essentially an equivalent manner, further suggesting that interlayer and intralayer interactions contribute similarly to the thermodynamic stabilization of cellulose.130

Amorphous cellulose is often represented by phosphoric acid swollen cellulose (PASC), regenerated cellulose, or soluble, oligomeric substrates. Each of the “clean” cellulose substrates (i.e., free of hemicellulose or lignin) exhibit significantly different properties including degree of polymerization, degree of crystallinity, microfibrils and fiber shapes, available reactive surface area for cellulase reactions, and possibly more variables that substantially impact the effectiveness of enzymatic deconstruction. The use of many model substrates, often with significant variation even within substrates, can make direct comparison of cellulase performance data difficult across laboratories and studies. For pretreated substrates, which typically contain other residual plant cell wall polymers, direct comparisons are even more challenging. Thus, we note that caution should be taken when comparing activity data across studies as substrate variations can greatly influence the outcomes of activity measurements. As the study of cellulase action is invariably linked the substrate properties, these properties should not be overlooked. Detailed substrate characterization is essential in cellulase studies. Going forward, the combination of structural, nanoscale imaging, and modeling will continue to be important for elucidating the features of cellulose relevant to enzymatic deconstruction. Some of the more prevalent substrates are described below. 2.3.1. Microcrystalline Cellulose (MCC) from Plants. Purified, microcrystalline celluloses have been used extensively for comparing cellulose performance.96 These celluloses are made from wood pulp and are available commercially, including the following: Sigmacell, Solka-floc, and Avicel. Cotton linters, the dust-like byproduct from cotton processing, are also used for cellulose assays. Cotton linters are essentially fragmented, but chemically unmodified cotton fibers reduced to small size (ca. 100 mesh). Whatman No. 1 filter paper, commonly used to determine the international filter paper unit, is made from cotton cellulose. The amorphous cellulose content of MCC (Avicel) can be increased by a mechanical treatment process known as ball milling. Ball-milled cellulose, especially when prepared under conditions of low water content, had been shown to be considerably more digestible than the starting material.127,133 2.3.2. MCC from Microbes. Cellulose is produced by green algae and some bacteria, primarily of the genera Acetobacter, Sarcina, and Agrobacterium. Acetobacter xylinum cellulose is most commonly used for cellulose digestion experiments as it can be produced in fermenters in yields as high as 15 g/L.134 The bacterial cell produces protofibrils of approximately 2−4 nm in diameter, which are eventually bundled into ribbon-shaped microfibrils of about 80 × 4 nm2.135 Natural bacterial cellulose (BC) must be purified before use in enzyme assays. For A. xylinum cellulose, cellulose from culture medium is treated with sodium or potassium hydroxide, acetic acid, followed by repeated washing with ultrapure water.136 The resulting BMCC has microfibrils of approximately 0.1−10 μm in width, which is 100 times thinner than the microfibrils found in plant cell walls. It has also been demonstrated that purified A. xylinum BMCC has a degree of polymerization of about 800.137 Green algae in which crystalline cellulose is the major component of the cell walls include the Cladophorales (Cladophora, Chaetomorpha, Rhizoclonium, and Microdyction) and a few members of Siphonocladales (Valonia, Dictyosphaeria, Siphonocladus, and Boergesenia). V. ventricosa produces large cellulose microfibrils compared to plant cell wall microfibrils),138 which have been proposed to be in a 33 × 38 chain configuration with lengths of

2.3. Cellulose Substrates

Accurate experimental representation of the nano- and microstructures of cellulose is a notoriously difficult prospect. Throughout the cellulase literature, multiple model celluloses are used as representative crystalline and amorphous substrates. Commonly, Avicel, bacterial microcrystalline cellulose (BMCC), tunicate or algal cellulose, various forms of pretreated biomass, or non-natural polymorphs derived from the many types of cellulose I are used to represent crystalline cellulose. 1315

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

hundreds of nanometers to a few micrometers.139 Valonia and Cladophora produce celluloses with an exceptionally high degree of crystallinity, approximately 95% from XRD,140 and for this reason make ideal substrates for cellulase studies. Algal cellulose is indeed complex, perhaps more than plant cellulose, considering that, in each Cladophora microfibril, the two cellulose I polymorphs were suggested to coexist, alternating either longitudinally or laterally.141 Koyama et al. further suggested that there are three types of cellulose I polymorphs found in green algae: Iα-broad microfibrils, Iβ-flat ribbons, and Iβ-small microfibrils with random orientation.142 Although highly crystalline, the specific surface area of Cladophora cellulose powder has been reported to be as high as 95 m2/g from N2 gas adsorption studies,143 which is much higher than this value for BMCC, ∼1 m2/g. Today, the molecular and polymeric basis for the action of cellulases on algal celluloses is not clear. The advantages of using BMCC and algal celluloses are thus considerable, yet inconsistencies in preparation practices between laboratories can introduce difficulties in comparing assay results. 2.3.3. Phosphoric Acid Swollen Cellulose (PASC). Walseth first developed a procedure for producing highreactivity cellulose suitable for cellulose activity studies by swelling air-dried cellulose in 85% phosphoric acid.144 After dissolving crystalline cellulose, the solubilized precursors can be formed, which can subsequently be hydrolyzed. Cellulose dissolution in phosphoric acid involves two processes: an esterification reaction between hydroxyl groups of cellulose and phosphoric acid to form cellulose phosphate and a competition of hydrogen bond formation between the hydroxyl groups of cellulose and hydrogen bond formation between hydroxyl groups of cellulose and water molecules or hydrogen ions.145 During this acid treatment, some glycosidic bond hydrolysis occurs which reduces the degree of polymerization, although the effects can be controlled. Following regeneration with water, free phosphoric acid is recovered, and the resulting cellulose is amorphous without significant recrystallization. PASC has been used widely as a test substrate for CBHs. Note that, in contrast to PASC, which has no chemical modification, carboxymethyl cellulose (CMC) retains esterified carboxymethyl side chains and is thus suitable only for testing EG action. 2.3.4. Cellulose Crystallinity. The crystallinity index (CI) of celluloses is a key parameter when selecting substrates for enzyme assays. Cellulose CI has been measured using several different techniques including XRD, solid-state 13C NMR, infrared (IR) spectroscopy, and Raman spectroscopy.146 There have also been several methods used for calculating CI from the raw spectrographic data, particularly for XRD. Methods using Fourier transform (FT)-IR spectroscopy determine CI by measuring relative peak heights or areas.147,148 Thygesen et al. compared four different analysis techniques involving XRD and reported that the CI of Avicel cellulose varied significantly depending on the technique used.149 In 2012, Park et al. made critical comparisons between the different techniques using XRD and solid-state 13C NMR.150 Comparisons were made with literature data for the CI of one type of cellulose (Avicel PH101) using these methods. Park et al. also reported the CI values for seven crystalline cellulose preparations and BMCC, and further recommended that the simple, peak height XRD measurement be excluded from comparisons and the remaining values from other XRD and NMR methods be averaged to obtain the “best value” for CI. These results are shown in Table 1.

Table 1. Values for CI from Combined XRD and NMR Analysis150 cellulose tested

XRD peak deconvolution

XRD amorphous subtraction

NMR C4 peak separation

average

Cladophora BMCC Avicel PH101 SigmaCell 50 SigmaCell 20 SolkaFloc

ND 73 61

80 82 78

ND 74 57

80 76 65

61 64 57

79 67 57

56 53 44

65 61 53

3. GLYCOSIDE HYDROLASE CATALYTIC MECHANISMS Given the diversity of monosaccharides and the multiple types of glycosidic linkages possible, carbohydrates form the most diverse set of biomolecules in nature. As such, the enzymatic machinery to synthesize, modify, and deconstruct carbohydrates is vast.151−153 The Carbohydrate-Active Enzymes Database (www.CAZy.org) is a manually curated list of the primary enzyme classes known to act on carbohydrates.151−153 Since its inception in 1998, CAZy has become an invaluable resource in carbohydrate enzymology. More recently, a sister site, CAZypedia (www.cazypedia.org), has begun to develop descriptions of the CAZy Database entries for each enzyme class and family within each class. The protein classes covered in CAZy as of the time of this review include glycosyltransferases, carbohydrate esterases, polysaccharide lyases, auxiliary activities, carbohydrate-binding modules (CBMs), and GHs. Glycosyltransferases (EC 2.4-) catalyze the formation of glycosidic bonds with nucleotide phosphate or lipid phosphate leaving groups and to date have been classified into 95 families. 154 Carbohydrate esterases (EC 3.1.1- or 3.1.5-) are responsible for de-O- or de-N-acylation of polysaccharides, such as acetyl xylan esterases, and comprise 16 known families with multiple nonclassified sequences, perhaps suggesting that other families exist. Polysaccharide lyases (EC 4.2.2-)155 employ β-elimination reaction mechanisms to cleave uronic acid-contaning polysaccharides, such as those commonly found in pectins.10,11 To date, polysaccharide lyases form 23 characterized families.151,155 “Auxiliary activities” or AAs are a more recent addition to CAZy,152 and currently include 12 families of enzymes in total, with 8 known to be active during lignin degradation and 4 known to be directly active on polysaccharides, namely LPMOs. Oftentimes, catalytic function for carbohydrates is associated with binding function, and thus, CAZy contains a classification scheme for CBMs as well, which to date represent 69 families, as will be discussed in section 5.156 Lastly, GHs are included in the CAZy Database. GHs are the primary drivers of enzymatic polysaccharide degradation in nature and represent a vast set of enzymes. To date, 132 GH families have been characterized. As most cellulases are GHs, we focus on their description in this section from a general, mechanistic viewpoint. Individual GH families that fungi employ, namely GHs from Families 5, 6, 7, 12, and 45, are described in separate sections below. We note that newly discovered LPMOs do not employ a hydrolytic mechanism; their mechanism of action is described in section 11. 3.1. Retaining and Inverting Mechanisms

As proposed by Koshland in 1953, nearly all known GHs employ one of two mechanisms: either retaining or inverting hydrolysis.157 Inverting mechanisms proceed via a single 1316

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Scheme 1. Two Primary Catalytic Mechanisms of GHsa

a

(A) Inverting GHs employs a single displacement catalytic mechanism wherein a water molecule conducts nucleophilic attack at the anomeric carbon of the −1 sugar, a catalytic base abstracts a proton from the attacking water molecule, and a catalytic acid transfers a proton to the glycosidic oxygen to cleave the glycosidic linkage, resulting in an inversion of stereochemistry at the anomeric carbon. (B) Retaining GHs employs a two-step, double displacement catalytic mechanism. In the first step, the nucleophilic residue attacks the anomeric carbon simultaneously with the proton transfer from the acid residue to the glycosidic oxygen resulting in the formation of the glycosyl-enzyme intermediate and cleavage of the glycosidic bond. In the second step, a water molecule enters the active site and attacks the anomeric carbon simultaneously transferring a proton to the catalytic base, thus restoring the enzyme active site for subsequent catalysis.

While the mechanisms put forth by Koshland are now widely accepted,159−161 there has historically been some debate about whether bond cleavage and formation occurs in concerted steps in an SN2 reaction, as shown in Scheme 1, or via a carbocation intermediate as in an SN1-type reaction,162−166 or even via an acyclic oxocarbenium ion,167 although subsequent studies have not supported ring-opening as part of the mechanism.168 Researchers now discuss oxocarbenium-like transition states (TSs), rather than a distinct ionic intermediate, as the lifetime of an ion in an enzyme active site would be less than a molecular vibration.169−171 The oxocarbenium-like TSs show extended distances between atoms involved in bond cleavage and formation, sometimes referred to as “exploded” TS.161,170 GH enzymes often have multiple carbohydrate binding sites in their CDs. For example, family 7 glycoside hydrolase (GH7) cellulases exhibit at least 9 subsites for binding cello-oligomers, and catalysis generally takes place 2 glucose units from the reducing end of the chain bound in the enzyme tunnel.172−174 As GHs ubiquitously feature binding subsites for carbohydrate residues in polysaccharides, Davies, Wilson, and Henrissat proposed a scheme for the naming of carbohydrate binding sites similar to the scheme previously proposed by Biely, Krátký, and Vršanská.175,176 Specifically, they propose the use of the −n to +n system used by molecular enzymologists, where −n represents the nonreducing end and the +n represents the reducing end subsites. Thus, glycosidic bond cleavage in GHs occurs between the −1 and +1 subsites. This nomenclature has become nearly universally accepted in the carbohydrate enzymology and structural biology communities and will be used throughout this review.175,176

catalytic step (Scheme 1A), wherein a water molecule conducts nucleophilic attack at the anomeric carbon of an oligosaccharide or polysaccharide, a proton from water is transferred to the catalytic base, and a proton is transferred from the catalytic acid to cleave the glycosidic linkage. Generally, both the catalytic acid and base residues exhibit carboxylate groups (i.e., Asp or Glu). This reaction results in an inversion of stereochemistry at the anomeric center. Before the next catalytic cycle, the catalytic acid and base must be reset to their configuration for catalysis. Scheme 1A illustrates the inversion from a β-linkage to an αlinkage. Conversely, retaining mechanisms are two-step reactions (Scheme 1B). The first step in retaining hydrolysis (typically termed “glycosylation”) involves proton transfer from the catalytic acid and attack at the anomeric carbon by the nucleophile residue to form a covalent glycosyl-enzyme intermediate and invert the stereochemistry of the sugar covalently bound to the enzyme. In the second step, termed “deglycosylation”, a water molecule enters the enzyme active site and conducts nucleophilic attack at the anomeric carbon. The glycosyl-enzyme intermediate bond is broken, and a proton is transferred to the catalytic base (which was the acid in the first step) resetting the enzyme and again inverting the stereocenter at the anomeric carbon for an overall net retention of stereochemistry. In retaining mechanisms, the carbohydrate product of glycosylation is typically not assumed or illustrated to participate in deglycosylation, and generally, both the catalytic acid and nucleophile are carboxylate residues. Several GH families have been discovered that do not follow this typical paradigm, which have been recently reviewed.158 All cellulases reviewed here follow the typical retaining or inverting hydrolysis mechanisms shown in Scheme 1 with the exception of LPMOs reviewed in section 11. 1317

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 5. Representations of the IUPAC conformations of six-membered rings: chair (C) has four atoms in the same plane, with one atom above that plane and an atom on the opposite side of the ring below the plane; envelope (E) has five atoms in one plane, with the sixth either above or below the plane; half-chair (H) has four atoms in one plane, with one atom above the plane and an adjacent atom below the plane; skew (S) has four atoms in the same plane, with one atom above the plane and an atom two positions away below the plane; and boat (B) has four atoms in the same plane, with two atoms on opposite sides of the ring either both above or both below the plane.

lysozyme,186 which was the first structure of an enzyme ever solved.187 On the basis of analogy with SN1 transition states, it was suggested that the ring distortion is a key component of the hydrolysis mechanism, contributing to the formation of a carbonium ion and weakening the C1-glycosidic bond.186 Other research groups have proposed that the role of ring distortion is overestimated on the basis of lack of difference in observed binding constants188 or calculated low-energy conformations.189 However, structural studies of the hen egg-white lysozyme confirmed that a key substrate ring is distorted from the chair conformation, which investigators attributed to forming a geometry that supported formation of an oxocarbenium ion at C1190 or to weaken the scissile glycosidic bond by creating a higher-energy ground state.162,191 In one of the first published structures of a cellulase, specifically a family 6 glycoside hydrolase (GH6), Rouvinen et al. noted that the sugar ring in the −1 subsite (at the time referred to as the “B subsite”) was distorted from the solutionstable 4C1 conformation. The authors were originally unsure if this distortion was functionally significant.192 Barr et al. later reported that GH mutants that could allow the −1 sugar to relax to a chair conformation exhibit increasing binding affinity and decrease hydrolytic activity, supporting the proposal that ring distortion is important for catalysis.193 Zou et al. observed distorted sugars in the −1 subsite studies of T. reesei Cel6A (TrCel6A) with a nonhydrolyzable ligand, identifying the residue whose steric clash with the hydroxymethyl group forces the sugar ring into a distorted conformation.194 They proposed that the distortion is integral to the catalytic mechanism, providing for nonperiplanar orientation between the scissile bond and a doubly occupied, nonbonding orbital from the ring oxygen.194 As reviewed in detail below, many more structures were solved of cellulases that have revealed puckered carbohydrate rings in the −1 subsites; certainly, the same is also true for GHs beyond cellulolytic GHs, but these additional structures are outside of the scope of this review. For capturing such catalytic reaction coordinates, GH structural biology efforts have relied on the use of synthetic ligands, such as thio-linked sugars, which cannot be hydrolyzed, or with other transition state analogue ligands in native enzymes.161,178,195−197 Withers et al. pioneered the use of fluoro-sugars in mechanistic GH studies wherein fluorine atoms are substituted for hydroxyl groups in pyranose sugars. For example, substitution at the C2 position can result in a decrease in deglycosylation rates in retaining GHs, enabling capture of glycosyl-enzyme intermediates.198−202 Other substitutions can lead to the ability to probe mechanistic steps in GHs when coupled to NMR, mass spectrometry, and enzyme kinetics analysis.203 Moreover, the use of native substrates in catalytically inactive mutants, such as the mutation of glutamate, a common acid/base to glutamine is a common approach. Through approximately a decade of work

3.2. Carbohydrate Ring Puckering

A significant number of crystallographic studies have been conducted to date to study GH catalysis, as detailed in subsequent sections of this review for cellulolytic enzymes, and many studies have employed transition state analogues to capture the Michaelis complex of carbohydrates in GH active sites. A universal observation to date is that GHs distort carbohydrate ring geometries in Michaelis complexes for catalysis away from the chair conformations that are thermodynamically stable in aqueous solution.159,161,169,177,178 To systematically classify sugar puckering geometries, Schwartz developed the original nomenclature for describing the 38 canonical puckering conformations of pyranose rings,179 which was subsequently adopted by the International Union of Pure and Applied Chemistry (IUPAC).180 This system describes pyranose rings as chair (C), envelope (E), half-chair (H), skew (S), and boat (B) conformations, as shown in Figure 5. The naming system further uses superscript and subscript numbers for each ring conformation to denote which atoms are outside of the reference plane formed by four atoms (e.g., B1,4 denotes a boat-shaped pyranose ring with the C1 and C4 carbons below the reference plane). To quantitatively describe puckering, Cremer and Pople proposed a spherical coordinate system that uniquely describes the pyranose ring conformations as a function of three parameters that describe a sphere (Figure 6).181 This has become the standard method to quantify the

Figure 6. 38 IUPAC designated puckering conformations for sixmembered rings are projected here on a two-dimension representation of the Cremer−Pople sphere.

degree of ring puckering and is quite useful when stable puckering geometries in enzymes are intermediate between the 38 IUPAC puckering geometries.182−184 Hill and Reilly introduced a system of triangular decomposition that is particularly well-suited for molecular simulations and also uniquely identifies the exact puckering conformation.185 Carbohydrate ring puckering in GH active sites has been identified since the seminal structural study of the hen egg-white 1318

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

oxocarboniom ion (instead forming a lower-energy oxonium ion−water complex) as well as lowering the transition state energy by stretching the scissile glycosidic bond.221 To investigate whether preferential properties for catalysis could be observed in isolated puckered monosaccharides, several studies have examined differences between puckered conformations of β-D-glucose. The metadynamics studies by Biarnés et al. investigated the relative stability of different puckered conformations by employing Cartesian-collective variables to explore the puckering potential energy surface.222 While the chosen collective variables resulted in distortions in puckering amplitude and energy barriers,223 Biarnés et al. found that bond lengths and partial charges of key atoms change as a function of puckering geometry, with puckered geometries observed in enzyme complexes displaying catalytically favorable properties such as elongated C1−O1 bond distance, shortened C1−O5 bond distance, and a higher partial charge on C1.222 Barnett and Naidoo performed a study of β-D-glucose puckering ensuring more complete sampling of puckering geometries, revealing the free-energy landscape calculated with the semiempirical method PM3CARB-1.182 The initial investigation was followed by a study in which they compared free energy differences calculated with different semiempirical methods against the density functional theory method B3LYP/6-311++G(d,p).183 The trends revealed by the semiempircal methods were qualitatively similar to notable quantitative differences: PM3CARB-1 revealed that the second-lowest energy conformation is BO,3, just 1.6 kcal/mol higher in free energy than the lowest-energy 4C1 conformation obtained, while B3LYP/6-311++G(d,p) identified BO,3 as the third-lowest conformation, at 5.5 kcal/mol higher than lowestenergy 4C1 conformation. Together, these papers confirm that different puckered geometries afford measurably different properties that would make them more or less amenable to catalysis, and they show the results are sensitive to the method employed. Recently, Mayes et al. employed a highly accurate electronic structure method (CCSD(T)/6-311+G(d,p)//B3LYP/6311+G(2df,p)) in a study that ensured thorough sampling of monosaccharide ring geometries.184 In addition to the 38 IUPAC puckered conformations, monosaccharides have exocylic groups that are free to rotate at ambient temperatures, resulting in the notorious carbohydrate flexibility that challenges structural and dynamic properties of carbohydrates.224 The study by Mayes et al. compared puckering behavior of five biologically important sugars, and the differences among them testify to the importance of exocyclic groups in defining puckering landscapes. For β-D-glucose, they confirmed that the puckered conformations employed by GH active sites offer a combination of catalytically advantageous properties, such as a higher partial charge at C1 to make it a better target for nucleophilic attack. Furthermore, these conformations have lower barriers for ring interconversion. In the following sections, we review specific details of each cellulase family represented prevalently in fungi, which comprise both inverting and retaining enzymes. Attention is given primarily to specific elements of the catalytic steps in each cellulase family. For more comprehensive reviews of general GH catalytic mechanisms, we refer readers to recent reviews.158,159,161,196

starting in the 1990s, it became clear that cellulases from a variety of GH families commonly distort pyranose rings from the 4C1 conformation to a puckered conformation, observed in the 1S3,204−208 2SO,192,194,209 BO,3,192,209 or 2,5B210 geometries. The preponderance of data indicating conservation of puckering in enzymes from a wide range of GH families and different organisms indicates that ring distortion indeed plays a central role in catalysis of glycosidic bond cleavage, and spurred efforts to further understand the role of puckering in catalysis. Proposals for the role of puckering include providing for antiperiplanar alignment of the ring oxygen lone pair with the leaving group (the scissile glycosidic bond), a requirement of Deslongchamps’s theory of stereoelectronic control,211 later referred by the more specific name “antiperiplanar lone pair hypothesis (ALPH)”.212 This requirement in enzymatic reactions has been refuted on the basis that predictions from this theory have been proven false,169,170 such as the prediction that α-linked substrates would retain their ground state, chair conformation during reaction.166,213 In counterpoint to such objections, Nerinckx et al. noted that observed pucker conformations seemingly counter to the ALPH could in fact substantiate the ALPH if they were points on a catalytic itinerary with the 1C4 inverted chair, rather than the solution-stable 4C1 chair conformation, and the 1C4 geometry could convert to 4C1 in a subsequent step.214 This suggestion opposes the “principle of least nuclear motion” which posits that enzymatic itineraries adopt conformations which minimize such large conformational changes.168 Deslongchamps further theorized that ring “distortion raises the energy of the ground state and thus lowers the energy of activation for bond cleavage”,166 in agreement with the earlier proposal by Jencks.162 Warshel vigorously repudiated this “strain theory” based on theoretical models showing that it cannot provide a significant catalytic effect.215,216 Additionally, researchers have noted that observed puckering conformations aligned β-linkages in an axial position suitable for nucleophilic attack.159,186,217,218 While it may be a factor in catalysis, it is not a universal feature of GH substrate puckered geometries.184 Warshel advocated that the most important enzymatic contribution to catalysis is stereoelectronic stabilization of the transition state,215,216 harkening back to Pauling’s theories on enzyme mechanisms.219 Blake et al. originally proposed a stereoelectronic argument for ring distortion, suggesting that the puckered conformation aids catalysis by allowing the ring oxygen to share charge with the anomeric carbon, stabilizing a carbonium ion formed during the proposed reaction mechanism.163,186 As previously discussed, it is now widely believed that GHs employ oxocarbenium-like TSs, rather than ionic intermediates, but the basic concept of the puckered geometry stabilizing a positive charge at the anomeric carbon still applies, and stabilizing the TS is exactly what Pauling and Warshel champion. The proposal that puckered ring geometries stabilize positive charge at the anomeric carbon continues to hold wide support.159,161,196,220 As modifications to the electronic structure of carbohydrate ring puckering is now an obvious feature of GH mechanisms, quantum mechanical approaches coupled to structural biology studies are a clear means to probe the mechanistic underpinnings of carbohydrate catalysis.159 A series of theoretical studies have been conducted to determine the effects of ring distortion on catalysis. Smith performed one of the earlier studies using 2-oxanol.221 This work suggested that ring distortion may obviate the need to form a high-energy 1319

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 7. Geneaology of T. reesei mutants developed internationally from World War II to about 2000. Mutagenesis was affected by radiation or chemical treatments. At the time, exposure to a linear accelerator was an effective means of irradiation. Most commercial preparations used today are based on proprietary improvements of RUT-C30.

cocktails.230 An important aside is that, later in the 1970s, Gauss and co-workers at the Bio Research Center Company (Nagoya, Japan) patented the concept of combining, in one tank, T. reesei fungal enzymes, milled biomass, and fermentative organisms.231 This revolutionary concept, later improved in a U.S. patent owned by the Gulf Research & Development Company,232 dramatically increased thermodynamic “pull” to products, and was called simultaneous saccharification and fermentation, or SSF. The wild-type strain of T. reesei, QM6a, has been thoroughly studied and, importantly, used to generate the modern strains of enhanced industrial microorganisms by radiation mutagenesis. T. reesei QM9414, generated from QM9123, was the first production grade strain (Figure 7).233 QM9414 was found to produce approximately 2 times more cellulase protein than QM9123,234 which was in turn superior to the QM6a parent strain by about the same extent.235 This era of very early strain improvement is today difficult to follow for a number of reasons, including the limited methods of protein and activity determination available at the time. For example, the Biuret copper oxidation assay, known today to be highly influenced by sugars and amino acids, was the dominant protein assay. Today, the bicinchoninic acid or dinitrosalicylic acid reagent assays are preferred.236 The other problem was that workers worldwide used vastly different cellulase performance assays, normally based on lab-specific digestion curves, given simply as “units”. A true universal cellulase assay, first proposed by Ghose in 1987, helped alleviate these problems by defining a relevant measure for cellulases, later known as the international filter paper unit.237 4.1.2. Early International Effort for Strain Improvement. Starting from QM9414, researchers at the Natick Lab used ultraviolet light to generate the MCB-77 series of mutants.238 MCB-77 strains demonstrated volumetric productivity of 90 IU/L/h, compared to 30 IU/L/h for QM9414.239 The M series of mutants, produced at Rutgers University from QM9414, yielded strain M-7.240 M-7 was then treated with

4. EARLY DEVELOPMENTS IN FUNGAL CELLULASES Prior to the landmark publications of the first structures of fungal cellulases, many biochemical experiments shed light on the nature of cellulase enzymes, especially from the model fungus T. reesei. As a preface to detailed structural and mechanistic descriptions, we first briefly describe some of the initial developments in cellulase enzymology related to the initial isolation of T. reesei, the early development work to produce mutant strains for industrial production of cellulases, and the first studies related to fingerprinting the cellulolytic cocktail of T. reesei. We note that the naming scheme of cellulases changed from their initial characterization to adopt the standard CAZy GH classification. Throughout section 4, we typically employ the “classical” name [e.g., the previously used T. reesei “CBH I” versus the new, commonly used T. reesei “Cel7A” (TrCel7A)], and provide the new, widely used enzyme names at the end of this section in Table 2. Beyond this section, we only utilize the new naming schemes for enzymes. 4.1. History of the Discovery and Improvement of T. reesei Strains: Premolecular Era

4.1.1. Early Work at U.S. Army Natick Laboratories. The history of the discovery and improvement in the Trichoderma strains has been thoroughly studied and reviewed.225−229 Many Trichoderma species are known today to be cellulolytic, a characteristic of their saprophytic life style. These strains include T. reesei, T. lignorum, T. koningii, T. harzianum, T. longibrachiatrum, T. virens, and T. pseudokoningii. Studies of decomposing cotton militaria sent from Bougainville Island (Solomon Islands) in the South Pacific to the U.S. Army Quarter Master Research and Development Center at Natick, Massachusetts, forged our understanding of an important cellulolytic fungus, eventually named T. viride QM6a.62 Note that T. viride was eventually reclassified as T. reesei in honor of Elwyn Reese.6 Postwar work at Natick by Reese and Mandels led to the modern concept of saccharification of biomass to fermentable sugars affected by the powerful T. reesei enzyme 1320

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

reesei CBH I [CD−linker−cellulose binding domain (CBD)] could be shortened or otherwise modified genetically to explore its possible role in cellulose saccharification, as discussed in section 5.251 At this time, new tools were proposed for improving the transformation efficiency of T. reesei, notably hygromycin resistance.252 In 1995, Nakari−Setälä et al. reported the development of a modified strain of T. reesei able to grow on glucose, where catabolic repression shuts down the normal cascade of cellulase expression, and produces targeted proteins using novel promotors.253 Although expression levels of the cellulase CDs reported in this study were low, about 100 mg/L, this report marked a very important early step in engineering promotor performance as well as movement toward engineered T. reesei strains that were able to secrete heterologous genes into the culture medium without the high background interference from other cellulases. A critical review in 1995 by Keränen and Penttilä outlined the state of the art for expression of heterologous proteins in filamentous fungi.254 The next year, a report from Ilmén et al. described the isolation and characterization of the cre1 genes of the filamentous fungi T. reesei and T. harzianum.255 We know today that, in multicellular ascomycetes, the C2H2 type transcription factor CreA/CRE1 acts as a repressor mediating carbon catabolite repression.256 In T. reesei, CreA/CRE1 binds to the promoters of the respective target genes via the consensus motif, 5′-SYGGRG-3′, and inhibits translation.257 Ilmén et al.258 also reported the first detailed analyses of the CBH I promotor. This work set the stage for follow-up work from the same group the next year which began to describe regulation of cellulase expression in T. reesei at the molecular level.259 In 2000, Pakula et al. reported the development of a novel isotope labeling approach for measuring protein synthesis and secretion in T. reesei in an attempt to understand the rate-limited events occurring in cellulase production.260 Over the next few years, work at VTT continued to investigate the protein transcription/expression factors responsible for efficient production of cellulases, including ACEI, ACEII, and RHOIII.261−263 A few years later, the VTT group reported the characterization of two genes, ire1 and ptc2, implicated in T. reesei unfolded protein response.264 Following this early work, reports began to emerge regarding the development of new strains of T. reesei that had been genetically modified to relieve catabolite repression and thus improve target protein production when cultures are grown on glucose. A leading example was the report from VTT summarizing development of strains of T. reesei in which the cre1 genes were deleted or modified by truncation (cre1-I) to improve expression of plant cell wall degrading enzymes under noninducing growth conditions.265 It is noteworthy that the cre1-I gene was found earlier in the RUT C30 strain and thus explains some of the ability of this strain, produced by chemical mutagenesis in the 1980s, to produce cellulases when grown on glucose. Kubicek et al.266 summarized promising genetic strategies for improving protein production in T. reesei. It is known that plant cell wall degrading enzymes are inducible and under both positive and negative transcriptional control. In this review, the authors summarize what is known about the known positive control elements include XYR1, ACE2, and HAP2/3/5, and the negative elements are ACE1 and CRE1. The authors further propose increased research focus on signal transduction pathways and other potential factors known to influence gene regulation, such as light cycling and intensity. More recently, Steiger et al.267 reported the next generation of T. reesei stains

nitrosoguanidine, which, after suitable performance selection on cellulose, resulted in the NG T. reesei mutant series. Strain NG14 showed approximately 5 times the filter paper activity, and twice the activities of cellobiase (or β-glucosidase) and EG of QM9414.241 NG-14 also produced about twice as much cellulase protein as QM9414, or about 1.4 mg/mL. Montenecourt and Eveleigh used T. reesei NG-14 to generate the next series of mutants by chemical mutagenesis, including the C and E series. From this series, the Rutgers C30 and E58 are of interest as they are carbon catabolite derepressed strains. Both of these strains showed about 4−6 times the cellobiase (or β-glucosidase) activity of QM6a, although the filter paper activity was comparable. The RUT-C30 strain was superior to all other strains reported at the time, producing about 2.2 g/L cellulase protein,240 a trait later attributed to an increased endoplasmic reticulum content.242 Application of RUT-C30 to SSF further enhanced its performance on cellulose and biomass.243 Another hypercellulolytic strain produced during the late 1970s at Rutgers was RL-P37.240 RL-P37 is thought to be the parent strain further developed by the U.S. cellulase industry and used for large-scale production today. During this time, reports of a Cetus Corporation propriety strain, known as L27, a regulatory mutant of QM9414, revealed that considerable effort was being applied to improving T. reesei by the industrial sector.244 Besides the cellulase improvement programs discussed above at Natick and Rutgers, industrial sponsorship of this work was also ongoing in the 1980s−1990s in Finland (VTT), the United States (Cetus), Japan (Kyowa), and France (CAYLA).233 Smaller programs were also underway in Portugal, Czechoslovakia, and India. A report from Portnoy (2011) suggested that the only other T. reesei strain roughly comparable to RUT-C30, CAYLA’s CL-847, was in fact derived from RUTC30.245 4.2. New Understanding from the Molecular Era: Cloning in T. reesei and S. cerevisiae

4.2.1. Cloning and Protein Production in T. reesei. In 1987, Penttilä et al. reported the successful transformation of T. reesei using a plasmid carrying the dominant selectable marker amdS.246 Heterologous DNA was found to be integrated at several different locations in the T. reesei genome, often in multiple tandem copies. The successful expression of an Escherichia coli β-galactosidase in T. reesei was also reported by these workers.246 This same year, the same group further reported the characterization of the major genes coding cellulases from T. reesei.247 The group from VTT reported the first review of homologous and heterologous protein expression in T. reesei, citing both promoter changes and gene inactivation as critical for successful outcomes.248 The strong, inducible promotor for the cbh1 gene is highly recommended in this review. In the same year, Harkki et al.249 reported the first example of genetic engineering of T. reesei for the purpose of varying the production of key cellulases.249 These workers produced a strain of T. reesei in which the cbh1 gene was inactivated and the gene coding the major EG, egl1, was cloned into a vector carrying the CBH I promotor and terminator, which resulted in the overexpression of EGI. The following year, 1992, progress to date and outlook for cloning heterologous genes in T. reesei were discussed in a landmark review from VTT.250 The ability to utilize a CBH I delete strain of T. reesei was demonstrated the following year by the publication of perhaps the first paper reporting the engineering of this enzyme. Srisodsuk et al. showed that the linker of the three domain T. 1321

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

of direct microbial conversion, also known as consolidated bioprocessing, inspired the early work in T. reesei molecular biology. The molecular cloning era of T. reesei cellulases began with simultaneous reports from the U.S. and Europe in the October 1983 issue of Bio/Technology that the T. reesei gene, cbh1, had been successfully cloned in the heterologous host, E. coli, using lambda phage technology.281 The work from North America was reported by the research team from Cetus Corporation,281 and the work from Europe was reported by VTT.282 At this time, it was the stated goal of Cetus Corporation to transform S. cerevisiae with the four dominant cellulase genes from T. reesei in order to engineer a cellulose digesting, ethanol producing organism. To demonstrate the level of international competition ongoing at this time, we note that four year later workers at VTT reported several landmark accomplishments relevant to this same objective. First, Penttilä et al. reported the successful cloning of active EG I and EG III in S. cerevisiae using the yeast phosphoglycerate kinase promoter.283 At this time, these workers noted that the molecular weight of EGIII was higher than expected and proposed potential problems with hyper-glycosylation in yeast. Importantly, in 1988, Penttilä et al. reported the expression of the CBH I and CBH II in S. cerevisiae.284 However, these heterologous enzymes were found to be highly polydisperse in molecular weight, to bind poorly to crystalline cellulose, and to be active only on amorphous cellulose. Pilot scale (200 L) production of T. reesei CBH II in S. cerevisiae was reported by VTT in 1990.285 Cetus Corporation discontinued its work on cellulases in the mid-1980s and was sold to Chiron Corporation in 1991. In 1993, this group reported the coexpression of CBH II and EG I in S. cerevisiae and showed that both enzymes were active on crystalline cellulose and acted synergistically.286 The group at VTT continued to express T. reesei genes in S. cerevisiae well into the next decade, for example, including expression of the family 5 glycoside hydrolase (GH5) mannanase gene, man1.287 The next leap in understanding of heterologous gene processing in S. cerevisiae resulted from studies of the unfolded protein response.288 Many years later, work to express T. reesei plant cell wall degrading enzymes continues worldwide, and VTT has remained involved in this field. In 2011, a notable report describes a systematic study to express, at higher titers than previously reported, active CBH I and II in S. cerevisiae using an engineered chimera approach.289 These authors were able to report the fermentation of MCC to ethanol by S. cerevisiae strains expressing CBHs with the addition of β-glucosidase. This concept is, of course, the direct microbial conversion or consolidated bioprocessing process. The team authoring this work included authors from VTT, Mascoma Corporation, and the University of Stellenbosch, South Africa. In the year of this review, Voutilainen et al. have reported the expression of active, thermal stable chimeric CBHs in S. cerevisiae.290

developed for engineering, namely strains that are susceptible to homologous gene integration and employ reusable bidirectionally selectable markers. 4.2.2. RUT C-30 Revealed. In the modern era of systems biology, the critical mutations conferring superior performance to RUT-C30 and its descendants have been elucidated.268 In 2009, a massively parallel sequencing effort was used to compare the genomes of T. reesei RUT-C30 and its direct ancestor NG14 with the published genome of the wild-type organism, QM6a.269 The genomes of RUT-C30 and NG14 were found to be missing over 100 kb of genomic DNA present in QM6a, encompassing 18 large deletions in RUT-C30. These deletions included the cre1 gene truncation identified earlier by Ilmén et al.258 and 15 small deletions or insertions in RUT-C30. As stated above, cre1 regulates catabolite repression in most cellulolytic fungi and is responsible for the strong inhibitory effect glucose has on cellulase production.258,270 For wild-type Trichoderma, the “cellulase signaling cascade” is initiated by natural environmental chemical inducers, such as cellobiose. However, largescale enzyme production is usually conducted with more powerful inducers, such as sophorose and lactose. We note that a process has recently been reported which converts glucose rich, biomass derived monosaccharide mixtures to sophorose and other products using enzymatic transglycosylation,271 providing a more cost-effective source of this inducer. Beyond the mutations to cre1, 211 single nucleotide variants in RUTC30 were found by Ilmén et al.258 and Seidl et al.270 These mutations and deletions affected 43 genes in NG14 and in RUTC30.269 One of these genes in RUT-C30, gls2α, encodes the glucosidase II α-subunit,272 an enzyme important for trimming N-linked oligosaccharides in glycoproteins, such as cellulases. Indeed, detailed biochemical analyses of Cel7A purified from RUT-C30 broth shows evidence that this enzyme is not glycosylated normally.273 As noted by Peterson and Nevalainen,274 several of the industrial strains derived from QM6a were selected for resistance to the chemical mutagen, nitrosoguanidine, which is now known to be a glycosylation inhibitor.275 In contrast to QM6a, RUT-C30 can be grown on glucose for cellulase production, and improvements in growing and inducing RUT-C30 have led to reports of its cellulase productivity as high as 30 g/L. Proprietary strains of T. reesei used in industry, based on RUT-C30 and its descendants, have led to strains delivering more than 100 g/L cellulase protein.229 In spite of this outstanding industrial record, RUT-C30 remains a marginal producer of heterologous proteins. Examples of such expression results include the following, in order of highest production: Hormoconis resinae glucoamylase P (0.5 g/L),276 Melanocarpus albomyces laccase (0.23 g/L),277 Acrophialophora nainiana XynVI (0.17 g/L),278 and CBH I-Fab fusion antibody (0.15 g/L).279 All other examples of protein expression from RUT-C30 reported are below these levels. However, T. reesei strain ALKO3620 (produced from QM9414 by multiple rounds of mutation) was reported to produce 1.9 g/L of Nonomuraea f lexuosa Xyn11A,280 which may make this case the highest publicly available heterologous protein expression example in a mutant T. reesei strain. Peterson and Nevalainen274 suggest that the causes of poor heterologous protein expression lie in the incompatibilities of foreign peptides with natural mechanisms of protein recognition and disposal within the cell, known collectively as the unfolded protein response. 4.2.3. Cloning T. reesei Genes in S. cerevisiae: The Road to Consolidated Bioprocessing. In many ways, the concept

4.3. T. reesei Cellulases: Understanding the Mechanisms of Action

4.3.1. T. reesei as an Early Model for Cellulase Action. Likely because of the significant resource availability for T. reesei at the time, this fungus soon became the archetype microorganism used to study cellulase digestion at the level of the enzymes it produces, instead of the World War II era focus on microbial action and effects. In 1950, Reese and co-workers reported that although many fungi could hydrolyze derivatized celluloses, only a few could grow on crystalline cellulose, such as cotton.291 At this time, the molecular nature of the cellulase 1322

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

concept was not known. We note that, during the war years, only crude salt and metal ion mediated solvent precipitation of proteins could be applied to protein separations. These methods, developed for fractionating blood plasma proteins for the war effort, were entirely unsuitable for purification of microbial enzymes. This and other postwar developments in protein biochemistry, erupting into play immediately after the war, helped the nascent cellulase biochemistry programs. Perhaps the most significant of these developments was the work in the 1950s on enhanced protein separation methodology in Uppsala, Sweden. Work by Pederson in Uppsala during this era led to the development of modern size exclusion chromatography with early columns being packed with simple agarose gel spheres.292 With the ability to separate active, secreted enzymes, the advent of modern cellulase biochemistry was born. Simple, atmospheric pressure column chromatographic techniques were used in the early and mid-1960s to purify cellulases for study.293−295 As this was before the time of widespread use of electrophoretic gels, analytical ultracentrifugation was the primary tool for characterizing the molecular weights and degree of polydispersity of proteins. Using these new tools for biochemistry, Mandels and Reese293 suggested a protein level concept for cellulase action, called “C1−Cx” (Figure 8). Enabled by this newly acquired ability to study the

Figure 9. This cartoon represents the classical endo/exo model of cellulase enzyme action in T. reesei and many other cellulolytic fungi. Here, the dominant EG, EG I, or Cel7B acts only on amorphous cellulose and functions as the Cx activity described by Reese decades before.

the available free cellulose ends, with β-glucosidases converting cellobiose to glucose. Although a second exoglucanase from T. reesei had been reported for some time, CBH I (Cel7A) and CBH II (Cel6A) were shown to be functionally distinct by van Tilbeurgh and co-workers301 and to correspond to distinct cellulose digestion morphologies determined microscopically.302 The same year, Wood proposed a model depicting the cellulose stereochemical specificity of these enzymes, called CBH (B′) and CBH (B), a concept not discussed much today. In 1984, van Tilbeurgh and co-workers were first to describe the systematic multistep chromatographic purification of all key cellulases from T. reesei.303 A curious observation from this time was that the optimum synergistic ratio of cellulases purified from T. reesei was CBH I:EG I (1:1) and CBH II:EG II (95:1).305 For the latter case, these results are consistent with the view that the EG creates new chain ends for the exoglucanase in an almost catalytic role. However, for the GH7 enzymes (CBH I and EG I, later known as Cel7A and Cel7B, respectively), the 1:1 ratio suggests another mechanism. Wood306 proposed that the CBH I/EG I pair function with close physical coupling so that, immediately following an internal bond cleavage event by EG I, CBH I is able to immediately occupy the newly revealed reducing chain end. This not only permits rapid hydrolysis, but also reduces the possibly that the broken chain can “reanneal” into the crystal surface. We are not aware that more recent work has confirmed or denied this theory; however, given the challenges now known for removing all traces of EG contamination from CBH I preparations, caution is recommended. Initially reported by Ståhlberg et al.304 and more recently by Kurašin and Väljamäe307 is a view of CBH I mechanism that was not envisioned by the simple exo/endo model. Using cellulose reducing end group analysis following digestion, Ståhlberg et al. suggested that T. reesei produces no true exo-cellulases (Figure 10). The work by Kurašin and Väljamäe used a similar reducing end analysis strategy to show that both TrCel7A and Phanerochaete chrysosporium Cel7D are also able to conduct “endo-initiation” as well as the well-known exo-initiation leading to hydrolysis. It is worth noting that, even with this “extra”

Figure 8. Diagram of first concept for the roles and specificities of enzymes hydrolyzing cellulose, known as the C1−Cx model. Adapted with permission from ref 293. Copyright 1964 Society for Industrial Microbiology and 1999 Springer Science and Business Media.

action of relatively homogeneous enzyme preparations and measure their respective concentrations accurately, they described in this report the possible role of C1 enzymes, required for attack on crystalline cellulose, and the more prevalent Cx enzymes, needed for hydrolysis of soluble, derivatized cellulose such as PASC. Characterization of the hypothetical C1 enzyme lagged behind progress on the Cx enzymes for many years. Putative Cx enzymes were identified as endo-(1−4)-β-glucanases, exo-(1−4)-β-glucanases, and β-glucosidases.296 During the 1960s, it was commonly thought that the C1 and Cx enzymes worked in a synergistic way to hydrolyze crystalline cellulose, although no molecular detail was available. One proposal during this time was that C1 was a protein that decrystallized cellulose by displacing native hydrogen bonds in the microfibril, leading to a more available structure for Cx.297 In 1969, Eriksson298 is credited with first proposing that an exoglucanase could be playing the role of C1, a view supported three years later in a report by Halliwell and co-workers.299 The landmark work reported by Berghem and Pettersson in 1973300 clearly showed that an exo-cellulase, purified from T. viride, was highly active on crystalline cellulose and the best candidate for C1. This enzyme, later classified as “cellulose 1,4-β-cellobiosidase (nonreducing end) or CBH I”, EC 3.2.1.91; is today reclassified as EC 3.2.1.176 “cellulose 1,4-β-cellobiosidase (reducing end)”.151 The picture emerging from work done by the late-1990s is depicted in Figure 9. In this view, EGs attack amorphous cellulose surface regions of the microfibril, revealing new cleavage sites for exoglucanases. Exoglucanases also attack 1323

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 10. Concept of endo-initiating CBHs was introduced by Ståhlberg et al. in 1993.304 This new role for both CBH I or Cel7A and CBH II or Cel6A supports somewhat the concept of C1; however, the endo-initiation function may not be as dramatic as first envisioned by Reese.

reactive feature, CBH I by itself cannot hydrolyze more than about 60% of a model cellulose substrate, such as Avicel. Today, cellulase digestions mechanisms are once again viewed in a sense related to the original concepts from Reese, i.e., C1 and Cx. However, some inconsistencies are apparent given the classical view shown in Figure 9. Reese suggested that C1 was a decrystallizing protein factor, whose function may not be bondbreaking (at least, not hydrolytic), but instead possesses an ability to swell or disrupt cellulose, perhaps by intercalating between cellulose chains in the elementary microfibril, the consequence of which is action by other enzymes causing complete digestion to cellobiose. The view shown in Figure 9 actually suggests that Cx is not a single class of enzymes, but the combined action of EGs and exoglucanases that work together to expose the necessary ends of cellulose chains needed to yield simple sugars. Neither the EGs nor the exoglucanases from fungi cause global decrystallization of cellulose. This hydrolytic enzyme system functions in an ablative manner, peeling one layer at a time. The essential dilemma here is that no single protein is known to dramatically reduce cellulose crystallinity. Some proteins may cause some local cellulose disruption and morphological changes, for example, the expansins308 and expansin-like proteins (swollenins),309 but their reported effects on subsequent hydrolytic action are modest.310 So, are there protein factors that truly reflect the notions of Reese four decades ago? The relatively recent work to define the action of LPMOs has again posed some new potential answers to this question. LPMOs were originally classified as fungal GH61 enzymes and nonfungal members of family 33 CBM, but have been reclassified,152 and are now implicated in the oxidative cleavage of cellulose and other plant polymers. As reviewed extensively in section 11, LPMOs are known to act in two reaction schemes on crystalline cellulose, generating oxidized and nonoxidized chain ends (Figure 11). Some LPMOs have been shown to oxidize glucose at position C1, releasing lactones that are hydrolyzed to aldonic acids,311 whereas other enzymes act on the nonreducing end, producing ketoaldoses, or a combination thereof.312 The copper oxidases seem to attack the highly crystalline regions of cellulose in contrast to hydrolases. In this regard, LPMOs may act like Reese’s C1 enzymes.313 4.3.2. GHs and Related Enzymes. Today, the secretome of T. reesei is fairly well-understood, thanks to decades of traditional biochemistry and the recent assembling of much of the genome, i.e., 34 Mbp of nearly contiguous sequence comprising 9129 predicted genes.41 T. reesei QM6a is known to produce at least 193 GHs, 93 glycosyl transferases, 5 polysaccharide lyases, 17 carbohydrate esterases, and 41 CBMs.314 It has been pointed out that the outstanding performance of T. reesei on biomass is somewhat surprising,

Figure 11. Current view of cellulose degradation by many filamentous fungi, combining both hydrolytic and oxidative fragmentation reactions. Note that the action of LPMO may provide a new chain end for CBH I (or Cel7A), even though it is oxidized.

considering that other fungi have a greater diversity of critical cellulase components. For example, T. virens, Aspergillus nidulans, and Postia placenta have 256, 251, and 248 GHs, respectively.268 The CAZy database lists two CBHs, six EGs, two LPMOs, and seven β-D-glucosidases for QM6a. Furthermore, as illustrated by Martinez and co-workers,41 T. reesei has an extremely limited set of the enzymes needed to degrade plant cell walls (accessory enzymes, hemicellulases, acetyl esterases) and especially living walls (pectinases). Seiboth and coworkers268 concluded that T. reesei’s success in the biosphere must stem from its efficient cellulase induction system and extremely high cellulase production and secretion capability. Some caveats to this cocktail paradigm deserve attention. First, it has been recently suggested193 that some cellulases should be categorized as “processive EGs” for structural and functional reasons. Structurally, some cellulases have binding sites that are intermediate between the closed tunnels of CBHs and the open clefts of EGs. Functionally, some enzymes have intermediate levels of processivity, depending on the definition of processivity used.315 Also, as discussed above, for many years, enzymes now referred to as “cellobiohydrolases” were called “exoglucanases” because of their assumed tendency to begin their processive runs at a cellulose chain end (thus, “exo”). However, the term exoglucanase is now essentially obsolete as a class of GH given the early and now more recently confirmed revelation that GH7 CBHs can perform hydrolysis in an endoinitiation fashion.304,307 In addition, the more recently discovered class of enzymes known as LPMOs (formerly GH61 or family 33 CBM (CBM33)) have been recognized as a vital part of an efficient cellulase cocktail. These enzymes are not GHs, even though they were once listed in GH or CBM families. Given these nuanced (and at times misleading) enzymatic classifications, it may be more helpful and appropriate to think in terms of “modes of actions” rather than types of cellulases. As suggested above, T. reesei secretes a multiplicity of the key cellulose degrading enzymes, especially the β-glucosidases, EGs, and CBHs. The enzymes shown in Table 2 are those cited 1324

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

1325

endoglucanase V

GH45 Cel45A

GH74 Cel74A

AA9 (formerly GH61) Cel61B

Egl6 EC3.2.1.151

EG7

endoglucanase and xyloglucanase LPMO

endoglucanase III

GH12 Cel12A

endoglucanase I

GH7 Cel7B

EGIII EC3.2.1.151 EC3.2.1.4 EGV EC3.2.1.4

cellobiohydrolase I

cellobiohydrolase II

GH7 Cel7A

GH6 Cel6A

endoglucanase

GH5 Cel5B

CBHII EC3.2.1.91 CBHI EC3.2.1.176 EC3.2.1.91 EGI EC3.2.1.4

endoglucanase II

GH5 Cel5A

EGII (formerly EGIII) EC3.2.1.4 EC3.2.1.4

oxidative

inverting

inverting

retaining

retaining

retaining

inverting

retaining

retaining

retaining

β-glucosidase

GH3 Cel3E

EC3.2.1.21

retaining

β-glucosidase

GH3 Cel3D

EC3.2.1.21

retaining

β-glucosidase

GH3 Cel3C

EC3.2.1.21

retaining

β-glucosidase

GH3 Cel3B

EC3.2.1.21

retaining

β-glucosidase 1

GH3 Cel3A

Bg1 EC3.2.1.21

retaining

β-glucosidase

GH1 Cel1B

EC3.2.1.21

retaining

product configuration

β-glucosidase 2

common name

GH1 Cel1A

CAZy name

Bgl2 EC3.2.1.21

classical name

Saloheimo et al., 1994335 Foreman et al., 2003319 Foreman et al., 2003319

Fowler et al., 2001333

Penttilä et al., 1986329

Foreman et al., 2003319 Teeri et al., 1987325 Shoemaker et al., 1983281

Foreman et al., 2003319 Foreman et al., 2003319 Foreman et al., 2003319 Foreman et al., 2003319 Saloheimo et al., 1988323

Takashima et al., 1999317 Foreman et al., 2003319 Mach, 1993320

reported by

substrate specificity

ref

Karlsson et al., 2002334 Karlsson et al., 2002334

CMC, PASC, Avicel, Glc3/Glc4/Glc5, barley β-glucan, glucomannan, filter paper

cellulose

Karkehabadi et al., 2008337

Benkő et al., 2008336

Van Arsdell et al., 1987,330 Biely et al., 1991,331 Bailey et al., 1999,286 Vlasenko et al., 2010332

Glc3/Glc4/Glc5/Glc6, pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan, barley β-glucan, hydroxyethylcellulose CMC, PASC, Avicel, Glc4/Glc5, barley β-glucan, glucomannan, filter paper

xyloglucan, hydroxyethylcellulose

Boer and Koivula, 2003,327 Becker et al., 2001328

Poidevin et al., 2013,326 Ståhlberg et al., 1993304

Foreman et al., 2003319

Qin et al., 2008324

Foreman et al., 2003319

Foreman et al., 2003319

Foreman et al., 2003319

Foreman et al., 2003319

Korotkova et al., 2009,321 Karkehabadi et al., 2014322

Takashima et al., 1999,317 Saloheimo et al. 2002318 Foreman et al., 2003319

4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenol-β-D-lactoside, 3,4-dinitrophenyl-βD-cellobioside, 3,4-dinitrophenyl-β-D-lactoside, BMCC

Avicel, CMC, Glc3/Glc4/Glc5/Glc6, PASC

predicted endoglucanase activity

CMC-Na, Avicel, ball-milled cellulose, PASC

predicted β-glucosidase activity

predicted β-glucosidase activity

predicted β-glucosidase activity

Glc2/Glc3/Glc4/Glc5/Glc6, gentiobiose, laminaribiose, laminaritriose, sophorose, 2-chloro-4nitrophenyl-β-D-glucopyranoside, p-nitrophenyl-β-D-glucopyranoside, CMC, laminarin, βglucan predicted β-glucosidase activity

p-nitrophenyl-β-D-glucoside, p-nitrophenyl-β-D-cellobioside (pNPC), methylumbelliferyl-β-Dglucoside, 5-bromo-4-chloro-3-indolyl-β-D-glucoside predicted β-glucosidase activity

Table 2. T. reesei QM9414 and QM6a Cellulases Reported in CAZy151−153

Chemical Reviews Review

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

cleaved into small, glycosylated domains at the C- and N-termini of Cel7A and Cel6A, respectively, both of which are bound with high affinity to crystalline cellulose. Both “core” domains isolated from the two cellulases retained activity against small molecule substrates. This study,344 along with similar work the same year on two bacterial cellulases from Cellulomonas f imi,345 solidified the concept that cellulases can exhibit multimodular structures with CBDs. Upon the discovery of additional carbohydrate-binding ligands beyond cellulose, the term CBD was replaced with the broader concept of CBMs.156 Kraulis et al. solved the first structure of a family 1 CBM using NMR spectroscopy in 1989, which is shown in Figure 12.346

currently in CAZy. Some families, for example GHs 6 and 7, contain enzymes with vastly different and mechanistically synergistic activities (as described in sections 6 and 7). Other families, GHs 1 and 3, contain many enzymes with essentially the same activity (cleavage of the glycosidic bond in cellobiose); upon closer inspection, however, some of these enzymes have subtle differences in substrate specificity. Family GH7 is somewhat enigmatic in that it contains both a CBH (Cel7A) and an EG (Cel7B) based on the same protein fold (folded β sheet sandwich). In this case, the EG homologue has substrate tunnel associated peptide loops of shorter length than its CBH counterpart (see section 6). We also note that the EGs found in GH families 5, 7, and 12 act on cellulose to leave the terminal hydroxyl in the retaining configuration. Family GH45 EGs leave the terminal hydroxyl in the inverted configuration. For more comprehensive reviews of the early literature on T. reesei enzymology and strain development, we refer the readers to several major reviews.32,316 The following sections primarily focus on the developments in fungal cellulases following the initial structural determinations of each enzyme family (and CBMs) toward understanding structure−function relationships.

5. CARBOHYDRATE-BINDING MODULES AND LINKERS Biomass-degrading enzymes work at solid−liquid interfaces, and the concentration of catalytic units at the surface is directly related to the extent of substrate turnover. Thus, many enzymes that work on polysaccharides are multimodular with catalytic function accomplished by a single or multiple CDs coupled to a binding function via one or more CBMs; these two domains are connected together by linker peptides of varying length and structure. To date, 69 distinct families of CBMs have been discovered and characterized according to the CAZy database (family 33 CBMs have been reclassified as oxidative enzymes, as discussed below).151−153,156 As many biomass-degrading fungi commonly employ family 1 CBMs for plant cell wall degradation, we review the history and developments in structure−function studies primarily of this particular family of CBMs in this section. Moreover, CBMs are connected to CDs via linkers, and thus, the roles of these domains are also reviewed. We note that the study of CBMs is vast, and here we primarily focus on family 1 CBMs. We discuss other CBMs mainly when findings are relevant to our collective understanding of general CBM behavior. For more general perspectives on CBMs, readers are referred to several reviews from the past decade.156,338−342

Figure 12. TrCel7A family 1 CBM structure solved by Kraulis et al.346 Several residues of interest are highlighted including Tyr5, Asn29, Tyr31, Tyr32, Gln34 on the hydrophilic, flat face of the CBM and the two disulfide bonds in the protein. The structure forms an irregular, triple-stranded β-sheet core. (B) Sequence alignment of the TrCel7A, TrCel7B, and TrCel6A CBMs. The figure was generated with ESPript (http://espript.ibcp.fr).347

The 36-residue TrCel7A CBM was prepared via solid-state peptide synthesis. The structure revealed an irregular, triplestranded, antiparallel β-sheet arrangement with an amphiphilic character with a large, hydrophilic flat face exhibiting three tyrosine residues and a large number of polar amino acids, and a hydrophobic face on the “wedge” portion of the CBM. The CBM sequence contains four cysteine residues comprising two disulfide bonds, and all three possible combinations were investigated to determine the most likely pairing. The disulfide bonds are also shown in Figure 12. Perhaps the most cited observation from this original work is the presence of three conserved aromatic residues on the flat, hydrophilic face of the CBM (Tyr5, Tyr31, and Tyr32 in the TrCel7A CBM). As discussed below in detail, this flat face is implicated as the binding face to crystalline cellulose. The family 1 CBM structure was followed by the structures of many more CBMs from various families. By 2004, a paradigm had emerged for three types of CBMs, which were categorized as type A, B, and C CBMs in a seminal review paper from Boraston and co-workers in 2004.156 Type A CBMs, which include family 1, are characterized by flat faces that bind to crystalline cellulose or chitin. Type B CBMs exhibit extended grooves or clefts for binding single sugar chains, such as those found in hemicellulose, pectin, or amorphous regions of polymers such as cellulose or chitin. Lastly, type C CBMs are

5.1. Family 1 Carbohydrate-Binding Modules

That many GHs contain both binding and catalytic function was first reported in a study by Van Tilbeurgh et al.343 Therein, papain was used to proteolytically cleave the TrCel7A enzyme into two primary fragments: a 56 kDa domain that lost catalytic activity on cellulose and retained full activity on small molecule substrates (the CD), and a 10 kDa domain identified as the glycosylated C-terminal domain (the CBM-linker).343 The authors went on to show that the binding affinity of the 56 kDa domain to crystalline cellulose is significantly reduced without the C-terminal domain. On the basis of these results, the authors proposed that TrCel7A is a multimodular protein with a binding and catalytic function.343 Following this important discovery, Tomme et al. further characterized the proteolysis products of TrCel7A and TrCel6A from papain cleavage.344 They similarly demonstrated that both enzymes are 1326

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

aromatic-containing face of the CBM is likely the binding face to crystalline cellulose.350 Around the same time, Linder and co-workers published a series of comprehensive studies wherein the TrCel7A CBM was investigated in great detail with 6 alanine mutant CBMs produced via solid-state peptide synthesis and examined with binding affinity measurements and 2-D NMR spectroscopy, including Y5A, P16R, N29A, Y31A, Y32A, and Q34A (Table 3).352 The individual point mutants of Tyr5 and Tyr32 both

characterized as those that bind mono-, di-, and trisaccharides. In 2004, three functions were attributed to CBMs: substrate targeting, enhancements to enzyme−substrate proximity, and disruption of the substrate.156 Boraston and co-workers published a recent revision of the A−B−C CBM paradigm wherein type B CBMs are classified as those that bind in endomode and type C CBMs bind in exo-mode.338 type A CBMs remain classified as previously. With the publication of the landmark structural study from Kraulis et al.,346 this enabled a very large body of work to be conducted to investigate the function of family 1 CBMs.348−364 With the knowledge of a multidomain CBH structure, Ståhlberg and co-workers proposed an initial model for CBH action in 1991 using TrCel7A as a model enzyme, which formed much of the basis for our collective model of CBH action.348 By examining the binding capacity of the intact TrCel7A, the CD alone, and the CBM-linker alone, it was proposed that the CBM is responsible for targeting the intact enzyme to crystalline regions, thereby increasing the concentration of the CD to the cellulose surface. The CBM was also proposed to enable twodimensional diffusion of the CBH enzyme on the cellulose surface to enable efficient catalytic performance.348 Twodimensional diffusion of a type A family 2 CBM from C. f imi was later measured using fluorescence recovery after photobleaching experiments, with an estimated diffusion rate of 2 × 10−11 to 1.2 × 10−10 cm2/s. To our knowledge, similar experiments have not yet been conducted for family 1 CBMs. Reinikainen et al. expressed several variants of the full-length TrCel7A in yeast, focusing on mutations on the flat face (Y492A, Y492D, and Y492H; this is Tyr31 in the CBM-only sequence numbering) and on the hydrophobic wedge face (P477R).349 As will be discussed in more detail in the GH7 section below (section 6), this was one of the first studies to note that heterologous expression of fungal cellulases leads to lower activity than enzymes expressed natively, in this case by more than a factor of 2 in activity against crystalline cellulose. This observation was attributed to a higher extent of glycosylation from yeast expression as observed by a large heterogeneous population of enzymes on a gel. Within the heterologously expressed enzymes, the authors observed that wild-type Cel7A and the Y492H mutant exhibit roughly equivalent activity, whereas Y492A and Y492D result in a significant decrease in activity against crystalline cellulose accompanied by a concomitant decrease in binding affinity, as measured at low temperature. The P477R mutation also resulted in similarly low activity and binding affinity as Y492A; however, as the authors note, this could be due to structural changes to the CBM given the removal of a proline residue. This was not clearly resolved until a later study from Reinikainen et al., wherein Y492A, Y492H, and P477R were all produced in the native organism with the native cbh1 gene knocked out.350 Interestingly, the P477R mutant bound similarly to the wild-type Cel7A enzyme, contrary to the same enzyme produced heterologously. Both the Tyr492 mutants bound to a lower extent than the wild-type enzyme, by approximately 60%. At a concentration of 1 M MgSO4, all mutants performed similarly to the wild-type enzyme on crystalline cellulose, which the authors state suggests that the hydrophobic effect plays a significant role in CBM binding to the substrate. Given the similarity of the P477R results to the wild-type enzyme and the reduction in affinity and activity with mutations to Tyr492, this led the authors to suggest that the flat,

Table 3. Free Energy of Binding for TrCel7A CBM Mutants Relative to Wild-Type352 ΔΔG (kJ/mol)

CBMs compareda Cel7A Cel7A Cel7A Cel7A

wild-type → wild-type → wild-type → wild-type →

Cel7A Cel7A Cel7A Cel7A

P16R N29A Q34A Y31A

1.2 2.4 4.9 7.3

a

Values for Y5A and Y32A, which lost their affinity to cellulose, were not determined.

completely lost affinity to cellulose. P16R was the least-affected mutant, similar to results from Reinikainen et al.349,350 wherein this amino acid position had the least effect on Cel7A activity. The 2-D NMR spectroscopy suggests that the Y5A mutation affects the overall compactness of the structure, which was later verified by Mattinen et al. by solving NMR structures of this CBM.356 In all other cases, the 2-D NMR spectra suggest minimal changes only to the CBM structures upon single-point mutations, which was later verified structurally for the Y31A and Y32A CBMs.356 Additional NMR spectroscopy of the same CBMs was conducted with cellohexaose present in solution by Mattinen et al.357 From this study, the authors suggested that the flat putative binding face of the family 1 CBM aligns in parallel with cellohexaose, thus potentially serving as a model for how the CBM binds to crystalline cellulose.357 An examination of the differences between the CBMs from TrCel7A and TrCel7B closely followed these initial family 1 CBM structure−function studies.351,358,359 When comparing the CBM binding affinity alone, it was observed that the CBMs from these two enzymes exhibit significantly different binding affinities, with the Cel7B CBM possessing a greater affinity to crystalline cellulose. Starting with the Cel7A CBM sequence, the authors made multiple mutations, and found that the Y5W mutation alone can explain much of the differences in binding affinity between the two CBMs351 (Table 4). A subsequent Table 4. Relative Free Energies of Binding for the CBMs of TrCel7A, TrCel7A Y5W, and TrCel7B351 CBMs compared

ΔΔG (kJ/mol)

Cel7A wild-type → Cel7A Y5W Cel7A Y5W → Cel7B wild-type Cel7A wild-type → Cel7B wild-type

−1.1 −1.4 −2.4

study from Srisodsuk et al. compared the binding affinity of a hybrid Cel7A CD and linker with the Cel7B CBM, which demonstrated slightly higher activity on BMCC.358 This study was closely followed by the solution structure of the Cel7B CBM.359 The structure is quite similar to the TrCel7A structure with the primary exceptions of an additional disulfide bond between Cys2 and Cys18, and a tryptophan residue at the 7position in place of Tyr5. The alignment of aromatic residues on 1327

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

the flat, hydrophilic surface of the CBM is quite similar. Srisodsuk et al. also noted that the activity of the Cel7B-CBMcontaining chimeric enzyme with the Cel7A CD exhibited higher activity.358 In a similar vein, Takashima et al. later conducted an exhaustive study of CBM binding affinity correlated to GH7 CBH activity using the H. grisea GH7 CBH (Figure 13).365 Therein, they demonstrated that

Figure 13. Correlation between the relative affinity constant (x-axis) and activity on Avicel (y-axis) of each CBM mutant in the study. Reprinted with permission from ref 365. Copyright 2007 Elsevier.

Figure 14. T. reesei CBM binding temperature dependence: 3 h at 4 °C (●), 18 h at 4 °C (○), 3 h at 22 °C (×), 18 h at 22 °C (■), 43 h at 22 °C (▲), 3 h at 50 °C (□), and 18 h at 50 °C (Δ). ▼ corresponds to CBHI CBD, and ▽ to CBHII CBD, both at 18 h at 22 °C. Reprinted with permission from ref 354. Copyright 1996 American Society for Biochemistry and Molecular Biology.

tryptophan residues in the “outer” positions on the CBM (corresponding to Tyr5 and Tyr31 in the TrCel7A CBM) yielded the highest binding affinity measured for the H. grisea Cel7A CBM, and this correlated to the highest measured activity of the full-length enzyme. The authors also found a nearly linear correlation between activity on Avicel and CBM binding affinity, highlighting the key relationship between CBMs and cellulolytic performance.365 Linder and Teeri also examined both the binding and reversibility of the TrCel7A CBM on crystalline cellulose using a sensitive tritium-labeling approach combined with dilution experiments.353 Therein, they produced the CBM in E. coli as a chimera with the TrCel6A CBM-TrCel7A linker-TrCel7A CBM construct, described in detail in another study.354 The resulting protein was cleaved to isolate the Cel7A CBM and included 11 residues from the linker domain. The authors demonstrated that the binding of the CBM was temperature dependent, with higher binding at lower temperatures (the authors tested 4, 22, and 30 °C) (Figure 14). More importantly, they demonstrated that the CBM binding was completely reversible, addressing a question in the literature regarding if CBMs could desorb from cellulose. The authors conclude that the rate of adsorption and desorption of CBMs and the binding affinity to cellulose should be optimally balanced to maximize cellulase activity and minimize nonproductive binding.353 Interestingly, however, a subsequent study from Carrard and Linder demonstrated that the TrCel6A CBM could not be desorbed from cellulose over a temperature range from 4 to 50 °C, even 8 days after dilution, in stark contrast to the similar TrCel7A CBM.360 To ascertain the differences, the authors examined two key differences between the two CBMs: namely, Tyr5 in the Cel7A CBM is a tryptophan in the Cel6A CBM, and the latter has an extra disulfide bond. The W7Y mutation in the

Cel6A CBM led to a dramatic loss in binding affinity relative to the wild-type Cel7A CBM. The removal of the third disulfide bond in the Cel6A CBM resulted in a decrease in binding affinity and off-rate, although not as drastic as the W7Y mutation, suggesting that the Cel6A CBM rigidity contributes to the CBM-cellulose interaction and that the presence of the tryptophan at the 7-position significantly affects the binding affinity.360 In a separate study, the same authors demonstrated that the native TrCel7A CBM binding affinity was not significantly affected by pH.361 More recently, Guo and Catchmark conducted a thorough study using isothermal titration calorimetry and adsorption isotherms to compare the binding characteristics of TrCel7A and TrCel6A CBMs expressed in E. coli on crystalline cellulose, Avicel, PASC, and small cellodextrin chains.364 To enable quantitative comparison of binding affinity across various cellulose substrates, the authors first developed a methodology to quantify the available surface area for CBM binding based on nitrogen adsorption and static light scattering. Similar to previous conclusions from Carrard and Linder,360 the authors demonstrated that the binding affinity of the TrCel6A CBM is significantly higher, specifically by an order of magnitude relative to the TrCel7A CBM (106 M−1 versus 105 M−1, respectively).364 The application of isothermal titration calorimetry enabled the delineation between enthalpic and entropic contributions to binding affinity for the first time for family 1 CBMs, which demonstrated that the binding affinity to the cellulose surface was enthalpically favorable and entropically unfavorable. This finding is in direct contrast to an influential, earlier study 1328

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

imaging work with intact TrCel7A also suggested that the whole enzyme binds to and attacks cellulose on the hydrophobic surface, likely mediated by the CBM.367 More recently, Sugimoto et al. examined the family 1 CBM binding behavior to cellulose in more quantitative detail.363 The authors fused a fluorescent protein with the TrCel7A CBM and measured binding isotherms to various crystalline and amorphous cellulose substrates. The authors fit their binding isotherms to four binding models, and found that the Hill binding model provided the best fit for CBM-binding to cellulose, with the Langmuir isotherm model also providing a reasonable fit (correlation coefficients of 0.9995 and 0.9915, respectively) at 5 °C on Cladophora cellulose. The authors attribute the goodness-of-fit of the Hill model to the explicit treatment of a steric exclusion effect induced by surface crowding of the CBM-fluorescent protein complex and infer that the length and flexibility of the linker domain will dictate the size of the exclusion area.363 To date, nearly all studies of family 1 CBMs have either used solid-state peptide synthesis or E. coli expression, neither of which impart glycosylation to the protein. However, detailed mass spectrometry studies from Harrison et al. in 1998 showed that the TrCel7A family 1 CBM exhibits glycosylation at the Thr1 and Ser3 positions, as illustrated in Figure 16.368 Mass

wherein isothermal titration calorimetry experiments on a family 2 CBM (also a type A CBM) suggested that type A CBM binding is entropically driven.366 Interestingly, Guo and Catchmark also examine the binding specificity of both CBMs to NaBH4-treated cellulose microfibrils, which modify cellulose chain reducing ends, finding that the Cel6A CBM exhibits much lower binding affinity.364 They attribute this difference to the fact that the Cel6A CBM may recognize the reducing ends of cellulose chains, whereas significantly less difference in binding was observed for the Cel7A CBM. The authors speculate that the potential preference in the Cel6A CBM binding may lead the intact enzyme to bind near reducing ends of cellulose chains.364 In 2003, the binding-face specificity of two families of type A CBMs, including a large library of family 1 CBMs and a family 3 CBM from C. thermocellum, to cellulose was reported.362 Lehtiö et al. fused CBMs of interest to a modified staphylococcal protein A, which was then coupled to an immuno-gold label. By binding the CBMs to very large cellulose Iα microfibrils from Valonia, the authors could then visualize the specific cellulose binding face for the CBMs directly with TEM and diffraction measurements. It was shown that in all cases the CBMs bound to the 110 face of cellulose Iα (Figure 15), which is the

Figure 15. Crystal structure faces of the cellulose Iα polymorph. The circle indicates the 110 face, exposed in worn crystals, which is the CBM binding site. Reprinted with permission from ref 362. Copyright 2003 National Academy of Sciences.

Figure 16. (A) Glycosylated TrCel7A CBM on the hydrophobic surface of cellulose, as studied by Taylor et al.359 (B) LOGO representation372 of a multiple sequence alignment of family 1 CBMs suggests that multiple putative O-glycosylation sites may exist on family 1 CBMs.

hydrophobic face (the hydrophobic face in cellulose Iβ is 100). This finding shed light directly on the type A CBM-cellulose interaction by ascertaining that the protein−ligand binding surface is the flat surface with the glucopyranose rings directly exposed. It should be noted that this type of face only exists in the natural cellulose I polymorphs with multiple chains exhibiting exposed faces; cellulose II and III do not exhibit “flat” faces with multiple parallel chain faces exposed.88−90,93,94 The authors also measured the binding affinity of multiple CBMs and noted that the addition of two tryptophan residues to the TrCel7A CBM does not increase the binding affinity beyond the effect of a single Y-to-W mutation.362 Unsurprisingly, later

spectrometry analysis was not conducted beyond Tyr5. Given the importance of glycosylation in cellulase activity,369 as discussed in more detail below in several GH sections, and the sequence conservation of putative O-glycan sites on family 1 CBMs (Figure 16),370 fungi may employ glycans on CBMs for multiple functions. In terms of cellulose binding affinity, Taylor et al. employed thermodynamic cycle calculations with molecular dynamics (MD) simulations to examine how Oglycans, especially at Ser3 and a conserved site at Ser14, are able to modify the binding affinity over the nonglycosylated 1329

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

variant.371 The predictive capability of these simulations was first tested by comparing to previous experimental data for amino acid substitutions. The thermodynamic cycle calculations were in quantitative agreement, for example, with the Y5W mutation. From there, the authors predicted that a single mannose at Ser3 could modify the binding affinity by the same order as a tryptophan mutation. Moreover, the simulations predicted that the addition of mannosylation at different sites would modulate the binding affinity by a significantly different magnitude, suggesting that the location and extent of glycosylation has a major impact on the change in properties.371 The predictions that glycosylation affects CBM binding affinities were recently tested experimentally.373 Chen et al. developed a solid-state glycopeptide synthesis approach to rapidly produce single glycoforms of the TrCel7A CBM. Two sets of glycoforms (for a total of 20 CBMs) were synthesized: the first set focused on addition of glycan motifs to single amino acids, namely at Thr1, Ser3, and Ser14 of mono-, di-, and trimannose group, whereas the second set was designed to study the effects of adding glycans to multiple amino acid residues simultaneously, as would be found naturally.368 For each glycoform, the thermal stability, resistance to thermolysin cleavage, and the binding affinity to BMCC were tested. For the addition of glycans to single sites, Chen et al. discovered that Ser3 has the largest proteolysis-stabilizing effect (by up to a factor of 10), increase in thermal stability (up to 12 °C), and binding affinity increase to BMCC (up to a factor of 4).373 Addition of glycans at Thr1 does not demonstrate significant differences in proteolytic and thermal stability, but a disaccharide at Thr1 is able to increase the binding affinity significantly to BMCC. Addition of glycans at Ser14 essentially improved all properties, but not to the extent of Ser3. For the addition of glycans to multiple sites, single mannosylation at each site was able to increase the binding affinity to BMCC by 7.4-fold, well over the increases demonstrated via amino acid substitutions.351,352 An increase of up to 50-fold in thermolysin resistance was demonstrated with a concomitant increase in thermal stability of up to 16 °C for several variants. From a biological perspective, glycan-bearing residues are highly conserved in the 1, 3, and 14 positions on family 1 CBMs, and given the ability to affect multiple, beneficial properties, it is likely that CBM glycosylation is employed commonly by fungi. Given the prevalence of glycosylation in fungal cellulases, this work demonstrates that the study of CBMs should explicitly consider the effects of glycosylation to measure physiologically relevant properties.373 Another interesting question related to CBMs is the ability for substrate disruption, as discussed by Boraston et al.156 Din et al. published an early study on CBM substrate disruption with the family 2 CBM from the C. f imi EG, CenA.374 Therein, they observed that application of the CBM-linker domain on ramie cotton fibers resulted in surface roughening. The authors suggested that this CBM-cellulose interaction was the result of nonhydrolytic disruption of the substrate. However, it was not demonstrated that the CBM-treated substrates were subsequently more digestible by reducing-sugar assays nor were cellulose crystallinity measurements conducted. Ståhlberg et al. incubated the TrCel7A CBM-linker domain with Avicel, but did not observe additional susceptibility of the resulting substrate to digestion by intact Cel7A.348 Several subsequent studies have reported observing substrate disruption. Gao et al. claimed that the Penicillium janthinellum CBM-linker was able to synergize with an EG from T. pseudokoningii on Avicel and cotton fibers,

but no data were included in their study to quantitatively support this claim.375 Instead, the authors show scanning electron microscopy (SEM) images of before and after treatment with the P. janthinellum CBM-linker domain. Xiao et al.376 and Wang et al.377 report similar findings, also without demonstrating synergistic cellulose depolymerization for the CBM from a T. reesei EG and from a family 7 CBH from T. pseudokoningii, respectively. These studies overall lack sufficient detail to substantiate the interpretations of SEM and FTIR data. More recently, Hall and co-workers reported an in-depth study of “CBM pretreatment” of Avicel and fibrous cellulose (cotton linters).378 Therein, they noted that a 15 h pretreatment at 42 °C followed by enzymatic hydrolysis at 50 °C was able to slightly improve the digestibility of Avicel and was able to slightly decrease the crystallinity as measured by the height of the 200 peak in X-ray diffraction measurements. It is unknown how pretreatment with CBMs affected the conversion at long times or the final yield of reducing sugars from these experiments. Despite continued claims of substrate disruption,379 the data for a disruption effect remain somewhat sparse, and a full study of CBM disruption in terms of synergy with cellulases at high substrate conversions coupled to detailed substrate characterization does not yet exist, thus severely limiting the ability to ascertain true “disruption” effects by CBMs. Many other CBM effects on fungal cellulase activity and behavior beyond substrate disruption have been investigated. Hall and co-workers compared the thermal stability of intact TrCel7A, the CD alone, and the CBM-linker, the latter two isolated from papain cleavage of the intact enzyme.380 They found melting temperatures of 59, 51, and 66 °C, respectively, suggesting that the CBM-linker is a significant stabilizer of the intact enzyme. Voutilainen et al. produced chimeric enzymes of the Thermoascus aurantiacus Cel7A (TaCel7A) CBH, which natively lacks a CBM and linker, to either the TrCel7A or Chaetomium thermophilum Cel7A CBM-linker.381 On Avicel at high temperature, this resulted in a significant increase in activity, suggesting that CBM-linker pairs from thermostable cellulases confer disparate benefits in terms of thermal stability to chimeric CBHs. Voutilainen et al. also published a recent study wherein they built chimeras with CBMs from Families 1, 2, and 3 of the thermostable Talaromyces emersonii Cel7A (TeCel7A) enzyme and an engineered TeCel7A, both of which natively lack a CBM.290 This study revealed that the CBM3-containing enzymes bind approximately 20% more to Avicel over the family 1 and family 2 CBM-containing enzymes at both 45 and 60 °C. Correspondingly, the family 3 CBM-containing cellulases are able to solubilize a greater amount of Avicel in 24 h across a range of temperatures from 45 to 65 °C. A similar study was reported from Kim et al. wherein a large library of EGs were expressed with varying CBMs.382 The authors found that the activity of the EG increases generally with the addition of CBMs, and that the thermal stability in some cases, even with the addition of CBMs and linkers from mesophilic organisms, was improved. Although many cellulases have CBMs, there are many examples of fungal and nonfungal cellulases that do not employ them in natural biomass degradation. There are likely multiple physiological and evolutionary reasons for the lack of CBMs in some cases, such as biomass degradation in organisms that densely pack solids into their digestive organs.383 Recently, an elegant study from Várnai et al. shed light on one potential 1330

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

reason by examining the performance of a T. reesei enzyme cocktail at high solids loadings.384 Therein, they demonstrated that T. reesei cellulases with their CBMs and linkers removed are able to achieve the same extents of conversion on both Avicel and pretreated wheat straw at 20% solids loading. The most likely reason for this observation is that, at high solids loadings, the diffusion length for the enzymes to productively bind to substrate is much shorter than in low solids loading digestions, thus precluding the need for CBMs to ensure that the concentration of active enzymes at the substrate interface is high. For natural systems, high dry-matter content may also preclude the need for CBMs, but a systematic correlation coupled to other environmental and evolutionary considerations has not been conducted to our knowledge. For commercial biomass conversion, the work from Várnai et al. suggests that additional research should be conducted to understand the need or lack thereof for CBMs in cellulolytic enzymes in industrial contexts, as this could lower the overall mass loading of enzymes for biomass degradation to sugars.384 Given the relatively small size of family 1 CBMs coupled to the inherent difficulty in directly studying molecular-level interactions at the heterogeneous cellulose interface, family 1 CBMs have been the subject of computational studies since 1995 aimed at further elucidating cellulose−CBM interactions at the molecular level.121,123,370,371,385−395 Nimlos et al. conducted MD simulations of the TrCel7A CBM on the hydrophobic face of cellulose and observed that Tyr13 undergoes a conformational change from being tucked into the CBM structure to bind directly to the substrate. It is noteworthy that this simulation was conducted with a previous version of the CHARMM force field,396 and this behavior has not been observed in subsequent simulations with more updated potentials. The conservation of an aromatic residue in this position (Figure 16), however, suggests that this aromatic residue is important for a structural or functional reason, and future experimental work will likely shed additional light on this question. Multiple studies were later reported that suggest how CBMs translate on the cellulose surface.121,123,370,392,394,395 Potential energy surfaces of a family 1 CBM on both a coarsegrained and atomistic surface of cellulose Iβ revealed that the Cel7A CBM displays potential energy wells every cellobiose unit (roughly 1 nm) over the surface of cellulose (Figure 17). A much more detailed study was later published by Nimlos et al. describing multiple aspects of CBM behavior on cellulose. This study demonstrated that there is a thermodynamic driving force for the TrCel7A CBM to translate from hydrophilic surfaces to

the preferred hydrophobic face, that the CBM will translate along the hydrophobic surface in both a forward and backward direction with equal probability, and that the flat, hydrophilic face of the CBM is the thermodynamically preferred face for cellulose binding. Taken together, significant insights have been gained regarding the function of family 1 CBMs, especially accelerated and informed by the initial structural work.346 Yet, these studies also illustrate that our collective understanding of CBM function for activity and stability in the context of full-length, multimodular enzymes and the corresponding potential for protein engineering via CBMs remain limited. Moreover, as is clear from this section as well, most of our collective understanding of family 1 CBMs stems almost solely from the model T. reesei system, despite the prevalence of family 1 CBMs in many other fungi. Undoubtedly, there is significant potential for improving cellulase properties via deeper understanding of CBM function in the context of both cellulase performance and stabilization. 5.2. Linkers

In multimodular cellulases, linkers of various length and sequence diversity connect CBMs to CDs. In some cases, these linkers are very short, and the CBMs are thus in intimate contact with the CDs.397,398 In fungi, however, linkers characterized to date tend to be longer and exhibit glycosylation, thus allowing greater separation between CBMs and CDs. Here, we review the characterization of linkers and their putative functions beyond domain connection. We include discussion of nonfungal linkers where appropriate to understanding their general function. Some of the original work related to linker domains and cellulase modularity was conducted with small-angle X-ray scattering (SAXS) using T. reesei and C. f imi cellulases as model systems.399−403 Abuja and co-workers examined both Cel7A and Cel6A, and in both cases, determined that the enzymes exhibited a “tadpole”-like structure with a large core domain (the CD) and a long, flexible “tail” (the CBM-linker; Figure 18).399 On the basis of the work from van Tilbeurgh et al.,343 there was already strong experimental evidence indicating that the “tail” domain was responsible for carbohydrate binding. A “collar”-like section in the Cel7A linker region was also

Figure 17. Simulation of a family 1 CBM of a cellulose Iβ surface was used to calculate potential energy surfaces as shown. The minimum energy wells, shown in dark blue and violet, represent preferential locations for the CBM at the surface. These locations correspond to approximately 1 nm, which is the length of a cellobiose unit. Reprinted with permission from ref 370. Copyright 2010 American Chemical Society.

Figure 18. Overall “tadpole” topologies of the TrCel7A and TrCel6A enzymes. The C- and N-termini are noted, highlighting the reverse locations of their cores (CDs). The Cel7A linker displays a characteristic “girdle”, while the Cel6A linker comprises two repeating units. Adapted with permission from ref 399. Copyright 1998 Elsevier. 1331

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

attributed to the presence of O-glycosylation.400 It was noted that the TrCel6A tail region was significantly longer than its GH7 counterpart from the same organism. A second study on TrCel7A demonstrated that the enzyme becomes elongated in the presence of xylan from 18 to 22 nm, which was attributed to lengthening in the CBM-linker region.401 Langsford et al. published the seminal study demonstrating glycosylation of cellulase linkers in C. fimi serves to protect against proteolysis.404 Specifically, the authors isolated two cellulases from C. f imi and expressed their counterparts in E. coli, the latter of which does not natively glycosylate proteins. The properties of the enzymes in terms of kinetics on small molecule substrates (such as CMC or pNPC) and thermal and pH stabilities were not affected by glycosylation.404 Binding was measured at 0 °C on Avicel and also suggested that, at those conditions, glycosylation had no effect. In the presence of crude protease from C. f imi, however, the recombinant cellulases showed fragmentation patterns that suggested the prolinethreonine rich linker domains are proteolytically cleaved. Cellulase performance in Avicel digestion was not directly compared in the absence of protease. Overall, this study led Langsford et al. to suggest that the C. f imi cellulases were multimodular with CBMs and CDs, similar to that of van Tilbeurgh et al.343 Importantly, they also suggested that the proline-threonine rich domain between these two functional domains is a “hinge region” (now commonly referred to as the “linker”) that contains O-glycosylation that protects against protease action.404 Although an equivalent study has not been conducted to our knowledge in fungi, perhaps due to the inherent difficulties in expressing fungal cellulases in E. coli or fully deglycosylating fungal cellulases, it is likely that the finding from Langsford and co-workers is similar in multimodular cellulases regardless of origin in that O-glycosylation on linker domains tends to shield the linkers from proteolysis during cellulose depolymerization in an extracellular, competitive environment. Srisodsuk et al. examined the activity of TrCel7A as a function of linker length.251 Therein, the authors produced two mutant enzymes, one in which the approximately first one-third of the linker was removed, comprising Gly434-Gly444, referred to by the authors as the “hinge” region, and the other from Gly434 to Gly 460, which essentially is removal of the entire linker. The hinge mutation resulted in similar overall performance of the enzyme relative to the wild-type, but with a reduced binding capacity at high loading. The mutant removing the entire linker drastically reduced the enzymatic activity toward crystalline cellulose. The overall interpretation of the linker’s role by the authors was that it is important for maintaining sufficient distance between the CBM and CD and that it facilitates “dynamic adsorption” to the surface of cellulose.251 An earlier, similar study on a bacterial cellulase also demonstrated that removal of the linker domain significantly reduces its activity toward both crystalline and amorphous cellulose.405 Boisset et al. examined the H. insolens EG V (GH45 cellulase) with static light scattering, which contains a family 1 CBM and a 33-residue linker.406 They found that the average length of each residue in the linker domain is approximately 2 Å, which is not in agreement with α-helices, polyproline helices, or β-sheets. From these initial SAXS and light scattering studies, it was not clear if linkers are stiff or flexible. Receveur and co-workers later reported a comprehensive SAXS study wherein they examined the full-length H. insolens GH45 EG (the same enzyme as from Boisset et al.406), the CD alone, the enzyme with the CBM

removed, the CBM alone, a variant where a substantial portion of the linker was removed, and a polyproline insertion in the linker.407 They found that the CBM presence does not alter the overall conformation of the enzyme and that the CBM was not discernible from the linker region. The results for the wild-type enzyme and the shortened-linker variant both demonstrate that the linker volume is quite substantial and extended, suggesting either (or both) that the glycosylation on the linker provides an excluded volume effect or that the linker is quite flexible, but extended. The polyproline mutant exhibited a considerable narrowing in the region where the proline insertion was made, suggesting that this region imparted significant local rigidity. The authors interpret their data to suggest that the linker provides the means to separate the CBM and CD and optimize their relative geometry, similar to the interpretation of Srisodsuk et al.251 The authors go on to state that the conformational landscape of the linker may impart a “caterpillar”-like motion between the CBM and CD during catalytic action on insoluble cellulose. This model is one wherein free energy is gained by compression of the linker during catalysis, which is dissipated by sliding of the CBM.407 In the aforementioned study from Receveur et al.,407 the authors could not resolve the CBM in the full-length H. insolens GH45 EG, thus limiting their ability to decipher linker behavior relative to CBM behavior in their SAXS experiments. To overcome this problem and focus only on the linker domain, von Ossowski et al. built a chimeric cellulase with 2 GH6 cellulases from H. insolens (Cel6A and Cel6B) with an 88residue linker between them, wherein both the CDs were completely discernible from the linker region.408 Using SAXS combined with molecular modeling, the authors demonstrated that the chimeric cellulase adopts a huge range of conformations in solution accessible at low energetic cost with equal probability (thus, free energetically similar) from compressed to extended, as measured by the end-to-end distance. To our knowledge, this study marked the first explicit connection of linkers with intrinsically disordered proteins, which has become a large field in the last 10−15 years.409−413 The authors also propose that the O-glycans may provide greater extension between the subdomains, and that their results provide additional evidence toward an inchworm- or caterpillar-like mechanism. Additional studies have been published that include SAXS analysis of intact cellulases with broadly similar conclusions.414 Besides SAXS, NMR and MD simulation also been used to characterize linker behavior. Poon et al. used NMR spectroscopy to examine the proline-threonine rich linker of a family 10 xylanase from C. f imi.415 Therein, they showed that the linker samples conformational space on very fast time scales: from picoseconds to nanoseconds, and that glycans serve to dampen mobility. In two subsequent papers, we examined linker behavior in 3 GH7 cellulases and the TrCel6A linker.416,417 Beckham et al. first used replica-exchange MD to examine the TrCel7A linker with and without O-glycosylation.416 It was shown that the Cel7A linker was an intrinsically disordered protein in solution and that the primary role of the glycans in the isolated domain was to serve an excluded volume effect, as measured by the conformational free energy of the linkers as a function of the end-to-end distance. A later study from Sammond et al. demonstrated similar behavior for three additional fungal cellulase linkers.417 Therein, the lack of structure predicted by replica-exchange MD simulation was confirmed for the nonglycosylated variants of all four linkers with circular dichroism spectroscopy. 1332

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 19. Characteristics of cellulase linkers; data adapted from Sammond et al.417 (A) Linker length histograms for fungal GH7 and GH6 cellulases. (B) O-glycosylation distribution across linker sequences measured by the prevalence of serine and threonine residues for fungal GH7 and GH6 cellulases.

It has often been mentioned that linkers in cellulase enzymes do not exhibit sequence conservation or high homology to one another. In the aforementioned study from Sammond et al., an alternative set of methods was applied to study the similarities and differences between linkers besides sequence alignments.417 Instead of the conventional sequence identity, linker lengths, amino acid content, glycosylation distributions, and these variables as a function of one another, of the GH family, and of the origin (i.e., either bacterial or fungal) were compared. In doing so, quantitative patterns in linker characteristics begin to emerge, some of which are illustrated in Figure 19.417 For example, multimodular fungal GH6 and GH7 cellulases exhibit a significant difference in linker length (with GH6 cellulase linker average length of 42 and GH7s at 30 residues), the functional reason for which remains unknown. In all cases examined, Oglycosylation was found to be approximately uniformly distributed across the length of linkers, suggesting that it is needed in a distributed manner to protect against proteolysis or to serve additional, unknown functions. Glycine residues were found to be clustered at the termini of all linkers studied, suggesting the need for flexibility at the junction between ordered domains and the linker, perhaps for orientation during catalytic action. The amino acid content of bacterial and fungal linkers differed significantly, with higher proline content in

bacterial cellulase linkers and higher Ser/Thr content in fungal cellulase linkers. Taken together, these results begin to suggest that cellulase linkers exhibit properties that are optimized for specific catalytic function, the mechanistic underpinnings for which are mostly unknown.417 Given the wealth of sequence data for cellulases and other carbohydrate-active enzymes in the CAZy database,151−153 this bioinformatics approach offers a straightforward, quantitative means to develop new hypotheses regarding linker function and should provide a framework for the comparison of linkers beyond simple sequence alignments. Detailed characterization of the glycosylation pattern of linkers is essential to understand their behavior. To that end, several in-depth mass spectrometry studies have been conducted to characterize the glycosylation patterns on cellulase linkers. For TrCel7A, Harrison et al. published the first detailed study of the linker domain glycosylation in 1998; the primary results from which are shown in Figure 20.368 Therein, the authors examined Cel7A from a hyper-cellulase-producing strain of T. reesei (ALKO2877) and determined that all serine and threonine residues in the linker exhibited at least a single O-mannose residue. They also found that a mannose residue on the linker exhibited a sulfate group, but the role of this sulfation remains unknown. In 2001, Hui et al. characterized the linker of the same enzyme produced in RUT-C30, Iogen-M4, and Iogen-B13 T. 1333

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

store free energy that is dissipated during catalysis or CBM movement, thus driving the enzyme forward. This hypothesis was tested computationally by using a free energy method (umbrella sampling) by compressing and extending an isolated linker domain over a cellulose slab, which suggested that there was indeed a barrier.424 However, the computed barrier for compression was found to be incredibly high, likely given that convergence of simulations of these types are quite difficult with such a low-resolution reaction coordinate as end-to-end distance of a large, glycosylated linker peptide.416,424 Ting et al. proposed a theoretical mechanochemical model of how a multimodular CBH such as TrCel7A can work at the solid−liquid interface.425 Therein, they assumed that the CBM and CD are random walkers connected by a linker modeled as a spring. The possible steps in the model, governed by a master equation, are CBM motion in a “forward” or “backward” direction, and CD motion (once productively bound to a cellulose chain) in a “forward” direction, i.e., a hydrolysis and processivity event. Note that the events of processive cellulolytic action are described in detail in sections 6.2 and 7.2.7. The rate constant governing hydrolysis and processivity for the CD account for both the work required to decrystallize a single polymer chain and the compression of the linker as the CD moves forward. The model demonstrates that the maximum enzyme velocity on the surface is reached at intermediate linker stiffness, for a given length. The overall recommendations from this theoretical model are that optimization needs to be conducted not only of the CD hydrolysis rates, as was already known, but also of the linker length and stiffness. Toward further understanding linker behavior during cellulase action, we recently reported a combined computational and experimental study wherein the interaction of the TrCel7A and TrCel6A enzymes with the surface of cellulose were examined.393 From long MD simulations, it was predicted that the glycosylated linkers in both enzymes were able to bind to cellulose, as illustrated in Figure 21. Subsequently, the binding affinities to cellulose were experimentally measured of both the glycosylated TrCel7A CBM-linker (isolated via papain cleavage) and the CBM alone (produced with solid-state peptide synthesis). These experimental measurements demonstrate that the linker indeed is able to increase the binding affinity to cellulose by a factor of 10 over the CBM alone, as fits with a Langmuir isotherm model, and that it likely does not function as a spring between the two structured domains. The MD simulations further revealed that the linker binding is dynamic, and the linker lacks secondary structure upon binding, thus suggesting that its binding mechanism is one that is nonspecific compared to that of the CBM. These results hearken back to the

Figure 20. (A) TrCel7A linker glycosylation pattern from Harrison et al.368 (B) Molecular snapshot of the linker from Beckham et al.416

reesei strains.273 In all cases, the serine and threonine residues exhibited heterogeneous mannose distributions, with a single phosphorylation on an undetermined site in the linker. Shortly after, Hui et al. published a second study characterizing the glycosylation of TrCel6A, TrCel7B, and of the GH5 EG, TrCel5A.418 For each linker, the following number of glycans were detected on each linker: 39−46 O-glycans for Cel6A, 24− 34 for Cel7B, and 32−42 for Cel5A. Stals et al. conducted a thorough examination of glycosylation on the TrCel7A linker as a function of growth conditions and found that, in minimal media at low pH, 19−29 mannose residues were present, whereas, in rich media, the O-glycans on the linker were trimmed back to 16−23 residues.419 On the basis of analysis of fungal glycosylation pathways and the aforementioned mass spectrometry data, the T. reesei O-glycans are likely primarily mannose units connected by α-O linkages to serine or threonine.273,368,418−422 However, other carbohydrate monomers, such as glucose, as well as branching and straight chain Oglycans are known to be found in yeast and fungi, including T. reesei.423 Mannose and other carbohydrate monomers are known also to be connected by both α2 and α6 linkages. These data, when taken together, further complicate the ability to examine the impact of linker glycosylation in a systematic manner. Many studies have been conducted to understand how linkers behave in isolation or how intact multimodular enzymes behave in solution. Conversely, our collective understanding of linker function during enzymatic action is limited given that cellulases act at a solid−liquid interface, which is inherently difficult to study at the molecular level. Related to cellulase action vis-à-vis linker function, Receveur et al. proposed the inchworm hypothesis wherein, during catalytic action, the linker will

Figure 21. Molecular snapshots of TrCel7A and TrCel6A wherein the linker binds to the cellulose surface from microsecond-long MD simulations. These computational predictions of cellulose linkers enhancing binding of CBMs to the cellulose surface were corroborated experimentally via binding isotherm measurements.393 1334

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

organism

expression host

1335

Pichia pastoris

Chaetomium thermophilum CT2 Chrysosporium lucknowense Cladorrhinum foecundissimum

Fusicoccum sp. BCC4124 Heterobasidion irregulare TC 32-1 Humicola grisea var. thermoidea IFO9854 Humicola grisea var. thermoidea IFO9854

pH opt

Bauer et al., 2006487

CMC, Glc4/Glc5/Glc6, barley β-glucan, lichenan, xyloglucan

5.0−5.5

Cel7A

CBHI

Aspergillus oryzae 5.0

5.0

4.0

Cel7A

Exo1

5.0

CBHI

EG I, Cel7B

Cel1

Cel7

5.0

4.0

5.0

Cbh3

Cel7A

Bgl7A

60

65

45

40

60

65

60

Gusakov et al., 2005494 Vlasenko et al., 2010332

pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan CMC, Avicel

pNPC

pNPC

4-methylumbelliferylβ-D-cellobioside pNPL

stable at pH 3.0−10.0 at 4 °C for 20 h stable at pH 2.0−10.0 at 4 °C for 20 h, and at 55 °C for 10 min

Kanokratana et al., 2008496 Momeni et al., 2013449 Takashima et al., 1998497 Takashima et al., 1996,498 Takashima et al., 1998497

pNPL Avicel, CMC, Glc2/Glc3/Glc4/Glc5/Glc6, p-nitrophenyl-β-Dglucoside, pNPC Avicel, CMC, xylan, Glc2/Glc3/Glc4/Glc5/Glc6, p-nitrophenyl-β-D-glucoside, pNPC

assay conditions 50 °C and pH 5.0; no detectable residual activity on CMC after 3 h at 60 °C and pH 5.0 stable at pH 3−11 and maintains ∼50% at 70−90 °C for 30 min

assay conditions 40 °C; maintained >90% activity after 7 h at 60 °C assay conditions 50 °C and pH 5.0; 30% residual activity on CMC after 3 h at 60 °C and pH 5.0

assay conditions 30 °C, using partially purified enzyme assay conditions 30 °C, using partially purified enzyme stable below 50 °C and at pH 3.0−7.0; A. oryzae KBN616 used for CMC activity; A. oryzae RIB40 used for barley β-glucan activity stable at 60 °C for at least 1 h and at pH 1.0−8.0

assay condition 50 °C and pH 5.0; 72% residual activity on CMC after 3 h at 40 °C and pH 5.0 assay conditions 50 °C and pH 5.0; 92% residual activity on CMC after 3 h at 40 °C and pH 5.0 identical optima, but lower activity, when expressed in S. cervisiae assay conditions 37 °C and pH 4.5

comments

pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan Avicel, filter paper, 4-methylumbelliferyl-β-D-cellobioside

Vlasenko et al., 2010332

Müller et al., 2007495

Li et al., 2009493

pNPL, pNPC, Avicel, CMC, cotton

pNPC pNPL

Luo et al., 2010,491 Zhang et al. 2013492 Voutilainen et al., 2008,381 Szijártó et al., 2011483

4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenolβ-D-lactoside, Avicel, PASC, filter paper, hydroxyethylcellulose, xylan, p-nitrophenyl-β-D-glucoside pNPC, MCC, filter paper

barley β-glucan, lichenan, CMC, laminarin, xylan

4-methylumbelliferylβ-D-lactoside

lichenan

488

Kitamoto et al., 1996,489 Kotaka et al., 2008490

Gielkens et al., 1999488

CMC, barley β-glucan

CMC

Takada et al., 1998,484,485 Kanamasa et al., 2003486 Bauer et al., 2006487

Vlasenko et al., 2010332

Vlasenko et al., 2010332

CelB

45

ref Voutilainen et al., 2008,381 Szijártó et al., 2011483

Gielkens et al., 1999

CMC

Avicel

substrate specificity 4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenolβ-D-lactoside, Avicel, PASC, filter paper, hydroxyethylcellulose, xylan, p-nitrophenyl-β-D-glucoside pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan Avicel, insoluble oligosaccharides (DP20), CMC, alkaliswollen cellulose, pNPL CMC, Glc3/Glc4/Glc5/Glc6

CMC 4.0

substrate for opt 4-methylumbelliferylβ-D-lactoside

CbhB

42

60

60

CMC

5.5

3.0

5.0

CbhA

Aspergillus oryzae

Pichia pastoris

Trichoderma reesei

Chaetomium thermophilum

Claviceps purpurea T5 Fusarium oxysporum

Pichia pastoris

Bispora sp. MEY-1

Pichia pastoris

CBHI, CbhB EglB

CBHI

Cel7A

Aspergillus aculeatus F-50 Aspergillus nidulans FGSC A4 Aspergillus nidulans FGSC A4 Aspergillus niger CBS 513.88 Aspergillus niger CBS 513.88 Aspergillus oryzae

Pichia pastoris

enzyme

Cel7A

Cel7B

Trichoderma reesei

Acremonium sp. CBS265.95

Acremonium thermophilum ALKO4245 Acremonium sp. CBS265.95

temp opt (° C)

Table 5. Summary of Biochemical Characterizations of Fungal GH7 Cellulases

Chemical Reviews Review

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Pichia pastoris

Myceliophthora thermophila Penicillium chrysogenum FS010 Penicillium decumbens Penicillium decumbens

1336

Talaromyces emersonii IMI 392299 (Rasamsonia emersonii) Talaromyces f uniculosus IMI 378536 [Penicillium f uniculosum] Thermoascus aurantiacus ALKO4242 Thermoascus aurantiacus IFO 9748

Phanerochaete chrysosporium

Trichoderma reesei Saccharomyces cerevisiae

CBH1

Penicillium pulvillorum Phanerochaete chrysosporium

pH opt

5.0 6.0

Cel7A

3.5

4.1

4.2

4−5

4.0

5.0

5.0 5.0 6.0

5−6

5.5

5.0

Cel7A

XynA

CBH62, CBH1.1, Cel7C Cbh58, CBH1.2, Cel7D Cbh1A, CBH IB, Cel7A

CBHI

CBHI, Cel7A Cel7B

CBH1

Penicillium occitanis Pol6

Saccharomyces cerevisiae

Cel7B (cbh)

Saccharomyces cerevisiae

Melanocarpus albomyces

EG7A

Cel1, Ex-1 Cel2, Ex-2 Cel7A

Irpex lacteus MC-2 Irpex lacteus MC-2 Melanocarpus albomyces

Humicola insolens

CBH1, Cel7A EG1, Cel7B

Aspergillus oryzae Aspergillus oryzae

enzyme

EGL1

expression host

Aspergillus oryzae

organism

Humicola grisea var. thermoidea IFO9854 Humicola insolens

Table 5. continued

65

65

55

66−69

60

60

60

65

50 50 65−70

55−60

temp opt (° C) substrate for opt

4-methylumbelliferylβ-D-lactoside pNPL

barley β-glucan

2-chloro-4-nitrophenylβ-D-cellobioside, pNPL

Avicel

pNPC/PASC

CMC

CMC

Avicel

Avicel Avicel hydroxyethylcellulose

CMC

Glc3

pNPC

substrate specificity

Wei et al., 2010509

CMC, barley β-glucan, PASC, pNPC, Avicel, xylan

4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenolβ-D-lactoside, Avicel, PASC Avicel, PASC, pNPL, pNPC

Tuohy et al., 2002513 Avicel; pNPC, pNPL, 2-chloro-4-nitrophenyl-β-D-cellobioside, 2-chloro-4-nitrophenol-β-D-cellotrioside, 2-chloro-4nitrophenol-β-D-lactoside, 4-methyl-umbelliferyl-β-D-cellotrioside barley β-glucan, CMC, pNPL, pNPC, Glc3/Glc4/Glc5, xylan, arabinoxylan

Hong et al., 2003516

Voutilainen et al., 2008381

Texier et al., 2012,514 Furniss et al., 2005515

Uzcategui et al., 1991,512 von Ossowski et al. 2003461

Uzcategui et al., 1991512

Avicel, pNPL, pNPC, CMC

Avicel, pNPL, pNPC, BMCC, hydroxyethylcellulose amorphous cellulose

Marjamaa et al. 2013511

Avicel, CMC, p-nitrophenyl-β-D-glucoside, glucuronoxylan

Limam et al., 1995510

Gao et al., 2012508

pNPC

pNPC, pNPL, Avicel, filter paper, PASC, Glc3/Glc5

Hou et al., 2007507

stable at pH 3.0−9.0 at 40 °C for 24 h; maintained >80% initial activity after 1 h at 65 °C

pH optimum measured at 50 °C; temperature optimum measured at pH 5.0; T1/2 of 68 min at 80 °C and pH 5.0 maintains >80% activity at pH 3−4.5

stable at pH 3−8 at 4 °C for 16 h; maintained >90% initial activity after 1 h at 60 °C stable at pH 2−9; maintains activity below 60 °C, but loses ∼50% activity after 30 min at 60 °C, and inactivated at 70 °C assay conditions 45 °C

stable at pH 3−11, retaining initial activity after 24 h

assay conditions pH 6.0

assay conditions 40 °C; 35% residual activity on CMC after 3 h at 60 °C and pH 5.0

assay conditions 37 °C

Schou et al., 1993,442 Schülein, 1997,499 Xu et al., 2009500 Schou et al., 1993,442 Schülein, 1997,499 Vlasenko et al., 2010332 Hamada et al., 1999501 Hamada et al., 1999501 Miettinen-Oinonen et al., 2004,502 Szijártó et al., 2008503 Voutilainen et al., 2007,504 Szijártó et al., 2008,503 Miettinen-Oinonen et al., 2004,502 Voutilainen et al., 2009505 Karnaouri et al., 2014506

comments stable at pH 5.0−11.0 at 4 °C for 20 h, and at 60 °C for 10 min

ref Takashima et al., 1996,498 Takashima et al., 1998497

barley β-glucan, CMC, lichenan, arabinoxylan, xylan, filter paper, hydroxyethylcellulose, Avicel, Glc3/Glc4/Glc5 pNPC

4-methylumbelliferyl-β-D-lactoside, Avicel, CMC, hydroxyethylcellulose, PASC, filter paper, 2-chloro-4-nitrophenol-βD-lactoside

CMC, PASC, Glc3/Glc4/ Glc5/Glc6 pNPC, pNPL, Avicel, BC, pretreated corn stover, xyloglucan, xylan, arabinoxylan, mannan, galactomannan Avicel, CMC, BC, pNPL, pNPC Avicel, CMC, BC, pNPL, pNPC hydroxyethylcellulose, 4-methylumbelliferyl-β-D-lactoside, CMC, Avicel, PASC

PASC, Glc3/Glc4/ Glc5/Glc6, Avicel

Avicel, CMC, xylan, Glc2/Glc3/Glc4/Glc5/Glc6, p-nitrophenyl-β-D-glucoside, pNPC

Chemical Reviews Review

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

CBHI

EGI

Trichoderma viride HK-75 Trichoderma viride AS 3.3711

EGI, Cel7B

Saccharomyces cerevisiae

CBHI, Cel7A

Cel7B

Pichia pastoris

CBHI, Cel7A

Egl1, Cel7A

Saccharomyces cerevisiae

pNPC

substrate specificity

5.8

5.0

4.5

4.5

60

50

65

60

45

50

CMC

CMC

4-methylumbelliferylβ-D-lactoside

4-methylumbelliferylβ-D-lactoside PASC

CMC

CMC

CMC, Glc3/Glc4/Glc5/Glc6, pNPC, pNPL, PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan, barley β-glucan, hydroxyethylcellulose 4-methylumbelliferyl-β-D-lactoside, 2-chloro-4-nitrophenolβ-D-lactoside, 3,4- dinitrophenyl-β-D-cellobioside, 3,4dinitrophenyl-β-D-lactoside, BMCC CMC, Glc3

4-methylumbelliferyl-β-D-lactoside, CMC

CMC, barley β-glucan, laminarin, xylan

Avicel, PASC PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan BMCC, Avicel, Sigmacell 20, CMC, pNPC, cNPL

50

substrate for opt

Cel7A Cel7C 5.0

pH opt Avicel, PASC

enzyme

temp opt (° C)

Cel7A

Aspergillus nidulans

expression host

Trichoderma reesei L27

Trichoderma harzianum FP 108/ IOC-3844 Trichoderma longibrachiatum CECT 2606 Trichoderma pseudokoningii Trichoderma reesei

Thielavia australiensis Thielavia terrestris Thielavia terrestris

organism

Table 5. continued ref

Song et al., 2010520

Kwon et al., 1999

Van Arsdell et al., 1987,330 Biely et al., 1991,331 Bailey et al., 1999,286 Vlasenko et al., 2010332 Boer and Koivula, 2003,327 Becker et al., 2001328

Mitrovic et al., 2014519

Ganga et al., 1997518

Colussi et al., 2011,517 Textor et al., 2013466

Xu et al., 2009500 Vlasenko et al., 2010332

Xu et al., 2009500

retains >70% of maximal activity at pH 3.8−7.0 and 60% of maximal activity at 40−90 °C

retains >80% activity at pH 3.5−6.0

assay conditions pH 5.0; 50% residual activity on CMC after 3 h at 60 °C and pH 5.0

narrow pH activity (35% activity on CMC at pH 4 and 5.5); opt on xylan at pH 4.5 and 60 °C assay conditions pH 4.8

assay conditions 50 °C and pH 5.0

comments

Chemical Reviews Review

1337

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Table 6. GH7 Rate Constantsa isolated CBH ref 476 564 307 557 561 470 441 564 307 561 307 557 561 557

property kcat kcat kcat kcat kcat

−1

(s ) (s−1) (s−1) (s−1) (s−1) kcat,gly (s−1) kcat,gly (s−1) koff (s−1) koff (s−1) koff (s−1) Papp Papp Papp kobs (s−1)

CBH plus EG value

substrate

± 3.9 ± 3.5 ± 0.4 ± 0.5 2.4 0.415 10.8 0.20 ± 0.01 0.0007 ± 0.1 0.01 61 ± 14 66 ± 7 22.6 0.1 ± 0.05

Iα III BC BC BMCC Glc8 Glc9 III BC BMCC BC BC BMCC BC

7.1 6.8 2.8 2.2

property

value

subtrate

ref

kcat (s−1)

1.5 ± 0.2

BC

557

Papp

50 ± 3

BC

557

kobs (s−1)

1.45 ± 0.5

BC

557

a

Advanced experimental techniques have allowed for the determination of various rate constants in the processive cycle of GH7 CBHs. In each case, the CBH is intact TrCel7A, and the EG is TrCel5A. Substrate abbreviations not previously defined are III = cellulose III, Iα = cellulose Iα, Glc8 = cellooctaose, and Glc9 = cellononaose. The rate constant kcat represents the rate constant for the complete inner processive cycle that includes processivity, hydrolysis, and product expulsion, whereas kcat,gly specifically describes the barrier for the glycosylation reaction and was computed via advanced molecular simulation techniques. Papp was called n in the original publication by Cruys-Bagger et al.561 Rate constant kobs is calculated from the rate of cellobiose produced normalized by the concentration of CBH with occupied active site.557

6. FAMILY 7 GLYCOSIDE HYDROLASES GH7 enzymes are commonly among the most prevalent cellulolytic enzymes in secretomes of biomass-degrading fungi, almost undoubtedly because the processive GH7 cellulases provide the majority of hydrolytic turnover during fungal cellulose depolymerization. For example, T. reesei secretes a single GH7 CBH and a single GH7 EG.41 White-rot basidiomycete fungi, such as the model fungus P. chrysosporium, degrade lignin in plant cell walls most likely for enhanced access to biomass polysaccharides, and also employ GH7 CBHs and EGs for cellulolytic action.426,427 Similar to some organisms that employ GH7 CBHs to degrade biomass, P. chrysosporium has multiple GH7 CBHs, and in virtually all organisms that exhibit multiple GH7 CBH genes, the need for multiple, similar CBHs from the same family is as of yet unknown. Moreover, unlike other GH enzymes in this review (i.e., GH6, GH5, GH12, and GH45), GH7 enzymes have not been found in bacteria or archaea to date, but mainly have been found in fungi. The CAZy database currently lists nearly 5000 GH7 sequences, although some are noted to be gene fragments only, not full-length enzymes.151−153 Quite recently, GH7 enzymes have been characterized in crustaceans,428 protist symbionts,429 stramenopiles,428 and slime molds (or social amoeba),430−432 demonstrating that they are not only found in fungi. Interestingly, as highlighted by King et al., these nonfungal GH7 cellulases offer distinct evolutionary branches from fungal cellulases, inspiring the study of the similarities and differences with their fungal counterparts.383,428,432,433 The first discovered and characterized GH7 cellulase was characterized originally from T. reesei in the late 1970s and early 1980s, which was originally denoted CBH I.434−438 As mentioned in section 4, the gene was subsequently sequenced simultaneously by two groups in the early 1980s.281,282 Initial work from Pettersson et al. also discovered that TrCel7A was a multimodular protein, which were among the first studies to discover the coupling of binding and catalytic function in cellulases, as discussed above.343,344 As a result of the significant body of work on GH7 cellulases, especially CBHI from T. reesei

Abuja et al. SAXS study from 1989 wherein the addition of xylan to TrCel7A was demonstrated to stiffen the linker.401 It has been known for some time that modifications to cellulase linkers modify the enzyme activity, typically in detrimental ways when major changes are made.251,405 On the basis of pioneering work using SAXS and other biophysical methods, we now know that cellulase linkers such as those found in fungi are intrinsically disordered proteins.407,408,414,416,417 For cellulases from T. reesei, linker glycan patterns are either known or at least the range of mannose residues have been characterized on linkers.273,368,418−420 New bioinformatics analyses have emerged recently that suggest a more general means to analyze linkers beyond sequence alignments, yielding more detailed quantitative information about linker function and optimization. 417 Lastly, new theoretical analysis coupled with biophysical measurements has suggested that glycosylated linkers play a direct role in cellulose binding, similar to the CBM but with a nonspecific, dynamic role.393 However, despite these strides, key questions remain regarding the detailed mechanistic roles of cellulase linkers. For example, it is unknown why cellulases from different families employ linkers of different average lengths, or why and how linkers from different organisms affect overall enzyme stability and activity.290 Certainly glycosylation is known to be important for linker protection404 and more recently for binding to cellulose,393 but glycosylation is known to be quite heterogeneous and it can differ significantly between fungi.421 How linker sequences and glycan patterns coevolved is an open question. Moreover, most fungal cellulase linkers have been examined from cellulases from T. reesei or similar fungi. However, some fungal cellulases, such as those found in rumen fungi employ linkers with dramatically different sequence characteristics such as with extremely high asparagine content.417 Clearly, many structure−function studies on both CBMs and linkers remain yet to be done to fully understand their detailed roles in cellulose deconstruction. 1338

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Table 7. Reported Fungal GH7 Crystal Structures source and original name in primary citation Trichoderma reesei CBH1/Cel7A

Heterobasidion irregulare Cel7A Limnoria quadripunctata Cel7B

Melanocarpus albomyces Cel7B

Phanerochaete chrysosporium Cel7D

Talaromyces emersonii Cel7A

Trichoderma harzianum Cel7A

PDB code

resolution (Å)

1CEL 1DY4 1EGN 1Q2B 1Q2E 2V3I

1.80 1.90 1.60 1.60 1.75 1.05

2CEL 3CEL 4CEL 5CEL 6CEL 7CEL 4C4C 4C4D

2.00 2.00 2.20 1.90 1.70 1.90 1.45 1.32

2YG1 2XSP 4GWA 4HAP 4HAQ 4IPM 2RFW 2RFY 2RFZ 2RG0 1GPI

1.90 1.70 1.60 1.60 1.90 1.14 1.60 1.70 1.80 2.10 1.32

1H46 1Z3T 1Z3V 1Z3W 1Q9H 3PFJ 3PFX 3PFZ 3PL3 2Y9N 2YOK

1.52 1.70 1.61 1.70 2.35 1.36 1.26 1.10 1.18 2.89 1.67

1EG1 1OVW 2OVW 3OVW 4OVW 1A39 2A39 1DYM 1OJI 1OJJ 1OJK

3.60 2.70 2.30 2.30 2.30 2.20 2.20 1.75 2.15 1.40 1.50

brief highlights

ref

CBH Structures first GH7 structure reported; complex with o-iodobenzyl-1-thio-β-D-cellobioside. complex with (S)-propranolol engineered variant E223S/A224H/L225 V/T226A/D262G engineered variant D241C/D249C engineered variant, deletion of residues 245−252; complex with Glc2-S-Glc2 highest resolution of a CBH1 from T. reesei; complex with (R)-dihydroxy phenanthrenolol. active-site mutant E212Q active-site mutant E212Q; complex with cellobiose active-site mutant D214N active-site mutant E212Q; complex with two cellotetraose molecules active-site mutant E212Q; complex with cellopentaose and cellotetraose active-site mutant E217Q; complex with cellohexaose and cellobiose active-site mutant E217Q; Michaelis complex active-site mutant E217Q; covalent glycosyl-enzyme intermediate trapped using DNP-2deoxy-2-fluoro-cellotrioside complex with xylose first structure of a nonfungal GH7 CBH from a salt tolerant marine animal complex with cellobiose complex with cellobiose and cellotriose complex with thiocellobiose thermotolerant GH7 CBH1 complex with cellobiose complex with cellotriose complex with cellotetraose first structure of a GH7 CBH1 from a basidiomycete complex with (R)-propranolol complex with cellobiose complex with lactose complex with cellobioimidazole first structure of a thermostable GH7 CBH complex with cellobiose complex with cellotetraose complex with cellopentaose

172 463 328 461 461 unpublished 446 446 446 173 173 173 441 441 449 449 383 383 383 383 465 465 465 465 460 462 458 458 458 464 unpublished unpublished unpublished unpublished unpublished 466

EG Structures Trichoderma reesei Cel7B Fusarium oxysporum Cel7B

Humicola insolens Cel7B

first GH7 EG; complex with thiocellotriose complex with cellobiose complex with epoxybutyl cellobiose engineered variant S37W/P39W wild-type engineered variant E197A engineered variant E197S engineered variant E197S; complex with lactose engineered variant E197S; complex with cellobiose

in the late 1970s and 1980s, these enzymes were included in the first classification of cellulases from Henrissat et al., denoted as family C enzymes based on hydrophobic cluster analysis.439 In this section, we review the history of GH7 studies primarily starting from the first structural report in 1994.172 As these enzymes have been the focus of many studies, given their abundance in natural biomass degrading fungi and importance

450 174 448 448 448 457 452 452 653 653 653

in industrial biomass conversion, a significant number of studies have been applied to elucidate the GH7 catalytic mechanism, understand the basis of GH7 CBH processivity, improve their thermostability and activity, and ascertain other important features of their structure and function. Here, we focus on structure and activity studies of these key enzymes. We note that we do not include a significant review of studies before the first 1339

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 22. Crystal structure of the first GH7 CBH and EG. The ligand from the TrCel7A Michaelis complex (PDB code 4C4C441) is shown in all panels. (A) CBH TrCel7A CD (PDB code 1CEL172) view from side, exhibiting the β sandwich structure that is characteristic of GH7 enzymes. TrCel7A was the first GH7 structure solved and is the best-characterized member of GH7. (B) TrCel7A view from bottom showing the more closed substrate binding “tunnel”. (C) EG F. oxysporum Cel7B (PDB code 1OVW174) view from side. (D) FoCel7B view from the bottom showing the more open binding “groove”. (E) TrCel7A Michaelis complex (PDB code 4C4C441) exemplifies the standard numbering of the substrate binding sites (catalytic residues shown in green for reference). A cellulose chain enters from the −7 site. Hydrolysis occurs between the −1 and +1 sites; thus, the +1/+2 sites are termed the “product sites”.

of GH7 gene products are now of acute importance for the

structural reports, except where needed, as these studies have been reviewed extensively.32,316,440 We note that, despite the huge body of work conducted to date on these enzymes, there are still major elements of their function that remain to be elucidated, especially related to improving their stability and activity. Moreover, given the wealth of GH7 sequences available from genomics and metagenomics efforts, expression and testing

continued development of structure−activity relationships. A summary of GH7 structures that are discussed are provided in Table 7, and biochemical data for characterized GH7s are provided in Table 5. 1340

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

6.1. Structural Studies and Catalytic Function

cellulose chains using cello-oligosaccharides with tritium-labeled reducing ends.445 It was also proposed that the longer binding tunnel, along with the location of cleavage that is skewed toward the tunnel exit, would make Cel7A more processive than Cel6A. Given its importance as the first structural representative of a GH7 CBH and the fact that many studies have used this enzyme as the model GH7 CBH, in the discussion that follows, the numbering of individual enzyme residues corresponds to that of TrCel7A, unless otherwise noted. 6.1.2. TrCel7A Catalytic Mutants. Site-directed mutagenesis of TrCel7A was subsequently employed in 1996 to further probe the roles of the triad of acidic residues in glycosidic bond cleavage by mutation to their isosteric amide counterparts.446 The catalytic activity of the individual point mutants E212Q, D214N, and E217Q was impaired on 2-chloro-4nitrophenol-β-D-lactoside, with kcat reductions of 1/2000, 1/85, and 1/370, respectively, compared to the wild-type. In addition, E212Q and E217Q mutants lost all catalytic activity on crystalline cellulose.446 Crystal structures of each mutant showed that the active site architecture and overall fold of the protein are identical in all the mutants and the wild-type, thus confirming that the activity loss was due to the catalytic roles of these three residues (PDB codes 2CEL, 3CEL, and 4CEL). Importantly, the D214N structure (PDB code 4CEL) revealed a calcium ion bound to Glu212, supporting the hypothesis that this residue is the charged species in the precatalytic state. 6.1.3. F. oxysporum Cel7B with Active Site-Spanning Nonhydrolyzable Inhibitor. Also in 1996, important structural details of the substrate at the active site during catalysis were revealed by the first crystal structure of a GH7 EG from F. oxysporum (FoCel7B) complexed with a nonhydrolyzable thiooligosaccharide inhibitor (PDB code 1OVW, Figure 22C,D).174 EGI was previously characterized as having four binding sites via analytical HPLC measurements and kinetic assays of this enzyme on reduced cello-oligomers of varying length (DP 3−6).442 The active site-spanning substrate analogue captured in the FoCel7B crystal structure occupies the −2, −1, and +1 binding sites (Figure 24). Along with the structure of a chitin-degrading enzyme published earlier the same year,447 this crystal structure was the first to reveal an

6.1.1. TrCel7A: Wild-Type. The first crystal structure of a GH7 member was presented in 1994 (PDB code 1CEL), revealing the structure of the catalytic core of the CBH TrCel7A,172 whose primary structural characteristic is a large βsandwich formed by two large antiparallel β-sheets. This original structure was a complex with inhibitor o-iodobenzyl-1-thio-β-Dcellobioside; Figure 22A,B shows the 1CEL crystal structure with the cellononaose ligand from the solved structure of the Michaelis complex (PDB code 4C4C441), published 20 years later. The structure revealed a binding tunnel that was estimated to have 7 binding sites, roughly twice as long as that of the TrCel6A CBH (the only other CBH with a solved structure at that time192). Like Cel6A, the binding tunnel was lined with tryptophan residues (three in Cel6A and four in Cel7A). Though it was previously determined via 1H NMR that TrCel7A (later generalized to the entire GH7 family442) employs a twostep, retaining catalytic mechanism (contrasted with the onestep, inverting mechanism of TrCel6A),443 the structural machinery had not yet been revealed. This mechanism utilizes two glutamate residues that generally reside approximately 5.5 Å apart in the catalytically active conformation (Figure 23).444

Figure 23. First structural picture of a GH7 active site. TrCel7A (PDB code 1CEL) was the first GH7 structure solved. The o-iodobenzyl-1thio-β-D-cellobioside inhibitor captured in the product sites helped to identify Glu217 as the potential acid/base in the retaining mechanism. Glu212 was proposed as the nucleophile with the role of Asp214 not yet elucidated.

One of these residues is the nucleophile for the first step (glycosylation), and the other is the acid/base. The acid/base donates a proton to the glycosidic oxygen in the first step and removes a proton from a water molecule that serves as the second step nucleophile. On the basis of the 1CEL crystal structure, two glutamate residues were proposed as the catalytic residues: Glu217 as the catalytic acid/base and Glu212 as the nucleophile. On the basis of its proximity to the O4 atom of the o-iodobenzyl-1-thio-β-D-cellobioside glucosyl moiety (occupying the putative position of the cleavable glycosidic bond), Glu217 was suggested as the acid/base (Figure 23).172 A third acidic residue, Asp214 in TrCel7A, was noted to potentially be involved in the chemical steps172 due to its proximity to the active site and close contact with the putative nucleophile, Glu212. In addition, the location of the proposed active site suggested that the cellulose chain is cleaved from its reducing end. This was consistent with previous findings that established the preference of TrCel7A to hydrolyze the reducing ends of

Figure 24. Enzymatic substrate distortion in the GH7 active site. FoCel7B (PDB code 1OVW174) was the first GH7 EG structure solved. Note the ring distortion at the −1 subsite that is midway between a 4E envelope and a 1,4B boat conformation. Sulfur atoms (shown in yellow) replace the glycosidic oxygen atoms. Glu197 is the nucleophile, and Glu202 is the catalytic acid/base. 1341

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 25. Loop structures in GH7 enzymes. The major loops of TrCel7A (PDB code 4C4C441) and TrCel7B (PDB code 1EG1450). The loop nomenclature is taken from Momeni et al.449 Note the deletions of most major loops in TrCel7B. The ligand from 4C4C is shown in both panels.

Figure 26. Key structural differences among GH7 EGs. (A) At the tunnel entrance, TrCel7B (PDB code 1EG1, shown in orange) is the only GH7 EG with solved structure to naturally maintain the tryptophan stacking seen in TrCel7A. Also shown is the HiCel7B S37W/P39W mutant (PDB code 2A39, magenta). (B) Similar to the tunnel entrance, TrCel7B is the only of the three GH7 EGs with solved structure to maintain the aromatic stacking (Tyr38) at the −4 site seen in TrCel7A (Trp38). The HiCel7B S37W/P39W mutant is also shown, as well as the residues occupying this site in FoCel7B (PDB code 1OVW, gray) and wild-type HiCel7B (PDB code 2A39, slate): Ile37 and Ser37, respectively. (C) The product binding sites are quite different in the three EGs as compared with TrCel7A. Two of the three arginine residues that contact the ligand in TrCel7A are absent in all three EGs (Arg251 and Arg394). Arg267, however, is found in HiCel7B and FoCel7B. These two EGs also have an additional residue that hydrogen bonds to the substate (His209) that is not found in either of the T. reesei enzymes. In all panels, the background protein from TrCel7B (transparent orange “cartoon”) and the ligand from the TrCel7A Michaelis complex (PDB code 4C4C, in aquamarine “sticks”) are shown.

intact glycosidic bond across the cleavage sites (Figure 24).190 The glucosyl ring at the −1 subsite was distorted into a nonchair conformation (later termed a “skew boat”448). As discussed in section 3.2, GHs have been proposed to distort substrate ring conformations to aid catalytic bond cleavage. This structure was further evidence that this distortion is indeed critical for providing the nucleophile access to the −1 anomeric carbon for nucleophilic attack.174 The structure of FoCel7B also provided additional structural evidence confirming the importance of the three acidic residues that had been identified previously for TrCel7A. Glu202 (corresponding to TrCel7A Glu217) hydrogen bonds (at an O-S distance of 2.7 Å) to the glycosidic sulfur of the inhibitor between the −1 and +1 glucosyl rings (Figure 24). In addition, Glu197 (corresponding to TrCel7A Glu212) was poised for nucleophilic attack, residing 3.2 Å away from the −1 anomeric carbon. Finally, it was noted that Asp199 (corresponding to TrCel7A Asp214) formed a hydrogen bond (2.5 Å) to the catalytic nucleophile. Though its catalytic role was still unclear, it was speculated to be involved in “proton

shuffling” around the active site, maintaining the catalytically active protonation states for the nucleophile and acid/base. Though not discussed at length in the original publication, this structure was also the first to reveal the dramatic loop shortenings that were later identified as being characteristic of GH7 EGs. In fact, comparison with the structure of TrCel7A shows that loops B1, B2, B3, B4, A1, and A4 are all considerably shortened or absent altogether in FoCel7B. Figure 25 provides the loop nomenclature utilized herein for GH7 cellulases, following Momeni et al.449 6.1.4. TrCel7B. In 1997, the crystal structure of T. reesei EG (PDB code 1EG1), at the time defined as EGI but later renamed to TrCel7B, revealed the same overall fold as TrCel7A but with a completely open binding site “cleft” compared to the tunnel observed for TrCel7A.450 In TrCel7B, loops B2, B3, B4, and A4 are absent (Figure 25). However, loop A1 is of similar length in TrCel7B as it is in TrCel7A, though slightly more open in the crystal structure. Sequence alignments had previously indicated that major deletions seen in TrCel7B mapped to the tunnel1342

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 27. Comparison of the loops of GH7 EGs. Compared with GH7 CBHs, GH7 EGs display significantly shorter (or altogether deleted) loops that connect the two faces of the β sandwich. (A) TrCel7B (PDB code 1EG1) displays the most truncated loop structures of any GH7 cellulase with a solved structure. (B) FoCel7B (PDB 1OVW) and (C) HiCel7B (PDB code 2A39) have slightly more prominent B3 and B4 loops than TrCel7B. In all panels, the ligand from the TrCel7A Michaelis complex (PDB code 4C4C) is shown in aquamarine “sticks”.

detail for TrCel7B (Figure 27).450 These findings also solidified the structural basis for the observed differences in EGs and CBHs. The same study that presented wild-type HiCel7B also utilized site-directed mutagenesis to confirm the catalytic roles of Glu202 (acid/base, corresponding to Glu217 in TrCel7A) and Glu197 (nucleophile, corresponding to Glu212 in TrCel7A).452 E197A and E202A mutants had no detectable catalytic activity, either on reduced oligosaccharides or pnitrophenylcellobioside substrates.452 This study also confirmed the identity of the catalytic nucleophile for HiCel7B utilizing the technique previously applied to FoCel7B451 in which the glycosyl-enzyme intermediate is trapped and subsequently characterized by mass spectroscopy. The three EG structures described directly above constitute the only GH7 EG structures solved to date. The most dramatic structural difference between these EGs and TrCel7A is in the loops that protrude from the two faces of the β sandwich. All of the major loops (with the exception of A1 at the tunnel entrance) are shortest in TrCel7B compared with the other GH7 EGs; thus, its binding cleft is the most open GH7 cellulase characterized to date (Figure 25B and Figure 27A). The shortenings/deletions of both the so-called “exo” loop (B3) as well as another large loop (B2) are particularly dramatic in TrCel7B. However, compared with HiCel7B and FoCel7B, TrCel7B has the longest entrance site loop (A1), which is on par with that of TrCel7A. Further comparing HiCel7B with FoCel7B shows that the loop structures for these two enzymes are nearly identical, with the minor exception of a slightly lengthened loop in HiCel7B on the back of the binding tunnel entrance. All three GH7 EGs align remarkably well to TrCel7A with regards to active site protein residues (Figure 28). The aromatic−carbohydrate interactions (discussed in more detail below) are conserved in all three EGs at Trp367 (over the −2 subsite) and Trp376 (over the +1 subsite). However, the important interaction between Trp40 and the −7 sugar residue is naturally present only in only one known EG structure, TrCel7B (Figure 26A). TrCel7B is also the only EG to naturally maintain an aromatic stacking interaction at the −4 subsite, albeit with a tyrosine rather than the tryptophan found in TrCel7A (both at residue number 38, Figure 26B). The other two EGs have no aromatic interaction here; this site is occupied by isoleucine in FoCel7B and by serine in HiCel7B. Despite the exceptional similarity in catalytic machinery and some similarity in aromatic−carbohydrate interactions, signifi-

forming loop regions of TrCel7A. The TrCel7B crystal structure confirmed the absence of these loops, yielding a more open binding cleft. These differences in loop structure potentially provided a strong rationale for the difference in functionality of CBHs versus EGs. The additional surface loops in CBHs were hypothesized to prevent an extracted cellulose chain from readhering to the crystalline surface postcatalysis as well as keeping the chain threaded in the tunnel for multiple hydrolytic events before dissociation enabling CBHs to processively cleave cellobiose much more effectively than EGs. The same year, the identity of the GH7 catalytic nucleophile was confirmed by isolating the glycosyl-enzyme intermediate of FoCel7B.451 The glycosyl-enzyme intermediate was captured by incubation of the enzyme with 2′,4′-dinitrophenyl 2-deoxy-2fluoro-β-cellobioside, a method that has found great utility in identifying active site residues, as mentioned in section 3.2.170,206,452−454 This class of inhibitors slows both steps of the retaining mechanism due to the presence of the C2 fluorine. Coupling this with a good leaving group that accelerates only the first step (glycosylation) allows for “trapping” of the glycosyl-enzyme intermediate.202,451,455 Subsequent to trapping, characterization via mass spectrometry allows for identification of the nucleophilic residue. In this case, the nucleophile was identified as Glu197, which is completely conserved in GH7s and corresponds to Glu212 in TrCel7A. Around the same time, a combination of comparative liquid chromatography coupled online to electrospray ionization mass spectrometry, tandem mass spectrometry, and microsequencing applied to TrCel7A bound with an epoxide-based inhibitor provided direct experimental proof for Glu212 as the catalytic nucleophile for that enzyme and also revealed the glycosylation pattern of the core protein.456 6.1.5. H. insolens Cel7B S37W/P39W Mutant. Also in 1997, the structure of a double mutant of H. insolens Cel7B (HiCel7B; PDB code 1A39) was published wherein additional sugar-binding sites were engineered on the basis of comparison with TrCel7A.457 Two tryptophan residues were inserted (S37W and P39W) to introduce “+3” and “+4” binding sites (Figure 26A,B). The resulting mutant maintained wild-type level of activity on soluble substrates, but had a slightly decreased Michaelis constant KM (by 30%) on PASC, indicating slightly higher binding on longer substrates. Subsequent publication of the wild-type structure of this enzyme (PDB code 2A39)452 further described the more open binding cleft of EGs first observed structurally for FoCel7B and first discussed in 1343

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 28. Sequence alignment of major GH7 enzymes. Sequence alignment of three GH7 CBHs (TrCel7A, PcCel7D, and HirCel7A) and three EGs (TrCel7B, HiCel7B, and FoCel7B). Strictly conserved residues are shown in red block, and chemically similar residues in red text. The blue boxes indicate chemical similarity across a grouping of residues. The secondary structural elements and residue numbering of TrCel7A are shown above the sequences. Loop structures (A1, B1, etc.) are shown in black boxes. The catalytic triad is denoted by yellow stars. The sequence alignment was generated with ESPript (http://espript.ibcp.fr).347

amongst EGs only TrCel7B maintains the protein−carbohydrate stacking between Trp40 and the −7 subsite sugar at the tunnel entrance. Though both enzymes lack this interaction,

cant differences exist not only in the aforementioned loop morphologies, but also in relevant enzyme residues at the tunnel entrance and the product binding sites. As noted above, 1344

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 29. Comparison of the loops of GH7 CBHs. The loops that extend from the faces of the β sandwich in GH7 CBHs enclose the substrate binding tunnel to varying degrees. Among GH7 CBHs with known structures, the A2, A3, A4, B1, B2, and B4 loops are fairly similar. (A) TrCel7A encloses substrate most fully of any GH7 CBH. (B) PcCel7D features a shortening of the A1 loop and a natural deletion of six residues on loop B3 (“exo” loop) that give it a more open active site than TrCel7A (Figure 30A). (C) HirCel7A features a lengthened A1 loop (Figure 30A) and a slightly shortened B3 loop (compared with TrCel7A) due to the natural deletion of two residues. In all panels, the ligand from the TrCel7A Michaelis complex (PDB code 4C4C) is shown in aquamarine “sticks”.

HiCel7B and FoCel7B are not identical at the entrance, as HiCel7B has essentially no interactions with the ligand here, but FoCel7B has an arginine residue (Arg41, FoCel7B numbering) within hydrogen bonding distance to both the −7 and −6 subsites that TrCel7A lacks. The Asn49 residue in TrCel7A forms hydrogen bonds to both the −7 and −6 subsite glucosyl moieties, which is not found in any of the GH7 EGs. Moving toward the tunnel exit, an increasing number of discrepancies are present between TrCel7A and the three GH7 EGs. This is partly due to missing loops that enclose the tunnel exit in TrCel7A (A4 and B4, Figure 25). Another key difference in the product sites is three arginine residues present in TrCel7A (Arg 251, Arg267, and Arg394) that are at or close to hydrogen bonding distance to the +1/+2 glucosyl residues (Figure 26C). Arg251 is located at the base of loop B3 and has been identified as being particularly important to product coordination.458 The guanido group of this residue has been observed to make direct hydrogen bonds with the O5 and O6 atoms of the sugar in the +1 binding site.458 All three EGs lack this residue. All three also lack Arg394, which has been identified in a computational study of processivity in TrCel7A as being one of the key drivers of processive motion.459 The conservation of these arginine residues in processive cellobiohydrolases (Figure 28) and absence in relatively nonprocessive EGs perhaps affirm their importance in driving processive motion. In contrast to the other two arginine residues present at the TrCel7A product sites, HiCel7B and FoCel7B, maintain the Arg267 interaction. These latter two EGs also have an additional interaction at the product sites that both T. reesei enzymes lack: His209, which is within hydrogen bonding distance to the C2 hydroxyl of the +1 glucosyl residue. 6.1.6. TrCel7A: Cello-Oligomer Complexes. The extensive insights into GH7 architecture provided by the three EG structures were greatly enhanced by subsequent publication of new CBH structures. For example, crystal structures of TrCel7A E212Q and E217Q mutants published in 1998 gave much greater insight into the binding of cello-oligomers in GH7 CBHs.173 These structures included the binding of (1) two cellotetraose molecules on either side of the vacant −3 site (PDB code 5CEL), (2) cellopentaose in −6 to −2 and cellotetraose in +1 to +4 (PDB code 6CEL), and (3) cellohexaose in −7 to −2 and cellobiose in +1/+2 (PDB code 7CEL). This more complete view of cellulose chain binding revealed 9−10 binding sites (−7 to either +2 or +3) in a 50 Å

long tunnel (the +3 site has almost no interaction with enzymatic residues, and the +4 site is outside of the binding tunnel and has no carbohydrate−protein interactions). From −7 to −4 the chain binding for all ligands overlaps almost perfectly. Two twists occur between −4 and −2 that essentially turn the cellulose chain upside down. By linking the cellohexaose and cellobiose contained in 7CEL and modeling a “skew boat” configuration in the −1 site (based on previously published chitobiase447 and FoCel7B174 structures), a theoretical model of the Michaelis complex (PDB code 8CEL) was also presented that became the basis for modeling studies of this enzyme for the next 15 years. 6.1.7. P. chrysosporium Cel7D. The white-rot basidiomycete P. chrysosporium secretes six unique GH7 CBHs.426 The year 2001 marked the publication of the crystal structure of PcCel7D, the first GH7 CBH crystal structure from a basidiomycete,460 and revealed a substrate binding “groove” that is more open than TrCel7A, but not as open as the GH7 EGs. This more open architecture is the result of several loop deletions/shortenings compared with TrCel7A. At the binding tunnel entrance, PcCel7D has a shortened A1 loop but also an extra tyrosine (Tyr47) covering the entrance on the opposite of the tunnel entrance that TrCel7A lacks (Figures 29 and 30A). In addition, PcCel7D has a much shorter B3 loop and slightly shorter B2 loop compared with TrCel7A. These loop variations give a more accessible active site and may be the structural explanation for PcCel7D’s enhanced kcat and Km on small soluble substrates. The more open structure may also explain the reduction in the binding of the cellobiose product, easing product inhibition for this enzyme relative to TrCel7A.461 The specific residues responsible for this are discussed further in section 6.3. The difference in substrate binding architecture between TrCel7A and PcCel7D may also be relevant to the binding of different enantiomers of the β-blocker propranolol. Both enzymes prefer the S enantiomer over the R,462 yet crystal structures of TrCel7A (PDB code 1DY4) could only be obtained with S,463 while PcCel7D (PDB code 1H46) only gave crystals with R.462 The enantioselectivity of TrCel7A may be a largely entropy-driven process, as it has been shown to increase with temperature.462 Thus, the more open active site, and the resulting increase in solvation, could explain this difference. 6.1.8. TrCel7A: Exo Loop Engineering. The comparison of TrCel7A and PcCel7D was extended in 2003 by extensive 1345

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 30. Key structural differences among GH7 CBHs. (A) GH7 enzymes exhibit A1 loops that are shortened (exemplified by PcCel7D in light blue, PDB code 1GPI) and lengthened (exemplified by HirCel7A in light pink, PDB code 2YG1) compared with TrCel7A (green, PDB code 4C4C). On the opposite side of the glucan chain, Trp40 is conserved in all cases. HirCel7A and PcCel7D both have an extra tyrosine that protrudes over the tunnel entrance: HirCel7A Tyr101 over the top and PcCel7D Tyr47 on the opposite side. (B) Variation in tyrosines on the A3 and B3 loops in GH7 CBHs: the lack of Tyr371 (numbering for TrCel7A) renders the B3 loop of ThCel7A (magenta, PDB code 2Y9L) more flexible, and LqCel7A (yellow, PDB code 4GWA) lacks the Tyr247 equivalent and has a slightly shortened B3 loop; TrCel7A possesses a tyrosine at the tip of both the A3 and B3 loops, and these have multiple conformation (PDB code 1CEL shown in teal, exemplifies this alternate conformation). (C) Key variable residues that coordinate the glucosyl residues in the product sites. TrCel7A lacks the aspartate interaction at the +2 site that is seen in PcCel7D (Asp336) and HirCel7A (Asp347). All three CBHs maintain the three arginine residues shown. In addition, due to the shortening of the B3 loop in PcCel7D (six residues) and HirCel7A (two residues), both of these lack Tyr247 and Thr246, which are possessed by TrCel7A. In all panels, the ligand from the TrCel7A Michaelis complex is shown in aquamarine “sticks” (PDB code 4C4C).

engineering of the active-site loop B3 (referred to as the “exo” loop in the original publication) of TrCel7A; this loop is the most prominent active-site structural difference between these two enzymes.461 This included (separately) the introduction of a Y247F mutation at the tip of the loop (no crystal structure presented), deletion of the middle eight residues of the loop (PDB code 1Q2E), and stabilization by introduction of a disulfide bridge (PDB code 1Q2B). The three mutations showed little effect on the hydrolysis of small, soluble substrates. The deletion mutant gave enhanced activity on amorphous cellulose, but a 50% activity reduction on crystalline cellulose; this was associated with a reduction in processive character. The disulfide bridge resulted in enhanced activity on both amorphous and crystalline cellulose. Taken together, these data confirmed the previous hypothesis that the B3 loop is integral to the high processivity of TrCel7A. 6.1.9. TeCel7A. In 2004, the crystal structure of T. emersonii (or Rasamsonia emersonii) CBH IB (TeCel7A) represented the first structure of a thermostable GH7 CBH and also the first of a GH7 CBH naturally lacking a CBM (PDB code 1Q9H).464 In general, the structure of TeCel7A is quite similar to TrCel7A, with two exceptions. First, the tip of the B2 loop was not resolved in the TeCel7A crystal structure, so comparison is not possible. Also, the A1 loop at the binding tunnel entrance is extremely similar to that of PcCel7D (Figure 30A), which is somewhat shortened from that of TrCel7A.464 6.1.10. PcCel7D Bound with Disaccharides. Further structural studies of PcCel7D in 2005 focused on the interaction with inhibitors: natural product cellobiose (PDB code 1Z3T), lactose (PDB code 1Z3V), and cellobioimidazole (PDB code 1Z3W).458 The structural information revealed here was used to explain the differences in affinities for cellobiose and lactose between PcCel7D and TrCel7A. Residues on the B3 loop may provide more possibilities for interaction with both the substrate and the product in TrCel7A. TrCel7A, however, lacks a conserved acidic residue (Asp336) that was shown to interact

with a glucopyranose in the +2 subsite of PcCel7D (Figure 30C). The studies also revealed that Tris (buffer molecule) can bind in the active site, and that both Tris and calcium inihibit PcCel7D CBH activity. 6.1.11. M. albomyces Cel7B. The structure of another thermostable GH7 CBH natively lacking a CBM was published in 2008 from M. albomyces.465 MaCel7B (PDB code 2RFW) was crystallized with four molecules within one asymmetric unit and was determined both in apo form and with bound cellobiose, cellotriose, and cellotetraose. This gave a structural analysis of altogether 16 different “structure snapshots” of the same enzyme and the total picture of a surprisingly large conformational variability of the substrate interaction with the enzyme. The structure of MaCel7B is also quite similar to TrCel7A. The most significant difference from TrCel7A is that MaCel7B possesses a slightly elongated entrance loop A1, which features a unique tyrosine residue (Tyr100, which neither PcCel7D nor TrCel7A possess) that sits “above” the −7 binding site, opposite the conserved tryptophan at this site (Trp40 in TrCel7A). 6.1.12. Heterobasidion irregulare Cel7A. The comparison of PcCel7D and TrCel7A was taken one step further by Momeni and Payne et al. in 2013.449 This publication presented the structure of Cel7A from the root-rot fungus and basidiomycete H. irregulare (PDB code 2YG1). HirCel7A also possesses the elongated A1 loop seen in MaCel7B, including the extra tyrosine residue (Tyr101 in HirCel7A; Figure 30A).465 HirCel7A has a slightly shorter B3 loop (deletion of two residues) compared with TrCel7A (and PcCel7D lacks this loop entirely), resulting in the loss of stable contacts across the binding tunnel with loop A3 that TrCel7A possesses (Figures 29 and 30B). Structural comparisons of the three enzymes were complemented by MD simulations to examine the flexibility of tunnel closing loops and the potential role for these in the recognition of substrate, endoinitiation, and the processive action of the enzyme. On this basis, it was suggested that HirCel7A exhibits intermediate properties 1346

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

between TrCel7A and PcCel7D in terms processivity and possible endo-initiation capability. 6.1.13. T. harzianum Cel7A. In 2013, the structure of T. harzianum Cel7A (ThCel7A; PDB code 2Y9L), an enzyme with 81% sequence identity with TrCel7A, revealed a few significant structural differences.466 The entrance to the substrate-binding tunnel is more open in ThCel7A than TrCel7A, due to the shortening of the A1 loop, very similar to that seen in PcCel7D (Figure 30A) and TeCel7A. The second highlighted difference was the potentially greater flexibility of the B3 loop (denoted loop 4 in the original publication) as a result of a single residue substitution between ThCel7A (Ala384) and TrCel7A (Tyr371). This substitution is not on the B3 loop itself, which is highly conserved between the enzymes, but on the opposing side of the tunnel (Figure 30B). MD simulations of the two enzymes confirmed that this substitution does increase the flexibility of the B3 loop and results in a more open binding tunnel.466 6.1.14. L. quadripunctata Cel7B. The year 2013 also featured publication of the first structure of a nonfungal GH7 CBH from the marine woodborer L. quadripunctata.383 Four high-resolution LqCel7B structures (PDB codes 4GWA, 4HAP, 4HAQ, and 4IPM) constituted the basis for a structural analysis and comparisons with TrCel7A using MD simulations. LqCel7B was shown to have a highly acidic surface charge, which may be the source of its high activity in saline environments (Figure 31).

Figure 32. Michaelis complex and glycosyl-enzyme intermediate of TrCel7A. (A) TrCel7A Michaelis complex (PDB code 4C4C441). Note the 4E conformation of the substrate at the −1 site. (B) TrCel7A glycosyl-enzyme intermediate (PDB code 4C4D441) with covalent bond between the nucleophile and the broken cellooligomer chain. The cellobiose product is in unprimed glycosyl-enzyme intermediate mode. Note the approximately 30° rotation of the nucleophile during glycosylation. Figure 31. Electrostatic potential distribution on the solvent accessible surface of LqCel7B. Electrostatic potential between −7 kT/e and 7 kT/ e is shown as a colored gradient from red (acidic) to blue (basic). (A) LqCel7B possesses an anomalously high frequency of acidic residues on its surface. These are likely required for activity in its native marine environment. (B) The surface charge of TrCel7A is much less acidic. Adapted from ref 383.

The Michaelis complex featured a cellononaose chain that spans the entire binding tunnel. The glycosyl-enzyme intermediate contains a cellohexaose molecule covalently bound to the nucleophile and a cellobiose product filling the +1/+2 sites. The glycosyl-enzyme intermediate was captured by the highly successful method206,451−453 of incubation with 2,4-dinitrophenyl 2-deoxy-2-fluoro-β-cellotrioside, a mechanism-based suicide inhibitor.170,206,452−454 These structures constitute the first experimentally determined structural picture of the Michaelis complex for a GH7 CBH (confirming much of the geometry in the theoretical model of the Michaelis complex presented in 1998 with PDB code 8CEL173) and the first glycosyl-enzyme intermediate for any GH7 enzyme.441 6.1.16. GH7 Catalytic Insights from Molecular Simulation. Connecting the many geometrical changes between the static configurations captured in crystal structures with the dynamical variables that drive chemical reactions necessitates the use of computational modeling and simulation. Moreover, modeling is essential for the computation of individual free energy barriers and rates of fundamental process steps.111,159,467,468 This makes computational studies crucial to the development of enzymatic structure−function relationships.111 To that end, many computational studies have

At the substrate tunnel entrance LqCel7B exhibits the same A1 loop elongation, with extra tyrosine (Tyr121 in this case) also seen in MaCel7B465 and HirCel7A (Figure 30A).449 The B3 loop conformation for LqCel7B is quite similar to that of HirCel7A (PDB code 2YG1). HirCel7A and LqCel7B both lack the tyrosine (Tyr247) that TrCel7A possesses that interacts with loop A3 across the binding tunnel (Figure 30B). Where LqCel7B and HirCel7A differ though is in the tyrosine (Tyr371 in TrCel7A) on the opposing A3 loop: LqCel7B possesses this tyrosine, whereas HirCel7A does not (Figure 30B). 6.1.15. TrCel7A Michaelis Complex and GlycosylEnzyme Intermediate. In 2014, two structural snapshots of key steps in the GH7 retaining mechanism were published (Figure 32), namely the Michaelis complex (PDB code 4C4C) and glycosyl-enzyme intermediate (PDB code 4C4D) of TrCel7A, both as acid/base disabled mutants (E217Q).441 1347

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 33. Hydrolytic free energy barriers for TrCel7A. Free energy barriers for the hydrolytic steps of glycosylation (left) and deglycosylation (right) for TrCel7A acting on a cellulose chain are shown in addition to that for the nucleophilic water movement that is coupled to cellobiose product movement (middle).441 Rate calculations based on these free energy barriers reveal that glycosylation is rate-limiting within the hydrolytic steps. M denotes the Michaelis complex.

Knott et al. calculated free energy barriers for both steps 1 and 2 for TrCel7A.441 Path sampling simulations were utilized to elucidate the reaction mechanism prior to computing free energies.473−475 This is in contrast to previous computational investigations in which free energies were computed along assumed and unverified coordinates (e.g., bond lengths). Path sampling revealed that the glycosylation reaction coordinate (Figure 33, “RC1”) contains components of the forming and breaking bonds as well as a rotation in the nucleophile. This finding was corroborated by the crystallographic snapshots presented in the same study of the Michaelis complex (PDB code 4C4C) and glycosyl-enzyme intermediate (PDB code 4C4D). Comparing the conformation of nucleophile Glu212 in the two structures (Figure 32) reveals an approximately 30° twist, consistent with the rotation seen in the simulations. In between the two hydrolytic steps, the cellobiose product cleaved during glycosylation shifts slightly toward the tunnel exit in order to allow the nucleophilic water access to the anomeric carbon reaction center. The free energy barrier for this transition was computed along a distance coordinate describing the proximity of this water to the active site (Figure 33). An additional noteworthy aspect of this work was the finding that deglycosylation proceeds via a product-assisted mechanism: the reaction coordinate (Figure 33, “RC2”) involves forming and breaking bonds as well as the orientation of a hydroxyl from the cellobiose product, which positions the catalytic water for nucleophilic attack on the anomeric carbon. Subsequent free energy and dynamic calculations facilitated TST rate calculations of 10.8 and 5300 s−1, for steps 1 and 2, respectively, thus confirming step 1 as rate-limiting in the hydrolytic cycle. This rate agrees well with the rate of processive cellulose hydrolysis by TrCel7A on crystalline cellulose measured by Igarashi et al. via high-speed AFM of 7.1 ± 3.9 s−1.476 In addition to these studies that explicitly studied bondbreaking and bond-forming in GH7 enzymes, several other computational studies have helped to shed light on other factors influencing enzymatic catalysis. These factors include substrate ring distortion,477 protonation of catalytic residues,478,479 mutations,472,480 and protein interfacial allostery.481,482 In addition, the glycosynthetic ability of the nucleophilic mutant (HiCel7B E197S mutant) was examined via QM/MM metadynamics simulations.480

examined the chemical steps of GH7 enzymes, both of EG TrCel7B469 and CBH TrCel7A, as reviewed below.441,470−472 Zhang et al. used hybrid QM/MM (quantum mechanics/ molecular mechanics) calculations to compute the two-dimensional potential energy surface as a function of the key breaking and forming bonds for both hydrolytic steps of p-nitrophenyl-β469 D-lactoside (pNPL) hydrolysis by EG TrCel7B. They found that the barrier to glycosylation (18.9 kcal/mol) is higher than the barrier to deglycosylation (10.5 kcal/mol). Site-directed mutagenesis experiments were also performed, with the goal of attributing functional roles for several near-active site residues that are well-conserved in GH7 enzymes. R108 K, Y146F, Y170F, and D172N mutants revealed catalytic activities that were decreased between 130- and 7700-fold. Subsequent MD simulations of these mutants revealed a disrupted hydrogen bond network that resulted in more distant interactions with the catalytic residues, thus hindering either the nucleophilic attack or the proton transfer. This likely increases the catalytic free energy barriers, explaining the observed reduction in activity and underscoring the catalytic importance of the environment near the active site beyond simply the “catalytic residues”. Li et al. utilized QM and QM/MM to perform single-point calculations for both steps of the hydrolytic mechanism of TrCel7A.471 Their QM calculations indicate that the free energy barrier for step 2 is more than twice as high as that for step 1 at more than 30 kcal/mol (compared to a glycosylation barrier of around 14 kcal/mol). Yan et al. performed QM/MM umbrella sampling on wild-type TrCel7A as well as E212Q and D214N mutants.472 They only study step 1 of the hydrolytic mechanism finding a free energy barrier of more than 30 kcal/mol in all three cases. In addition to being difficult to reconcile with one another, these free energy barriers are difficult to reconcile with experimental hydrolysis rates which would suggest a significantly lower hydrolytic barrier. Barnett et al. also utilized QM/MM umbrella sampling simulations for step 1 with wild-type TrCel7A,470 utilizing two coordinates (one breaking and one forming bond) along which to sample free energy. Their prediction of the step 1 free energy barrier was 17.5 kcal/mol, from which transition state theory predicts a reaction rate of 0.4 s−1. This study also presented details on the ring puckering itinerary for step 1 and the importance of the oxocarbenium character of the transition state, as described in section 3 for GH catalysis. 1348

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 34. Complete processive cycle of a GH7 CBH. TrCel7A is shown with its CD, linker, and CBM in gray “cartoon” representation. Nglycosylation and O-glycosylation are shown in blue and yellow, respectively. The cellulose surface is shown in green and the cellobiose product in magenta. Following the adsorption of the CBM and CD to the substrate and initial chain threading, TrCel7A processively cleaves cellobiose from a cellulose chain end. The “Processive Cycle” includes chain processivity, hydrolysis, and product expulsion (Figure 35). This processive cycle occurs repeatedly until the enzyme desorbs from the cellulose surface.

Figure 35. Hypothesized hydrolytic processive cycle of a GH7 CBH inferred from structural data. The inner processive cycle of a GH7 CBH consists of the following seven steps: after adsorption, decrystallization, and initial chain threading, the substrate fills the −7 to −1 sites in pre-slide mode (upper left); (1) processive motion of CBH by one cellobiose unit fills the product sites, with all glucosyl residues in the stable chair conformation (slide mode, upper middle); (2) catalytic activation rotates the chain ∼90° and distorts the −1 glucosyl residue into half-chair or envelope conformation, forming the Michaelis complex (upper right) and allowing the nucleophile access to the anomeric carbon reaction center; (3) the first chemical step, glycosylation, cleaves cellobiose from the reducing end of the cellulose chain and forms a covalent bond between the nucleophile and the broken chain (unprimed glycosyl-enzyme intermediate, lower right); (4) a shift in the product produces the primed glycosyl-enzyme intermediate (lower middle) such that the deglycosylation nucleophilic water has access to the anomeric carbon reaction center; (5) the second chemical step, deglycosylation, produces the product complex, wherein the glycosyl-enzyme intermediate is broken and the catalytic residues are regenerated (lower left); (6) product expulsion vacates the +1/+2 sites (upper left). The processive cycle ends when the enzyme dissociates from the cellulose chain. It should also be noted that it is possible for a CBH to perform endo-type initiation,304,307 in which case the enzyme would initiate this cycle in slide mode (or similar, possibly with a longer chain on the product side of the active site) and the remainder of the processive cycle would proceed as depicted. In all panels, the nucleophile (Glu212 in TrCel7A) is on top, and the catalytic acid/base (Glu217 in TrCel7A) is on bottom in green “sticks”. For clarity, only selected hydrogen atoms are shown: the hydrogen bonded to the −1 anomeric carbon shows the stereochemistry and that of Glu217 shows its protonation state throughout the processive cycle.

1349

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

6.2. Processivity, Kinetic Modeling, and Visualization

assumptions be made regarding initial binding mode (i.e., endo vs exo, where some processive enzymes are known to exhibit both initiation modes) and can frequently lead to misinterpretation or overestimation of processivity.461,532 Furthermore, these methods are also extremely sensitive to the substrates selected, where the number of available free chain ends can drastically affect measured product profiles.533 A recent review provides an excellent assessment of the pros and cons of each available technique.521 Finally, we caution the reader to approach experimental determinations of processivity with an eye toward the difficulty in quantification and exercise good judgment with respect to interpretations of processive ability and claims of endo- or exo-initiated binding. The direct in situ visualization of cellulase action on cellulosic substrates has provided valuable clues to their mechanistic action and has recently been reviewed.534,535 These visualization methods offer the ability to resolve single molecules on the cellulose surface, allow for observing single cellulases interacting with the cellulose surface, and provide temporal tracking of the effects of cellulases on the cellulose surface (e.g., the formation of fissures). TEM was the first method applied to visualizing the structural dynamics of enzymatic cellulose degradation directly on the cellulose surface both for the complete T. reesei system536 and for TrCel7A in isolation.537 White and Brown visually presented the EG/CBH synergistic action of T. reesei enzymes.536 They found that CBH or EG in isolation could not produce cellulose microfibril degradation, though EG could produce some splaying of chains (notably, results were only reported for 1 h of incubation). When acting in concert, however, they could completely dissolve microfibrils within 30 min of incubation. Particularly notable among these early investigations were those of Chanzy and co-workers on GH7 CBHs.537,538 These studies provided direct visual information regarding processive CBH action and have had a long-lasting impact on the field. Early work showed the binding of TrCel7A to crystalline cellulose and its subsequent degradation. Significantly, complete degradation of crystalline cellulose was shown by TrCel7A in isolation (within 48 h), without the need for any EG present, producing cellobiose as the major product (determined by HPLC).537 A follow-up study visualized gold nanoparticlelabeled TrCel7A molecules via TEM. TrCel7A retains 60% of the hydrolytic activity even with the bulky gold label (5 nm in diameter, the same order of magnitude as TrCel7A). It was found that TrCel7A binds preferentially to the hydrophobic (100) face of cellulose microfibrils.538 Several other TEM studies focused on GH7 enzymes,536−544 including those utilizing gold labeling538,539 and labeling with gold-coupled monoclonal antibodies (immuno-EM).541,544 Immuno-EM was utilized to determine that TrCel7A and TrCel7B preferentially bind to the crystalline and amorphous regions of substrate, respectively.544 One limitation of TEM is that it cannot visualize hydrated cellulases; thus, it cannot be used in the native cellulose/ cellulase environment. Conventional atomic force microscopy (AFM), however, can be applied in liquid environments at atmospheric temperature and pressure, without sample modification (labeling, coating, etc.). Thus, AFM enabled the first visualization of cellulase action in a biologically relevant environment.545 Subsequent AFM studies revealed further details of enzymatic action.367,545−550 For example, AFM visualization of TrCel7A action on crystalline cellulose suggested, under environmental conditions, that degradation was

The wealth of structural and biochemical data on GH7 enzymes has revealed a common overall fold as well as a common catalytic machinery. Clues have also been provided by structures as to differences in functionality (e.g., processivity) between CBHs and EGs and well as within these classes. This section discusses the discrete steps in the processive cycle of GH7 enzymes, with a particular focus on CBHs. Visualization studies of GH7 enzymes on lignocellulosic biomass are then reviewed, as they predominantly provide qualitative information about cellulase action. Kinetic modeling and novel high-speed AFM measurements make this knowledge quantitative and have the capability to provide rates for the individual steps of the processive cycle. We then discuss structural and molecular modeling studies, which allow for identification of the molecular underpinnings of cellulase action including the individual steps of the processive cycle. Finally, the reversible blockage of this processive cycle via product inhibition is discussed and reviewed. GH7 CBHs act from the reducing end of cellulose chains445 and perform many hydrolytic events before disassociating from a cellulose chain. This overall processive cycle includes at least the following steps (Figure 34): adsorption to the crystalline cellulose surface (possibly preceded by the adsorption of an attached CBM), cellulose chain decrystallization, chain threading through the binding tunnel or direct binding (i.e., endoinitiation304), hydrolysis, product expulsion, and desorption. The “repeating processive” cycle that generally repeats many times before chain dissociation includes processive motion, catalytic activation, hydrolysis, and product expulsion (Figure 35). The rate-limiting step in the processive cycle of a cellulase is an oft-discussed topic in the literature, due to its importance as the primary target for protein engineering efforts. For an EG, the processive cycle shown in Figure 35 is modified in that the chain threading and product expulsion are omitted. Chain acquisition without threading is thought to be readily accomplished by EGs (as opposed to CBHs) due to their open binding site clefts that lack the enclosing loops found in CBHs. GH7 CBHs are also capable of performing hydrolysis in an endo-fashion.304,307 It has been speculated that, with the ability for CBHs to perform endo-initiation, their enclosing loops may open, allowing entry of the cellulose chain into the active site without chain threading.304,307 Whether or not a CBH performing endo-initiation subsequently disassociates immediately or embarks on a processive run is still an open question. Quantifying the processive ability of an enzyme is useful both in generally describing the mechanistic behavior and in identifying opportunities for activity enhancements. Unfortunately, measuring processivity is not straightforward, and measurements performed via different techniques are not readily comparable.521 Formally, processive ability is defined as the number of hydrolytic events performed per number of initiated processive runs. This definition of processive ability is frequently referred to in the literature as “apparent processivity”. Kurašin and Väljamäe describe a complementary measurement, “intrinsic processivity”, that describes the theoretical processive potential of a GH, in the limit of ideal polymeric substrate turnover.307 A handful of techniques have been developed to assess apparent and intrinsic processivity, some of which are described below. These methods generally capitalize on the relatively consistent nature of the processive GH product profile.307,522−531 However, characterization of processive ability by measuring produced soluble products requires some 1350

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

primarily on a single face.367 When combined with the previous finding from TEM that the family 1 CBM of TrCel7A preferentially binds to the hydrophobic face of crystalline cellulose (as described in section 5),362 this may indicate TrCel7A localization primarily to the hydrophobic face of crystalline cellulose. This confirmed in an aqueous environment and with the native enzyme what Chanzy et al. had previously found via TEM of gold-labeled TrCel7A.538 AFM also showed the formation of path-like indentations on the substrate when incubated with TrCel7A, considerably different than the effect of EG,547 and that the processive motion of TrCel7a is impeded by amorphous regions of the substrate.546 In addition, AFM has been utilized to study synergistic effects between TrCel7A with TrCel6A and EGs546,551,552 and in HiCel7A with Cel6A540 as well as with EG present.553 Real-time AFM has also examined EG action by itself (TrCel7B).549 Fluorescently labeling TrCel7A coupled with confocal microscopy554 or total internal reflection fluorescence microscopy555 allows for tracking the CBH’s motion along cellulose fibrils. The latter of these studies found that upward of 90% of TrCel7A molecules were stationary on the cellulose surface during the observation interval.555 These studies highlight the utility of visualization studies in enhancing the understanding of cellulase−cellulose dynamical interactions as well as cellulase−cellulase interactions in some cases. The general insights provided by cellulase visualization have been solidified, refined, and quantified by kinetic modeling studies and novel HS-AFM measurements. These studies provide the capability of determining rate constants for the individual steps in the processive cycle (Figures 34 and 35) and how these constants depend on the nature and concentration of both substrate and enzyme. These models are often directly linked to hydrolysis experiments providing feedback between kinetic experiments and theories that seek to rationalize the results. The goals of these kinetic models are often focused on identifying the rate-limiting step(s) in the processive cycle of a CBH or an enzyme cocktail. Cellulase synergy and the origin of the “burst phase” often seen in the first few minutes of hydrolysis experiments are two other primary areas of focus of these studies. With some notable exceptions, we largely focus our attention here on those studies that have appeared from 2009 to the present, as both Zhang and Lynd33 and subsequently Bansal et al.556 have nicely reviewed the literature on this topic for publications appearing before 2009. Ståhlberg et al.348 examined the adsorption and hydrolysis of intact TrCel7A, isolated CBM, and isolated catalytic core on MCC. They found that the CBM increased both the adsorption and hydrolytic activity. To explain these results, they proposed a model for the action of two-domain cellulases: the core is biased toward amorphous regions or chain ends whereas the CBM is biased toward the crystalline regions. Attachment of the CBM to the core ensures that the core will have an elevated concentration on the crystalline regions, thus increasing the probability of cleavage. The importance of considering at least two different types of surface morphologies was thus emphasized. Beginning around 2009, multiple studies were presented that revolutionized our quantitative understanding of CBH action.307,315,476,531,557−562 Taken together, they constitute a significant step forward in our collective understanding of CBH function, beginning with the seminal study from Igarashi et al. in which high-speed AFM was utilized to spatially track individual TrCel7A molecules on crystalline cellulose with a temporal

resolution of 1−4 frames/s.558 Isolated CDs (sans CBM) moved with comparable velocity to intact enzymes with CBM at around 3.5 nm/s. However, E212Q (catalytic nucleophile) and W40A (binding tunnel entrance) mutants do not move at all. Thus, it was concluded that hydrolysis and chain loading are both critical for movement. The E212Q mutant was immobilized on the surface longer than the W40A mutant, suggesting W40 is critical for initial chain threading. On the basis of the observations regarding the immobile E212Q mutant and the comparable velocities of intact enzyme and isolated CD, they concluded that movement is inherently coupled to catalytic activity. While possible that the cleavage reaction actually induces the forward processive motion, these observations at least imply that the catalytic event is a prerequisite for motion to continue. Additionally, the role of the CBM was determined to be solely to enhance the concentration of enzyme molecules on the substrate, without a further substrate-modifying role. In 2010, Jalak and Väljamäe sought to identify the cause of the rapid rate reduction characteristic of enzymatic hydrolysis531 by quantifying the concentration of CBHs with a cellulose chain productively bound to cellulose (meaning that the cellulose chain was bound in the active site). Experimentally, this was accomplished by measuring the degree of inhibition of TrCel7A and PcCel7D for a small MW reporter molecule, in this case pNPL. The rate of pNP production (product of pNPL hydrolysis) is readily related to the fraction of CBH with free active site (and thus the productively bound fraction). The observed catalytic constant can be calculated as the ratio of cellobiose formation rate (cellulose hydrolysis product) to the concentration of CBHs with occupied active site. As candidates for the source of the rapid rate retardation, previously proposed hypotheses such as cellobiose product inhibition, depletion of more readily hydrolyzable substrate, depletion of available chain ends, inactivation through irreversible surface binding, and overcrowding of bound CBHs could not account for their data. However, if “steric obstacles” that limit the processive action of CBHs were considered, the results could be rationalized. In this way, the enzyme dissociation rate from the surface (koff) is implicated as the slow step of the overall CBH processive cycle with only CBHs present. Following the initial “burst” phase, the rate of hydrolysis is governed by koff and the average obstaclefree path of a CBH on a cellulose chain. This explanation also accounts for the differences in hydrolysis rates wherein CBHs with more open binding tunnels (e.g., PcCel7D vs TrCel7A) more readily dissociate from cellulose chains, resulting in lower processivity, but higher cellobiose production. Thus, there are “costs and benefits” to high processivity.563,564 The concept of steric obstacles that limit CBH catalytic production has been quite influential in subsequent studies performed by a number of groups. The concept of a CBH processing along a cellulose chain according to its “true” catalytic constant before becoming “stuck” at an obstacle necessitates (at least) two major CBH populations. In 2011, Igarashi et al.476 presented high speed AFM data of TrCel7A as well as TrCel6A that provided a powerful visual verification of this “dual population” model. In this study, TrCel7A was observed to alternate between stopping and sliding. The velocity distribution of CBHs could be very well accounted for by assuming two populations, one with near zero velocity and the other with nonzero velocity (7.1 ± 3.9 nm/ s). CBHs were shown to stop, and then accumulate behind a surface obstacle (described as a “traffic jam”). A new dimension was added to this dynamic picture when a group of CBHs was 1351

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 36. Cellulase architecture correlates with function. Shortened or absent substrate-enclosing loops result in a more open architecture which gives rise to functional differences. The rate constants and processivity estimates are from Kurašin and Väljamäe.307 (A) CBH TrCel7A has the most closed substrate-binding tunnel (PDB code 1CEL172), which gives rise to the highest Pintr and lowest koff of the enzymes considered in the study. The ligand shown is from the TrCel7A Michaelis complex (PDB code 4C4C441). (B) CBH PcCel7D (PDB code 1GPI460) has a more open substratebinding tunnel than TrCel7A due mostly to the shortening of the A1 and B3 loops (Figure 29). The ligand shown is from the TrCel7A Michaelis complex. (C) EG TrCel5A (PDB code 3QR3566), shown in complex with the ligand from Bacillus agaradhaerens Cel5A (PDB code 4A3H206). (D) EG TrCel12A (PDB code 1H8V567) shown with the ligand from Humicola grisea Cel12A (PDB code 1UU6208). In all panels, the enzymes are shown to scale and oriented with the acid/base on top and the nucleophile on the bottom (all four utilize a retaining mechanism). The substrate for each of the 307 listed measurements is amorphous cellulose, with the exception of Pendo BC , which was performed on BMCC.

TrCel7A had previously been reported,304 but there was some uncertainty about these findings.565 TrCel7A was shown to perform endo-initiation with probability Pendo of 0.41−0.55 on BC (PcCel7D, Pendo = 0.73−0.82) and 0.83−0.93 on amorphous cellulose (PcCel7D, Pendo = 0.92). Also, the Papp of TrCel7A was similar to that of PcCel7D on amorphous cellulose and on BC; however, Papp for each CBH was nearly 3 times higher on BC, leading to the conclusion that this parameter is determined by the substrate properties. The dissociation rate koff was also found to be dependent on the nature of the substrate. Moreover, the koff values for the EGs were 2 orders of magnitude above those of the CBHs. This is in contrast to kcat, which was of the same order of magnitude for the CBHs as for two EGs (TrCel5A and TrCel12A) and for the different types of cellulose. Pintr was also determined by separately estimating kcat and koff. These rate constants were determined by measuring product formation rate in the regimes where the processes described by these constants are rate-limiting (the initial “burst” phase for kcat and the subsequent linear regime for koff). Pintr was 1−2 orders of magnitude greater than Papp, confirming that CBHs do not reach their full processive potential on real substrates, and thus processivity is substrate limited. The authors thus conclude that the dissociation of a stalled, nonproductively bound CBH is the rate-limiting step in the overall processive cycle and the primary target for the selection of cellulases, though it must be noted that this conclusion is based on measurements of individual enzymes in isolation (e.g., a CBH in the absence of any synergism with an EG, other CBH, or LPMO). The trends in various measured parameters (koff, Pendo, Pintr) are indicative of a deep connection between enzyme structure and intrinsic kinetic properties (Figure 36). The key structural difference between EGs and CBHs is the shortening or deletion of several loops in EGs that cover the substrate binding tunnel (Figures 25, 27, and 28). Also, PcCel7D has a shortening of the important B3 loop (aka the “exo” loop) as compared to TrCel7A (Figure 29). These loop differences are the likely structural basis for the increased Pendo of PcCel7D versus TrCel7A (and even higher Pendo for

seen to stop, accumulate, and then resume motion without dissociating. This may correspond to the collective action of several CBHs removing an obstacle in their path. TrCel6A by itself was observed to bind, but it did not slide nor did it appreciably degrade cellulose. The combination of TrCel6A with TrCel7A degraded cellulose much more efficiently than the sum of the action of the two, giving a powerful visual representation of CBH synergy, and it was suggested that this may be due to TrCel6A making endo-like cuts in the crystalline cellulose surface. Kurašin and Väljamäe built upon this quantification of the overall processive cycle by isolating its individual steps and connecting cellulase functional properties to structure.307 The resulting seminal study provided an unprecedented holistic understanding of CBH action in isolation. Recognizing both the importance of cellulase processivity and also the difficulty in its measurement, Kurašin and Väljamäe sought to develop a robust method for its quantification. They note that the intrinsic processivity Pintr (determined by the ratio of the hydrolytic rate kcat to the dissociation rate koff) can only be realized on a “perfect” polymer. On real substrates, the inherent heterogeneity introduces steric obstacles that halt forward processive motion. Thus, they sought to measure the ratio of the number of actual catalytic events before dissociation on a real polymer (i.e., the apparent processivity Papp). The primary experimental challenge for the quantification of apparent processivity is determination of the number of processive run initiations. This was elegantly addressed by selectively “tagging” only the reducing ends of cellulose with diaminopyridine (DAP) and detecting the release of DAP-labeled sugars (after cleavage by reducing end specific CBHs TrCel7A and PcCel7D) with sensitive fluorescence detection. This analysis revealed that Papp for the CBHs increased with increased enzyme loading, indicating that there was another mode of initiation besides exo-action, namely endo-type initiations. This led them to perform similar experiments with reduced BC to measure the number of endo-initiation events. Endoinitiation capability by 1352

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 37. CBHs acting in isolation and the two modes of endo/exo synergism. (A) The rate of cellobiose production by CBHs acting in isolation (i.e., no EGs present) is limited by dissociation from cellulose when hindered by obstacles (most notably the amorphous regions of cellulose).307 (B) The traditional picture of endo/exo synergism (right side of panel B) is that EGs create random cuts in the cellulose surface that provide starting points for CBH processive action. Jalak et al.557 showed that another important role for EGs was to help CBHs “escape” blockage (left side of panel B) by amorphous regions of cellulose, as shown in panel A. Reprinted with permission from ref 557. Copyright 2012 the American Society for Biochemistry and Molecular Biology, Inc.

had indicated a role for EGs in “surface cleaning”,568,569 the prevailing model of endo/exo synergism posited that EGs make internal cellulose chain cuts that produce starting points for CBH action.570 However, the recent revelation that the ratelimiting step for processive CBH action in isolation was dissociation (Figure 37A) and not association,307 suggested that the synergistic power of EGs might somehow be related to CBH dissociation. Jalak et al.557 demonstrated that the hydrolytic rate constant of TrCel7A on BC increased with the addition of EG TrCel5A under steady-state conditions, unaccountable by the traditional paradigm. This led to the new hypothesis: the role of EGs is not only to help CBHs attach to the cellulose surface, but also to detach f rom the cellulose surface (Figure 37B), the step which was now gaining traction as rate-limiting in the CBH processive cycle (though EGs themselves were prone to reversible deactivation by surface heterogeneity315). Both modes of synergy were found to act in concert, but the traditional paradigm accounts for a significant portion of the synergistic effect only at high enzyme/substrate ratios.557 At optimal enzyme/substrate ratios, the new synergistic mechanism dominates. Under these conditions (optimal enzyme/ substrate loadings and with EG present), the result is that the rate-limiting step for the conversion of cellulose to glucose is the CBH processive cycle.557 In other words, the combined steps of processive motion, hydrolysis, and product expulsion (Figure 35) are rate-limiting in the overall processive cycle which includes association and dissociation (Figure 34). This proposal was corroborated by the fact that, at these optimal enzyme/ substrate loadings, kobs (calculated from the rate of cellobiose production normalized by the number of TrCel7A molecules with occupied active site) approached kcat (the rate constant for the processive cycle involving hydrolysis, product expulsion, and processive motion, Table 6). These findings were also demonstrated to be true on lignocellulose.557 In addition, it was noted that the apparent processivity calculated for TrCel7A

EGs). In addition, PcCel7D more easily dissociates from a cellulose chain than TrCel7A (reflected in a higher koff, and thus a lower Pintr). The fact that the koff for the EGs studied is 2 orders of magnitude higher than for the CBHs also seems to implicate loop structures as the source of the differences in activity of these cellulases (Figure 36). Similarly, Praestgaard et al.562 developed an explicit kinetic model to describe the “burst” kinetics seen in cellobiose production by CBHs, such as Cel7A wherein the steady-state is preceded by a transient burst in activity. They compare their results to calorimetric measurements performed with TrCel7A. The key components of the theory are the relative reaction rate constants for adsorption, processive hydrolysis, and desorption, as well as blockage by obstacles on the cellulose surface. They find that the burst in activity persists until the CBHs start encountering obstacles in significant proportion. Their model is capable of capturing this burst phase; however, to capture the double exponential decay in cellobiose production rate seen experimentally, it was necessary to add random enzyme inactivation (with no correlation to the stage of the processive cycle) to their model. This model was further developed and applied to further mine the experimental wealth of mechanistic information on fast (i.e., non-rate-limiting) steps of the processive cycle afforded by the pre-steady-state regime.560 This constituted the first such pre-steady-state investigation of a cellulase on its natural substrate, insoluble cellulose. For TrCel7A, it was found that the rate of cellobiose production reaches a maximum after just 5−8 s, and then declines rapidly. They calculate 4 glycosidic bonds cleaved per second (kcat), concluding that dissociation (koff) is rate-limiting (0.022/s). Complexation (kon) is 3× faster than dissociation (koff), and kcat (which includes hydrolysis, processive motion, and product expulsion) is 2 orders of magnitude faster than either kon or koff. The stage was now set for a paradigm shift in the understanding of cellulase synergism. Although some studies 1353

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 38. Free energy landscape of the hydrolytic and processive steps for TrCel7A. The barriers for glycosylation,441 product movement,441 deglycosylation,441 processive motion,459 and catalytic activation459 were computed via advanced molecular simulation techniques. These barriers represent all of the key steps in the CBH processive cycle (in between adsorption and desorption) with the exception of product expulsion, which has previously been experimentally ruled out as a rate-limiting factor in these enzymes.557 Because product expulsion is not explicitly considered here, the overall free energy change for the processive cycle will not be as dramatic as shown. Taken with previous results, these findings led to the conclusion that the glycosylation reaction is the rate-limiting step in the enzymatic deconstruction of cellulose by TrCel7A.459

is similar to the DP of the BC substrate (DP ∼ the length of obstacle-free path = nfree); thus, the “obstacles” were identified with the amorphous regions of the BC (Papp ∼ DP ∼ nfree), suggesting that CBH is not able to digest in the amorphous regions.557 When a CBH encounters an amorphous region of substrate, it stalls and must dissociate before it can continue hydrolyzing the substrate. Endolytic cuts made in the substrate provide “escape routes” for the CBH before it reaches the amorphous regions and facilitate avoidance of the amorphous regions. An extremely relevant extension of this work would be to perform similar experiments with other cellulolytic enzymes including nonreducing end specific CBHs, EGs from other GH families, and/or LPMOs to understand their molecular-level synergistic effects. The rate-limiting step of the processive cycle of a CBH had now been narrowed to those steps that comprise the “inner processive cycle” (Figure 35), namely, processive motion, catalytic activation, glycosylation, product movement, deglycosylation, and product expulsion.557 These are also the steps that take place in the observed CBH processive motion in high-speed AFM studies.476,558,564 Further narrowing which of these steps limits the overall production of cellobiose would be extremely difficult to accomplish experimentally, but molecular simulation has provided insight in this regard. Product expulsion has been studied via molecular simulation571,572 and has been ruled out as the rate-limiting step in CBH action.557 The free energy barriers for glycosylation, product movement, and deglycosylation were calculated as 15.5, 2.0, and 11.6 kcal/mol, respectively;441 the reaction rate for glycosylation was found to be much lower than that for deglycosylation making it the rate-limiting hydrolytic step (discussed in more detail in section 6.5). Free energy profiles for the remaining steps of the TrCel7A processive cycle

were elucidated in 2014.459 The barrier for processive motion (advancing along the cellulose chain by one cellobiose unit) was calculated via umbrella sampling and found to be higher than that for catalytic activation (the formation of the Michaelis complex) at 4.2 and 2.9 kcal/mol, respectively. Both of these barriers are significantly lower than the 15.5 kcal/mol barrier for glycosylation (Figure 38).441 The barrier to processive motion may seem surprisingly small, given the many enzyme−substrate interactions that must be broken for the cellulose chain to advance. However, many new and strong interactions are formed upon chain advancement, and given the flexibility of the residues lining the binding tunnel, these can start to be formed before prior interactions are fully broken. Knott et al. pointed in particular to the strong binding of polar residues at the product sites that are empty following product expulsion (and immediately prior to processive motion).459 Taken with previous computational and experimental results, this computational evidence suggests that the rate-limiting step in the overall conversion of cellulose is the glycosylation reaction. As such, accelerating this step is thus predicted to constitute a primary target for the engineering of improved GH7 CBHs. In addition, a significant stabilization of −8.1 kcal/mol results from the processive motion. This “driving force” for processive motion was explained in terms of the particularly strong binding of the leading glucosyl residue of the cellulose chain, which advances to the +2 binding site. The primary residues responsible for this coordination (Asp259 and Arg394) are conserved in GH7 CBHs, but not in GH7 EGs (Figure 28). Further analysis of these simulations also suggested that the aromatic residues that line the binding tunnel are primarily for providing the tunnel shape, guiding the cellulose chain to the active site in the correct orientation and conformation. 1354

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 39. Product binding modes in GH7 enzymes. The various binding poses of the glucosyl residues filling the product sites represented by TrCel7A crystal structures. These include slide mode (exhibited by PDB code 5CEL),446 cut mode also known as the Michaelis complex (PDB code 4C4C),441 unprimed glycosyl-enzyme intermediate (PDB code 4C4D),441 and primed glycosyl-enzyme intermediate mode (PDB code 3CEL).446 For reference, the catalytic residues of the unprimed glycosyl-enzyme intermediate (PDB code 4C4D) are shown.

steps in the hydrolytic processive cycle shown in Figure 39.573 Ubhayasekera et al. studied binding in the product subsites for PcCel7D and TrCel7A, which showed two distinct binding modes for glucosyl residues in the product sites, referred to as “cut” and “slide” modes.458 Slide mode was named on the basis of the proposal that this represents how the product sites are filled when the cellulose chain first slides across the active site. Structurally, the TrCel7A structure with two cellotetraose molecules bound (PDB code 5CEL173) best represents this stage of the processive cycle. The formation of the Michaelis complex would then involve a transition (“catalytic activation”) to cut mode, so-called because it was suggested to represent the positioning of a glucopyranose immediately prior to enzymatic cleavage. The recently solved structure of the TrCel7A Michaelis complex (PDB code 4C4C441) represents this stage. More recently, these product modes have been refined and expanded to include the “unprimed glycosyl-enzyme intermediate” and “primed glycosyl-enzyme intermediate” immediately after glycosylation and before deglycosylation, respectively.573 The unprimed glycosyl-enzyme intermediate mode essentially overlays cut mode in the product sites and represents the product position immediately following the first chemical step (glycosylation). The TrCel7A glycosyl-enzyme intermediate structure (PDB code 4C4D441) features a covalently bound cellohexaose molecule in the −6 to −1 subsites and a cellobiose product in the +1/+2 sites in unprimed glycosyl-enzyme intermediate mode. While the product remains in unprimed glycosyl-enzyme intermediate mode, there is insufficient space between cellobiose and the anomeric carbon reaction center for the nucleophilic water molecule to approach. Thus, before the second step of the catalytic cycle (deglycosylation) can proceed, the product shifts slightly toward the tunnel exit, into the “primed glycosyl-enzyme intermediate” mode (Figure 35); this is the position the cellobiose product occupies during the deglycosylation (exhibited by product complex, PDB code 3CEL446). The “priming” refers to being primed for the second catalytic step.

An important theoretical advance in the treatment of enzymatic adsorption was the utilization of the quasi-steadystate assumption to give physical significance to the parameters of the oft-used (and inherently nonprocessive) Michaelis− Menten (MM) framework.559 The MM framework is often applied to processive enzymes (e.g., cellulases), because it is able to fit the experimental data reasonably well. However, the justification for applying the MM framework to processive systems had never been firmly established; thus, the fitting parameters obtained lacked physical meaning. Cruys-Bagger et al. applied the quasi-steady-state assumption to a set of processive enzyme reactions, namely, association (kon), processive hydrolysis (kcat, which includes processivity, hydrolysis, and product expulsion), and dissociation (koff). Thus, they connected MM parameters with rate constants for the discrete processive steps of a CBH. Relatively straightforward data obtained from standard assay techniques could now be converted to rate constants for the elementary steps via simple mathematical expressions. This approach was subsequently applied to the hydrolysis of three different types of cellulose (Regenerated AC, Avicel, and BMCC) by two variants of TrCel7A (intact with linker/CBM and cleaved catalytic core), finding that the on-rate varies with substrate load but not with enzyme load.561 Dissociation was found to be rate-limiting except at very low substrate loads, where association became slower. The CBM was shown to increase the rate on crystalline but not amorphous cellulose; in fact, cleaving the CBM actually increased the rate of cellobiose production on RAC. This may indicate a role for the CBM in substrate disruption (see section 5) or simply indicate a low affinity of the CBM for more amorphous regions of substrate. This study also found that dissociation was unaffected by the presence of the CBM indicating that the dissociation of a cellulose strand from the CD is slower than CBM disassociation. Structural and molecular modeling studies have provided valuable insight into the molecular basis of the processive cycle of GH7 CBHs. For example, the binding of various substrates and products in GH7 CBHs provides snapshots of the discrete 1355

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

From the first GH7 CBH and EG structures, aromatic residues that line the substrate-binding tunnel have been identified as being ubiquitously conserved in GH7 enzymes.172,450 The first structure of TrCel7A revealed the presence of four tryptophan residues lining the binding tunnel,172 Trp40, Trp38, Trp367, and Trp376, which stack with the −7, −4, −2, and +1 glucosyl moieties, respectively. TrCel7B maintains these interactions with one variation: Trp38 in Cel7A is Tyr38 in Cel7B.450 Later crystal structures of TrCel7A with long oligosaccharides confirmed that these residues indeed stack face-to-face with the glycosyl moieties of the substrate.173 Mutation of the aromatic residues in CBH tunnels is typically significantly detrimental to enzyme activity. In particular, aromatic residues at the entrance to the binding tunnel have been show to have a dramatic impact on catalytic function. Koivula published the seminal study on this topic in 1996.574 Therein, mutation of the leading tryptophan residue in +4 subsite of TrCel6A was demonstrated to abolish activity on crystalline cellulose, but not on amorphous substrates. This study suggested that the leading tryptophan residue might act as a recognition site for acquiring cellulose chains in a crystalline context. von Ossowski et al. first made brief reference to similar results being discovered for TrCel7A, but no data were reported.461 In their initial report on HS-AFM, Igarashi et al. demonstrated that the TrCel7A W40A mutant, wherein the leading tryptophan residue in the −7 subsite is converted to alanine, did not slide on crystalline cellulose.558 This result was later corroborated in a more in-depth HS-AFM study from Nakamura et al.575 In 2013, two computational studies, one included with HS-AFM results from Nakamura and coworkers575 and one from GhattyVenkataKrishna et al.,576 were reported that shed further light on the role of Trp40 in chain acquisition. Namely, both studies essentially simultaneously reported that a cellononaose chain was placed in silico with the leading glucose residue bound to Trp40; the chain advances by a cellobiose unit into the CD to bind to the −5 to −7 subsites.575,576 In the aforementioned computational and experimental study of TrCel7A and TrCel6A on cellulose, MD simulation of the full length enzymes on the surface of cellulose demonstrated that, in both enzymes, the entrance tryptophan residue interacted directly with the first glucose being decrystallized from the substrate.393 It is also noted that, in some GH7 CBHs, there is an aromatic residue in the A1 loop that structurally resides on the opposite side of the conserved tryptophan residue in the −7 subsite. For example, in a ligand-bound structure of LqCel7B, the tyrosine residue at this position was shown to bind to the planar face of the glucosyl moiety bound in the −7 position.383 A computational examination of the HirCel7A CBH structure, which also exhibits a tyrosine residue on the A1 loop, suggested similar binding behavior.449 Beyond the entrance tryptophan residue, no experimental mutation work has been reported, to our knowledge, for GH7 cellulases on the equivalent Trp38, Trp376, and Trp367 from TrCel7A. Computational investigation of both Cel7A and Cel7B has provided some clues as to how these residues might relate to enzymatic processivity,577 as likely does analogy to work in other GH families. In a computational investigation, Taylor et al. found that mutating these aromatic residues to alanine resulted in reduced binding affinities in all cases, but they did so to different extents between TrCel7A, a CBH, and Cel7B, an EG. The mutational effects were fairly localized in Cel7A,

similar to results found in TrCel6A,578 but they seemed to be completely deleterious to ligand binding in the EG Cel7B.577 In related GHs, Horn et al. and Zakariassen et al. published two seminal studies on GH18 chitinases wherein aromatic mutations closer to the catalytic center of the enzymes resulted in dramatic impacts to processivity, but yielded significantly higher activities on chitosan, a soluble form of chitin.563,579 The authors of those studies suggested that aromatic residues in GH tunnels likely are vital for processive action, but that processivity comes at a cost in enzyme performance. For GH7 cellulases, clearly, further experimental work will be required to fully elucidate the role of the aromatic residues in enzyme tunnels and clefts. Advanced molecular simulation methods have also contributed to the formation of a molecular basis for CBH processivity. Payne et al.580 utilized free energy perturbation with replica-exchange MD to develop a connection between structure and processivity. The binding free energy, ΔGob, of several family 7 CBHs was calculated, and a novel theoretical connection was made to experimentally observable quantities: ⎛ P intrk ⎞ ΔG bo on ⎟ = ln⎜ RT ⎝ kcat ⎠

In a comparison of the binding free energy of TrCel7A with a tunnel-spanning cellononaose ligand to that of a celloheptaose filling only the reactant side of the tunnel, it was calculated that filling the product sites results in stronger binding by 11.1 kcal/ mol.580 Normalized by the number of binding sites, the product sites bind the ligand more tightly than the substrate sites. This exceptionally strong binding in the product sites of CBHs is likely a key factor in driving processive motion. Following product expulsion, the product sites are empty, and the cellulose chain must advance forward by one cellobiose unit (Figure 35). This study revealed the significant stabilization experienced by the enzyme-cellulose complex by filling the product sites. Furthermore, tight binding of cellobiose in the product binding sites is remniscent of the well-known phenomena of cellobiose inhibition in TrCel7A.581,582 The value reported by Payne et al., −11.1 kcal/mol,580 is in excellent agreement with a prior study by Bu et al., −11.2 kcal/mol,571 obtained using an entirely different computational approach. This prior study highlighted the likely relationship of favorable cellobiose product binding to product inhibition. The effects of cellobiose inhibition on processive cellulolytic action are described further in section 6.3. Pingali et al.583 employed small angle neutron scattering (SANS) in order to probe the effect of pH on the structure of TrCel7A in solution. Their findings indicated that at higher pH (7.0, 6.0, and 5.3) the enzyme shape is well-defined and compact, consistent with the crystal structure.172 However, at pH 4.2 (near the pH for optimal catalytic activity), the CD adopts a conformation that is intermediate between this compact form and a fully denatured state, while maintaining its secondary structure. It was speculated that this may indicate enhanced conformational flexibility between the secondary structure elements that would allow increased access for binding to the cellulose chain, thus explaining the optimum in catalytic activity. Molecular simulation has provided insight into the molecular level roots of the pH-dependent, higher-level conformational changes observed via SANS by Pingali et al. The typical protocol for MD simulation involves fixing the protonation state of the protein’s titratable residues based on the solution pH and the residue’s pKa and local environment. However, changes in local 1356

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

the hydrolytic activity on BC. At 79 μM cellobiose, only 40% of the activity was retained; at 158 μM, all activity was lost. The substrate concentration did not significantly affect measured inhibition, and addition of β-glucosidase greatly increased the solubilization of cotton fibers. Bacterial cellulases from C. thermocellum were later shown to be inhibited by cellobiose and, to a lesser extent, glucose.599 Addition of β-glucosidase from Aspergillus niger was later shown to increase the glucose yield by mitigating cellobiose inhibition.594 The authors suggest that a combination of product and substrate inhibition causes the detected retardation of hydrolysis at high substrate loading (greater than 10%). Lee and Fan studied the kinetics of the T. reesei cellulase system on insoluble cellulose and the effect of extended hydrolysis times.588 They suggested that the reduction in hydrolysis rate may be caused by the inhibitory effect of the formed products but also the transformation of cellulose into a less digestible form of increased crystallinity. They also showed that the cellulase cocktail is more strongly inhibited by cellobiose than glucose. The product inhibition mechanism was suggested to be deactivation of the substrate-adsorbed enzyme and thus uncompetitive inhibition. However, Holtzapple et al. estimated the binding of cellobiose to a cocktail of cellulases from T. reesei and concluded that cellobiose inhibition was noncompetitive.581 The authors discuss the possibility of a regulatory site other than the active site. However, they conclude that the binding of sugar inhibitors is likely at the active site given that the active site is structured to strongly bind polymeric forms of these sugars, the low diffusivity of the insoluble substrate, and the stronger binding of cellobiose versus glucose (because it has more sites of attachment). They also compared the glucose and ethanol inhibition of the cellulases, concluding that glucose inhibition was 1.4× greater than ethanol inhibition, indicating that conversion of glucose to ethanol effectively reduces the inhibition. Väljamäe et al. detected cellobiose production by monitoring the solution absorbance of a coupled reaction with CDH.600 They concluded that the early rate retardation is not because of product inhibition by measuring initial cellulose hydrolysis rates by TrCel7A both in the presence and in the absence of initial cellobiose and finding that they follow an identical time course.600 Furthermore, by adding fresh substrate to an already slowed-down experiment produced a new “burst” phase indicating that the retardation could not be due to deactivation/inactivation of the enzymes themselves. These conclusions were later affirmed for the bacterial EGs E2 (GH6) and E5 (GH5), where addition of β-glucosidase did not stimulate cellulose hydrolysis when measured on filter paper or PASC.601 Though some early studies measured product inhibition with crystalline substrates,602−605 many focused on small, soluble substrates.461,606−610 Vonhoff et al. measured cellobiose inhibition of TrCel7A on 2-chloro-4-nitrophenol-β-D-lactoside to be Ki = 20 μM,608 a figure which has often been used for comparison in subsequent literature. These findings contributed to strong product inhibition being considered as a source of the cellulose hydrolysis rate retardation seen at low conversion.581,582 Gruno et al. studied product inhibition on one CBH (TrCel7A) and three EGs (TrCel7B, TrCel5A, and TrCel12A) from the T. reesei cellulolytic system via 3H reducing end labeled BC and amorphous cellulose.611 They tracked short time (5−10 s) radioactive product release in the presence of varying concentrations of background cellobiose. With such a

environment and protein conformational changes could result in protonation/deprotonation processes that will be missed in this scheme. Bu et al.479 utilized constant pH MD (CPHMD) in order to directly couple the protonation state of the titratable residues of TrCel7A and TrCel6A to the solution pH conditions. At pH of 5.0, near the pH of optimal activity, the boat (TrCel7A) or skew boat (TrCel6A) conformations for the −1 sugar are favored over the stable chair. Increased loop flexibility, particularly in loops B2 and B3 (and to a lesser extent A2 and A4), was observed in TrCel7A at pH 5 when compared to pH 7 due to altered hydrogen bonding interactions. Similar increased loop flexibility was seen in TrCel6A. These findings have important implications for catalysis and chain processivity, respectively. In addition, comparing the apo to the substratebound enzyme revealed significant differences in the pKa values of the active-site titratable residues. Maupin and co-workers applied these techniques to the CBH MaCel7B and determined that the active site was highly chargecoupled.478 In particular, Asp214 and His228 are critical for shuffling protons around the active site and maintaining the catalytically active states of Glu212 (deprotonated) and Glu217 (protonated). Incorporation of the CPHMD and replica exchange MD results into a kinetic model demonstrated that charge coupling between Asp214, Glu217, and His228 was essential to reproducing the experimental kinetic−pH profile. Follow-up work using these tools examined the flexibility of tunnel-enclosing loops as a function of solution pH.478 The findings of Bu et al. were affirmed in that increased loop flexibility with varying pH is the likely source for the pHdependent morphology seen in the neutron scattering studies. 6.3. Product Inhibition

While strong binding at the product sites of CBHs may be a key factor in driving processive motion, it also has an undesired consequence for cellulose hydrolysis: product inhibition. There is great potential for inhibition of enzyme cocktails due to the exceptionally heterogeneous nature of lignocellulosic hydrolysis, which involves a complex substrate as well as multiple components of the enzyme cocktail. However, the most dramatic inhibition of cellulases is due to the reaction products, namely cellobiose and glucose. Product inhibition retards the overall conversion rate of lignocellulose to the end product glucose and is particularly nefarious at the high substrate loadings utilized industrially.584,585 Additionally, product inhibition has been considered as a factor in the rapid rate retardation that occurs at short time and low conversion.586−588 Various reviews have included discussion of product inhibition,32,33,556 with a recent excellent two-part review focused exclusively on the topic.584,589 Both TrCel7A (CBH) and TrCel7B (EG) are inhibited by cellobiose, and this likely originates from their relatively high binding affinities for cellobiose. As GH7 CBHs constitute the majority of industrial cellulase mixtures, product inhibition constitutes an important consideration for achieving high product yields in the enzymatic hydrolysis of cellulose,556,584 which can have significant impact on the efficiency of biomass conversion.590,591 Though other strategies exist for relieving product inhibition in cellulases, including product removal via membrane filtration585,592 and conversion via cellobiose dehydrogenase (CDH),593 the most commonly employed strategy is conversion of cellobiose to glucose via β-glucosidases.594−597 Early work by Halliwell and Griffin598 on the activity of T. koningii component C1 (CBH) showed that cellobiose inhibited 1357

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

similar methods in thermophilic/thermotolerant fungi (A. thermophilum, T. aurantiacus, and C. thermophilum) to demonstrate that GH7 CBHs were more sensitive to product inhibition than were GH6 CBHs and EGs.614 Also, cellobiose inhibition decreased as temperature increased. The inhibition measured using 14C-labeled BC did not correlate with that measured using low molecular-weight model substrates suggesting that the latter should not be used to measure product inhibition as a parameter for selecting enzymes for lignocellulose conversion557 and affirming what was concluded by Gruno et al.611 and later by Murphy et al.615 The latter study compared inhibition by cellobiose and glucose for the five major cellulases from T. reesei and confirmed previous findings that TrCel7A was most sensitive to cellobiose. Structural studies of cellulases (section 6.2) have provided some links to the molecular-level origins of the phenomenon of product inhibition. Becker et al. attempted to engineer TrCel7A with regards to its pH optimum by analogy with HiCel7B, which has a higher and broader pH optimum than either TrCel7A or TrCel7B.328 HiCel7B has an extra histidine residue near the catalytic acid/base that the T. reesei enzymes do not have. Five point mutations were made to TrCel7A, adding this histidine as well as changing nearby residues to accommodate the bulkier histidine residue (formerly an alanine). The primary goal of shifting to a more alkaline pH optimum was achieved as well as an unexpected additional consequence: cellobiose inhibition was greatly relieved in the mutant CBH (Ki increased from 20 μM to 755 μM, both on 2-chloro-4-nitrophenol-β-D-lactoside substrate). In addition, inhibition in the wild-type was competitive, whereas, in the mutant, it was mixed. This is all the more interesting when one considers that the “binding pose” of the cellobiose in the product sites as well as the directly contacting ligands of the enzyme are nearly identical in the wild-type and the mutant. However, the weaker binding was attributed to the loss of two water-mediated interactions between cellobiose and Thr226 and Asp262. von Ossowski et al.461 presented another protein engineering study wherein the exo loop of TrCel7A was mutated in various ways to connect its structure with catalytic function. One mutant in particular had eight residues deleted from the exo loop, which significantly relieved inhibition by cellobiose (as well as changing its type from competitive to mixed). Ki was measured to be 24 μM for the wild-type TrCel7A on pNPL and 300 μM on the exo loop deletion mutant with substrate 2-chloro-4-nitrophenol-β-D-lactoside. This study also considered PcCel7D (Ki = 180 μM on 2-chloro-4-nitrophenolβ-D-lactoside), which has an intermediate length exo loop owing to the natural deletion of six residues. The authors pinpoint structural roots for the relative inhibitory strengths of these three enzymes. The deletion mutant loses carbohydrate interactions with Tyr247, Thr246, Tyr252, and Arg251 (residue numbering for TrCel7A). However, PcCel7D maintains the Arg251 residue (Arg240 in PcCel7D), which has hydrogen bonding interactions with both of the product site glucosyl residues. Further comparative studies of TrCel7A and PcCel7D examined the binding of several inhibitors in addition to cellobiose.458 It was observed that the loss of hydrogen bonding interactions along with the more open binding tunnel represents the structural basis for weaker binding of cellobiose to PcCel7D, and thus reduced product inhibition.458,461 The more open tunnel provides more access to solvent and more potential disruption of protein−carbohydrate electrostatic interactions. Textor et al. presented the crystal structure and inhibition experiments on ThCel7A noting that a key difference between

short time interval, one avoids the effects of continuous alterations in cellulose structure due to hydrolysis as well as the complex kinetics that may be relevant at later times, and thus approximates a steady-state situation. Although the precise nature of the inhibition (competitive or mixed) was not determined, the relative strength of inhibition was calculated (assuming a competitive paradigm, though assuming mixed inhibition gives Ki values within the standard error of one another). The value for the apparent competitive inhibition constant found for TrCel7A on 3H-labeled BC was approximately 1.5 mM,611 or 2 orders of magnitude higher than what had been found for the same enzyme on small, soluble substrates.461,607,608 Thus, product inhibition is actually decreased 100-fold on the more industrially relevant solid, insoluble substrates. Product inhibition for two of the EGs studied (TrCel7B and TrCel5A) was about an order of magnitude weaker than for TrCel7A611 (consistent with what had been found previously on small molecule substrates607,612 and later confirmed on PASC613), and may be due to the more open binding site cleft of EGs. An anomalous trend was found for EG TrCel12A wherein activity actually increased with cellobiose concentration suggesting this EG is activated by the product; however, this is more likely an experimental artifact due to enhancement of transglycosylation.611 A simple kinetic model was also presented wherein the type of inhibition experienced by the enzyme was found to be dependent upon the relative binding strengths of the nonproductive (i.e., product sites empty) and productive (i.e., binding tunnel-spanning) enzyme−substrate complexes. Competitive inhibition results when the productive complex is bound much more tightly than the nonproductive; conventional mixed-type inhibition results when the opposite is true. This was a landmark study toward determining the relevance of product inhibition for the industrial production of fuels and chemicals via enzymatic hydrolysis of lignocellulosic biomass. Bommarius et al. presented a study of the hydrolysis of MCC (Avicel) prepared via three different methods of pretreatment.596 Product inhibition mainly affected the hydrolysis rate during “phase I” of hydrolysis, up to about 30% substrate conversion and when the hydrolysis rate is highest. In the later phases when the hydrolysis rate has slowed down, other factors such as “jamming” of enzymes molecules on the substrate were suggested to be rate-determining. In a study of the synergism between endo- and exo-acting cellulases on 14C-labeled BC, Jalak et al. studied cellobiose inhibition557 under both single-turnover and steady-state conditions and concluded that the inhibition of hydrolysis rate by cellobiose was stronger under steady-state than under singleturnover conditions. The product was mainly competing with the cellulase chains to bind to TrCel7A by reducing the number of initiations. Cellobiose also seemed to slow down the processive movement of the CBH on the cellulose chain. This was expected since cellobiose bound in the product site should restrict the possibility of a processive movement and should contribute to a noncompetitive component of the inhibition. However, the calculated values of kinetic constants suggest that the expulsion of cellobiose should not be rate-limiting for the turnover. Additionally, the kinetic analysis led to the suggestion that cellobiose also can bind in the substrate sites in direct competition with the cellulosic substrate. Thus, both competitive and noncompetitive components of inhibition should be involved in cellobiose inhibition of TrCel7A resulting in an overall mixed type of inhibition. The same group later utilized 1358

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(addition of β-glucosidases) is complicated by the fact that βglucosidases themselves are inhibited by glucose585,591,624 and by gluconic acid.625 β-Glucosidase inhibition by gluconic acid is particularly significant because the LPMOs that have become an indispensable part of industrial cellulase cocktails produce gluconic acid in significant amounts.626,627 In addition, βglucosidases have transglycosylation capabilities628−631 wherein they assemble di- and oligosaccharides rather than deconstruct them, a phenomenon that is exacerbated as glucose concentrations increase.

this enzyme and TrCel7A is in the exo loop mobility, as discussed in section 6.2.466 The exo loop itself is quite similar to that of TrCel7A, but its mobility is enhanced due to a missing interaction with Tyr371 (TrCel7A numbering) from the loop on the opposite side of the binding tunnel (loop A3). This increased loop mobility was intimated by increased temperature factors in the crystal structure for this region and confirmed with MD simulations. This difference may be the key to explaining why this enzyme’s inhibition by cellobiose is significantly reduced, with a Ki = 7.2 mM on pNPC that is more than 2 orders of magnitude greater than that of TrCel7A on 2-chloro-4nitrophenol-β-D-lactoside. Molecular simulation has served to connect structural roots of product inhibition with thermodynamics. Bu et al. calculated the absolute binding free energy of both glucose and cellobiose in the product sites of TrCel7A. Using two different computational methods, they found the absolute binding free energy of cellobiose to be −14.4 kcal/mol (via steered MD) and −11.2 kcal/mol (via free energy perturbation MD) signifying that cellobiose is 11.2−14.4 kcal/mol more stable in the TrCel7A product sites than in solution.571 In order to pinpoint specific residues that contribute to this strong binding, the product site residues that interacted most strongly with cellobiose in the simulations (Arg251, Asp259, Asp262, Trp376, and Tyr381) were mutated to alanine, each resulting in significantly weakened cellobiose binding. The same group followed up on this work to elucidate how product binding differs between processive and nonprocessive cellulases as well as how the presence of a ligand in the substrate binding sites (−7 to −1) affects the binding strength.572 Absolute binding free energy calculations with GH7 (CBH TrCel7A and EG TrCel7B) and GH6 (CBH TrCel6A and EG H. insolens Cel6B) revealed that cellobiose binding in the product sites is dramatically stronger in CBHs than in EGs lending a thermodynamic basis to what had been observed experimentally.607,611−613 This result was rationalized in terms of the structural differences between EGs and CBHs: the more open architecture of the binding cleft in EGs leads to increased solvation, thus weakening the protein− carbohydrate interactions in the product sites. Payne et al.580 found via an independent computational method that filling the product sites of TrCel7A results in an 11.1 kcal/mol stabilization580 (in agreement with the results from Bu et al.571). This serves as another confirmation of the exceptionally strong binding that is present in the product sites in CBHs, which is likely a key factor in product inhibition. Theoretical and kinetic modeling studies have also helped to shed light on the reaction kinetics, mechanisms, and enzymatic synergism of product inhibition on enzymatic hydrolysis (refs 581, 582, 588, 591, 600, 602, 605, 609, and 616−619), and this body of literature has recently been reviewed.584,618 Beyond cellobiose product inhibition, cellulase cocktails face other complicating inhibitory factors. For example, many other species besides the cellobiose product exist in the hydrolytic system which also inhibit the GH enzymes, including lignin,620 ethanol,616,617 glucose,581,614 lactose,458,606 various ions,458,598 solvents (including ethanol, butanol, and acetone),581 hemicellulose-derived sugars (including mannose, galactose, and xylose),587,621 xylan and xylooligomers,622 and long (DP 7−16) xylo- and gluco-oligosaccharides produced during pretreatment (which were shown to inhibit T. reesei CBHs 100x more strongly than cellobiose).623 Cellobiose is a stronger inhibitor than each of these with the exception of the last.623 In addition, the common industrial technique for product inhibition relief

6.4. Pyroglutamate

Another common feature of GH7 enzymes, and other secreted proteins from, e.g., T. reesei, is an N-terminal modification of a glutamic acid residue to pyroglutamate, also referred to as pyrrolidone carboxylic acid or PCA (Scheme 2 and Figure 40) Scheme 2. Pyroglutamate Chemistry: The Chemistry of NTerminal Glutamine Cyclization

(refs 172, 309, 323, 329, 436, 438, 567, and 632−634). This modification was known before the first GH7 crystal structures were solved due to the nonstandard signatures on mass spectrometry observed during N-terminal sequencing.436,438,632 The enzyme responsible for this chemical modification is glutaminyl cyclase.635,636 This post-translational modification is thought to be responsible for protection against degradation by exo-acting peptidase enzymes.637,638 Similarly, the removal of this post-translational modification is well-known via a calf liver pyroglutamate amino peptidase, a method that has been commonly used for preparing eukaryotic proteins for Nterminal sequencing since the 1980s.639 GH7 structures to date have been derived from fungal expression hosts, and as mentioned above, pyroglutamate is ubiquitously observed in these structures. However, the role of this post-translational modification was only recently reported in the academic literature.640 Motivated to understand the significant differences in GH7 activity and stability imparted by heterologous expression in S. cerevisiae, Dana et al. expressed T. emersonii (R. emersonii) Cel7A from S. cerevisiae and Neurospora crassa expression with a linker and CBM taken from A. thermophilum and Agaricus bisporus, respectively.640,641 Improper disulfide bond formation in S. cerevisiae and the extent of hyperglycosylation were both examined, and reported not to be a major factor in the activity differences. Specifically, free thiols were not detected, suggesting that all cysteine residues were paired, and the yeast expression system used exhibits a deletion in the glycosylation pathway resulting in minimization of hyperglycosylation. Further deglycosylation by EndoH and a mannosidase enzyme resulted in an increase in activity by 25%, but did not fully explain the disparity in activity between the fungal and yeast expressed GH7 enzyme, which led the authors to examine pyroglutamate (Scheme 2). Treatment of the S. cerevisiae-expressed Cel7A with glutamyl cyclase in vitro resulted in essentially identical activity and stability to the enzyme expressed in N. crassa.640 1359

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 40. N-Terminal pyroglutamate exemplified by TeCel7A (PDB code 3PFJ). The ligand from the TrCel7A Michaelis complex is shown in aquamarine “sticks” (PDB code 4C4C).

6.5. Glycosylation

GH7 glycosylation significantly affects activity and represents a promising target for protein engineering efforts. Glycosylation is a post-translational modification that serves a myriad of biological functions in recognition and signaling.642,643 Secreted carbohydrate-active enzymes are often “decorated” by N-linked and O-linked glycosylation, wherein small molecule carbohydrates are covalently attached to specific sites on the protein. Nglycosylation attaches the carbohydrate (generally highly branched mannose or single N-acetylglucosamine residues (GlcNAc)) to the β-amide group of an asparagine in a N−X− S/T motif (where X ≠ P), and O-glycosylation attaches the carbohydrate (generally one to three mannose residues) to the β-hydroxyl group of a serine or threonine.369 Glycosylation of CBMs and linkers is discussed in sections 5.1 and 5.2, respectively. In what follows, we focus our attention the glycosylation of the catalytic domain (CD) of GH7 cellulases. Early work by Maras et al. characterized six dominant forms of N-glycans on TrCel7A produced from the RUT-C30 strain.644 Six major forms were characterized: GlcMan8 GlcNAc2, GlcMan7GlcNAc2, Man7GlcNAc2, ManPGlcMan7GlcNAc2, GlcMan5GlcNAc2, and Man5GlcNAc2. Klarskov et al. successfully identified three sites for glycosylation in TrCel7A (out of four putative N-glycan attachment sites).456 Liquid chromatography coupled electrospray ionization mass spectrometry (LCESMS) results indicated that a single GlcNAc residue was attached at Asn45, Asn270, and Asn384. The relatively small glycan prompted speculation that the glycan had been trimmed by an endoglycosidase. The three sites identified in this early study have been subsequently confirmed as the primary attachment sites for N-glycosylation on the TrCel7A CD (Figure 41).369 For example, Harrison et al. analyzed a particular strain of T. reesei, ALKO2877, and confirmed via ESMS that only three of the four postulated TrCel7A N-glycosylation sites

Figure 41. TrCel7A CD glycosylation. The CD of TrCel7A is shown in green transparent “surface”, N-glycans are show in slate and red “sticks”, and the cellulose surface is shown in aquamarine and red “sticks”. N-glycosylation attaches to the TrCel7A CD at 3 locations: Asn45, Asn270, and Asn384. Blue squares denote GlcNAc, and green circles denote mannose. Both images show a representative glycan at each binding site, though variability has been observed experimentally, as discussed in the text.

actually possessed attached sugar residues.368 They also demonstrated via monosaccharide analysis that only single GlcNAc residues attached at these sites. O-glycosylation of GH7 CDs has not been reported to date, in contrast to the linker and CBM (see section 5).368,645 1360

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Hui et al. examined TrCel7A from RUT-C30 and two derivative strains via MS and found that single GlcNAc residues were predominantly found at Asn45 and Asn384 for all three strains, but high mannose chains tended to attach at Asn270 with a strain-dependent degree of variability.273 The Asn270linked glycan was identified as Man8GlcNAc2 in RUT-C30, single GlcNAc in one derivative strain, and a mixture of the two in the other derivative strain. These differences may be due to differences in the presence of glycan-trimming glycosidases, which vary with strain and growth conditions. This may also be the source of the shorter (and more consistent) length of glycans attached to Asn45 and Asn384: perhaps steric hindrance prevents glycosidases from accessing Asn270 glycans to the extent that they do at Asn45 and Asn384. Follow-up work by this group on TrCel7B revealed Asn56 (single GlcNAc residue) and Asn182 (Man8GlcNAc2) as glycan-attachment sites for this EG.418 The dependence of the glycosylation pattern of TrCel7A on growth conditions was probed systematically by Stals et al.419 Four different growth conditions were examined: minimal medium (resulting in a solution pH of 2.5), corn steep liquorenriched medium (pH 5), CaCO3-supplemented minimal medium (pH 7), and fed-batch cultivation (pH 4). Differences in growth conditions were shown to affect the number of glycosylation sites that were occupied (ranging from zero to three), attached glycan (Man5GlcNAc2 to Man8GlcNAc2), glucosylation (i.e., a glucosyl moiety attached to the terminus of the glycan, e.g., GlcMan8GlcNAc2), phosphorylation (e.g., ManPGlcMan8GlcNAc2), and frequency of single GlcNAc attachment. Minimal medium tended to give a higher degree of glycan occupancy than rich medium. Phosphorylation and terminal glucosylation did not trend discernibly with the growth medium conditions. The observation from Hui et al.273 that Asn270-linked glycans tend to be more resistant to hydrolytic trimming was affirmed. Stals et al. also studied the effect on glycosylation produced by different strains, all under minimal medium growth conditions.646 Wild-type strain QM6A, RUT-C30 mutant, and four other high-cellulase producing mutants (RL-P37, QM9414, VTT-D-80133, and VTT-D-78085) were compared. RUT-C30 and RL-P37 (both selected using ultraviolet light and 2deoxyglucose-supplemented media) produced longer glycans (GlcMan7−8GlcNAc2) whereas the wild-type and other mutants produced trimmed glycans (Man5−6GlcNAc2), which may indicate an inefficient glucosidase in the endoplasmic reticulum in the former two strains. Interestingly, these two strains secrete 3−5 times more total cellulase than wild-type QM6A or QM9414,647 though the connection between extended glycans and cellulase production was not directly elucidated. Jeoh et al. expressed TrCel7A in A. niger, resulting in 6-fold increased N-glycan attachment to the CD.645 The increased glycosylation resulted in a concomitant decrease in activity (on both bacterial MCC and PASC) and increase in nonproductive binding (as compared to homologously expressed TrCel7A). The original activity could be restored by supplementing with N-glycosidase that trimmed back the glycans. The decrease in activity and binding relative to homologous TrCel7A were less dramatic (though not eradicated) when the CBM and linker were proteolytically cleaved. The recombinant CD displayed a reduced binding affinity compared to homologous TrCel7A, indicating that the glycans may sterically hinder its attachment to the substrate surface (and thus account for the reduced activity).

A complementary study by Adney et al. utilized site-directed mutagenesis to individually manipulate the glycan attachment sites on the Cel7A CD from T. reesei and Penicillium f uniculosum.648 Subsequent expression in A. niger and characterization revealed all mutants with glycosylation removed (two for TrCel7A and three for Pf Cel7A) exhibited improved cellulose conversion. The N384A mutant for TrCel7A, however, had a much more dramatic effect than did the N270A mutant, increasing the activity by 70%. This residue resides on a substrate-binding tunnel loop and is near to the cellulose surface when the enzyme is bound. This was significant in that it linked glycosylation function with the enzyme structure and attached glycans. In addition to three mutants that removed glycosylation, an additional mutant for Pf Cel7A (A196S, creating an N−X−S motif) introduced a new glycosylation attachment site at Asn194. This increase in glycosylation actually gave the most dramatic improvement (85% increase in activity) for any of the Pf Cel7A mutants. Thermal stability also tended to be reduced for the mutants versus the wild-type, both for addition and removal of glycosylation. Gao et al. utilized homologous expression of four glycoforms of P. decumbens Cel7A to probe N-glycosylation at two attachment sites, Asn137 and Asn470.508 They found ManxGlcNAcy residues (where x = 2−5 and y = 1−2) attached at Asn137. The glycoform with the lowest amount of mannose in its glycosylation was shown to have zero detectable activity on pNPC, but it was only this glycoform that synergistically increased the glucose yield of an industrial enzyme cocktail (by a factor of 2). Site-directed mutagenesis of Asn137 and Asn470 to aspartate (thus removing the attachment glycan sites) resulted in higher activities (the double mutant had 65% higher activity than the most active of the four glycoforms). These studies have demonstrated that it is possible to characterize and manipulate glycosylation to alter cellulase activity. Looking to the future, studies that can elucidate the underlying structural features and interactions that give rise to the observed differences in enzymatic function (e.g., hydrolytic activity and thermal stability) will be particularly valuable for harnessing glycosylation for improved cellulases. 6.6. Protein Engineering

As mentioned multiple times, GH7 enzymes are often the most prevalent members of fungal cellulolytic cocktails and provide a significant amount of hydrolytic potential. Thus, unsurprisingly, they have received significant attention in academic, government, and industrial research laboratories for engineering higher activity, higher thermal stability, and advantageous modification of other properties. However, a major problem in the engineering of GH7 cellulases, especially GH7 CBHs, is the use of a reliable expression host that is able to properly fold and glycosylate the enzymes such that the activity and stability are comparable to those expressed in filamentous fungi. To date, it seems that yeast such as S. cerevisiase is able to express certain GH7 CBHs such as TeCel7A (albeit still at low levels compared to filamentous fungi) but not others such as TrCel7A.289 The sequence relationships that give rise to this nonuniform expression level are unknown but are essential to understand in order to develop high-throughput expression systems for GH7 CBH expression and screening. Nevertheless, significant efforts in the open literature either using rational design or semirandom mutations identified through computational methods have demonstrated that GH7 engineering is feasible 1361

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

at 75 °C, thus demonstrating that the addition of disulfide bonds can improve the thermal stability of GH7 CBHs. Arnold and co-workers from the California Institute of Technology have also made substantial contributions to GH7 CBH engineering primarilyon the basis of computational prediction tools such as structure-guided recombination.650−652 In 2010, Heinzelman et al. used five GH7 CBH parent enzymes to predict and express (in S. cerevisiae) 28 chimeric GH7 CBHs, on the basis of a “background” structure of the TeCel7A enzyme, chosen due to its high expression levels in yeast.650 The authors found multiple combinations of mutations (with an average of 37 mutations relative to the closest parent enzyme) that resulted in substantial thermal stability improvements. Moreover, activity on MCC for 6 chimeras was slightly retained at higher temperatures than the parent enzymes (68, 70 °C). Interestingly, activity at 37 °C in 6 chimeras was improved, also on MCC, but the overall conversion time was quite short (90 min) when the relative activity measurements were made. The same group reported a follow-up study in 2012 from Komor et al.651 starting with 5 chimeras from the previous work. Therein, they used a method based on GH7 CBH sequence alignments and the FoldX force field to predict the effect on ΔGfolding for each amino acid residue. Using this approach, the authors were able to demonstrate an improved variant with 8 additional stabilizing mutations that exhibits a T50 of 72.1 °C, which is a substantial increase in thermal stability. This mutant also retained significant extent of activity on MCC up to 70 °C. Via yet another method, namely noncontiguous recombination from the same group, Smith et al. located 6 single amino acid mutations that are each able to improve the TrCel7A stability by 1−3 °C each.652 Overall, these studies from Arnold et al. demonstrate that computational protein design tools combined with large screening studies can identify both single-point mutations and blocks of enzymes that can directly contribute to improved stability. High-throughput, random mutagenesis has also proven to be effective, given a reliable secretion system. Dana et al. reported the development of an S. cerevisiae strain capable of producing GH7 CBHs with limited glycosylation at consistent titers.641 Using this system, they developed a library of GH7 CBHs with biased clique shuffling starting with 11 parent Cel7A genes, 86% of which were active. Overall, 51 chimeras were identified with improved thermal stability, and several were shown to have activity on Avicel hydrolysis significantly higher than TeCel7A at 60 and 65 °C.

to date, as briefly reviewed here. We note that we do not review the patent literature in detail here. Early work in GH7 CBH engineering from Becker et al.328 and Boer and Koivula327 demonstrated that it is possible to rationally shift the pH optimum of TrCel7A toward a more alkaline optimum through site-directed mutagenesis. This pH optimum shift was based on comparison of the TrCel7A (CBH) active site to the HiCel7B (EG) active site, the latter of which contains a histidine residue near the acid/base residue (Glu217 in TrCel7A), which corresponds to an alanine residue in TrCel7A. To accommodate the bulky histidine residue in place of alanine in the TrCel7A active site, four additional mutations were conducted, resulting in an enzyme with a basic pKa shift of approximately +0.7 pKa units, but a lower kcat/KM overall compared to the wild-type on 3,4-dinitrophenyl-β-D-lactoside. Boer and Koivula subsequently examined the wild-type and the same pentamutant TrCel7A to understand the enzyme stability as a function of pH. They demonstrated that the pentamutant was less stable overall than the wild-type at both acidic and alkaline pH, suggesting that mutations to shift pH optima should go hand in hand with mutations aimed at improving thermal stability.327 Voutilainen and co-workers from VTT have also published an extensive series of studies to improve the thermal stability of GH7 CBHs via several different methods.381,504,505,649 In a study from 2007, they used a high-throughput robotics system to generate a library of random mutants of MaCel7B expressed in S. cerevisiae, and screened activity as a function of temperature on 4-methylumbelliferyl-β-D-lactoside.504 The authors found that a single mutation in the hydrophobic core of the enzyme (S290T) was able to improve the Tm of the enzyme by 1.5 °C and was able to double the activity of the enzyme on Avicel at 70 °C compared to the wild-type. In 2009, Voutilainen et al. described a subsequent engineering study using MaCel7B wherein the S290T mutation was combined with the addition of a tenth disulfide bridge near the tunnel entrance (positionally analogous to the tenth disulfide bridge in the TrCel7A CD), to produce a 4.5 °C increase in Tm.505 The authors also employed the strategy of expressing MaCel7B with the TrCel7A CBMlinker domain attached, which resulted in a 2.5 °C increase in Tm. Comparison of the MaCel7B activity to TrCel7A (both enzymes were studied as both full-length enzymes with CBMlinker and as solitary CDs) demonstrated that, at 45 °C, TrCel7A in both cases was more active, but MaCel7B was significantly more active at 70 °C, also in both cases. The lower overall activity of MaCel7B at lower temperature was attributed to a higher degree of product inhibition. Nevertheless, these two studies demonstrated the ability to engineer a GH7 CBH for higher stability and activity through a combination of both random and rational mutagenesis.504,505 The same group from VTT also examined means to improve the thermal stability of another thermophilic GH7 CBH, namely TeCel7A, expressed in S. cerevisiae.649 Therein, they used a computational prediction tool to predict 5 new sites for engineering in disulfide bonds, all near the active site tunnel. Like MaCel7B, TeCel7A natively contains 9 disulfide bonds, and the addition of 3 independent disulfides increased the Tm in the range 3.5−5 °C each. The combination of all three successful engineered disulfide bonds increased the Tm by 9 °C. Two single disulfide bond mutants and a triple mutant with all three disulfide bonds added improved the activity on Avicel at 75 °C, and the optimal triple mutant was able to hydrolyze Avicel at 80 °C with only slightly reduced performance relative to its activity

6.7. Conclusions

Our collective understanding of GH7 cellulases to date is quite considerable, especially given recent developments toward more quantitative underpinnings of their action on cellulose. In summary, the following important features of GH7 cellulases have been elucidated: (1) GH7 cellulases employ a two-step, retaining mechanism, and CBHs and EGs form two distinct populations that differ primarily in their loop structures near the ligand. (2) GH7 CBHs can engage cellulose chains via exo-initiation or endo-initiation, i.e., binding to chains by the end or via internal binding to chains. (3) The rate-limiting step in GH7 CBH action in the absence of synergistic enzymes is likely to be substrate dissociation, either caused by obstacles or amorphous regions of cellulose. 1362

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 42. TrCel6A structure and active site details. (A) The overall structure of the TrCel6A CD from PDB code 1QK2, showing the distorted β/α barrel, with labeled structural elements.194 (B) The TrCel6A structure rotated to show the active site tunnel enclosed by an N-terminal and C-terminal loop (pink and blue cartoon, respectively). (C) Four of the TrCel6A binding sites (−2 to +2) revealed by the cocrystallized nonhydrolyzable oiodobenzyl-1-thio-β-D-cellobioside ligand (cyan stick). Aromatic residues are shown in magenta stick, and residues that have been hypothesized to participate in catalysis are shown in green stick. The water molecules likely involved in catalysis are shown as red spheres.

amounts of enzyme for biochemical, structural, and enzyme performance studies. Slowly emerging tools in specialized yeast strains, engineering of filamentous fungi, and cell-free expression systems for these enzymes, combined with a deeper understanding of the effects of post-translational modifications on activity and stability, will eventually enable an era wherein rapid screening of GH7 cellulases from natural diversity and engineered enzymes will be possible. The recent publication of the effect of pyroglutamate clearly established its importance, if not yet the mechanism by which it functions, in enzyme activity and stability.640 Clearly, a more comprehensive understanding of GH7 glycosylation is now of paramount importance, an area which has been only slightly reported on to date in the academic literature given the complexity of detailed glycan analysis.369 On the basis of quite recent results, it seems that the glycosylation step in the retaining mechanism is the primary rate limitation of GH7 CBHs. If this conclusion is correct, which seems logical given the strength of glycosidic bonds, then the primary target for GH7 improvement is kcat. Certainly, future kinetics studies similar to that of Kurašin et al. in the presence of additional cellulolytic enzymes (e.g., GH6 cellulases, additional EGs, β-glucosidases, and LPMOs) on a broader range of substrates will continue to inform the direct target for GH7 improvement. Furthermore, the development of additional, more utilitarian methods for studying processivity of GH7 cellulases in the presence of additional enzymes will be of paramount importance for the systematic comparison of GH7

(4) The rate-limiting step in GH7 CBH action in the presence in the presence of synergistic enzymes is likely is the processive velocity (i.e., the combined steps of hydrolysis, product expulsion, and processive motion). Synergistic enzymes, especially those that provide endolytic action, likely provide points of detachment for GH7 CBHs, thus removing the rate limitation of substrate dissociation. (5) Computational studies of the entire GH7 CBH processive cycle suggest that the glycosylation step in hydrolysis is the rate-limiting step between hydrolysis, product expulsion, and processive motion. (6) The CBM-linker domains on GH7 cellulases seemingly do not significantly contribute to the enzyme performance once the enzyme is productively bound to the substrate, but likely play a major role in enzyme targeting, and thus impact the overall enzyme performance. Although GH7s represent one of the most well studied (if not the most well studied) classes of cellulolytic enzymes to date, there are many open questions based on our current understanding of their action. Given their significant presence in industrial cocktails combined with their powerful hydrolytic action, engineering these enzymes is of keen importance for industrial biomass conversion. The development of accurate structure−activity relationships for the GH7 cellulases still remains nascent. One of the primary challenges in studying these enzymes is expression in convenient heterologous expression systems, which severely limits high-throughput production of adequate 1363

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Table 8. Reported GH6 Crystal Structures source and original name in primary citation Trichoderma reesei CBHII/ Cel6A

Humicola insolens Cel6A

Coprinopsis cinerea Cel6C

Coprinopsis cinerea Cel6A

Chaetomium thermophilum Cel6A HJPlus chimera Cel6A 3C6P chimera Cel6A Humicola insolens Cel6B

PDB code

resolution (Å)

3CBH

2.00

1CB2 1QK2 1QJW 1QK0 1HGW 1HGY 4AU0 4AX6 4AX7 1BVW 2BVW

2.00 2.00 1.90 2.10 2.10 2.20 1.70 2.30 1.70 1.92 1.70

1GZ1 1OCB 1OC6 1OC5 1OC7 1OCJ 1OCN 3A64 3ABX 3A9B 3VOF 3VOG 3VOH 3VOI 3VOJ 4A05

1.90 1.75 1.50 1.70 1.11 1.30 1.31 1.60 1.40 1.20 1.60 1.45 2.40 2.00 2.29 1.90

4I5R 4I5U

1.50 1.22

1DYS

1.60

brief highlights

ref

Fungal CBH Structures first GH6 structure reported; cocrystallized with a nonhydrolizable ligand (o-iodobenzyl-1-thio-βD-cellobioside) Y169F mutant wild-type with (Glc)2-S-(Glc)2 Y169F mutant with (Glc)2-S-(Glc)2 wild-type with m-iodobenzyl-β-D-glucopyranosyl-β(1,4)-D-xylopyranoside D175A mutant D221A mutant with β-D-glucose D221A mutant with 6-chloro-4-methylumbelliferyl-β-cellobioside D221A mutant with 6-chloro-4-phenylumbelliferyl-β-cellobioside D221A mutant with 4-methylumbelliferyl-β-cellobioside binding of glucose in the −2 subsite and cellotetraose in +1 through +4 subsite confirms the existence of 6 subsites D416A mutant with (Glc)2-S-(Glc)2 wild-type complexed with fluoresceinylthioureido-derivatized tetrasaccharide D405N mutant D405N complexed with 4II-thio-β-cellotetraoside substrate D405N complexed with methyl-4,4II,4III,4IV-tetrathio-α-cellopentoside substrate D416A complexed with methyl-4,4II,4III,4IV-tetrathio-α-cellopentoside substrate D416A complexed with cellobio-derived isofagomine wild-type complexed with p-nitrophenyl-β-D-cellotrioside wild-type complexed with cellobiose D102A complexed with glucose wild-type complexed with HEPES wild-type complexed with cellobiose wild-type complexed with p-nitrophenyl-β-D-cellotrioside and with a Mg2+ ion in the active site D164A wild-type complexed with cellobiose and cellotetraose with a Li+ ion in the active site chimera of H. insolens, T. reesei, and C. thermophilum chimera of H. insolens, T. reesei, and C. thermophilum Fungal EG Structure

574 194 194 194 656 656 704 704 704 659 662 209 665 665 665 665 665 666 670 670 670 671 671 671 671 671 672 674 674 663

Thermobif ida f usca E3/Cel6B

4B4H 4B4F 4AVO 4AVN

1.50 2.20 1.80 2.00

Thermobif ida f usca E2 (Cel6A)

1TML

1.80

2BOD 2BOE 2BOF 2BOG 1UP0

1.50 1.15 1.64 1.04 1.75

Bacterial CBH Structures wild-type wild-type complexed with cellohexaose (chain A) or cellotetraose (chain B) D274A complexed with cellohexaose D226A/S232A complexed with glucose and cellotetraose Bacterial EG Structures this apo structure was virtually identical to the previously reported fungal GH6 CBH, but lacked one of the tunnel-enclosing loops wild-type complexed with (Glc)2-S-(Glc)2 Y73S mutant Y73S complexed with cellotetraose Y73S complexed with (Glc)2-S-(Glc)2 wild-type complexed with cellobiose

1UP3 1UOZ 1UP2

1.6 1.10 1.9

wild-type complexed with (Glc)2-S-(Glc)2 wild-type complexed with thio-cellopentaoside wild-type complexed with cellobio-derived isofagamine

Mycobacterium tuberculosis H37Rv Cel6

192

683 683 686 686 658 688 688 688 688 667 667 667 667

Streptomyces were identified as sharing over 20% sequence identity. The tertiary structure of these enzymes was unknown at the time, but only a year later, TrCel6A would be the first reported cellulase structure providing critical insight into both the general structure of this family and key modes of CBH action.

enzyme performance across substrates and in cocktails of varying composition.

7. FAMILY 6 GLYCOSIDE HYDROLASES GH6 was one of the original six families identified through hydrophobic cluster analysis.439 Originally denoted family B GHs, T. reesei CBH II and two EGs from C. f imi and 1364

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 43. Sequence alignment of the six fungal GH6 structures reviewed here. Strictly conserved residues are shown in red block, and chemically similar residues in red text. The blue boxes indicate chemical similarity across a grouping of residues. The secondary structural elements of TrCel6A are shown above the sequences. The catalytic acid is marked by a yellow star. Residues with hypothesized or confirmed roles in catalysis are marked by magenta stars. Active site loops A and B are marked by black boxes. The figure was generated with ESPript (http://espript.ibcp.fr).347

categorized into eight subfamilies with accessory loop lengths defining the features of the different groups.655 Today, GH6 encompasses nearly 500 protein sequences from both bacteria and eukaryota remaining one of the smaller designated

By 1991, membership of this family had grown by one to include Microbispora bispora EG A.654 Family B was renamed family 6, reflecting the growing availability of sequence data.654 Recent sequence analysis suggests GH6 enzymes may be further 1365

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

families;151 we stress that this by no means diminishes the value of this family in industrial biomass conversion. Members of the GH6 family have been experimentally characterized as exhibiting two primary modes of action, endohydrolytic cellulose action (EC 3.2.1.4) and exo-hydrolytic cellulose action (EC 3.2.1.91), where many are thought to display both to some extent. Ståhlberg et al. proposed that all GH6s can create new chain ends in cellulose as an EG would, and thus, none are purely exoglucanases.304 While cellulolytic action in general is critical to industrial biomass conversion, GH6s play a particularly unique role by virtue of their complementary mode of action. The GH6 family is currently the only known family having cellulases that act from the nonreducing cellulose chain end.151 The most effective enzyme cocktails contain cellulases that recognize both the reducing and nonreducing ends allowing rapid, synergistic degradation of crystalline content.305,436,476,540 As such, GH6 cellulases are primary components in biomass degradation cocktails. In addition to their industrial relevance, GH6s have been critical instruments in the general study of cellulase action. TrCel6A was first uncovered in the characterization of the culture filtrate components from the enhanced QM 9414 Trichoderma strain,435 though the enzyme would not explicitly be named until 1980 as CBHII.436 Several years later, Teeri et al. isolated and sequenced the CBHII gene adding a second sequence by which to compare CBHI (TrCel7A).325 The solution of the T. reesei CBHII CD 3-D structure marked a major turning point in cellulase research.192 Biochemical characterization prior to this had successfully identified CBHII as a modular enzyme consisting of a CD and a CBM.344,399 Additionally, key details regarding the CBHII mode of action had been uncovered including that the enzyme acts processively from the nonreducing end of cellulose and that it yields primarily cellobiose through an anomeric inversion stereochemical pathway.302,443 With the solution of the first GH6 crystal structure, details regarding processive action began to emerge.192 In addition to revealing the overall enzyme typology and active site location (Figure 42), the CBHII structure solved by Rouvinen et al. uncovered a central question motivating many subsequent structural and biochemical studies, namely the lack of a readily identifiable catalytic base, which is reviewed at length here. As we note below, although there was significant debate for many years, there is now a general consensus that the GH6 catalytic mechanism proceeds via a “water wire” or Grotthuss mechanism potentially with or without an explicit catalytic base on the enzyme, although this still lacks definitive confirmation.573,656 Over time, CBHII would receive its current designation of Cel6A (TrCel6A), which we use throughout the remainder of our discussion.657 In this section, we describe our current understanding of fungal GH6 cellulase structure, function, and efforts to engineer thermal stability and activity in these important enzymes. We describe the hypothesized catalytic mechanism and the structural and biochemical studies that support the proposed mechanism, the many studies describing processive action and, to a lesser extent, the synergism studies. As bacterial GH6 cellulases often provide interesting and insightful comparison to fungal GH6s, we occasionally deviate from the primary focus of the review on what we hope is an informative detour. A summary of discussed GH6 structures is provided in Table 8.

7.1. Structural Studies

7.1.1. TrCel6A: Wild-Type. The first cellulase structure revealed the catalytic core of TrCel6A and the general GH6 fold as a distorted β/α barrel (PDB code 3CBH192). Much like the classic (β/α)8 barrel, this structure consists of a core region of βstrands surrounded by a constellation of α-helices. The noted distortion arises from the exclusion of one of the eight β-strands from within the core barrel region. The C-terminal loop connecting the β7 strand to the α8 helix (shown in blue in Figure 42B) and the loop connecting the β2 strand to the α4 helix (shown in pink in Figure 42B) form the enzyme active site. For convenience, we have labeled these loops as loop A and loop B, respectively, shown on the multiple sequence alignment in Figure 43. The active site region of TrCel6A is a 20 Å-long tunnel formed by two loops, loops A and B.192 The loops partially encompass the cellulose substrate and several water molecules. In subsequent structures, the flexibility of these loops would be elucidated as well as the potential relevance in nucleophilic attack (Figure 42B).194 The enclosure somewhat restricts the tunnel volume and thus the substrate flexibility. This latter observation is consistent with previously observed CBH product profiles, as substrate rearrangement to a productive binding conformation is prevented when the chain is advanced by a single glucose moiety.302 Rouvinen et al. proposed that the tunnel shape is typical for processive attack of cellulose chains and suggested the loops surrounding the active site appear to be ideally positioned to assist in this processive action. In addition to uncovering the fold of GH6, the TrCel6A structure illustrated how the enzyme binds a cello-oligomer within the tunnel-shaped active site. Cocrystallization with oiodobenzyl-1-thio-β-D-cellobioside inhibitor identified four substrate binding subsites, which Rouvinen et al. labeled A− D; the structure has not been deposited in the PDB. Current nomenclature refers to these sites as −2 through +2 (Figure 42C).175 The binding subsites are largely defined by the spatial arrangement of aromatic residues within the active site. The thio-oligosaccharide complex illustrated carbohydrate-π stacking interactions with three tryptophan residues (Figure 42C). The TrCel6A structure also enabled an initial proposal for the GH6 catalytic mechanism. At the time of the Rouvinen et al. study, GH6 enzymes were known to invert substrate stereochemistry,443 yet candidate residues corresponding to this mechanism had not been definitively identified. The new structure showed two carboxylate oxygens belonging to Asp175 and Asp221 located less than 5 Å from the glycosidic bond between the −1 and +1 subsites.192 Rouvinen et al. proposed that this location is the cleavage site and that these two residues likely participate in catalysis. Specifically, the authors put forth the hypothesis that Asp221 was protonated, and thus the likely catalytic acid, and that Asp175 was charged aiding in protonation of Asp221 as necessary. A water molecule in position to attack the anomeric carbon at the cleavage site was also captured (Figure 42C). Identification of a residue acting as the catalytic base proved less straightforward, with the authors offering two potential candidate residues, Asp263 or Asp401. This latter observation spawned many subsequent studies and controversy, and definitive identification of the catalytic base, if one indeed is required, still remains elusive. As we continue to discuss GH6 structure and catalytic function, it is helpful to compare subsequent structures providing insight by analogy. Figure 43 provides a sequence alignment of TrCel6A alongside several of the more relevant 1366

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

catalytic base. The Asp156 carbonyl oxygen was buried in the active site, though still within hydrogen bonding distance of the catalytic acid. Spezio et al. suggested Asp156 more likely functions to modulate pKa of the catalytic acid. The other proposed catalytic base, Asp401 in TrCel6A (Asp265), was observed forming a salt bridge with a neighboring arginine. Spezio et al. suggested Asp265 was readily ionizable and adequately positioned to accommodate both the ligand and attacking water molecule in the inverting stereochemical mechanism; thus, the hypothesis from this study was that Asp265 serves as the catalytic base in GH6s.658 7.1.3. TrCel6A: Y169F Variant. With an unclear picture of the residues involved in catalytic function, Koivula et al. set out to examine the role of a centrally located tyrosine residue (Tyr169) in catalysis and substrate binding.574 The authors suggest Tyr169 sterically hinders the glucan substrate in the −1 binding site, resulting in formation of the catalytically active conformation. The Y169F variant was generated and solved in its apo form (PDB code 1CB2). This new structure displayed negligible structural differences relative to the original wild-type structure. Weighing in on the ongoing debate as to catalytic residues, Koivula et al. contradicted proposals that Asp401 or the Tf Cel6A homologue could act as the catalytic base. The authors argued that Asp401’s participation in two salt bridges would likely prevent it from extracting a proton from the attacking water. Turning attention to Asp175, Koivula et al. confirmed the previous structural interpretation indicating that the residue is important for pKa modulation and ensures protonation of the catalytic acid.192,574 This observation was arrived at on the basis of unpublished mutagenesis studies. Additionally, the authors noted that Asp175’s position could potentially serve to stabilize an oxocarbenium-like transition state. Ultimately through structural evidence and specificity and kinetic characterization, Koivula et al. arrived at the hypothesis that Tyr169 serves to distort the −1 subsite glucose moiety positioning the glycosidic linkage for catalysis. Comparisons of catalytic constants toward short cello-oligosaccharides were examined for both the wild-type and the Y169F variant, and kcat for cellotriose and cellotetraose were 4 times lower for the Y169F variant. Specificity constants were also decreased. This indicates Tyr169 plays a key role in catalysis. The Y169F mutation was also accompanied by an altered pH activity profile leading one to speculate that Tyr169 may also play an indirect role in ensuring protonation of the catalytic acid. 7.1.4. H. insolens Cel6A: Apo. The solution of the catalytic core of H. insolens CBH Cel6A (HiCel6A) and subsequent characterization cast doubt on its categorization as a strictly exoenzyme (PDB code 1BVW).659 HiCel6A is strikingly similar to TrCel6A, sharing 64% sequence identity. Thus, one may reasonably presume the two enzymes share a great deal of characteristics related to specificity, kinetics, and processive action. Up to this point, TrCel6A had been largely described as an exo-active CBH. However, several groups were approaching the conclusion that an enzyme, but this one in particular, may not be strictly delineated into either the endo- or exocategory.304,660 Varrot et al. observed hydrolysis of a fluoresceinyl-derivatized oligosaccharide substrate by HiCel6A, the functional group of which appeared far too large to reside in the closed, tunnel-shaped active site of the HiCel6A structure. The authors argued that, to accommodate the fluorescein group, the active site of HiCel6A must necessarily exhibit flexibility in the active site loop regions. Flexibility of these loops would in

GH6 sequences. The active site loops as well as residues of particular interest to understanding the catalytic mechanism have been highlighted. 7.1.2. T. f usca Cel6A. The first EG representative from family 6 was solved just a few years after TrCel6A and proved useful in identifying topological differences between EGs and CBHs. Spezio et al. solved the structure of T. fusca Cel6A (designated E2 at the time; PDB code 1TML).658 Although a bacterial EG sharing only 26% sequence identity with TrCel6A, the topologies of the two enzymes are quite similar. As is characteristic of the GH6 fold, Tf Cel6A also displays the distorted β/α barrel, though with different α-helix lengths. A notable difference between the two structures is the topology of the active site. Tf Cel6A is missing a homologous N-terminal loop, which yields a more open cleft-shaped active site potentially to allow flexibility in endo-initiated attack of crystalline cellulose (Figure 44). On the surface, the active site

Figure 44. Comparison of the fungal CBH TrCel6A (1QK2)194 and bacterial EG Tf Cel6A structures (1TML).658 The TrCel6A structure is shown on the left in gray surface. The thio-oligosaccharide ligand is shown in cyan stick. Active site loops A and B are shown in pink and blue, respectively. The Tf Cel6A structure is shown on the right in green surface. For illustration, Tf Cel6A is shown bound to the thiooligosaccharide from the TrCel6A 1QK2 structure. The Tf Cel6A active site is notably more open than the enclosed tunnel-shaped active site of TrCel6A as a result of shorter and missing loop regions. The cleft- or groove-shaped active site is a common attribute of many EGs.

structural differences between Tf Cel6A and TrCel6A are consistent with the notion of a strict delineation of endoversus exo-initiated attack; though as noted before, it is known that CBHs are also capable of endo-initiation despite having a tunnel-shaped active site,304 making such a delineation based on topology alone not definitive. The Tf Cel6A structure confirmed several suppositions from Rouvinen et al. regarding catalytic residues while casting doubt on others.192 Spezio et al. noted that a homologous pair of aspartates was in a similar location to Asp175 and Asp221 in TrCel6A.658 Asp117 (Asp221 in TrCel6A) was likely protonated given its relative distance to neighboring residues and was reported as the catalytic acid on this structural basis. However, Spezio et al. suggested that the Asp79 (Asp175 in TrCel6A) was unlikely to participate in catalysis, as it shifted away from the active site by 4 Å. Notably, the Tf Cel6A structure did not contain a ligand, which may have allowed a greater degree of active site flexibility. Spezio et al. did make the concession that the loop containing Asp79 may shift upon binding. Similarly, Spezio et al. examined the identity of the catalytic base on the basis of structural data. Focusing on the aspartate homologous to TrCel6A Asp263 (Asp156/Tf Cel6A), they found it was unlikely this residue participated in catalysis as the 1367

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

further suggest that in one models the missing −1 glucose moiety, Asp405 would be a suitable distance from the −1 anomeric carbon so as to activate a water molecule for nucleophilic attack. The authors maintained their opinion that Asp180 significantly impacts the pKa of the catalytic acid given the large conformational change about the Cα−Cβ bond allowing Asp180 to hydrogen bond with the glucose. 7.1.6. TrCel6A: Non-Hydrolyzable Ligands. A set of three new ligand-bound complexes of TrCel6A was instrumental in addressing the issue of loop flexibility upon substrate binding.194 These structures included the following: wild-type complexed with (Glc)2-S-(Glc)2 (PDB code 1QK2), Y169F complexed with (Glc)2-S-(Glc)2 (PDB code 1QJW), and wildtype complexed with m-iodobenzyl-β-D-glucopyranosyl-β(1,4)D-xylopyranoside (PDB code 1QK0). Zou et al. analyzed the observed active site loop conformations of these three new structures in the context of the available structures at the time. The authors found that the conformational states of the active site loops generally fall into four categories including “most closed”, “more open”, “even more open”, and “most open”. The tunnel-forming loop was responsive to modifications in the active site including ligand binding as well as site-directed mutagenesis (Y169F, D175A, and D221A). The former three conformations are illustrated in Figure 46. Clearly, the TrCel6A active site is quite flexible with its “pincer-like” narrowing of the tunnel likely playing a significant role in the enzyme’s mode of action. Additionally, the crystallization of TrCel6A in the presence of nonhydrolyzable ligands was effective in capturing TrCel6A with the −1 glucose moiety in the 2SO distorted conformation

theory allow for endo-initiation on cellulose, explaining more recent observations indicative of such.209 The catalytic acid, Asp226, was also observed in two orientations, perhaps representing two different protonation states (Figure 45).659 In one conformation, the distance

Figure 45. Two observed conformations of the HiCel6A catalytic acid likely correspond to two different protonation states. The 1BVW structure is shown in green cartoon. The catalytic acid, Asp226, and the proposed catalytic base, Asp405, are shown in stick. The distance between the two residues in one conformation is consistent with a single-displacement catalytic mechanism, and thus, Asp405 (Asp401/ TrCel6A) was not excluded from consideration on the basis of distance.

between the carbonyl oxygen of the proposed catalytic acid and the proposed catalytic base is 9.5 Å (Asp401/TrCel6A and Asp405/HiCel6A), which is consistent with a typical singledisplacement mechanism. Varrot et al. further agreed with the hypothesis that an aspartate serves to modulate the pKa of the catalytic acid, as Asp268 was appropriately positioned to do so. However, while the HiCel6A Asp180 is in an equivalent position to Asp175 of TrCel6A, the authors reported their consideration of its relevance to catalysis as inconsequential on the basis of its distance from the cleavage site. They cite consistency with mutagenesis studies performed on similar enzymes as further support of this conclusion.192,661 7.1.5. H. insolens Cel6A: Cello-Oligomer Complexes. Shortly after solution of the original HiCel6A apo structure,659 Varrot et al. reported a glucose/cellotetraose-bound structure of HiCel6A confirming their previous hypothesis that the active site loops are flexible, changing conformation upon substrate binding, and providing significant molecular-level insight into ligand binding (PDB code 2BVW).662 The noted conformational change in the active site loops leads to a more enclosed active site tunnel that increased the number of contacts with the ligand. This likely also applies to the TrCel6A active loops indirectly explaining previous observations that TrCel6A is able to access and hydrolyze internal bonds.304 Direct observation of this phenomenon in TrCel6A would soon follow,194 and computational studies would later define molecular mechanisms associated with the conformational change.571 Six ligand-binding subsites, −2 through +4, were directly observed as a result of these new oligosaccharide complexes. The −1 subsite was unoccupied, and the glycosyl units in the occupied subsites adopted a relaxed 4C1 conformation. Additionally, the catalytic acid, Asp226, was observed in close proximity to the +1 glucose and within a realistic distance of Asp405, again not ruling it out as the catalytic base. Varrot et al.

Figure 46. Flexibility of the TrCel6A active site loop A as observed in four separate structures. The “most open” active site observed via structural studies was captured in the 1HGW structure, shown in yellow cartoon.656 The 1QJW active represents the “most closed” active site, dark teal cartoon.194 Two intermediate active site loop conformations, “more open” and “even more open”, were captured in 1QK2 and 1QK0, gray cartoon and magenta cartoon, respectively.194 The ligand, cyan stick, is the thio-oligosaccharide linked ligand from the 1QK2 structure. For illustration, Ser181 is shown in stick on each of the loop A conformations demonstrating the impressive range of motion of this residue. 1368

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

perspective on the relationship of EG and CBH active site topology with activity (PDB code 1DYS).663 Up to this point, the tunnel-shaped active sites of CBHs defined by flexible active site loops were largely thought to be a construct allowing processive hydrolysis of crystalline cellulose.316,440 EGs were identified through the lack of these active site loops resulting in a groove- or cleft-shaped binding site. Deletion of one of the key CBH active site loops in a C. f imi CBH indeed increased overall activity toward CMC, but the activity toward smaller cellooligosaccharides was simultaneously decreased clouding the notion that the absence of active site loops strictly delineates activity toward soluble substrates.660 Adding to the suggestion that EGs may not always resemble the canonical open active site architecture, Davies et al. reported the structure of the HiCel6B, which displayed a striking homology with family 6 CBHs.663 As with familial representatives TrCel6A, Tf Cel6A, and HiCel6A, HiCel6B exhibits the characteristic GH6 distorted β/α-barrel fold. However, HiCel6B is missing one C-terminal loop (loop B) present in the CBHs (Figure 43) creating a more open binding site, yet not equivalent to other GH6 EGs. The authors noted that the remaining N-terminal loop (loop A) may close upon substrate binding allowing Ser98 (homologous to TrCel6A Ser181) to interact with the −1 and/or +1 subsite substrate. Such a conformational change could also allow Glu97 to interact with the active site. Other EGs have a conserved histidine at the Glu97 position, while CBHs exhibit an alanine. In the ongoing search for the catalytic base, Davies et al. note that Asp316 (homologous to TrCel6A Asp401) displayed salt bridge to the conserved Arg269.663 Corresponding to findings from Zou et al.,194 this configuration suggests Asp316 (and TrCel6A’s Asp401) is an unlikely catalytic base in HiCel6B, but the authors noted that a small rearrangement of the −1 subsite sugar could better position the residue. Beyond the catalytic base, Ala182 hydrogen bonds with the pKa modulator Asp180 that in turn orients the catalytic acid, Asp139, for catalysis. 7.1.8. H. insolens Cel6A: D416A/Thio-Oligosaccharide Complex. With the identity of the catalytic base still unknown, Varrot et al. set out to investigate a set of particularly perplexing findings related to the proposed catalytic base in HiCel6A, Asp405.209 The authors reported unpublished observations that the EG HiCel6B is rendered inactive upon mutation of the homologous residue (Asp316). However, mutation of Asp405 in the CBH HiCel6A resulted in only a 100−300-fold reduction in activity rather than complete annihilation. These results led the authors to propose that the CBHs of the GH6 family may benefit from catalytic “rescue” as a result of the phenotypic differences in architecture. One of the basic residues proposed as a possible “rescue” residue in HiCel6A was Asp416. Varrot et al. suggest that the residue sits in such a way that the Asp could potentially activate a water molecule for inverting attack on the −1 glucose anomeric carbon. Moreover, this residue is located on a loop that is missing in homologous EGs and, thus, would be a likely candidate to maintaining minimal levels of activity should the potential base (Asp405) be modified. Solution of a thio-oligosaccharide bound D416A variant of HiCel6A was unable to confirm the hypothesis that Asp416 serves as a “rescue” residue because no water molecule was observed in a position for nucleophilic attack (PDB code 1GZ1).209 Nonetheless, the D416A structure was successful in capturing a thiol-linked cellotetraose molecule spanning the +2 to −2 binding subsites with the −1 glucose moiety in the 2SO conformation. This structure was also very briefly one of the

providing additional insight in the ongoing efforts to define the catalytic residues. In both the wild-type/(Glc)2-S-(Glc)2 complex and the Y169F complex, ligand binding in the active site is essentially the same despite each having very different active site loop conformations. The wild-type/m-iodobenzyl-βD-glucopyranosyl-β(1,4)-D-xylopyranoside complex captured the −1 sugar in the 4C1 conformation; however, the xylosyl unit of the ligand was in the −1 site with the glucosyl moiety in the −2 site. Zou et al. note that if the −1 subunit sugar had a hydroxymethyl group as in the native substrate, it would severely clash with the Tyr169 side chain. The Y169F mutant also captured the −1 glucose in a distorted conformation, suggesting Tyr169 plays an indirect role in catalysis.194 Overall, these findings suggest the −1 ring distortion arises from the need to avoid steric clashes with the surrounding protein, yet the distortion is likely integral to the catalytic mechanism. Zou et al. also explicitly examined the three structures for clues regarding the identity of the catalytic residues with respect to existing characterizations and structural knowledge. Two primary conclusions were made. The pair of aspartyl residues previously proposed as catalytic acid and pKa modifying residue, Asp221 and Asp175, respectively, serve as the primary catalytic machinery, and neither of the putative catalytic bases, Asp263 and Asp401, are convincingly suitable candidates. With respect to the first conclusion, Zou et al. suggested structural evidence that pointed toward a catalytic mechanism that is dependent upon the conformation of the active site loops: (1) In the most commonly captured orientation corresponding to most apo structures, the tunnel is generally closed, opening as necessary to facilitate substrate entry. (2) When the substrate binds within the tunnel, the −1 glucosyl moiety distorts, which is likely crucial for catalysis. The aspartyl pair is locked in an interaction with Tyr169 and Arg174 that facilitates the protonation of Asp221. (3) A loop conformational change tightens the tunnel breaking the Asp175/Asp221 interaction with Tyr169/Arg174 and positions two water molecules close to the −1 sugar of the anomeric carbon. (4) The Asp175/Asp221 hydrogen bond is broken to allow Asp221 to donate its proton to the substrate leaving group, and Asp175 activates a neighboring water molecule above the anomeric carbon. (5) Catalysis takes place, and the α-anomer of cellobiose is expelled as product alongside a final conformational change of the active site loops. Processive action on cellulose would then follow. Examining the putative catalytic bases, Zou et al. reported that Asp263 is too distant to be directly involved in catalysis but may modify pH behavior of the enzyme, as previously proposed.658,659 Asp401 was not ruled out as the catalytic base by distance and was likely to be in the correct charge state to activate a water molecule. However, its side chain was not close enough to a negatively charged residue and, thus, may not be suitable to deprotonate the water molecule. Furthermore, the Asp401 side chain does not interact with a suitable water molecule in either of the thio-oligomer complexes but, instead, hydrogen bonds with the −1 sugar O3. The placement of Asp401 relative to the face of the sugar ring rules it out as the catalytic base. As discussed below, subsequent reports suggest that Zou et al.’s reported structures do actually indicate a catalytic role for Asp401, though not as expected. Instead, it has later been suggested that the backbone of Asp401 interacts with the attacking water, and thus, this residue may be important for proper alignment of the catalytic center.573,656 7.1.7. H. insolens Cel6B. The solution of the H. insolens EG Cel6B (formerly EG VI) structure provided a unique 1369

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

“most open” GH6 CBH active sites, in terms of position of the two active site loops A and B. At this point in time, evidence convincingly pointed toward distortion of the Michaelis complex in the catalytic itinerary of retaining GHs; however, it was unknown as to whether inverting GHs exhibited a similar transition state distortion. Previous GH6 structures had captured a similar 2SO conformation in the −1 binding subsite,194 but the nearest potential transition state to the 2SO conformation on the Stoddard diagram is the 2,5B boat conformation, which initially seemed an unlikely transition state. The D416A variant provided mounting structural evidence of what appears to be a valid intermediate catalytic state (2,5B) of inverting GH6s. Overall, no significant structural changes were found for the D416A mutant except for the open conformation of the activesite loops. The authors report only a single direct hydrogen bond between the −1 glucose O3 hydroxyl and the proposed catalytic base Asp405; other protein−substrate interactions appeared to be mediated entirely by water. During proof of the manuscript, Varrot et al. included a postscript209 acknowledging the findings of another paper published at nearly the same time.656 Koivula et al. had solved the structure of an even more open TrCel6A structure, but more importantly, they had proposed a solvent-mediated Grotthuss mechanism for deprotonation that was consistent with the solvent-mediated interactions observed in the D416A HiCel6A structure. Furthermore, Koivula et al. reported molecular simulations that support the 2,5B transition state conformation. 7.1.9. TrCel6A: D175A and D221A Variants. The Koivula et al. study in 2002 remains one of the most influential structural investigations of catalytic residue function in GH6s.656 The structure of TrCel6A D175A and D221A mutants (PDB codes 1HGW and IHGY, respectively) were solved, and activity measurements, kinetic isotope experiments, and molecular simulations were used to ascribe functions to Asp221 and Asp175.656 A key finding of this study was that the catalytic mechanism appears to proceed via water-mediated Grotthuss mechanism (Figure 47), and that Asp175 may accept the proton as the catalytic base.656 Moreover, the distortion of the substrate in the −1 binding subsite was shown to be stable over a relatively short MD simulation, suggesting the conformation is stabilized by the surrounding protein environment. An addi-

tional feature of note is that the D175A structure remains the most open GH6 active site conformation yet observed. The idea that Asp221 was the catalytic acid was not a new one going into this study, nor was that of Asp175 as a pKa modulator. Up to this point, most available data pointed toward Asp221 as the catalytic acid.192,658,661 The D175A structure captured Asp221 positioned close to where the −1/+1 glycosidic linkage would be located (PDB code 1HGW). This position matched an earlier structure of the Y169F TrCel6A variant which used a thio-oligosaccharide to capture a substrate spanning the active site (PDB code 1QJW).194 Using MD simulation to model the oxygen-linked oligosaccharide, Koivula et al. observed that the hydrogen bond between Asp221 and Asp175 breaks revealing a new, catalytically competent side chain conformation of Asp221. This conformation had been observed previously in Tf Cel6A and HiCel6A structures but was the first observation of a protonated carboxyl-oxygen oriented toward the glycosidic linkage in TrCel6A. Activity measurements of the D221A variant on cellotriose, cellotetraose, cellopentaose, and cellohexaose indicated a complete loss of competency; however, binding specificity was maintained. The findings overwhelmingly support Asp221 as the catalytic acid candidate. Data supporting Asp175’s role in catalysis had been less convincing up to this point. Koivula et al. began investigating the function of Asp175 in catalysis by testing the natively expressed D175A variant on oligosaccharides. They showed that the variant was effectively inactivated but, like D221, maintained binding specificity. The recombinantly expressed D175A variant maintained 2−3% residual activity on barley β-glucan, suggesting, as the authors state, that it is not the catalytic base “in a normal sense”.656 The D175A variant also markedly lowered the pH-rate profile, suggesting a role in pKa modulation. The authors also put forth a convincing argument surrounding the role of Asp175 in transition state stabilization. Examining cellobiosyl fluoride substrate reaction kinetics, Koivula et al. calculated an exceptionally high charge buildup in the TrCel6A transition state. Electrostatic stabilization would be required to maintain such an unfavorable ΔΔG⧧, and Asp175 is uniquely positioned to perform this role. The structure captures the −1 glucose ring oxygen and the Asp175 carboxylate oxygen within 4.5 Å of each other with no other shielding atoms between them. Many previous studies had proposed Asp401 as the mostly likely candidate, but contradictory results prohibited conclusive identification.661 Using structural insights and molecular simulation, Koivula et al. laid out a rational discussion describing the evidence against Asp401 acting as the catalytic base in TrCel6A. They first described the role of Asp401 in catalysis and why prior studies have been inconclusive at best. Structural data indicate that Asp175 and Ser181 coordinate two water molecules positioned so as to attack the −1 glucose anomeric carbon. The water coordinated by Ser181 is also coordinated by the backbone carbonyl of Asp401. Simulation of a carbocationic intermediate confirmed this water molecule as the likely catalytic water. The authors further added that prior studies mutating Asp401 do not in fact remove the catalytic base but rather induce a charge imbalance that destabilizes the transition state. The removal of Asp401 would leave two nearby positively charged residues without acidic compensation near the active site. Instead of a catalytic base-mediated mechanism, Koivula et al. suggested proton transfer occurs through a water wire in a Grotthuss-type mechanism, illustrated in Figure 47, wherein proton transfer occurs to a neighboring water upon nucleophilic

Figure 47. Proposed GH6 Grotthuss mechanism. In this mechanism, the role of catalytic base is fulfilled by a wire that shuttles the proton from the attacking water.573 The residues are labeled according to the TrCel6A sequence. The Ser181 and Asp401 backbone carbonyl oxygens stabilize the attacking water, and the Asp175 side chain stabilizes a second water molecule that will accept the proton from the attacking water in an inverting mechanism. 1370

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 48. Proposed mechanism of catalysis and “virtual processivity” as captured through structural studies of HiCel6A bound to thio-oligosaccharide inhibitors. The figure numbers referenced in the captions at right refer to the figures from the original publication. Reprinted with permission from ref 665. Copyright 2003 Elsevier.

structural explanation as to why mutagenesis of Asp405 results in only a 100−300 fold decrease in activity rather than complete abolishment, as in the H. insolens EG. A series of five structures were reported toward the aim of this goal. These structures included the following: wild-type HiCel6A complexed with methyl 4-S-(4III-FTU-β-cellotriosyl)-4-thio-β-D-glucopyranoside (PDB code 1OCB), an apo D405N mutant (PDB code 1OC6), a D405N mutant complexed with methyl 4II-thio-β-cellotetraoside and methyl 4,4II,4III,4IV-tetrathio-α-cellopentaoside (PDB codes 1OC5 and 1OC7, respectively), and a D416A mutant complexed with methyl 4,4II,4III,4IV-tetrathio-α-cellopentaoside (PDB code 1OCJ).665 The authors noted that there appear to be relatively few structural consequences of mutating Asp405. Neighboring residues rotate slightly but maintain a similar interaction profile with the asparagine. Asp405 maintains a salt bridge with the neighboring Arg357 as well, the destruction of which was speculated to result in a significant local instability.

attack, which may then transfer a proton to Asp175 serving as indirect catalytic base.664 The authors noted that because there is little evidence of bond formation with the nucleophilic water in the transition state, there is likely no need for a traditional catalytic base-mediated mechanism. Examination of solvent isotope effects on kcat for cellotriose substrates supported the active site architecture observed in the structural study, and thus the proposed catalytic roles. Interestingly, a Grotthuss mechanism for GH6 catalysis had previously been considered in the 1999 review by Davies et al.170 Ultimately, the plausibility of this mechanism had been dismissed as inconsistent with a prior study claiming to have identified a GH6 catalytic base residue.661 7.1.10. H. insolens Cel6A: D405N Variant. In 2003, Varrot et al. revisited their structural investigation into the possibility of a catalytic “rescue” residue upon mutation of the proposed catalytic base, Asp405.665 The authors were searching for a 1371

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

capture the 2,5B transition state conformation in the −1 binding subsite (PDB code 1OCN).666 This was the first structural report of this unusual catalytic itinerary, traversing from the 2SO Michaelis complex conformation through the 2,5B transition state intermediate.194,209 The structure reveals the distortion is driven primarily through steric interactions with neighboring Tyr174 coinciding with the hypothesized role of the homologous residue in TrCel6A.574 The structure also effectively supported the Grotthuss mechanism for GH6 cellulases. A water molecule was captured in position for attack on the anomeric center of the −1 isofagomine moiety (Figure 49). The formerly proposed catalytic base, Asp405, was

Varrot et al. concluded that mutation of Asp405 must have a participatory role in catalysis since the mutation does not result in local structural unfolding. Prior studies, as well as this one, noted that Asp405 is directly involved in substrate binding with a hydrogen bond between the C3 hydroxyl group of the −1 glucose moiety. Additionally, it was proposed that Asp405 may serve to stabilize the proposed 2,5B transition state intermediate, though not directly observed as part of this study. The authors again left open the possibility that Asp405 is the catalytic base. The study of the D405N mutants bound to the unique set of fluorogenic and thio-oligosaccharides also uncovered a series of structures trapped in the act of “virtual processivity”.665 Varrot et al. illustrated a proposed schematic of catalysis and processivity in GH6 enzymes (Figure 48). The initial step, productive binding mode, was captured in the wild-type structure bound to the fluorescing thio-oligomer (1OCB). Though the ligand does not span the +1/−1 binding sites where cleavage of the glycosidic linkage would occur, the catalytic acid reflects a catalytically competent conformation. This structure also captured a ligand binding within the −3 and −4 subsites, which had not been previously defined for any GH6 enzyme. The authors suggested this finding illustrates that there is no absolute requirement for cellulose chains to enter the enzyme tunnel from one end only supporting the ability of this particular enzyme to act in an endo-active fashion as well as exo-active. In the productively bound conformation, the protein makes approximately 15 different hydrogen bonds directly with the ligand and a similar number of solvent-mediated hydrogen bonds. The start of the processive event, “processivity commences” (Figure 48), was captured in the 1OC7 structure, the D405N mutant bound to the thio-linked cellopentamer.665 The first sugar lies intermediate to the −1/+1 binding site with the trailing sugars also occupying intermediate subsite binding positions as the chain threads in the tunnel. The interactions of the protein with the ligand are similar to the productive binding mode. The chain is hypothesized to progress through the tunnel maintaining a nonproductive binding mode conformation. The 1OC5 structure of the D405N mutant bound to the thio-linked cellotetramer illustrates the nonproductive binding mode in which the ligand sits in the +1 through +4 binding subsites in a relaxed chair conformation. Each of the glucose moieties’ faces is rotated 180° from that of the productively bound ligand conformation suggesting the ligand has partially completed the processive motion through the tunnel. Contrasting the productive binding mode, the protein makes very few hydrogen bonds directly with the ligand in the nonproductive conformation. Nearly all of the interactions are through solvent-mediated hydrogen bonds as the ligand slides through the tunnel. The catalytic acid is tilted away from the substrate. This frequently captured dual orientation of the GH6 catalytic acid, both of which were captured in this study, has long been attributed to the change in local pH affecting protonation state. Varrot et al. suggested that the conformational change may actually serve a role in processivity, putatively to accommodate the unusual orientation of the sugars as they find their way toward a catalytically competent binding position. Overall, this study documents a fascinating series of structures illustrating a proposed processive mechanism and sheds light on the ability of HiCel6A to act in endo-initiation mode. 7.1.11. H. insolens Cel6A: D416A/Isofagomine Complex. Revisiting the H. insolens D416A variant, Varrot et al. used a cellobiose-derived isofagomine inhibitor to successfully

Figure 49. Transition state intermediate of HiCel6A as captured in the isofagomine bound complex with the D416A variant (PDB code 1OCN).666 Asp405, once proposed as the catalytic base, is shown mediating the Grotthuss-type water wire mechanism through hydrogen bonding of its main chain carbonyl with a water molecule. Ser186, of loop A, also stabilizes the water wire. A second water molecule hydrogen bonds with the first, and Asp180 completes the wire interaction. Asp226, the catalytic acid, is positioned away from the pKa modifying Asp180 during this intermediate state. Tyr174 forces the energetically unfavorable 2,5B conformation of the −1 moiety through steric interactions. This transition state intermediate appears to be general for GH6 inverting enzymes.

captured mediating a water−ligand interaction by stabilizing the interaction with the main-chain carbonyl. Ser186 on loop A, the pKa modifying/indirect putative base Asp180, and a second water molecule are also implicated in catalysis. This structure unambiguously suggests GH6 catalysis occurs via a Grotthusstype water wire mechanism with Asp180 potentially being able to accept a proton during catalysis (Figure 47). Interestingly, the Grotthuss-type water wire mechanism may be feasible even in the absence of a homologous serine residue in loop A. Mycobacterium tuberculosis H37Rv Cel6, a bacterial EG, exhibits an alanine substitution in this position. Structures of the bacterial EG complexed with nonhydrolyzable substrates indicate that, despite the alanine substitution, a water molecule is still able to correctly position for inverting attack of the −1 moiety (PDB codes 1UP0, 1UP3, 1UOZ, and 1UP2).667 The stabilization of the water molecule is alternatively maintained by a conserved asparagine homologous to Asp401 in TrCel6A. This latter study suggests that GH6 enzymes have evolved to maintain the water wire catalytic mechanism. 1372

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

7.1.12. C. cinerea Cel6C. The genes encoding five new GH6 enzymes (CelA through CelE) from C. cinerea were discovered in 2009.668 Of the five, only CcCel6A exhibits a CBM and was most closely related to that of TrCel6A.669 The remaining GH6 cellulases were more similar in homology to HiCel6B. Liu et al. confirmed that C. cinerea Cel6A (CcCel6A) exhibits a CBM, while CcCel6B and CcCel6C do not. Yoshida et al. claim that all five GH6s exhibit CBH activity on the basis of cellobiose production from PASC. This latter observation became increasingly more pertinent upon the solution of the structure of CcCel6C in the year that followed. Liu et al. reported the solution of the CcCel6C structure in complex with cellobiose and p-nitrophenyl-β-D-cellotrioside, which appears to be the first report of a GH6 from a basidiomycete (PDB codes 3A64, 3ABX, and 3A9B).670 The authors suggested the structure was an additional example of a GH6 CBH based on prior characterization on PASC despite the high sequence homology with the H. insolens EG Cel6B. However, in the current study, CcCel6C was tested for activity on CMC and demonstrated specificity for the substrate unlike any other GH6 CBH. The new structure suggests the CcCel6C active site is significantly more open than any of the known GH6 CBHs. Active site loops A and B, which are not missing (Figure 43), were captured in an open conformation in each instance and appear to be insensitive to the substrate binding-based conformational change noted in both TrCel6A and HiCel6A. Accordingly, the serine residue of loop A (Ser181 of TrCel6A) was not noted to interact with the substrate. A lysine substitution in loop A is suspected for the reduction in flexibility in the loops overall. Additionally, a tyrosine present in the −3 binding site of other GH6 CBHs is missing in CcCel6C making for a much more open product-binding site by comparison. Overall, the evidence that CcCel6C is indeed a CBH is not compelling. In fact, specificity toward CMC and the appearance of what was described as a cleft-shaped active site surrounded by fixed open loops points more toward EG behavior, or at the very least, a high degree of endo-initiation. High sequence homology with the EG HiCel6B is also concerning. In lieu of assays on more crystalline substrates such as Avicel or BMCC, the categorization of CcCel6C activity as a CBH remains unclear. 7.1.13. C. cinerea Cel6A. In a subsequent study, Tamura et al. investigate conformational changes within the active sites of both CcCel6A and CcCel6C resulting from substrate binding or mutation of the pKa modifying catalytic aspartate.671 Four CcCel6A structures were reported including the following: CcCel6A bound to 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES; PDB code 3VOG), CcCel6A bound to cellobiose (PDB code 3VOH), CcCel6A bound to p-nitrophenyl-β-D-cellotrioside (PDB code 3VOI), and an apo CcCel6A D164A variant (PDB code 3VOJ). A fifth structure of an CcCel6C D102A variant with glucose in the −2 binding site was also reported for comparison to the group’s prior wildtype structure of CcCel6C.670 The solution of the CcCel6A structure was the second basidiomycete GH6 structure reported. As noted before, CcCel6A exhibits a CBM, whereas all other C. cinerea GH6s do not. CcCel6A demonstrates 52% sequence identity with HiCel6A and 48% identity with TrCel6A. Additionally, CcCel6A was reported to produce cellobiose from PASC but not hydrolyze either p-nitrophenyl-β-D-cellotrioside or CMC, indicative of CBH activity. However, relative or specific activity data has never been directly reported making direct comparisons between CcCel6A and CcCel6C and to other GH6s difficult.

Tamura et al. report the CcCel6A-HEPES structure maintains an “open” tunnel conformation, while the CcCel6A-cellobiose and CcCel6A-p-nitrophenyl-β-D-cellotrioside structures demonstrate the “closed” form of the active site tunnel that now appears to be characteristic of GH6 CBH loops upon substrate binding.194,671 The conformational changes in moving from open to closed forms were essentially identical to changes observed in HiCel6A. Interactions of CcCel6A with the bound ligands were reported to be consistent with the Grotthuss-type catalytic mechanism previously proposed for GH6s.656 Rather than the new CcCel6A structure, the primary focus of the Tamura et al. study was actually the conformational changes associated with mutation of Asp102 in CcCel6C.671 Reversing their original conclusion that the active site of CcCel6C is rigid by comparison to other GH6 CBHs,670 the authors noted that the D102A mutation induces a significant degree of flexibility in the active site loops. A “motion angle” measurement devised to account for conformational changes indicates that CcCel6C D102A variant active site loops tighten significantly in comparison to both the wild-type active site and the CcCel6A structures. Tamura et al. go on to suggest that the conformational change in the CcCel6C D102A variant is the most drastic yet observed in a GH6 CBH. However, superimposition of the two CcCel6C structures with the “most closed” and “most open” extreme conformations from TrCel6A (1QJW and 1HGW, respectively) indicates that the purported drastic conformational change is only moderately greater than that of TrCel6A (Figure 50). The CcCel6C D102A variant exhibits loop structuring nearly identical to that of the most closed TrCel6A structure, 1QJW. Overall, the more open active site of CcCel6C wild-type again supports the enhanced ability of this particular GH6 to perform endo-initiated attack relative to other GH6 CBHs.

Figure 50. Superimposition of CcCel6C structures with the TrCel6A “most open” and “most closed” structures. The wild-type CcCel6C structure, orange cartoon, represents the open conformation, which Tamura et al. reported as the most drastic open conformation of a GH6 yet.671 The TrCel6A D221A variant in the “most open” conformation is shown in yellow cartoon for direct comparison to CcCel6C.656 Upon binding a glucose molecule, the CcCel6C D102A variant active site loops close, shown in pink cartoon. This latter conformation is nearly identical to the “most closed” conformation from the TrCel6A Y169F variant, shown in teal cartoon.194 The thio-oligosaccharide ligand from 1QK2 is shown in cyan stick.194 1373

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

7.2. Catalytic Function

7.1.14. C. thermophilum Cel6A. Solution of the C. thermophilum Cel6A structure provided additional structural evidence in support of a general Grotthuss-type mechanism for GH6 enzymes and again suggested GH6 enzymes have an active site particularly suited for endo-initiated attack (PDB code 4A05).672 CtCel6A displays the typical GH6 distorted β/α barrel including both full-length active site loops A and B.672 The enzyme shares 77% sequence identity with HiCel6A and 63% with TrCel6A suggesting it is a CBH. Specificity for BMCC and filter paper supports this assessment.673 The wild-type CtCel6A structure captured a cellobiose molecule in the −3 to −2 binding subsites and a cellotetraose molecule in the +1 to +4 subsites. This is the second such account of a −3 binding site for a GH6 enzyme and is suggestive of the ability of the active site to bind substrate in a manner so as to allow glycosidic cleavage of products other than just cellobiose.665 One of the more interesting features of this structure was the tetrahedrally coordinated Li+ ion located in the same position as the anomeric carbon of the Michaelis complex. The Li+ ion likely mimics the oxocarbenium-ion-like transition state complex coordinating with three water molecules and the O4 hydroxyl of the +1 sugar, consistent with a water-mediated inverting catalytic mechanism in GH6 enzymes. 7.1.15. TrCel6A Variants HJPlus and 3C6P. Increasing the ability of cellulases to withstand heat while maintaining a reasonable degree of activity is an important area of industrial enzymatic biomass conversion research. Wu et al. have been particularly successful in identifying a set of mutations through directed evolution that is capable of increasing T50 by nearly 20 °C over wild-type.674 To understand the structural implications of these mutations and the molecular-level origins of added thermal stability, Wu et al. solved the structures of the TrCel6A thermally stable variants HJPlus and 3C6P (PDB codes 4I5R and 4I5U, respectively).674 The thermal stability characterization efforts are discussed in greater detail in section 7.4, but briefly, we mention that the HJPlus variant consisted of 48 mutations with respect to the wild-type TrCel6A enzyme. The 3C6P consisted of an additional 7 mutations with respect to HJPlus, a total of 55 with respect to wild-type. Both variants maintain a high degree of structural similarity with wild-type TrCel6A. With so many mutations, the authors necessarily did not discuss contributions from each mutation toward thermal stability and focus primarily on residues providing stability gains in 3C6P over HJPlus. Improvements in thermal stability were attributed to five residues, all of which were located near the surface of the enzyme. Mutations at the surface of globular proteins are frequently capable of enhancing thermal stability.675,676 These included the following: M135L, Q277L, S317P, S406P, and S413P. Only the serine to proline mutations were solvent exposed, however. The former two mutations appear to improve thermal stability through improved hydrophobic interactions. In addition to proximity to the surface, serine to proline mutations, as in the 3C6P variant, have long been reported to contribute stability in loop regions by restricting conformational freedom.677−679 In HJPlus, proline substitutions appear to be well-tolerated, though they do not always enhance stability. In fact, the three identified in the 3C6P variant were the only serine to proline mutations to significantly impact thermal stability in a positive fashion. While this particular approach to engineering enhanced stability while maintaining activity was effective, it is difficult to develop a general principle by which other proteins may be modified to the same effect.

Structural and biochemical studies have been essential to elucidate the catalytic mechanism of GH6 enzymes, although many questions still remain. To date, these studies enabled a general hypothesis for the unusual single-displacement catalytic mechanism adopted by GH6s. In the section to follow, we summarize our current understanding of the GH6 catalytic mechanism including the roles of key residues in the active site, the role of water in catalysis, aromatic residue function in substrate binding, and conformational changes along the processive cycle. Four residues are well-conserved across GH6 enzymes: the amino acids homologous to TrCel6A Asp221, Asp175, Asp263, and Asp401 (Figure 43).655 Residues homologous to Asp221 putatively operate as the catalytic acid in conjunction with the pKa modifying and potential “indirect” catalytic aspartate Asp175. Several residues, including Asp263 and Asp401, have been proposed as the catalytic base, though definitive evidence remains elusive. Additionally, Tyr169 has been reported to play a critical role in transition state stabilization. The function of each of these residues is discussed below. 7.2.1. Catalytic Acid. Structural and biochemical characterizations overwhelmingly suggest the GH6 catalytic acid is an aspartic acid residue located to donate the proton to the substrate leaving group as necessary. Rouvinen et al. observed that TrCel6A is active over a pH range 4.0−7.0, a range that allows both charged and uncharged aspartyl side chains.192 Analyzing the structural geometry, they suggested that Asp221 is likely to be protonated and is in position to act as the catalytic acid.192 Mutation of this TrCel6A residue to alanine results in complete annihilation of activity, where the differences in activity are not attributable to substrate binding.192,656 This finding is consistent across the GH6 family including bacterial and fungal representatives, as well as in EG and CBH functions (refs 194, 479, 574, 656, 658, 670, and 680−684). Structural studies have also captured the catalytic acid in two distinct orientations.194,656,659,670,683,685 While the two conformations have been attributed to different protonation states, Varrot et al. suggest the conformations may represent accommodations made within the active site to enable procession of the cellooligomer.665 7.2.2. pKa Modifying Residues. In addition to the catalytic acid, it is generally accepted that the GH6 catalytic mechanism relies on an adjacent secondary aspartic acid that serves, at minimum, to prime the catalytic acid pKa. Upon solution of the first GH6 structure, Rouvinen et al. posited that Asp175 could force protonation of Asp221 and stabilize a transitional positive charge at the substrate ring oxygen −1 binding subsite.192 Sitedirected mutagenesis and characterization of activity confirmed the D175A mutant displayed approximately 20% of the activity of wild-type, which suggests the residue plays a supporting role in catalysis.192 The putative role of Asp175 as a pKa modifying residue was again suggested by Koivula et al., who observed significantly reduced, but measurable, activity of the D175A mutant.656,684 Similar behavior has been observed in bacterial GH6 representatives as well. Studies characterizing the role of the homologous C. f imi Cel6A residue, Asp216, again suggested an indirect role of the residue in catalysis. As a result of the D216A mutation, Damude et al. observed reductions in activity of 18- to 1380-fold depending on the substrate, but they noted the mutants maintained a considerable amount of activity.661 The homologous residue in Tf Cel6A (Asp79) is also reported to 1374

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

catalytic center led some researchers to briefly consider the possibility that catalysis proceeds via Grotthuss mechanism in GH6s.170,659 However, the prospect was ultimately dismissed (and then later re-examined) on the basis of the Damude et al. study identifying the Asp401 equivalent as a catalytic base. In a comprehensive examination of the aspartic acids of Tf Cel6A, Wolfgang et al. determined that various Asp265 mutants (homologous to TrCel6A Asp401) retained anywhere from 2% to 10% of wild-type activity on CMC and PASC.680 This suggested that the residue was not playing a direct role in catalysis, such as a catalytic base would. To test whether Asp265 and Asp79 (the pKa modifying residue) functioned cooperatively as a base providing residual activity when only one was mutated, the authors tested the activity of the double mutant D79N/D265N. The observed results were similar to what would be expected from a cumulative, independent decrease rather than a marked decrease resulting from removal of the catalytic base. Molecular modeling studies were particularly insightful in developing what is our current understanding of how GH6 enzymes proceed without a traditional catalytic base. Modeling the TrCel6A transition state as an oxocarbenium cation, Koivula et al. scanned several initial configurations of the active site to identify potential catalytic base residues. However, the authors concluded there were no suitable candidates to act as a direct base. Some of the models revealed conformations wherein water molecules were positioned so as to attack the anomeric carbon of the oxocarbenium cation. The Ser181 side chain and the Asp401 backbone stabilized one water molecule, which also connected to the Asp175 side chain via a second water molecule. After bond cleavage in the simulation, the distorted −1 sugar ring relaxed to the chair conformation and moved slightly out of the active site. Koivula et al. thus suggested there is little need for a protein-based catalytic base and instead suggested a proton could be transferred from the nucleophilic water to Asp175 by means of a second water in a Grotthuss-type mechanism; thus, Asp175 could act as an indirect catalytic base via a second water molecule. In addition to reviving the notion of a water-mediated Grotthuss mechanism, the authors also predicted the 2,5B conformation of the oxocarbenium cation intermediate. Compounding structural studies capturing transition state intermediates would later confirm this for other GH6 enzymes.665,666 To date, the assignment of Asp175 as the catalytic base is the most plausible hypothesis that has been put forth. As definitive evidence of the role of Asp175 remains open, the search for a suitable GH6 catalytic base continues. In Tf Cel6B, Vuong et al. mutated all potential catalytic base residues within 6 Å of the −1/+1 subsites to an alanine.682 The authors suggested that an exogenous nucleophile such as sodium azide could partially “rescue” enzyme activity in the absence of a catalytic base and thus tested this hypothesis. Sodium azide did not “rescue” activity for any of the mutants nor was a catalytic base identified. However, the catalytic relevance of each of the mutated residues was confirmed. Activity of the D226A/S232A (equivalent to TrCel6A Asp175 and Ser181) double mutation was partially “rescued” by low concentrations of sodium azide. The authors suggest that the azide ion activates a water molecule that performs hydrolysis. This “rescue” suggests Asp226 and Ser232 activate a catalytic water via a proton-transferring network as a means of hydrolysis. The Tf Cel6B structure of the D226A/S232A double mutant later supported this hypothesis.686

function as a pKa modifying residue despite its structurally observed location 11 Å away from the catalytic acid.680 Wolfgang et al. obtained pH activity profiles of the D79A, D79E, and D79N mutants demonstrating that Asp79 raises the pKa of the corresponding catalytic acid. Besides modulating pKa and potentially acting as the catalytic base, as discussed in the next subsection, additional roles for Asp175 and its homologous residues have been suggested. Koivula et al. noted that Asp175 is ideally positioned to function in stabilization of the oxocarbenium-like transition state.574 In a later study, the authors used molecular modeling and structural studies with fluoroglycosyl ligands to investigate this hypothesis.656 They confirmed Asp175 stabilizes the exceptionally electron-deficient fluorine transition state and suggest this interaction also takes place with oligosaccharides. Interestingly, Vuong et al. found that the D226A Tf Cel6B mutant, homologous to Asp175, could hydrolyze a soluble substrate at the same level as wild-type producing large oligosaccharides, but the same mutant could not hydrolyze crystalline cellulose to produce cellobiose. This latter finding suggests this residue may play a role in processivity.682 Tamura et al. also suggest this residue plays a crucial role in active site loop functionality in CcCel6A and CcCel6C, as deletion enabled a significant conformational change (Figure 50).671 Asp263 in TrCel6A and its homologous residues have also been suggested to serve as pKa modifying residues. Structurally, this aspartic acid sandwiches the catalytic acid with an adjacent aspartic acid. Originally, Rouvinen et al. suggested that Asp263, or alternatively Asp401, may act as the catalytic base.192 Later, several studies successfully ruled out Asp236 as the catalytic base and suggested Asp263 likely increases the pKa of the catalytic acid.194,659,661 Site-directed mutagenesis and activity measurements of the homologous residue (Asp156) in Tf Cel6A support this proposal, with decreased activity found for D156A and D156N but not for D156E.680 7.2.3. Catalytic Base. The definitive assignment of a catalytic base in the inverting GH6 mechanism remains an open question. Recent evidence even suggests that GH6 enzymes do not require a direct catalytic base as part of their singledisplacement hydrolytic mechanism (i.e., a residue that could accept a proton directly from the attacking nucleophilic water molecule). The structural study of TrCel6A by Rouvinen et al. was one of the first instances in which Asp401 was proposed as the catalytic base.192 Spezio et al. later proposed that the homologous residue, Asp265, in Tf Cel6A would aid in activity by ensuring the enzyme’s charged state, effectively as the catalytic base.658 Damude et al. later used kinetic measurements to investigate the role of this residue, Asp392, in Cf Cel6A.661 After the catalytic acid mutant, the Cf Cel6A D392A mutant displayed the secondlargest reduction in activity with a decrease in activity of approximately 3 × 104 on both CMC and PASC. Combined with geometric considerations from the crystal structures of TrCel6A and Tf Cel6A, the authors posited that Asp392, Asp401 in TrCel6A, serves as the catalytic base. The Damude et al. study has been the source of much confusion over the years, as only a handful of other studies suggest this residue causes such a drastic decrease in catalytic activity. One of the first suggestions that TrCel6A’s Asp401 may not actually be the catalytic base was put forth by Koivula et al., who noted that the salt bridge in which Asp401 participates would likely interfere with proton abstraction from the attacking water molecule.574 The presence of water molecules near the 1375

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

product with the −1 and −2 subsites may in fact contribute to the moderate product inhibition observed in TrCel6A.344,581,599,606,693 Studies of bacterial GH6s Tf Cel6A and Tf Cel6B have been influential in defining protein−substrate interactions in the −2 binding site.687,696 In Tf Cel6A, sitedirected mutagenesis of a lysine residue to a histidine in the −2 binding site was shown to improve performance on PASC over filter paper, suggesting this residue plays a rate-limiting role in hydrolysis of crystalline cellulose.696 In examining Tf Cel6B, Vuong and Wilson identified an asparagine, Asn282, which significantly impacts the ability of the enzyme to hydrolyze crystalline substrates. Mutations of this asparagine to alanine and aspartate to alanine increased overall activity on BMCC and filter paper and increased processivity.528 Overall, characterization studies suggest the −2 binding site is important for catalysis, processivity, and product binding. Distortion of the −1 glucopyranose ring in the active sites of GHs is thought to be necessary for catalysis, as discussed previously. In GH6s, this distortion occurs on the product side of the cleavage site. A number of studies focus on understanding both how enzyme structure contributes to distortion and energetic contributions from distortion. Study of the first cellulase structure, TrCel6A,192 as well as several subsequent structural studies, observed tilting of the −1 glucopyranose ring out of the plane of the other substrate units and noted puckering of the ring from the relaxed chair conformation.194,209,574,665,666,683 Early molecular modeling simulations of Tf Cel6A in complex with cellotetraose focused on investigating the conformation of the glucosyl unit bound in the −1 subsite confirming the stability of the −1 puckering.692 Koivula et al. noted that while TrCel6A subsites −2, +1, and +2 contain tryptophan residues, the −1 subsite is instead composed of a lysine, aspartate, and a tyrosine (described above), all of which likely contribute to the distortion of the sugar.574 The authors also proposed that binding in the +2 subsite causes strain that manifests as ring distortion in the −1 subunit based on the 700-fold higher specificity constant for cellotetraose over cellotriose. Consistent with the proposal that interactions beyond the −1 subsite are involved in ring distortion and productive substrate binding,574,690 Payne et al. used free energy calculations to find that mutations to tryptophan residues in multiple binding sites affected the conformation of the −1 sugar.578 Specifically, mutating the +2, +1, and −2 tryptophans (Trp269, Trp397, and Trp135, respectively) to alanine resulted in the −1 sugar ring relaxing to a 4C1 conformation. Mutation of the +4 site tryptophan, Trp272, did not cause ring relaxation, however. Using molecular simulation, Bu et al. further revealed that the relative stability of the distorted conformation is sensitive to pH, with the 2SO conformation favorable at the TrCel6A optimum pH of 5.479 Exomode initiation in processive enzymes is thought to occur via the acquisition of free chain ends, which are threaded into the active site. Accordingly, the enzymes appear to exhibit machinery enabling acquisition through carbohydrate-π stacking with an aromatic residue at the entrance of the enzyme active site. Koivula et al. hypothesized that TrCel6A exhibited an additional two binding sites, +3 and +4, wherein Trp272 in the +4 binding site plays a key role in chain acquisition.525 Trp272 mutants retained near wild-type hydrolytic capability on amorphous cellulose but were significantly impaired on BMCC. At the same time, the binding affinity of the mutants was similar to wild-type, confirming the hypothesis that the residue was critical to chain acquisition. Notably, this

7.2.4. Catalytic Priming of Ring Distortion. While many residues are likely involved in the catalytic priming of the substrate for catalysis, the centrally located tyrosine has been shown to function in steric distortion of the −1 glucose moiety.574 An early investigation by Koivula et al. effectively established the role of Tyr169 and homologous tyrosines through crystallography and biochemical characterization of Tyr169 mutant activity.574 Through structural evidence and characterization of pH dependence of cellotetraose hydrolysis in the wild-type and Y169F variant, the authors proposed that Tyr169 contributes to ring distortion in the −1 subsite as well as helps to ensure protonation of the catalytic acid.574 Zou et al. later confirmed Tyr169 aids in distortion of the −1 glucose but noted that the Y169F mutant exhibits the same distortion.194 Later, molecular modeling studies confirmed the strongly stabilizing nature of Tyr169 in maintaining the distorted 2SO intermediate conformation.656 Similar results in homologous GH6s suggest the tyrosine at the −1 binding subsite has conserved function. In the bacterial Tf Cel6A, Tyr73 mutants display tighter binding and lower hydrolytic activity in comparison to wild-type.687,688 Barr et al. proposed that the additional volume created by these mutations allows the −1 subsite sugar ring to bind in the relaxed 4C1 conformation, indicating the role of Tyr73 could be to align the −1 sugar into an optimal position relative to the catalytic residues.687 Alternately, they noted that Tyr73 could stabilize an intermediate oxocarbenium ion through cation−π interaction, consistent with the observation that Y73S had lower activity than Y73F.687,689 In Tf Cel6A, Tyr73 is further from the catalytic acid than homologous residues in TrCel6A, and thus would only be able to aid in pKa modulation through conformational changes of the active site. The same roles have been reported for the −1 binding site tyrosines of HiCel6A and Tf Cel6B as well.666,682686 7.2.5. Substrate Binding. As with most GHs, family 6 cellulases exhibit aromatic-lined substrate binding sites in both cleft and tunnel architectures. The binding sites have been historically defined by the number of glucose moieties that may bind along the length of the tunnel or cleft. Since GH6 enzymes are nonreducing end specific, positive numbers denote the “substrate” side of the active site, and negative numbers denote the “product” side. The only known exception to this generalization appears to be the GH6 CBH from T. emersonii that has been reported to act from both the reducing and nonreducing end of cellulose.513 TrCel6A exhibits at least six binding subsites from +4 to −2.525,606 Four of these sites, +2 to −2, were observed in the seminal TrCel6A structure and have been extensively characterized over the years.192,574,690−693 The overall number of binding sites across GH6s varies among members from at least 4 and up to 8 depending on the enzyme.574,665,667,694,695 However, generalizations as to the roles of residues in the core binding sites +2 to −2 are insightful and will be discussed here. Product side binding sites, −2 and −1, have been shown to tightly bind the ligand aiding in both positioning the glycosidic linkage for catalytic cleavage and stabilization of the cleaved dimeric product. In TrCel6A, the −2 subsite exhibits a carbohydrate−aromatic stacking interaction mediated by a tryptophan residue, Trp135. Little experimental evidence characterizing this binding site in TrCel6A is available; however, molecular modeling studies suggest the tryptophan residue tightly binds the ligand to maintain both the −1 ring distortion and for product stability.578 This strong association of the 1376

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 51. Hypothesized processive catalytic cycle for exo-initiated attack. Pre-slide mode: The “more open” TrCel6A structure, 1QK2, represents the “pre-slide” mode conformation of a processive GH6.194 Loop A (pink cartoon) is in the open conformation with Ser181 far from the catalytic center. The catalytic acid Asp221 is hydrogen bonding with Asp175, the pKa modifying residue and putative catalytic base. Residues thought to participate in catalysis, directly or indirectly, are shown in green stick. Tf Cel6B was captured in a similar “more open” conformation with a full-length cello-oligomer in the 4AVO structure.686 Thus, we use the 4AVO cello-oligomer (cyan stick) in the pre-slide mode, slide mode, and Michaelis complex panels. Slide mode: The cello-oligomer processes through the active site, filling the −1 and −2 binding sites. The protein (1QK2) has not yet changed conformation in slide mode allowing the oligomer to pass unobstructed. Michaelis complex: The protein undergoes a conformational change positioning loop A near the catalytic center, represented here by the TrCel6A 1QJW structure.194 In the Michaelis complex, the backbone of Asp401 and the side chains of Asp175 and Ser181 form a network mediated by two water molecules, red spheres. The catalytic residues illustrating the Michaelis complex have been selected from various structures to represent this intermediate state (1QJW, Asp175, Asp401, and Ser181, water molecules; 1QK2, Tyr169; 1HGW, Asp221 from chain B).656 Substrate−product complex: Hydrolysis occurs, putatively via the Grotthuss mechanism, breaking the glycosidic linkage. The protein (1QJW) maintains the tightened active site conformation throughout hydrolysis producing an α-cellobiose product molecule. The product and substrate ligand shown is that of the Tf Cel6B 4AVN structure with a modeled −1 glucose based on the 4AVO ligand.686 The product molecule is expelled, and the processive catalytic cycle is reset.

M−1.691 Nevertheless, cellobiose inhibition in TrCel6A pales in comparison to values observed for TrCel7A. Moderate levels of glucose inhibition of TrCel6A have also been observed with an inhibition constant of 150 M−1.301,691 The inhibition of TrCel6A has been described as noncompetitive inhibition and can reduce cellotriose hydrolysis by a factor of 30.691 However, glucose inhibition does not appear to be universal across GH6s, as glucose did not significantly inhibit Tf Cel6A hydrolysis of CMC697 or Avicel.698 Attempts to understand molecular contributions to cellobiose product inhibition in GH6s are accordingly limited. Molecular simulation suggested that the strong substrate binding at the −2 subsite likely contributes to TrCel6A product inhibition.578 Sitedirected mutagenesis of Tf Cel6B active site residues has been shown to alleviate cellobiose inhibition.526 However, this improvement in hydrolysis comes at the price of ability to degrade crystalline cellulose. In the same study, a glycine to proline loop mutation inexplicably increased activity on

tryptophan is conserved in GH6 CBHs but not in EGs. Zhang et al. later confirmed the Tf Cel6B tyrosine functions in a similar manner.526 7.2.6. Product Inhibition. GH6 product inhibition has not been as extensively examined as in complementary GH7 cellulases (see section 6.3). Several early studies examined T. reesei cellulase product inhibition, using cocktails rather than purified enzymes.581,599 These studies reached the conclusion that cellulases in the cocktail were indeed inhibited by the cellobiose product, but it became clear this effect was likely dominated by the interaction of TrCel7A with cellobiose rather than TrCel6A. At 25 °C, TrCel6A has a reported cellobiose inhibition constant (Ki) of approximately 500 M−1,301 while the inhibition constant for TrCel7A with cellobiose is roughly 50 000 M−1.607 Following on from this work, Teleman et al. developed a hydrolysis progress curve model including product inhibition parameters and went on to suggest the TrCel6A cellobiose inhibition may be closer to 5000 M−1 than 500 1377

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

its classification as a CBH, TrCel6A may acquire a free chain end and processively cleave hydrolytic linkages. However, on the basis of production of reducing ends on both PASC and filter paper as well as reversible binding, TrCel6A was reported to exhibit a high degree of endo-initiation activity allowing the enzyme to randomly hydrolyze internal glycosidic bonds. In TrCel6A, this ability to cleave internal linkages similar to that of an EG has been attributed in part to the flexibility of the tunnelforming active site loops.304 Later, Harjunpää would report that TrCel6A exhibits strictly processive action, finding no cellotetraose yield from cellohexaose, as would be expected if the enzyme randomly cleaved glycosidic linkages.693 These opposing findings hint at the difficulty in describing exo-activity and processive function using accessible characterization techniques. Currently, no consensus approach has been widely adopted in examination of GH6 exo-activity, and thus, comparison of findings across laboratories is difficult at best. Difficulties associated with cellulase processivity measurements have been described above in section 6.2, and a recent review outlines limitations of available techniques, which are currently substantial.521 With this in mind, we describe our evolving understanding of the mechanism by which GH6s deconstruct cellulose, controversy in description of processive function, and technologies being developed to move forward in our understanding of GH6 processive function. Description of the endo- or exo-initiation character of GH6 cellulases has long been qualitatively determined through soluble reducing sugar assays, where enzymes exhibiting a greater degree of exo-initiation activity produce fewer insoluble sugars relative to more endo-active enzymes.523 According to this approach, Tf Cel6A exhibits a high degree of endo-activity relative to the high exo-active character of Tf Cel6B. In this same study, TrCel6A and Tf Cel6B were reported to be functionally equivalent in synergistic mixtures of cellulases indicating a similar mechanism in degradation of cellulose. The conclusion that TrCel6A is an exo-active cellulase was long unquestioned. However, recent characterization and visualization studies now suggest GH6 CBHs may also exhibit strongly endo-characteristic activities304,476,546,600,699,700 Several subsequent visualization studies support the hypothesis that CBHs TrCel6A and HiCel6A exhibit a high degree of endo-initiation. TEM captured clear images of the effects of incubating HiCel6A, HiCel7A, and a mixture of the two on digestion of BC ribbons.540 Upon incubation with HiCel7A, Boisset et al. observed thinning of the cellulose ribbons indicative of processive degradation of the polysaccharide. When the BC was incubated with HiCel6A, the fibrils were also thinned in several locations, but more importantly, the fibrils were cut into shorter fragments. The latter behavior is suggestive of endo-initiated degradation of the cellulose fibrils. Together, HiCel6A and HiCel7A exhibit exo/exo synergy, working from opposite ends of the polysaccharide and using complementary modes of degradation. Boisset et al. illustrated this phenomenon in a schematic reproduced here in Figure 52.540 The authors made the important observation that choice of substrate greatly affects the outcome of the experiment. They argued that the BC ribbons are the optimal substrate for observing the endoprocessive action of HiCel6A; this substrate does not exhibit the abnormally high degree of structural defects of pretreated substrates such as BMCC, Avicel, or even Valonia microfibrils that may preclude endo-activity. Recently, real-time visualization of cellulase action on cellulose using high-speed AFM has been reported.476 Igarashi

amorphous and crystalline substrates while increasing cellobiose inhibition. A molecular-level explanation of this observation remains elusive. 7.2.7. Processive Catalytic Cycle. The inverting catalytic mechanism described here is common to all GH6 enzymes. CBHs in the GH6 family are also thought to exhibit an encompassing “processive catalytic cycle” wherein the hydrolysis step is one part of the entire process of successively cleaving crystalline polysaccharides (see section 6.2). In this section, we outline the steps of the hypothesized processive catalytic cycle. We note that this is a hypothesized mechanism and that some of the steps, particularly “pre-slide mode” and “slide mode”, are not likely accurate indictions of endo-initiated attack by GH6s. The cycle presented here includes the original proposal by Zou et al., described above, and combines additional structural-based evidence from processive CBHs that followed the Zou et al. study.194,573 Figure 51 illustrates the hypothesized processive catalytic cycle using structures available in the literature. The putative steps include the following: (1) In the pre-slide mode, the processive catalytic cycle begins after the GH6 has acquired a free chain end from the nonreducing end of the polysaccharide as part of the initial processivity event.20 The ligand occupies the substrate binding sites (+1 through +4 or beyond, depending on the enzyme), while the product binding sites (−1/−2) remain unoccupied. At this point, the active site loops are in the “open” conformation such that the side chain of the serine implicated in catalysis is more than 10 Å away from the catalytic center. The catalytic acid (Asp221) and pKa modifying/putative base aspartate (Asp175) hydrogen bond with each other. (2) In the slide mode, the cello-oligosaccharide proceeds through the active site tunnel by a cellobiose unit filling the −1 and −2 binding sites. The −1 glucose moiety becomes distorted through steric interactions with a tyrosine. Asp221 is protonated during this step but maintains its interaction with Asp175. Interaction of this latter aspartate with a neighboring arginine allows it to maintain the charged state. (3) In the Michaelis complex, a conformational change in loop A breaks the interaction of the catalytic acid with Asp175. As a result, the protonated catalytic acid rotates toward the glycosidic linkage. The active site tunnel closes, and the serine residue of loop A (Ser181) moves toward the catalytic center. Two water molecules are stabilized by newly formed interactions with serine and two aspartates (Asp175, Asp401). (4) In the substrate−product complex, hydrolysis occurs via the Grotthuss mechanism leaving an inverted α-cellobiose product in the −1 and −2 subsites. The product is expelled from the active site, potentially with a conformational change of loop A, and the processive catalytic cycle begins anew. 7.2.8. Synergistic and Processive Function. Efficient polysaccharide deconstruction has long been known to require a complex mixture of CBHs, a host of EGs, and accessory enzymes all working together through complementary function.32 Synergistic activity between TrCel6A and TrCel7A was reported as early as 1980 prompting an entire body of research devoted to understanding the mechanisms behind this phenomenon.436 Chanzy and Henrissat hypothesized that orthogonal action of TrCel6A and TrCel7A could be responsible for their synergy.302 As discussed previously, it eventually became apparent that GHs do not exist as strictly delineated “endo” and “exo” active enzymes suggesting additional possibilities for synergistic function of the two enzymes.304 Ståhlberg et al. suggested T. reesei CBHs are capable of two essential modes of cellulolytic degradation.304 In line with 1378

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

chemistry.521 Use of reducing-end attached chromogenic or fluorogenic molecules in examining GH6 activity is generally uninformative, as the enzymes exhibit relatively poor hydrolysis of the heteroglycosidic linkages or preferentially cleave the homoglycosidic linkage of longer model substrates. A recent advance in fluorogenic substrates for characterization of GH6 activity made use of molecular docking studies to understand how GH6s bind 4-methylumbelliferyl-β-D-cellobioside substrates and suggested rational improvements based on the findings.704 Wu et al. discovered that GH6 enzymes may nonproductively bind with the traditional umbelliferyl substrate and that substitutions at the C4 or C6 position of the umbelliferyl motif greatly improved hydrolytic turnover of the heteroglycosidic linkage.704 The 6-chloro-4-methyl-umbelliferylβ-D-cellobiose reporter molecule was deemed to be the most successful improvement, as the molecule demonstrated an appropriate balance of activity, solubility, and fluorescence in comparison with other C4/C6 modifications. Nevertheless, GH6 turnover of these molecules remains low when compared to GH7 activity. The authors optimistically interpreted this to mean there are additional gains to be had in design of GH6 reporter molecules. 7.3. Glycosylation

Figure 52. Schematic of the action of HiCel7A and HiCel6A on BC ribbons. The two enzymes exhibit complementary function, working from the reducing end, R, and the nonreducing end, NR, of the microfibril, respectively. (A) The primarily exo-acting HiCel7A sharpens the microfibril into long, thinner ribbons. (B) HiCel6A exhibits a dual mode of action, thinning the ribbons in certain regions but also cutting the fiber using endo-initiated hydrolysis. Reprinted with permission from ref 540. Copyright 2000 American Society for Microbiology.

To our knowledge, TrCel6A is the only GH6 for which glycosylation has been explicitly characterized. Hui et al. used a combination of capillary liquid chromatography-electrospray and matrix-assisted laser desorption and ionization time-of-flight mass spectrometry to locate N-linked glycans on the CD and quantify the number of O-linked glycans attached to the CBM and linker domain.418 The TrCel6A enzyme was purified from the T. reesei RUT-C30 strain fermentation broth and thus closely reflects the native glycan arrangement. The cleaved CBM-linker domain, from residues 1 to 82, exhibits between 39 and 46 O-linked mannose residues appended to the threonines and serines in this region. This is significantly higher than the number of O-linked glycans observed on the linker of TrCel7A, related to the shorter GH7 linker. As the linker length across GH families appears to be conserved,417 this finding may be true of other fungal GH6 enzymes exhibiting linker regions. On the CD of TrCel6A, Hui et al. report there are three N-linked glycan sites at Asn14, Asn289, and Asn310. Of these three, the Asn310 glycan was reported to be a high-mannose glycan with anywhere from 7 to 9 mannose residues attached to the base GlcNAc2.418 This finding was in line with observations made in crystallization of the first TrCel6A structure. Rouvinen et al. found that the TrCel6A CD likely exhibited N-linked glycans at the Asn289 and Asn310 residues, though the only assignment made was a single GlcNAc at Asn310.192 The crystal structure also reported the CD as having O-linked glycans at residues Thr87, Thr97, Ser106, Ser109, Ser110, and Ser115. Additional crystal structures of the natively expressed TrCel6A CD also exhibit various combinations of the attached glycans.194,656,704 Insights into glycosylation of GH6 cellulases other than TrCel6A are limited to findings from structural reports. Unfortunately, the C. cinerea and C. thermophilum GH6 structures were obtained through recombinant expression in E. coli; thus, the structures do not exhibit glycosylation.670−672 The HiCel6A structures were expressed in an alternative fungal host, A. oryzae, and do exhibit some glycosylation captured in the structures. Varrot et al. reported the CD exhibits an N-linked GlcNAc at Asn141 and at least two O-linked mannose residues at Ser127 and Thr118.659 It is likely the CD may exhibit more or

et al. investigated the synergistic effects of TrCel7A and TrCel6A finding, as has often been reported, that the two act in a synergistic exo/exo fashion. Interestingly, the authors noted that when an ammonia pretreated cellulose polymorph, cellulose IIII, was incubated with only TrCel6A for 3 and 8 min, little change in morphology of the microfibril was observed.476 However, subsequent addition of TrCel7A resulted in remarkably faster digestion than with either enzyme alone. It seems reasonable to conclude that TrCel6A may function in a highly endo-active mode on cellulose IIII creating new chain ends for TrCel7A that would otherwise be unavailable. It has also been speculated that TrCel6A may even synergistically remove obstacles from the path of TrCel7A resulting in enhanced hydrolysis. One of the more intriguing results of this work relative to TrCel6A can be seen in the movies supporting the manuscript; in isolation, TrCel6A appears to be nonprocessive. Over the time span of the high-speed AFM measurements, the enzyme molecules do not deviate significantly from their initial positions. This latter finding is counter to the community’s working hypothesis that TrCel6A is a processive cellulase and highlights the need for accurate GH6 processivity assays. Much of the difficulty in uniformly characterizing GH6 activity, degree of processivity, and initial mode of attack arises from the fact that there are relatively few model substrates available for nonreducing end specific enzymes. For decades, reducing end specific assays have taken advantage of chromogenic and fluorogenic cellobiosides and lactosides.701−703 The specific chemistry of the reducing end of cellulose and cello-oligomers allows for attachment of reporter molecules, whereas the nonreducing end lacks this unique 1379

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

activity during a 10 min incubation) by approximately 10 °C. They tested Cys-Ser mutations on parent enzymes and found an approximately 8 °C increase in T50 for TrCel6A C400S (where the numbering is for the full-length enzyme with a CBM-linker). This corresponding Cys to Ser mutations further improved secretion of the functional enzyme. They proposed that the increased stability of the Cys-Ser mutation could be due either to stronger hydrogen bonding interactions or due to steric reasons, as the Cys residue resides closer to the carbonyl of Pro339 than Ser. Wu and Arnold further increased the HJPlus chimera thermostability via directed evolution, creating the new chimera 3C6P, which contained seven mutations over HJPlus: S30F, V128A, M135L, Q277L, S317P, S406P, and S413P.674 The mutations creating the 3C6P variant were tested one-by-one in TrCel6A. All mutations except Q276L (equivalent to Q277L in 3C6P) either stabilized or did not affect the T50 of the parent TrCel6A enzyme. The five thermostabilizing mutations in 3C6P are all near the surface, with the Ser to Pro mutations being the only solvent exposed mutations in loop regions. Wu and Arnold proposed that they limit conformational freedom without straining the backbone (the Cα and Cβ atoms in TrCel6A and HJPlus align well with the Cα and Cβ of the corresponding prolines in 3C6P). 3C6P displayed a half-life of 280 min at 75 °C and a T50 of 80.1 °C, a 15 °C increase over HiCel6A and 20 °C increase over TrCel6A. The 3C6P activity at its optimal temperature of 75 °C offers a 10-fold reduction in hydrolysis time compared to HiCel6A activity at its optimal temperature of 60 °C on Avicel (at 60 h). Crucially, it continued to demonstrate synergy, and they found that a mixture of thermostable Cel6A (3C6P) and a thermostable variant of Cel7A from the same group650 releases 1.8 times more cellobiose than a wild-type mixture at their respective optimum temperatures of 70 and 60 °C from Avicel. On the basis of results from their previous studies wherein free cysteine residues were found to be detrimental to stability in GH6 cellulases,708,709 Wu et al. investigated the role of paired and free cysteine residues in GH6 thermal degradation.710 They noted that some GH6 enzymes contain unpaired, free cysteine residues, including the thermostable chimera 3C6P Cys246. Mutants (C246A, C246G, C246L, and C246S) lacking the free cysteine showed increased extreme (90 °C) temperature tolerance. They noted that HiCel6A and TrCel6A have a free cysteine residue and lower temperature tolerance than wild-type CtCel6A, which does not have a free cysteine. As with 3C6P, mutants of HiCel6A and TrCel6A in which their free cysteine residues were removed showed improved tolerance of high temperature incubation. Further study of the C246G mutant indicated that it was able to retain activity after high temperature incubation due to disulfide-bond-assisted refolding to a productive conformation. They proposed that GH6 thermal inactivation is due to disulfide-bond degradation and thioldisulfide exchange that results in misfolding, and that removing free cysteine residues limits degradation by the later mechanism. GH6 enzymes display a diverse range of pH optimum and activity ranges (Table 9).192,499,680,711,712 Nevertheless, it is frequently desirable from an industrial perspective to understand both the molecular contributions to activity at various pHs and methods for engineering proteins more tolerant of pH conditions beyond optimal. Several groups have investigated the effect of substrate binding on the pKa of the enzyme as well as the effect of pH on active site loops and activity. Damude et al. noted that Cf Cel6A displays a basic shift in pKa with substrate

rather different glycans on the surface of the CD in the native host, but the extent to which this is true is currently publicly unavailable. 7.4. Protein Engineering

As with their GH7 counterparts, engineering GH6 CBHs toward higher thermal stability has received significant attention from industrial and academic laboratories. Early efforts focused on improving GH6 stability were targeted toward bacterial representatives. Zhang et al. attempted to improve Tf Cel6B thermostability by introducing disulfide bonds.526 They created a Tf Cel6B G234S-G284P double mutant that showed 2-fold increase on filter paper, but the benefit did not carry over to synergistic mixtures.526 Ai and Wilson attempted to engineer a more thermostable Tf Cel6B by introducing a disulfide bond by means of the N233C-D506C double mutant (residues Asn233 and Asp506 in Tf Cel6A correspond to Asn182 and Arg410 in TrCel6A).705 The circular dichroism spectra of the wild-type and double mutant were identical, indicating no change in the final structure. The mutated residues joined the two loops covering the active-site cleft but did not inhibit activity on CMC or PASC. Moderate gains in thermal stability were attained through the double mutation with 100% of activity maintained at 50 °C for 20 h compared to 85% activity retention in the wildtype.705 Unfortunately, the mutations caused a decreased protein yield attributed to lower in vivo expression but prohibiting extension to commercial applications.706 In efforts to increase thermostability in fungal GH6 cellulases, Lantz et al. targeted more than 100 nonconserved TrCel6A residues for mutagenesis. Therein, they concentrated primarily on the CD surface residues hypothesizing that these residues benefit the least from hydrophobic packing stability and may exhibit the largest gains; some linker and CBM sites were also examined.707 Through this comprehensive approach, gains in thermal stability of nearly 7 °C for TrCel6A (and 15 °C for TrCel7A) were achieved. These gains in thermal stability were significant enough to raise the stability of the primary hydrolytic components to that of the more thermally stable EGs TrCel5A and TrCel3A in a commercial enzyme cocktail. An added side benefit of the thermal stability mutations was a moderate gain in TrCel6A activity. Additional efforts to improve GH6 thermal stability include a significant body of work from the Arnold group, as previously described in the GH7 section.674,708−710 Heinzelman et al. first employed structure guided enzyme recombination of HiCel6A, TrCel6A, and the bacterial GH6 CtCel6A.709 From a sample set of 48 chimeras secreted by a glycosylation-deficient strain of S. cerevisiae, five chimeras exhibited half-lives of inactivation (63 °C) higher than the most thermostable wild-type parent (HiCel6A). None of the three chimeras were more active on PASC at 50 °C than TrCel6A, the most active parent, but each were active at higher temperatures than the parents, retaining activity up to 70 °C. The most thermostable parent, HiCel6A, is inactive above 57 °C. They named the most active chimera HJPlus, which was created by substituting the three “blocks” predicted to be most stabilizing into TrCel6A. In a follow-up study, Heinzelman et al. produced additional GH6 chimeras that their regression model predicted would encode thermostable chimeras.708 The block that most strongly contributed to thermostability contained 10 differences between two parents with different thermostabilities. Mutating each residue in the block identified one mutation, S313C in HiCel6A, which reduced the T50 (temperature at which an enyme loses 50% of 1380

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

1381

Orpinomyces sp. PC-2 Penicillium decumbens JUA10 Phialophora sp. G5 Piromyces rhizinf latus 2301 Piromyces sp. E2 Podospora anserina S mat+ Podospora anserina S mat+ Podospora anserina S mat+

Irpex lacteus MC-2 Malbranchea cinnamomea Neocallimastix patriciarum Neocallimastix patriciarum J11 Orpinomyces sp. PC-2 Orpinomyces sp. PC-2

Cel6A Cel6A

Cel6B

Cel6C

Escherichia coli Pichia pastoris

Pichia pastoris

Pichia pastoris

Cel6A

Escherichia coli

6

7

7

6.0

7.0

5.0

Cel6A

EgGH6A

5.8−6.2

CelF

Pichia pastoris

Escherichia coli

4.6−7.0

25

35

45

37−45

65

50

40−50

40

Gao et al., 2011722 Zhao et al., 2012723 Tsai et al., 2003724

pNPC, barley β-glucan CMC-Na, barley β-glucan, Avicel, filter paper CMC, barley β-glucan, lichenan, xylan

pNPC

Avicel

Avicel

Avicel

CMC

CMC-Na

Avicel, CMC, Glc3/Glc4/Glc5/Glc6

Avicel, CMC, Glc3/Glc4/Glc5/Glc6

Poidevin et al., 2013326

Poidevin et al., 2013326

Harhangi et al., 2003725 Poidevin et al., 2013326

Chen et al., 2003721

CMC

Avicel Avicel, CMC, Glc3/Glc4/Glc5/Glc6

Li et al., 1997712

CMC, Avicel, PASC, lichenan, barley β-glucan, arabinogalactan, araban, galactan, pullulan, gum arabic, pachyman, pustulan, cellotetraose CMC, PASC, barley β-glucan, lichenan, Avicel

CMC

CMC

barley β-glucan

assay conditions 39 °C and pH 6

retained >40% activity at pH 4.0−10

Li et al., 1997712

CMC, Avicel, PASC, lichenan, barley β-glucan, pullulan, Glc4

CelC

50

Wang et al., 2013720

barley β-glucan, lichenan, CMC, Avicel, PASC

Escherichia coli

4.3−6.8

formerly CelA

Denman et al., 1996711

Avicel

Avicel, CMC, PASC, lichenan

CelA

50

40

assay conditions 30 °C and pH 7

assay conditions pH 5.0

assay conditions 30 °C

assay conditions 30 °C

assay conditions 30 °C

formerly EG6

assay conditions 50 °C and pH 5.0

Escherichia coli

6.0

5.0

540

Tamura et al.,

Boisset et al., 2000, Moriyaet al., 2003,716 Wu and Arnold, 2013674 Schou et al., 1993,442 Dalbøge and Heldt-Hansen, 1994,717 Davies et al., 2000663 Toda et al., 2008718

Liu et al., 2009

Liu et al., 2009, 2012671 Liu et al., 2009

669

comments

Wu et al., 2011,719 Xu et al., 2009500

CelA

Avicel, PASC

Escherichia coli

PASC

CMC, PASC, Glc5, reduced Glc6

Avicel; BC ribbons

Cel6A

50

Avicel

PASC, CMC, Avicel

PASC, CMC, Avicel

PASC, Avicel, cellotriose, cellobiose

PASC, pNPC, Avicel

5

60

PASC

PASC

PASC

CMC

Aspergillus oryzae Escherichia coli

Cel6B

8.0

7.0

60

CBHII (Ex-4) Cel6A

Pichia pastoris

Saccharomyces cerevisiae

Cel6C

Escherichia coli

Cel6A

Cel6B

Escherichia coli

8.0

Emalfrab et al., 2003715

CMC, labeled CMC, β-glucan

4.5

Cel6A

pNPC

Bauer et al., 2006487 Wang et al., 2012673

50

CMC

4

57

Cel6A

ref Chow et al., 1994714

soluble CMC, cello-oligosaccharides Glc3/Glc4/Glc5/Glc6, barley β-glucan and lichenan, Avicel pNPC, BMCC, filter paper, CMC

substrate specificity

5.5

substrate for opt

CBHII

temp opt (°C) CMC, filter paper, PASC, barley β-glucan

pH opt

Cel3

Cel6A

Saccharomyces cerevisiae Pichia pastoris X33 Pichia pastoris

Agaricus bisporus D649 Aspergillus nidulans FGSC A4 Chaetomium thermophilum Chrysosporium lucknowense Coprinopsis cinerea 5338 Coprinopsis cinerea 5338 Coprinopsis cinerea 5338 Humicola insolens Humicola insolens

enzyme

Escherichia coli

expression host

organism

Table 9. Summary of Biochemical Characterizations of Fungal GH6 Cellulases

Chemical Reviews Review

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Brown et al., 2007727

Poidevin et al., 2013,326 Ståhlberg et al., 1993304 Song et al., 2010520

PASC

Avicel, CMC, Glc3/Glc4/Glc5/Glc6, PASC

70

CMC-Na

GH6 cellulases pose a unique challenge in characterization of molecular mechanisms as a result of their catalytic mechanism and specificity. As such, our molecular-level understanding of cellulases in this family is somewhat limited in comparison to other fungal cellulases such as GH7 and GH5 enzymes. The following findings summarize the general consensus regarding features of GH6 cellulases: (1) GH6 cellulases employ a one-step, inverting mechanism to hydrolyze glycosidic linkages. (2) The GH6 catalytic base has not been definitively identified. Cellulases of this family most likely undertake hydrolysis via a water-wire mediated Grotthuss mechanism in which a water wire shuttles the resulting proton from the nucleophilic water molecule to a neighboring aspartate, namely Asp175 in TrCel6A. (3) The GH6 family consists of both CBHs and EGs, both of which are capable of endo-initiated attack with specificity toward the nonreducing end of crystalline cellulose. (4) GH6 CBHs exhibit disordered loops forming the characteristic tunnel-shaped architecture. The loops demonstrate a remarkable range of flexibility in response to both mutagenesis and ligand binding and may be related to the mechanism of processivity or the endoactive nature of GH6 CBHs. These loops may also play an

CBHII Saccharomyces cerevisiae

5.0

Avicel 45 5 Cel6A Pichia pastoris

Cel6A

CBHII

7.5. Conclusions

Talaromyces cellulolyticus CF2612 Thielavia terrestris Trichoderma reesei QM9414 Trichoderma viride CICC 13038

Cel6A Aspergillus oryzae Stilbella annulata

CMC-Na

Yamanobe et al., 2000726 Avicel

comments temp opt (°C) pH opt enzyme expression host organism

Table 9. continued

present, shifting from 5.9 to 6.7 for the protonated group, and 5.7 to 6.3 for the deprotonated group. This shift was attributed to the substrate excluding water from the active site.695 Recently, Bu et al. used molecular modeling to gain insights into the effects of pH on GH6 protein structure and function. Molecular modeling confirms, as proposed in the Damude et al. study, that substrate binding results in a basic shift.479 Bu et al. further demonstrated that pH affects the substrate ring conformation in the −2 subsite and active site loop flexibility, with more flexibility found at the optimal pH. Engineering proteins for pHs beyond the optimum requires consideration of both the overall protein stability as well as the catalytic function. A particularly successful pH engineering effort by Wohlfahrt et al. demonstrates that TrCel6A can be altered so as to maintain both stability and wild-type-like levels of activity at significantly increased pH, similar to what would be encountered under basic lignocellulosic pretreatment strategies.713 The authors targeted three carboxyl-carboxylate pairs in a stable arrangement with low solvent accessibility yet near the flexible active site loops. Under alkaline conditions, the carboxylic acid group of these pairs would likely be deprotonated leading to repulsion of the side chain from the neighboring carboxylate. Mutants were designed replacing the carboxylic acid with an amide partner. Circular dichroism and tryptophan fluorescence confirmed the strategy effectively improved protein stability under alkaline conditions relative to wild-type. The triple mutant (all three carboxylic acids replaced by amides) was the most stable of the proteins examined. The half-life of this protein was extended by 4-fold at pH 8, and activity on cellotetraose was unperturbed. Interestingly, the activity on BMCC was shown to be the same or higher than wild-type under alkaline conditions. Overall, this protein engineering strategy appears to represent a general approach to modification of pH optimums. Furthermore, extrapolation of this approach toward reducing pH optimum appears straightforward, namely by substituting amide-carboxylate pairs with carboxyl-carboxylate pairs.

assay conditions 45 °C and pH 5.0

Wu et al., 2011719

substrate for opt

substrate specificity

ref

Review

PASC

assay conditions 50 °C and pH 5.0

Chemical Reviews

1382

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

characteristic of enzymes capable of degrading equatorially oriented glycosidic linkages.730,732,733 More specifically, the GHA clan, and thus GH5s, belong to the 4/7-superfamily of βαbarrel glycosidases.730,731 Within this superfamily, the retaining mechanism is catalyzed by the conserved glutamate at the ends of β-strands 4 and 7, though very little sequence homology is otherwise observed.730 Given such a broad categorization of low-homology enzymes based on the (β/α)8 fold, significant variety emerges in enzymatic behavior with at least 20 different experimentally determined enzyme classes within this family alone.734 A recent study revisiting GH5 classified structures given newly available structural data even went so far as to recategorize several enzymes as GH30s, illustrating the difficulty of sufficiently describing enzymatic activity based on sequence data or fold alone.735 Within the GH5 family, there are also 51 currently recognized subfamilies that have been defined by phylogenetic analysis with several undefined subfamilies anticipated as sequence data continues to grow.734 Subfamily classification in GH5, or cellulase family A, began in 1990 with five original subfamily classifications proposed, A1−A5.736 Five additional subfamilies were defined in the 15 years that followed until it became clear that a robust sequence-based subfamily classification approach was possible.734,737−740 Still, these newly defined phylogenetic subfamilies encompass both variety of specificities and species within the subfamily classification. Of the 51 identified GH5 subfamilies to date, experimentally categorized cellulolytic behavior (EC 3.2.1.4 and EC 3.2.1.91) has been observed in subfamilies 1, 2, 4, 5, 22, 26, 37, and 39 representing bacterial, archaeal, and eukaryotic taxonomies. Given the vast body of GH5 data available as well as the scope of this review, we focus our discussion of GH5 literature on fungal GH5s exhibiting cellulolytic behavior except where discussion of bacterial GH5 mechanisms provides relevant mechanistic insights. Saloheimo et al. reported the first isolation and sequencing of the egl3 gene coding for T. reesei EGIII.323 The total mass of EGIII was estimated at 49.8 kDa and was described as having modular organization as with other T. reesei cellulases characterized at that point. The 35 amino acid N-terminal region exhibited particularly high sequence homology with that of CBHII and the C-terminal regions of CBHI and EGI, which we now know derives from the characteristic T. reesei cellulase modular appendage of a family 1 CBM (see section 5). At nearly the same time, Ståhlberg et al. reported the isolation of the EGIII core domain from the 61-residue N-terminus region. The authors described the N-terminal domain as a heavily glycosylated structural element common to the other T. reesei cellulases and posited that modular organization may be a characteristic of all Trichoderma cellulases.741 Saloheimo et al. also identified EGIII as a glycoprotein, predicting one Nglycosylation site on the basis a surface exposed Asn-Phe-Thr motif and a great deal of O-glycosylation.323 EGIII was shown to cleave reducing sugars from cellodextrins, though at a rate 50− 200 times slower than that of T. reesei EGI. Comparison of the new EGIII sequence with that of an unpublished S. commune EGI sequence indicated 30.4% sequence identity,323 and while this seems low at first glance, it would later become apparent that this family of cellulases is only loosely linked by sequence homology. Saloheimo et al. also noted that a protein previously identified by Shoemaker and Brown in 1978, EG IV,728,729 was likely actually EGIII, as the amino acid compositions were remarkably similar.

indirect role in stabilizing a water molecule as part of the catalytic mechanism. (5) GH6 cellulases serve a critical role in the synergistic activity of cellulase cocktails. They are the only known nonreducing end specific cellulases and have been shown to both create new free chain ends for GH7s as well as to remove obstacles from the path of more processive CBHs. (6) While ensuring uniform characterization of GH activity, degree of processivity, and initial mode of attack is an inherently difficult task, GH6s pose additional challenges as nonreducing end specific GHs. Attachment of reporter molecules to the nonreducing end of cellulose is nontrivial, and thus, few reliable model substrates exist for characterization of enzymes exhibiting this specificity. GH6 family cellulases play a significant role in synergistic biocatalytic conversion of lignocellulosic biomass, yet despite decades of careful study, many questions remain as to even the most basic aspect of their molecular function. The path to definitively answering these questions of GH6 function will encounter obstacles including those common to fungal GHs such as high-throughput heterologous expression and characterization as well as the unique aspects related to the GH6 catalytic mechanism. We anticipate the development of reporter molecules for nonreducing end specific CBHs on either cellulose or soluble substrates will represent one of the most meaningful advances toward understanding GH6 catalytic function. A recent publication demonstrated a step toward this goal while also acknowledging the possibility that more advanced reporter molecule substrates may exist.704 These types of substrates not only are capable of elucidating the molecular underpinnings of GH6 function; model substrates will also prove useful in understanding the extent of processivity and for selection of enzymes for maximum synergistic function as part of biomass conversion cocktails. Furthermore, development of GH6 variants in the search for thermal stability and/or activity improvements will profoundly benefit from a more accurate assessment of enzyme performance.

8. FAMILY 5 GLYCOSIDE HYDROLASES GH5s have emerged as key EGs in biomass conversion cocktails. GH5 members were originally referred to as family A cellulases when some 21 β-glycanases were categorized into six different families using hydrophobic cluster analysis to group similar sequences.439 The two original fungal members of family A were EG III from T. reesei (TrCel5A) and EG I from Schizophyllum commune. Other members of this family were under investigation at that time,728,729 though without a known sequence, inclusion of these enzymes in the family A classification would not come until many years later. As a growing volume of sequence data became available, the nomenclature expanded to include hydrolytic activity beyond that of β-glycan hydrolysis, and family A cellulases became known as family 5 GHs.654 Today, the GH5 family is one of the largest and most diverse families of GHs with over 4000 entries.151,153 Though a single family designation, GH5s exhibit quite a large degree of variation in specificity and hydrolytic activity likely as a result of divergence from a common ancestor.730,731 In addition to 1,4-glucanase activity, the GH5 family includes enzymes with reported 1,6-galactanase, 1,3-mannanase, 1,4-xylanase, and xyloglucanase activities. Along with 18 other GH families, GH5s belong to the GH-A clan exhibiting a (β/α)8 fold 1383

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

The T. reesei family 5 EG is responsible for a significant portion of the fungi’s EG action, with some reports estimating the EG contribution at nearly 55% of total EG activity.742 As such, this EG is often incorporated as a key component in many industrial biomass conversion cocktails. Over the past four decades, a great deal of effort has gone into characterizing its action and defining the limits of stability by many different groups. Importantly, we note that over this time this particular enzyme has been referred to in the literature by at least three different names. The EGIII moniker was given upon the original characterization by Saloheimo et al.323 Earlier, a report of a partial amino acid sequence determined by Edman degradation assigned the enzyme EGII.632 It eventually became clear that these were the same enzyme. Up until the late 1990s, most publications referred to this enzyme as EGIII, but more recently, the enzyme has frequently been referred to as EGII. As GH nomenclature evolved, the enzyme has also been referred to as Cel5A. To avoid confusion throughout the remainder of this discussion, we adopt the modern convention of Cel5A with reference to the T. reesei family 5 EG. A summary of GH5 structures that are discussed in this section is provided in Table 10.

families likely originates from sequence variations that directly contribute to both major and minor structural variations.751 These structural variations, such as the two minor β-bulges located at strands β3 and β7 of C. thermocellum CelC and the 54residue subdomain insertion connecting the α6 helix and β6 strands,743 greatly affect overall function and appear to be related to thermal stability (Figure 53A).752 Many GH5 structures spanning a range of taxonomies and substrate specificities were solved in the interim between the appearance of the original CelC structure and the first fungal GH5 cellulase structure in 2002. Several of these structures from bacterial cellulases have been invaluable in defining the catalytic mechanism of GH5 cellulases and will be discussed separately with respect to catalytic function.206,754,755 8.1.1. Catalytic Function. As members of the GH5 family, fungal GH5 cellulases hydrolytically cleave the glycosidic linkages of their substrates using the double-displacement retaining mechanism, outlined in Section 3.157 Barras et al. first described this for the bacterial Erwinia chrysanthemi EG Z identifying the retention of the anomeric carbon product configuration.756,757 The authors went on to speculate that two conserved glutamates served as the catalytic acid/base and enzymatic nucleophile as we will discuss at length below. Much of the available insight into catalytic mechanisms of GH5s has been observed in bacterial representatives such as E. chrysanthemi. Thus, in the following section, we briefly deviate from our focus on fungal cellulases to discuss our current understanding of GH5 catalytic mechanisms in general. Initial characterization studies identified key catalytic residues through biochemical methods such as mutagenesis and use of labeled substrates. One of the first studies to investigate the catalytic mechanism of GH5s was conducted in 1990, shortly after some of the first GH5 enzymes were discovered.758 Baird et al. aligned 16 different amino acid sequences of known celluloytic enzymes. Though none of the sequences exhibited greater than 25% similarity to the other sequences in the set, the multiple sequence alignment identified a recurring three-residue motif in all the cellulases examined. Baird et al. had uncovered the Asn-Glu-Pro motif defining in part the catalytic active site of GH5 cellulases. Site-directed mutagenesis of the Glu to Gln in two representative bacterial EGs confirmed a detrimental loss of activity upon mutation. The participation of this conserved Glu in catalysis was again confirmed in both C. thermocellum and E. chrysanthemi GH5s.759,760 Without the benefit of structural insight, these initial studies identified the catalytic acid/base of GH5 cellulases, though it would not be until later that the Glu was confirmed as such. Following shortly after the discovery of the GH5 catalytic acid/base, the complementary nucleophile, also a glutamate, was confirmed. Wang et al. used a clever labeling approach to determine the nucleophilic residue of C. thermocellum CelC involving radiolabeling the enzyme with tritiated inhibitor and spectroscopically analyzing cleaved, labeled peptide fragments.747 Macarron et al. had previously speculated this same glutamate was the catalytic nucleophile in TrCel5A.750 The family 5 nucleophilic glutamate generally belongs to a Glu-XxxGly motif, where Xxx is typically an aromatic residue.761 Identification of the conserved acid/base and nucleophile glutamates served in part as the basis for refining GH classifications including family A cellulases and ultimately the development of the family 5 classification.732,733 The double-displacement retaining mechanism is also known to require a water molecule for nucleophilic attack. Evidence of

Table 10. Reported Fungal GH5 Crystal Structures source and original name in primary citation Thermoascus aurantiacus Cel5A

Piromyces rhizinf lata Egl1 Trichoderma reesei Cel5A

PDB code

resolution (Å)

brief highlights

ref

1GZJ

1.62

740

1H1N 3AYS

1.12 2.20

3QR3

2.05

first fungal GH5 structure reported apo wild-type complex with cellotetraose apo wild-type

753 762 30

8.1. Structural Studies

The first solved structure of a GH5 enzyme was of the bacterial C. thermocellum EG CelC in 1995.743 Dominguez et al. observed that the general fold corresponded to the (β/α)8 barrel topology first observed in triose phosphate isomerase and is referred to as a TIM barrel. This (β/α)8 tertiary structure is by far the most common observed fold, with an estimated 10% of all known enzyme structures falling into this class.744,745 It has been suggested that the variety of (β/α)8 barrel structures emerged as a result of divergent evolution from a common ancestor, thus accounting for the vast diversity of sequences and functionality allowing (β/α)8 barrel proteins to function as hydrolases, oxidoreductases, transferases, lyases, and isomerases.746 Within the GH5 family, a core set of amino acid residues serve as distinction from other (β/α)8 barrel hydrolases. These include the catalytic glutamates, identified previously through biochemical characterization as well as a handful of other residues thought to support catalysis or substrate binding.747−750 Solution of the C. thermocellum CelC structure captured these residues (Glu280, His198, and Trp313) in an arrangement suggesting the nucleophilic glutamate is assisted in catalysis by hydrogen bonding. The cellotriose-bound structure (3 Å) also illustrated carbohydrate−substrate interactions confirming Glu140 serves as the nucleophilic donor and that Asn139 and His90 contact the substrate occupying key positions near the catalytic glutamates. The vast substrate diversity within the GH5 family and throughout the over 30 (β/α)8 super1384

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 53. GH5s are among the many proteins exhibiting the relatively common (β/α)8 topology. (A) In 1995, the first GH5 structure, originating from the bacteria C. thermocellum, was solved.743 The celluloytic EG CenC, in purple cartoon, is shown aligned with the first fungal GH5 cellulase structure, T. aurantiacus Cel5A in green, solved nearly a decade later.740 (B) The GH5 structures share the same basic topology but exhibit extraordinary diversity in the loop regions connecting the primary β-sheets and α-helices. A third GH5 cellulase, also from T. aurantiacus in tan, was solved at nearly the same time as the first fungal Cel5A structure.753 C. thermocellum CenC exhibits a large subdomain insertion between the α6 helix and β6 strand, where the two T. aurantiacus Cel5A structures exhibit a very compact loop (highlighted by the dashed red circle). The cellotriose ligand from C. thermocellum CenC is shown in gray stick, and the positioning of the loops relative to the cellulose substrate suggests functionality in substrate recruiting, though likely using very different mechanisms.

The first step in the double-displacement hydrolytic mechanism of GH5s is the glycosylation step in which the glycosyl-enzyme intermediate is formed, Figure 54. At this point, a proton from the catalytic acid/base, Glu139, has been transferred to the leaving group sugar, a non-natural dinitrophenyl group in the Davies et al. study. The nucleophilic Glu228, having attacked the anomeric carbon of the −1 glucopyranose moiety, is then covalently bonded to the oligomeric substrate. The −1 glucopyranose sugar returns to its energetically preferential 1C4 state. With newfound space in the enzyme active site upon expulsion of the product sugar and relaxation of the substrate, a water molecule takes the place of the glycosidic oxygen. The authors managed to capture two different versions of the GH5 glycosyl-enzyme intermediate covalently bonded to a cellobiosyl and a cellotriosyl substrate. For the most part, the length of the substrate has little effect on the position of the catalytic residue. However, a conserved tyrosine residue appears to undergo a conformational change as a result of the shorter cellobiosyl covalent attachment. This residue has been identified as a contributor to catalysis,30,743,762 though the exact means by which this occurs remains unknown. The authors refrain from speculating as to the role of this tyrosine, though it appears the residue is critical to transition state stabilization intermittently interacting with the −1 sugar hydroxyl group and Glu228. Davies et al. do go on to propose that Arg62 is relevant to the correct positioning of the nucleophilic glutamate throughout the catalytic cycle. Finally, the nucleophilic water molecule catalyzes the second step of the double-displacement mechanism, deglycosylation. A proton from the water is donated to the catalytic acid/base, and the remaining hydroxide caps the cleaved glycosidic linkage. The nucleophilic glutamate is returned to its original negative charge state, and the enzyme is reset to its original state ready to

an appropriately positioned water molecule would not come until later when structural characterization of the B. agaradhaerens Cel5A reaction pathway was captured by Davies et al.206 Other residues, including an asparagine, histidine, and arginine, have been hypothesized to participate in supporting catalytic roles.740,747 However, it is not clear these interactions are conserved across GH5s or the precise role each residue plays. Structural evidence of the double-displacement mechanism in GH5s was obtained in 1998 when Davies et al. captured the entire reaction coordinate in a series of structures from B. agaradhaerens Cel5A, a β-1,4-glycanase.206 Five different structures describe the individual states of the reaction including: the initial apo conformation of the active site, the Michaelis complex, two versions of the glycosyl-enzyme intermediate, and the product bound conformation. We illustrate these states in Figure 54, as this reaction coordinate is general for fungal GH5 cellulases. Prior to hydrolysis, the cello-oligomer substrate is initially bound within the GH5 active site in the catalytically competent Michaelis complex (Figure 54). The catalytic acid/base, Glu139 for B. agaradhaerens Cel5A, and the nucleophilic glutamate, Glu228, are positioned over the −1 glucopyranose so as to initiate catalysis. The proton of the acid/base hydrogen bonds with the glycosidic oxygen, while the nucleophlic glutamate hydrogen bonds with the anomeric carbon of the −1 glucopyranose ring. The −1 glucopyranose must adopt an energetically unfavorable skew conformation (1S3) to correctly position the leaving group sugar significantly higher than would otherwise naturally occur. The use of a fluorine-substituted dinitrophenyl substrate made capture of the intact glycosidic linkage spanning the −1/+1 binding sites possible. 1385

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 54. Retaining mechanism of GH5 cellulases illustrated by B. agaradhaerens Cel5A structures captured along the reaction coordinate.206 The catalytic residues are shown in yellow stick. Glu139 is the catalytic acid/base, and Glu228 is the enzymatic nucleophile. Arg62 and Tyr202 appear to serve supportive roles. The substrate, representative of β-1,4-linked glucans, is shown in cyan stick. The water molecule that attacks the substrate anomeric carbon in the deglycosylation step is shown as a red sphere. The Michaelis complex (PDB 4A3H) is shown at left and illustrates the catalytically competent enzyme−substrate conformation. Two covalently bound glycosyl-enzyme intermediate states following the glycosylation step of the double-displacement mechanism have been captured (PDB 5A3H and 6A3H), and the length of the product side substrate appears to affect the conformation of Tyr202. The product, cleaved from the protein following the deglycosylation step, is shown at right (PDB 3A3H).

fundamental understanding of catalytic mechanisms in this family lags behind that of GH7 and offers an area for continued investigation. In the sections that follow, we focus on the four fungal GH5 cellulase structures solved to date noting that difficulties with protein expression and purification have hindered accumulation of a large fungal structural data set (Table 10).765 A multiple sequence alignment of the four fungal structures alongside the bacterial B. agaradhaerens Cel5A sequence is provided in Figure 55 to aid in comparative discussion. 8.1.2. T. aurantiacus Cel5A. Lo Leggio and Larsen reported the first fungal GH5 cellulase structure, a 1.62 Å EG from T. aurantiacus (PDB 1GZJ, Figure 53A).740 In addition to being the first fungal cellulase structure, the TaCel5A structure documented the first structural representative of the GH5 subfamily 5 classification (formerly A5/6). As with previous GH5 structures, the canonical (β/α)8 fold comprises the general tertiary structure. However, the TaCel5A structure is remarkably compact with few extensions to the loops connecting the α and β regions (Figure 53B). The TaCel5A structure does not include a bound ligand, but on the basis of appearance, Lo Leggio and Larsen suggest the EG may be capable of binding up to a celloheptaose oligomer within the −4 to +3 binding sites. The enzyme active site is formed by a wide and shallow groove lined with aromatic residues Trp278, Trp279, Trp273, Trp170, Trp174, and Tyr200 (Figure 56A). This aromatic-mediated substrate recognition

hydrolyze another glycosidic bond. Generally, most GH5s do not exhibit any significant degree of processive action. Thus, the substrate likely dissociates from the active site, and a new oligomer association forms prior to the next catalytic cycle. Overall, the Davies et al. study provides a firm structural basis for understanding the hydrolytic mechanisms of GH5 enzymes, including cellulases.206 However, it is not immediately clear as to which of the two steps, glycosylation or deglycosylation, is ratelimiting in GH5s or if this is consistent across the family. Currently, our understanding of the rate at which each step proceeds is limited to theoretical studies of bacterial GH5 cellulases. Liu et al. performed QM/MM simulations of both the glycosylation and deglycosylation steps of Acidothermus cellulolyticus Cel5A.763 The glycosyl-enzyme intermediate was approximated from MD simulations in lieu of a structure-based approximation. The authors report free energy barriers of 25.7 and 29.4 kcal/mol for the glycosylation and deglycosylation reactions, respectively. However, the authors caution that the difference between the two barriers may not be statistically significant given the relatively low level of theory applied. Furthermore, the reported barriers are likely unreasonably high as a whole, again likely as a result of the level of theory. Similarly, Saharay et al. apply QM/MM to examine deglycosylation in B. agaradhaerens Cel5A obtaining a free energy barrier of 24.2 kcal/mol.764 Saharay et al. do not provide explanation as to the differences in their calculations and those of Liu et al. despite having used the same level of theory. In general, our 1386

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 55. Sequence alignment of the four fungal GH5 structures discussed here alongside the bacterial B. agaradhaerens Cel5A (BaCel5A) from which catalytic function was elucidated. Notably, the sequence alignment illustrates the relatively low sequence similarity among the members of this GH family. Strictly conserved residues are shown in red block, and chemically similar residues in red text. The blue boxes indicate chemical similarity across a grouping of residues. The secondary structural element of TrCel5A and BaCel5A are shown above and below the sequences, respectively. The catalytic acid/base motif is shown in a pink box, and the catalytic nucleophile is shown in a yellow box. The figure was generated with ESPript (http:// espript.ibcp.fr).347

mechanism is typical of GHs,766 yet of the tryptophan residues lining the active site, only Trp273 in the −1 binding site is strictly conserved. However, aromatic residues near the +1 and +2 binding sites are spatially conserved throughout GH5s suggesting functional relevance. The spatially conserved tryptophan in the +1 binding site of TaCel5A is Trp170.

Chemical modification of the homologous residue in TrCel5A decreased kcat/Km by nearly half.749 The authors hypothesize this residue serves a configurational function positioning the glycopyranose moiety in the −1 binding site in the nonenergetically favorable skew conformation necessary for catalysis. The tyrosine residue, Tyr200, located in the −2 binding site of 1387

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 56. (A) First fungal GH5 cellulase structures, both T. aurantiacus Cel5A, exhibit an extensive network of aromatic residues comprising the wide and shallow active site groove.740,753 The aromatic residues, shown labeled in yellow stick, serve to mediate carbohydrate substrate interactions. Surprisingly, only Trp273 in the −1 binding site and the catalytically functional Tyr200 are strictly conserved among the aromatic residues. However, many others are spatially conserved by chemically similar residues. (B) A handful of conserved residues, cyan stick, are associated with GH5 cellulolytic function and differentiate these (β/α)8 structures from the thousands of others. Several, including Tyr200, Glu133, and Glu240, directly participate in catalysis. Other conserved residues contribute to structural stability through disulfide bonds, orange stick, and salt bridges, magenta stick.

observed for B. agaradhaerens Cel5A.206 More broadly, mounting structural evidence such as this and recent GH18 studies suggests flexibility of the catalytic residues is a hallmark of endo-active enzymes in general.206,768 A second even higher resolution (1.12 Å) TaCel5A structure followed almost immediately after the first (PBD 1H1N), again without a bound substrate.753 Van Petegem et al. offer additional molecular level insights not explicitly discussed by Lo Leggio and Larsen.740 The aromatic lined active site groove of TaCel5A was again noted, but Van Petegem et al. describe an additional aromatic residue (Phe16) that may participate in carbohydrate stacking interactions in the −2 binding site. Though not explicitly described as such, perhaps the most interesting findings from the TaCel5A structure reported by Van Petegem et al. is the discussion of structural details that likely contribute to the thermal stability of this EG. Lo Leggio and Larsen report a single disulfide bond between Cys212 and Cys249 noting that it does not appear to be conserved among GH5s. Contradictorily, Van Petegem et al. describe the apparent conservation of the same disulfide bond in 20 different GH5 EGs. In both studies, it is unclear which subset of enzymes was used in the sequence comparisons making it difficult to speculate why conservation of the disulfide bridge appears inconsistent. Nevertheless, disulfide bonds have long been associated with thermal stability and may contribute to the same in TaCel5A.769−771 A salt bridge between Arg154 and Asp188 is also reported, though not as strictly conserved. The Arg154 residue is located toward the end of the α4 helix, and Asp188 is located in a largely unstructured, solventexposed loop connecting the α5 helix to the β6 strand. It is tempting to surmise that the presence of the salt bridge in a region otherwise susceptible to denaturation contributes some measure of stability, as, like disulfide bridges, the appearance of salt bridges in thermophilic proteins has been implicated in stability.772−775 Van Petegem et al. also very briefly discuss the function of the loop connecting the α6 and β6 structural elements. Specifically,

TaCel5A, is also strictly conserved. Prior evidence suggests that the tyrosine hydroxyl group may participate in catalysis through interaction with the nucleophile rather than through direct stacking interaction with the carbohydrate moieties.206,743,747 Outside the aromatic residues, Lo Leggio and Larsen put forth functional roles for several of the conserved residues. Through structural comparison, the authors identified conserved residues including two glycines (Gly8, Gly44), the catalytic glutamates (Glu133, Glu240), as well as the six canonically conserved active site residues (Asp132, Arg49, His93, His198, Tyr200, Trp273). Conservation in GH5s is almost entirely restricted to the active site, as shown in Figure 56B. The exceptions to this in TaCel5A are the two glycine residues and Arg49, whose functions remain unclear. The authors suggest Gly44 is a part of the Schellman C-terminal capping motif conserved throughout GH5s.767 The functional relevance of Gly8 is less obvious, as this residue lies in the middle of the β1 strand away from the substrate and conservation is unnecessary to maintain the (β/α)8 fold. Arg49 is located in the β2 strand. This arginine is conserved in all members of the 4/7 superfamily except those of GH10 and GH26. As with Gly44, this conserved residue is seemingly related to the Schellman C-terminal capping motif, as the motif is also not conserved in GH10 and GH26. The positions of the catalytic residues in the TaCel5A structure were also informative. As is often the case, the crystallization conditions under which the protein was crystallized, pH 9.0, were far from the optimal acidic pH of approximately pH 4.0. Additionally, the structure did not contain a ligand bound in the active site. Despite these limitations, the catalytic residues were found in a catalytically competent position,740 making superimposition of cellooligomers from other bacterial GH5 structures straightforward and informative from a mechanistic perspective. The positioning of these residues under adverse conditions may be related to the apparent general flexibility of the GH5 EG active site as was 1388

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

positioning given its proximity to Trp44, which is positioned directly over the −3 glucopyranose moiety. This suggests that while both PrEglA and TaCel5A use the same catalytic mechanism to hydrolyze the glycosidic linkage, the two enzymes make use of a different molecular level mechanism for substrate recruitment. The PrEglA structure also explicitly illustrated molecular phenomena related to catalysis in fungal GH5 cellulases. Tseng et al. note six water-mediated hydrogen bonds between the −1 and −2 glucopyranose moieties in the 3AYS structure (Figure 57).762 The water near the anomeric carbon of the −1 glucopyranose hydrogen bonding with Glu154 has been implicated as a participant in the catalytic mechanism of GH5s, specifically as a means of nucleophilic attack on the glycosyl-enzyme intermediate.206 The PrEglA ligand structure captures the water molecule near the −1 glucopyranose despite being catalytically inactivated by an E154A mutation. Superimposition of the Glu154 side chain from the wild-type structure with E154A confirms the catalytic acid/base points toward the O1 atom of the −1 glucopyranose at 1.7 Å, and the complementary nucleophile, Glu278, hydrogen bonds with the anomeric carbon of the −1 glucopyranose at 3.2 Å. This catalytic arrangement of residues, including nucleophilic attack by the water, is in line with that previously observed in a bacterial GH5 cellulase. 206 The conserved Tyr231 was also observed participating in an apparent catalytic arrangement by hydrogen bonding with the Glu278 side chain. The PrEglA cellotetraose-bound structure offers an intriguing explanation for the enzyme’s putative processive ability. The authors purport, on the basis of previous biochemical assays, that PrEglA is “both an endoglucanase and a cellobiohydrolase”, which we take to mean that the enzyme exhibits both processive and nonprocessive behavior.776 The authors do not attempt to further connect structure to function in an enzyme that certainly resembles an EG by all appearances. However, Tseng et al. indicate the bound cellotetraose ligand is nearly completely surrounded by protein at the −1 binding subsite with the remainder of the active site exposed to water. This observation is likely directly related to the observed processive ability of the enzyme in both increasing ligand binding free energy and maintaining contact with crystalline substrates allowing processive hydrolytic events.580 Finally, the authors observe what they consider to be a potentially dimeric structure in which the O1 atom of the −1 sugar appears to hydrogen bond (3.2 Å) with the OE2 of the symmetry-related Glu242.762 Site-directed mutagenesis of E242A indicated overall catalytic function was retained, suggesting dimeric catalysis by the proposed mechanism was unlikely. Nevertheless, the authors go on to suggest that because the first 100 amino acids encoded by the PrEglA cDNA fragment were similar to the last third of the (β/α)8 barrel sequence that PrEglA comprises at least two CDs acting in tandem. However, the structural evidence to date does not support this hypothesis. 8.1.4. T. reesei Cel5A (Formerly EG II/EG III). The most recent fungal GH5 cellulase structure originates from T. reesei (PDB 3QR3).30,566 The enzyme, Cel5A or formerly EGII/ EGIII, accounts for up to 55% of the total EG activity of T. reesei and thus is a key component of many industrial biomass conversion cocktails.742 Thermal stability of the enzymes within these cocktails is key to effective deployment in commercial processes and has been a primary focus of recent protein engineering efforts.324,752,777−779 The thermal stability of

the authors describe a potential substrate positioning mechanism based on both an extensive observed hydrogen bonding network in this region as well as superimposition of the T. aurantiacus structure with the cellobiose-bound C. thermocellum CenC (Figure 53B). While proximity of this loop to the C. thermocellum CenC cellobiose does suggest function, little else can be said regarding the mechanism given the significant structural deviations between the two EGs in this region. C. thermocellum CenC exhibits an impressive subdomain insertion represented by a very compact, unstructured loop in TaCel5A. It is difficult to imagine these two drastically different features behave in a similar substrate-recruiting fashion. 8.1.3. P. rhizinf lata EglA/CelA. The first fungal GH5 cellulase structure exhibiting a bound ligand was solved by Tseng et al. in 2011.762 The E. coli expressed EG from P. rhizinflata, described in literature as both EglA and CelA (PrEglA), captured the orientation of a cellotriose molecule in the −3 to −1 binding subsites at 2.2 Å resolution (PDB 3AYS). The authors reported that the N-terminal His tag attached for purification prevented soaking in a cellulose ligand by blocking the active site. Cocrystallization with the catalytically inactive PrEglA (E154A) was successful, however. These limitations likely apply to other fungal GH cellulases explaining in part the long delay in solution of a substrate-bound structure. The authors also solved the complementary apo wild-type structure at 2.0 Å resolution (PDB 3AYR) illustrating that the enzyme underwent little to no change as a result of substrate binding.762 PrEglA is similar to the TaCel5A structure in several ways including the presence of several spatially conserved aromatic residues lining the active site and the presence of a single disulfide bond. Interestingly, the disulfide bond in PrEglA (Cys27 to Cys43) is located in a different loop from that observed in TaCel5A connecting the β1 strand to the α1 helix on nearly the opposite side of the protein near the −3 glucopyranose of the bound cellotriose (Figure 57). The PrEglA disulfide bond almost certainly plays an indirect role in substrate

Figure 57. Active site of P. rhizinf lata EglA, the first fungal GH5 cellulase structure to capture the position of a bound ligand. The cellotriose-bound variant structure (3AYS) is shown in aquamarine cartoon and stick, and the apo wild-type structure (3AYR) in light pink cartoon and stick. Aromatic residues, Trp44 and Tyr231, are shown in yellow stick, and the cellotriose is shown in gray stick. Water molecules are shown as red spheres. 1389

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 58. First structure of TrCel5A reveals key details regarding disulfide bonds and loop insertions that may aid in understanding contributions to thermal stability in GH5 EGs. (A) The TrCel5A structure, slate cartoon, is shown aligned with the homologous TaCel5A structure, green cartoon. Primary differences in the two structures appear in the noted loop regions connecting the primary structural elements. TrCel5A also exhibits a βhairpin element, red cartoon, with a tryptophan residue that stacks against the protein face. (B) TrCel5A exhibits four disulfide pairs, orange stick, that connect primary structural elements, yet surprisingly may not significantly contribute to thermal stability. (C) The active site of TrCel5A contains several conserved residues, gray stick, that may contribute to stability of transition states in catalysis or substrate binding. The catalytic acid/base, Glu218, and the nucleophile, Glu329, are shown in yellow stick.

TrCel5A (reported Tm of 69.5 °C) is relatively low compared to other hyperthermally stable GH5s and is a likely target for stability improvement. Lee et al. solved the 2.05 Å apo structure of TrCel5A to uncover the molecular level contributions to thermal stability by comparison; however, the structure uncovers as many questions as it answers.566 The TrCel5A structure’s closest homologue is TaCel5A with 29% sequence identity and 69% sequence similarity. Comparison of TrCel5A with the TaCel5A structure,740,753 a hyperthermally stable enzyme with a Tm of nearly 81 °C,780 can feasibly provide insight into the variation in molecular features contributing to reduced thermal stability in TrCel5A. At first glance, primary differences in the two structures include extended loops connecting the β1 sheet to the α1 helix, the β3 sheet to the α3 helix, and the α5 helix to the β6 sheet (Figure 58A). The TrCel5A structure also features a protruding βhairpin element at residues 378−385 near the final α8 helix, wherein Trp384 stacks the β-hairpin against the globular face of the protein (Figure 58A). The functional role of the β-hairpin remains unknown. However, β-hairpins have been observed in other bacterial GH5s including Thermotoga maritima Cel5A781 and Clostridium cellulovorans EG D (PDB 3NDY, unpublished). Though Lee et al. do not investigate the contributions of the loop insertions to reduced thermal stability, it is tempting to hypothesize these increasingly disordered regions contribute in part to a reduced Tm relative to TaCel5A. Recent computational investigations of a related enzyme lend credence to this hypothesis, though it remains largely untested experimentally.752 Disulfide bonding in TrCel5A is also a key differentiator from the homologous TaCel5A. TaCel5A exhibits a single disulfide bond pinning the α6 to β6 loop to the top of the α7 helix.740,753 PrEglA also exhibits a single disulfide bond and is a known hyperthermophilic enzyme.762 In contrast, TrCel5A contains 8 cysteine residues forming four disulfide bonds in the protein (Figure 58B). One of disulfide bonds, between Cys302 and Cys338, corresponds to that observed in TaCel5A. The disulfide bond between Cys86 and Cys92 tethers the C- and N-terminal regions of the β1 to α1 loop. Oddly, the other two disulfide

bonds, Cys162 to Cys169 and Cys343 to Cys393, anchor (β/ α)8 barrel structural elements directly including the α2 helix to β3 sheet and the α7 helix to the α8 helix, respectively. Intuitively, one would expect that, with the increased number of disulfide bonds, TrCel5A would exhibit a greater degree of structural stability and thus increased thermal tolerance over TaCel5A.769−771 However, on the basis of the structural evidence presented by Lee et al.,30 there does not appear to be a direct correlation of disulfide bonds and thermal stability. The authors report site-directed mutagenesis of several cysteines to serines resulted in insoluble protein expression in the E. coli recombinant host. Thus, there is little evidence by which to ascertain the contributions to thermal stability imparted by each nonconserved cysteine pair. Finally, Lee et al. observe catalytic residue orientations in line with the proposed GH5 catalytic mechanism.30,206 The catalytic residues Glu218 and Glu329 are separated by a distance of approximately 5 Å measured from the terminal oxygen atoms consistent with reports of catalytic residue flexibility (Figure 58C).206 The authors suggest residues Thr328, His288, and Glu218 function simultaneously to raise the donor carboxylate pKa promoting efficient catalysis. The nucleophilic glutamate, Glu329, is observed hydrogen bonding to the OE2 atom of Arg130 and the OE1 atom of Tyr290 as part of the retaining catalytic mechanism. As with T. aurantiacus,753 conserved residues His174 and Trp362 are reported to participate in substrate binding rather than directly in the catalytic mechanism despite their proximity to the active site. 8.2. Characterization of Activity and Specificity

In general, GH5 celluloytic function can be classified as endoinitiated, with only 2 of the 4395 members classified as having β1,4-cellobiosidase activity (EC 3.2.1.91), one from C. thermocellum and one from Teredinibacter turnerae. In the following sections, we briefly describe the wealth of biochemical characterization data available for fungal GH5 cellulases. Each of the enzymes described below have been classified as endo-β-1,4glucanases (EC 3.2.1.4) with some purported to exhibit moderate degrees of processive action. 1390

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

8.2.1. T. viride EG III. Some of the earliest reported biochemical characterizations of GH5s came from T. viride several years before cellulase classification would relate their function to other GH5s. In 1978, Shoemaker and Brown submitted two simultaneous reports on the discovery and characterization of a set of four EGs from T. viride demonstrating purity with SDS-PAGE.728,729 Up to this point, studies had purified fungal EGs (not necessarily GH5s) from T. viride, T. koningii, Fusarium solani, Sporotrichum pulverulentum, and Irpex lacteus, though homogeneity of these early purifications was uncertain.782−785 Shoemaker and Brown described what was ultimately the first biochemical characterization of pure fungal GH5 cellulases, T. viride EGIII and EGIV, both from subfamily 5. The enzymes were reported as 52 and 49.5 kDa, respectively.729 Specific activity assays suggested the EGs were active on CMC, PASC, and cellotriose and higher oligosaccharides with limited Avicel activity.728 Both EGs were reported to have pH optimums of 4.0 to 4.5 on PASC and CMC and a noted intolerance for high pH.729 The T. viride cellulases, both EGs and processive cellulases from other GH families, formed the basis of a commercial enzyme preparation, Maxazyme CL, from the Dutch company Gist-brocades (now owned by DSM) and another from Miles Laboratories. After the report from Shoemaker and Brown, several years passed before any additional characterization occurred. In 1985, Beldman et al. reported the purification and characterization of the commercial Maxazyme preparation, though much of the report focuses on the processive cellulase function.786 The authors followed up their characterization efforts with a description of the apparent synergy of the T. viride cellulases, which relies significantly on the presence of EGs such as EGIII and EGIV.787 After these initial reports, much of the literature related to T. viride GH5 EGs surrounds strain engineering for enhanced production with a few reports of recombinant expression for improved stability and pH optimum. 8.2.2. TrCel5A. T. reesei (now also known as Hypocrea jecorina) Cel5A (TrCel5A) is a model fungal GH5 EG (subfamily 5) having found application in some modern cellulase-based commercial enzyme preparations.788 As such, much of the fungal GH5 characterization efforts to date have focused on this enzyme. Early characterization focused on delineating activity between the suite of T. reesei cellulases and broadly describing specificity. In 1985, van Tilbeurgh and Claeyssens used 4-methylumbelliferyl-β-D-glycoside substrates in an attempt to explicitly differentiate T. reesei CBHs, EGs, and β-glucosidase activities.301 Notably, TrCel5A was the only cellulase to cleave the fluorescent phenol group from the fluorogenic 4-methylumbelliferyl-β-D-cellotrioside. This technique would later be used to characterize relative activities of commercial enzyme preparations, the results of which highlighted the need to pay close attention to the model substrates used in evaluation of overall activity and synergism, as it is now clear that it is impossible to differentiate CBH from EG activity using artificial fluorogenic substrates such as 4-methylumbelliferyl-β-D-glycoside.789 In 1987, Penttilä et al., having expressed and cloned the egl3 gene encoding TrCel5A, recombinantly expressed the EG in S. cerevisiae in an effort to understand the effects of expression host on activity.283 The authors found that while the recombinant enzymes remain active, the expression host affects both specificity and enzyme morphology. The change of morphology resulting from heterologous expression of TrCel5A compared to

the control strain was speculated to result from hyperglycosylation common in yeast expression. As previously mentioned, Saloheimo et al. described what appears to be the first isolated and sequenced fungal GH5 cellulase.323 Alongside this isolation, the authors characterized the activity of the new EG, as its sequence differed substantially from the known cellulases at the time, TrCel6A, TrCel7A, and TrCel7B, indicating potentially new function. TrCel5A, like TrCel6A, did not hydrolyze cellobiosides or lactosides, and thus, the authors speculated specificity was similar. Saloheimo et al. also reported a pH optimum of 4.0−5.0 on a cellotrioside substrate at 50 °C, comparable to the T. viride EGs. A flurry of activity surrounding the mode of TrCel5A action occurred in the early 1990s with a series of papers uncovering the catalytic nucleophile,750 described above, putative binding models,748 essential aromatic residues,749 and a high degree of endo-activity.304 With a basic understanding of TrCel6A and TrCel7A specificity and action in place at this point, Macarrón et al. set out to develop a similar level of understanding of TrCel5A action. Having purified the TrCel5A with an attached CBM, the authors report TrCel5A exhibits a stable pH range 4.0−6.3 at 55 °C, retaining 90% of its activity at 65 °C for 30 min.748 Chromophoric substrates were effectively used to identify TrCel5A as having a double-displacement catalytic mechanism, the first report of such. In the case of cleaving the 2-chloro-4nitrophenol bond, the deglycosylation step was reported as the rate-limiting step, though nucleophilic competition experiments between methanol and water point to the glycosylation step as rate-limiting. Ultimately, no conclusion can be made regarding rate-limitation in the double-displacement mechanisms from the reported data. However, Macarrón et al. were able to propose a five-subsite binding model for TrCel5A based on observed cleavage products (Figure 59). To date, this remains to be validated from a structural standpoint, with only a single apo structure of the enzyme reported.30 Macarrón et al. later investigated the function of three tryptophan residues in binding and catalysis using Nbromosuccinimide modification.749 Given the likely role of conserved aromatic residues in carbohydrate substrate binding, the authors hypothesized chemically modified tryptophans

Figure 59. Proposed five-subsite binding model for TrCel5A, illustrated with two possible hydrolysis pathways for 4-methylumbelliferyl-β-Dcellotrioside, with the open rectangles indicating D-glucosyl moieties and filled rectangles indicating 4-methylumbelliferyl moieties. The lettered, indented block represents the binding site of TrCel5A. Catalysis takes place between the C and D subsites, as indicated by the arrows indicating catalytic protein residues. The possibility of methanol transfer as part of the EG activity is indicated. Reprinted with permission from ref 748. Copyright 1993 the Biochemical Society. 1391

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

greater extent than TrCel7A. This is likely related to the characteristics of the globular surface of TrCel5A, as it has been repeatedly reported that addition of bovine serum albumin (BSA) in assays enhances stability of TrCel5A.323,792,793 Here, addition of BSA significantly reduced the binding of TrCel5A to lignin. In a related study, Le Costaouëc et al. examined the effects of CBMs on hydrolysis for TrCel7A and TrCel5A as well as TaCel7A and TaCel5A on pretreated wheat straw and spruce.794 TaCel7A and TaCel5A, unlike the Trichoderma cellulases, naturally lack the CBM module and presumably have evolved in such a way as to maintain similar levels of hydrolysis. The authors genetically modified the Thermoascus cellulases with attached CBMs for a direct comparsion to intact TrCel7A and TrCel5A. A general conclusion from this study was that, at high substrate loadings, all enzymes performed better without the attached CBMs, mostly likely by virtue of the high population of free oligosaccharides reducing the need for substrate recognition by the CBM. The Thermoascus cellulases have likely evolved to function in high substrate loading contexts, as both enzymes outperformed the CBM deficient Trichoderma cellulases. Le Costaouëc et al. also suggest that the CBM−lignin interactions may have detrimental nonproductive binding interaction that would be mitigated by the removal of the CBM entirely. Thus, even for Trichoderma cellulases, removal of the CBM for high substrate loading applications may result in better overall performance. Early characterization efforts were particularly successful in predicting extent of glycosylation in TrCel5A. Both Saloheimo et al. and Ståhlberg et al. described TrCel5A putatively exhibiting one N-linked glycan and heavy O-linked glycosylation on the N-terminal structural domain, i.e., the CBM and linker (as described in section 5).323,741 Nearly 15 years later, Hui et al. would follow up these hypotheses to firmly establish the nature and location of N-linked glycans and the extent of heterogeneity in O-linked glycans of the linker domain.418 Using capillary liquid chromatography-electrospray mass spectrometry and matrix-assisted laser desorption and ionization time-of-flight, Hui et al. characterized the glycosylation patterns of all four major Trichoderma cellulases purified from a T. reeei RUT-C30 strain fermentation broth. The authors confirmed that TrCel5A exhibits a single N-linked GlcNAc at Asn103, the previously predicted location. 323,418 The glycan is likely trimmed endogenously after secretion from a higher mannose form. The authors also indicate the linker exhibits anywhere from 32 to 42 O-linked mannose residues, only slightly fewer than the 39−46 attached to the long, glycosylated TrCel6A linker. The O-linked glycan moieties have been shown to play a key role in enzyme substrate binding, and such a high degree of glycosylation as reported here may account in part for the significant differences in adsorption between the intact enzyme and the CBM-deficient versions.792 Product inhibition of cellulases by cellobiose is a point of interest and noted concern when considering industrial conversion process conditions and design of experiments, as detailed in the previous two sections. Conversion rates of cellulase cocktails tend to drastically decline with increasing concentrations of glucose and cellobiose during the initial hydrolysis stage,581 though which enzymes directly contributed to this effect was not immediately clear. Gruno et al. directly examined product inhibition for TrCel7A and three Trichoderma EGs including TrCel5A to delineate contributions from each of these enzymes to the inhibition effect.611 As described

would affect binding and hydrolysis. Chemical modification was able to alter three tryptophans in total: one putatively part of the CBM, one likely very near the catalytic active site, and one unspecified modification not affecting activity or binding. The modification of the CBM-located tryptophan, mostly likely Trp5, reduced absorption on Avicel by nearly 50% (see section 5). As TrCel5A is an EG, it is unclear whether this modification would significantly affect enzymatic activity in practice. The “active site” tryptophan, thought to be Trp255, reduced rates of hydrolysis on small soluble substrates, though primarily through its influence on substrate binding rather than directly through catalysis. This finding is directly related to the primarily endoactive function of TrCel5A.304 In later years, the importance of choice of model substrate and kinetic screening tools became an important emphasis in cellulase research. Frequently, TrCel5A was included in these studies, while TrCel7A or TrCel6A was the primary focus. In developing the now ubiquitous disodium 2,2′-bicinchoninate assay to quickly assess initial reducing end sugar production from cellulases, Johnston et al. evaluated Michaelis−Menten kinetics of TrCel5A relative to the three other primary Trichoderma cellulases providing a consistent comparison of kinetic data accounting for initial rates of the biphasic systems.790 Kipper et al. similarly worked toward an accurate characterization cellulase kinetic action and, in doing so, reported the remarkable differences in TrCel5A endo-activity versus TrCel6A and TrCel7A. Related processive characterization efforts also served in verifying the nonprocessive nature of TrCel5A.315 Jäger et al. investigated the merits of using an αcellulosic substrate in the selection of cellulases for alkalinepretreated biomass and briefly discussed TrCel5A action on this realistic substrate.791 Knowing that TrCel5A is a multimodular enzyme consisting of a CBM attached via a glycosylated linker to the catalytic domain, a natural follow-up question is to what extent does the CBM play a role in the enzyme’s action. Nidetzky et al., noting the inherent difficulties associated with appropriate selection of binding site models, set out to develop realistic relationships between cellulase binding and rate of substrate hydrolysis.792 As part of this study, the authors also evaluated the effects of the CBM on binding and hydrolysis. After incubation with filter paper substrate and fluorometric assessment of activity, Nidetzky et al. developed Langmuir isotherms describing filter paper adsorption of all the major Trichoderma cellulases. Relative to TrCel5A, the authors found that (1) the extent of adsorption increases with agitation, facilitating mass transfer to the solid substrate, (2) nearly 4 times as many binding sites were available for the intact TrCel5A compared to the CD alone, and (3) the relationship of binding to hydrolysis was nonlinear, suggesting nonproductive binding. These results indicate that the CD alone does little of the work in maintaining enzyme/ substrate association, and thus, the CBM is critical to effective hydrolysis at relatively low substrate concentrations. The effect of CBM substrate binding to real biomass has also been examined to much the same result as described by Nidetzky et al. for filter paper.792 Palonen et al. compared the binding behavior of TrCel7A and TrCel5A both with and without CBMs on steam pretreated softwood and lignin.793 Again, separation of the catalytic core domains from the CBMs significantly decreased binding of the enzymes to the substrates, though the effect on hydrolysis as a result was not determined. An interesting finding from this work was that the TrCel5A CD appears to adsorb to the alkaline isolated lignin to a much 1392

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

TrCel5A is approximately 10 °C higher than either of three other cellulases (Figure 60). With a Tm of 75 °C, TrCel5A is

previously, the primary cellobiose inhibition effects tend to derive from processive CBHs. The authors report TrCel5A hydrolysis of tritiated amorphous cellulose is virtually unaffected by product inhibition, as was the case for the other EGs in the study. Comparatively, inhibition of TrCel7A on tritiated BC was nearly 100-fold higher than any of the EGs. It is likely that the relatively weak adsorption of TrCel5A with carbohydrate substrates resulting from the open, shallow binding site cleft plays a significant role in allowing the cleaved cellobiose product to rapidly dissociate from the active site preventing any significant degree of inhibition. Using a completely different technique of monitoring unmodified cellulose hydrolysis by calorimetry, Murphy et al. would later corroborate this finding.615 Gruno et al. also report turnover numbers for the various EGs on amorphous cellulose. For TrCel5A, amorphous cellulose is hydrolyzed at a rate of 8.0 ± 0.1 s−1.611 Karlsson et al. previously reported a kcat of 65 s−1 for TrCel5A on cellopentaose.334 The disparity is likely not in the actual rate of hydrolysis, but as Gruno et al. point out, the differences may lie in the means by which hydrolysis was detected, namely formation of soluble sugars. In later years, accurate detection of cellulase kinetics has become a focus area of several research groups around the world.315,527,795 The means by which EGs and CBHs work together to deconstruct cellulosic substrates has intrigued the community for several decades. As described in section 4, the literature generally points toward an the endo/exo synergism model, wherein EGs serve to randomly hydrolytically cleave glycosidic linkages as a sort of preparation for the processive CBHs that rapidly cleave without dissociation.305 Nidetzky et al. set out to describe synergistic cellulase action between binary mixtures of cellulases on filter paper. The authors point out that a key reason behind many of the inconclusive and contradictory reports was likely due to insufficient enzyme purity clouding results. Paying close attention to this fact, the authors reported the complementary action of both CBH combinations and CBH/ EG combinations. A particularly effective combination was the addition of TrCel5A with the TrCel7A CBH enhancing overall activity more than 2-fold. Medve et al. confirm this finding for hydrolysis on Avicel at 40 °C, but the authors go on to suggest that TrCel7A and TrCel5A compete for binding sites so as to negatively affect synergism, with TrCel7A as the more effective competitor.796 This latter finding suggests selection of enzyme ratios, erring on the low side of EGs, may effectively minimize enzyme costs while maintaining endo/exo synergism.797 The role of EGs in endo/exo synergism, in particular TrCel5A, continues to be a major research focus today.798−801 As new approaches for describing cellulase kinetics become available, models describing endo/exo synergism re-emerge, challenging the way we think about processive cellulase action and the role EGs play.557 Thermal stability of an enzyme is essential to industrial relevance, as increasingly higher temperatures facilitate biomass decomposition. Intuitively, it would seem that enzymes secreted from the same native host, such as TrCel6A, TrCel7A, TrCel7B, and TrCel5A, would have similarly evolved thermal stabilities given exposure to the same environmental conditions and stresses. However, this is not necessarily the case as reported in 1992 by Baker et al.802 Using differential scanning calorimetry and complementary tryptophan fluorescence monitoring, the authors discovered that thermal deactivation of all four enzymes is related to thermal unfolding, but that the overall stability of

Figure 60. Differential scanning microcalorimetry of four T. reesei cellulases, with concentration of ∼1 mg/mL in cell. In the reproduced figure, CBHII is TrCel6A, CBH I is TrCel7A, EGI is TrCel7B, and EGII is TrCel5A. Reprinted with permission from ref 802. Copyright 1992 The Humana Press, Inc.

significantly more tolerant of adverse process conditions than any of its complements. We note that while a Tm of 75 °C is significant by comparison, TrCel5A is not considered a hyperthermophilic enzyme, and thus, protein engineering efforts tend to focus on further gains in thermal stability as we will discuss below. Protein stability in highly alkaline environments is also a favorable industrial attribute, particularly given the recent focus on ionic liquid pretreatment of biomass.91,803 Wahlström et al. investigated the effects of ionic liquids on TrCel5A and demonstrated that the ionic liquid environment had a markedly detrimental effect on hydrolysis.804 The authors report the CBM is particularly susceptible to the effects of ionic liquids as evidenced by side-by-side comparisons of hydrolysis both with and without a CBM domain. 8.2.3. H. insolens Cel5A. Few other GH5 fungal cellulases have been as well-characterized as TrCel5A. Motivated in part by the industrial applicability, H. insolens EGs have been catalogued and characterized, though sparingly. Prior to the initial report of cloning and sequencing,717 HiCel5A (EGII) was characterized as part of a set of H. insolens cellulases in studies focused on uncovering stereochemistry, specificity, and kinetics of several GH families.442,805 The stereochemistry of HiCel5A was indeterminate from the results of the Schou et al. study.442 However, a similar bacterial EG from family 5 (A at the time) was shown to follow an inverting stereochemical course in accordance with earlier reports of E. chrysanthemi EG Z.756 This account offered yet another clue that GH5 enzymes would all generally follow this stereochemistry.442 The characterization of HiCel5A by Schou et al. also suggested the enzyme active site consists of at least six subsites as affinity for cellohexaitol over cellopentaitol was significantly higher. This latter finding has yet to be structurally confirmed. Several years later, Schülein performed a follow-up study to identify pH activity sensitivity and ranges as well as to fill in the gaps regarding stereochemical course and kinetics.499 It had been noted previously that HiCel5A as well as several other H. insolens EGs exhibited specificity for CMC; thus, CMC was used 1393

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

1394

4

5 4

CelA

GH5-1 CelB2

5

5

5

Neocallimastix f rontalis MCH3 Neurospora crassa Orpinomyces joyonii

Egl2

Aspergillus oryzae

Escherichia coli

5

Cel5A

Cel5B

Pichia pastoris

Cel5A

6.6

8.5

5

3.5

45

40

75

30−70

CMC

CMC

CMC

CMC

CMC

5

E2

80

CMC

5

CMC

CMC

PASC

PASC

CMC

CMC

CMC

CMC

CMC

substrate for opt

EG

60

50−85

70

60

60

70

52

50

60

50

temp opt (°C)

CMC

5

6

4

4

6.5

4

5

pH opt

EG47

Cel5H

5

Myceliophthora thermophila

Fomitopsis palustris Fomitopsis pinicola Fusarium verticillioides Gloeophyllum trabeum Gloeophyllum trabeum Humicola grisea

Escherichia coli

EG

Cel5B

Basidiomycete CBS495.95

Daldinia eschscholzii (Ehrenb.:Fr.) Dictyoglomus thermophilum

Cel5A

EglB

Pichia pastoris

Basidiomycete CBS495.95

EglB

EglA

Pichia pastoris

5

Egl3

Pichia pastoris

Aspergillus nidulans Aspergillus niger

5

Egl2

Pichia pastoris

Aspergillus f umigatus Aspergillus f umigatus Aspergillus nidulans 5

Cel5B

GH 5 subfamily

Aspergillus aculeatus

enzyme

Cel5A

expression host

Acremonium sp. CBS265.95

organism

Table 11. Summary of Biochemical Characterizations of Fungal GH5 Cellulases substrate specificity

ref

CMC, Avicel CMC, barley β-glucan, lichenan, pNPC, pnitrophenyl-β-D-cellotrioside

CMC, Avicel, xylan, cellobiose, Glc3/Glc4/Glc5/Glc6 PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan CMC, Avicel, xylan, lichenan

Sun et al., 2011819 Qiu et al., 2000820

Fujino et al., 1998818

Takashima et al., 1997817 Vlasenko et al., 2010332

de Almeida et al., 2013814 Cohen et al., 2005815 Kim et al., 2012816

CMC, barley β-glucan, locust bean gum, xylan, filter paper, cello-oligosaccharides CMC, pNPC, xylan, PASC, Avicel

Yoon et al., 2007812

assay condition 50 °C and pH 5.0; 50% residual activity on CMC after 3 h at 70 °C and pH 5.0

suggest GtCel5A is a processive EG due to generation of cellobiose from Avicel

maintains 78% activity at 4 M NaCl; chimerical CBMs moderately improve specific activity toward CMC

Shi et al., 2013811

assay conditions pH 5.0; 103% residual activity on CMC after 3 h at 40 °C and pH 5.0 (7% at 60 °C and 2% at 80 °C) assay conditions pH 5.0; 28% residual activity on CMC after 3 h at 60 °C (5% at 80 °C)

assay condition 50 °C and pH 5.0; 100% residual activity on CMC after 3 h at 60 °C and pH 5.0 (15% at 70 °C)

activity stimulated by select divalent ions

Yoon et al., 2008813

CMC, filter paper

comments assay condition 50 °C and pH 5.0; 52% residual activity on CMC after 3 h at 60 °C and pH 5.0

Karnchanatat et al., 2008810

Vlasenko et al., 2010332

Vlasenko et al., 2010332

Li et al., 2012809

Chikamatsu et al., 1999808 Bauer et al., 2006487 Bauer et al., 2006487

Liu et al., 2011807

Liu et al., 2011807

Vlasenko et al., 2010332

Vlasenko et al., 2010332

CMC, Glc4/Glc5

CMC, pNPC, Avicel

CMC

barley β-glucan, locust bean gum, cellobiose, CMC, laminarin PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan filter paper, Avicel, CMC

CMC, cello-oligosaccharides

CMC, Avicel

CMC, filter paper, Avicel

PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan CMC, filter paper

Chemical Reviews Review

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

1395

EGII

Saccharomyces cerevisiae

Eg-1

Trametes hirsuta

Trichoderma sp. C-4 Trichoderma reesei Trichoderma viride Volvariella volvacea

Cel5A

Saccharomyces cerevisiae Saccharomyces cerevisiae Pichia pastoris

Aspergillus oryzae Escherichia coli

Saccharomyces cerevisiae

5 5 5

EGVIII

EG1

5

5

5

Cel5A

C4endoII

Cel5A

7.5

6

4.8

5

5

6

4.0−4.4

5

6

Cel5A

4

5.5

4.8

EglA

Escherichia coli

4

β-glucanase

EglA

5.1

Cel5A

5

4.0−5.0 4

5

5

5.0−9.0

4.8

55

60

50

50

70

70−80

80

50

50

45

70

70

60

40−50

60

60

3.4 4

70

50

temp opt (°C)

4

5.8

pH opt

EgG5

EG44/Cel5B

EG38/Cel5A

Escherichia coli

Pichia pastoris

Egl1

Pichia pastoris

EG

4

Cel5C 5

5

Cel5A

Saccharomyces cerevisiae Pichia pastoris

5

4

GH 5 subfamily

Cel5C

CelB29

enzyme

Aspergillus oryzae Escherichia coli

expression host

Piromyces rhizinf lata Piromyces rhizinf lata Talaromyces emersonii CBS 814.70 Thermoascus aurantiacus Thermoascus aurantiacus IFO9748 Thielavia terrestris

Orpinomyces joyonii Penicillium brasilianum Penicillium canescens Penicillium decumbens Penicillium decumbens Penicillium echinulatum Penicillium janthinellum Penicillium pinophilum KMJ601 Phanerochaete chrysosporium Phanerochaete chrysosporium Phialophora sp. G5 Piromyces equi

organism

Table 11. continued

CMC

CMC

CMC-Na

CMC

CMC

CMC

CMC

barley β-glucan

CMC

CMC

CMC

CMC

CMC

konjac glucomannan CMC

CMC

CMC

CMC

CMC

substrate for opt

Huang et al., 2009834 Ding et al., 2001835

CMC, cellobiose, Glc3, Avicel CMC, PASC, filter paper

Qin et al., 2008324

Vlasenko et al., 2010332 Nozaki et al., 2007832 Sul et al., 2004833

Hong et al., 2003831

Parry et al., 2002780

CMC-Na, Avicel, ball-milled cellulose, PASC

CMC, Avicel

PASC, Avicel, BC, CMC, mannan, galactomannan CMC, PASC, Avicel, kraft pulp

CMC

CMC, barley β-glucan, lichenan

Murray et al., 2001830

Tsai et al., 2003724

Eberhardt et al., 2000829 Liu et al., 2001776

Zhao et al., 2012828

CMC, barley β-glucan, galactoglucomannan, filter paper, Avicel CMC, barley β-glucan, lichenan, galactomannan, pNPC, xylan CMC, pNPC, barley β-glucan, lichenan, xylan, Avicel, filter paper CMC, pNPC, barley β-glucan, lichenan, xylan, Avicel, filter paper barley β-glucan, lichenan

CMC, Avicel, p-nitrophenyl-β-D-cellotrioside

Uzcategui et al., 1991827 Uzcategui, 1991827

Rubini et al., 2009824 Mernitz et al., 1996825 Jeya et al., 2010826

Liu et al., 2013

823

CMC, Avicel, p-nitrophenyl-β-D-cellotrioside

CMC, hydroxyethylcellulose, barley β-glucan, lichenan CMC, Avicel, cellulose, lichenan, laminarin, xylan, Glc4/Glc5

glucomannan, tamarind seed gum, CMC, barley β-glucan CMC, filter paper, short oligosaccharides

CMC, barley β-glucan, PASC

Chulkin et al., 2009822 Wei et al., 2010509

Krogh et al., 2009821

ref Qiu et al., 2000820

substrate specificity CMC, barley β-glucan, lichenan, pNPC, pnitrophenyl-β-D-cellotrioside CMC, Avicel

assay condition 50 °C

long linker may aid in moderate activity toward Avicel

putatively exhibits 5 binding subsites

strict preference for mixed linkages, markedly thermostable

characterization of CD only; narrow pH optimum range and relatively thermally unstable deletion of the N-terminal linker significantly affects thermal stability characterization of CD only

retained 40% activity at pH 2.0

strongly stimulated by calcium

sequence includes an Ig-like domain

suggest PdCel5A may be processive

comments

Chemical Reviews Review

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

to determine pH activity profiles. HiCel5A exhibited one of the broadest pH activity profiles out of the seven H. insolens cellulases examined maintaining greater than 60% activity over a pH range 5.5−9. At the optimal pH of 7.5, HiCel5A exhibits a turnover rate of 8 s−1. Schülein also confirmed HiCel5A acts via the inverting stereochemical mechanism in this study.499 HiCel5A shares nearly 50% identity with TrCel5A, which is a striking degree of similarity given the relatively low homology of GH5s in general. Both of these fungal cellulases belong to subfamily 5 and exhibit an N-terminal CBM. Schülein was the first to describe the existence of the HiCel5A CBM explicitly in the literature, though the sequence had been annotated as such.499,717 Schülein also concludes that on the basis of the sequence homology, HiCel5A very likely folds according to the canonical (β/α)8 motif of GH5s. The relative effectiveness of CMC hydrolysis by a handful of industrially relevant EGs was more recently examined.806 Karlsson et al. concluded that HiCel5A was the most active hydrolyzer of the family 5, 7, 12, and 45 EGs examined alongside TrCel7B.806 The authors also showed that HiCel5A was one of the least inhibited enzymes in terms of accepting hydroxyl substitutions on the CMC substrate. This suggests HiCel5A may be a bit more forgiving in acceptance of nonideal substrates, which is ideal in terms of cellulose deconstruction but not for molecular probe applications. 8.2.4. Other Fungal GH5s. Vlasenko et al. recently undertook a broad study of EG substrate specificity to identify family dependent specificities.332 The specificity of five fungal GH5 cellulases (Aspergillus aculeatus Cel5B, Basidiomycete CBS494.95 Cel5A, Basidiomycete CBS495.95 Cel5B, Myceliophthora thermophilia Cel5A, and Thielavia terrestris Cel5A) against a variety of substrates was examined, including pnitrophenyl substrates, CMC, PASC, Avicel, BC, pretreated corn stover, xyloglucan, wheat arabinoxylan, β-1,4-mannan, and galactomannan. Unsurprisingly, family 5 EGs appeared to exhibit a greater preference for PASC than for the other polymeric cellulose substrates. The EGs were also incapable of cleaving p-nitrophenyl substrates, potentially because binding of the small substrates to the long clefts (seven putative subsites) is too weak to induce activity. In comparing CMC and PASC activity levels across all the EGs, Vlasenko et al. suggested that the GH5 EGs appear to be more accommodating of Osubstitution by methyl groups and rely less on hydrogen bonding of the substrate for hydrolysis in comparison to family 6, 7, 9, 12, and 45 EGs, a finding that lines up well with those of Karlsson et al.806 All of the GH5 EGs known to have celluloytic activity were also remarkably effective at mannan degradation. It has been hypothesized that the GH5 family has resulted from divergence of a common ancestral gene,730,731 and this comprehensive specificity study provides evidence pointing toward divergent evolution where the ancestral gene exhibited mannanase activity. Many additional characterizations of optimum pH and temperature as well as substrate specificity studies accompanying gene cloning and sequencing efforts have been conducted. These results are summarized in Table 11.

enhancements have been attained through a variety of approaches including host engineering, heterologous expression, and site mutagenesis efforts. For the most part, efforts associated with GH5 protein engineering have targeted thermal stability improvements in industrially relevant cellulases. A wealth of data surrounding thermophilic and hyperthermophilic GH5 family members exists by which analogies can be made. To date, both rational design and directed evolution techniques have demonstrated moderate success.752,777,778,836,837 Recently, Liu et al. used directed evolution on a Clostridium phytofermentans GH5 to improve the half-life by 92% at 60 °C.778 The gain was made through identifying a triple point mutation, though the means by which this cluster contributed to stability was not described. As part of this same study, the authors also investigated effects of CBMs on GH5 activity noting that removal of native CBMs from the bacterial EG clearly had a detrimental effect on activity toward both regenerated cellulose and Avicel.778 Chimerical CBM and Ig domain engineering frequently negatively affected activity as well though. A similar approach toward engineering an uncultured bacterial GH5 EG resulted in a 7-fold increase in thermal stability with the final enzyme having six mutations and an appended non-native family 6 CBM.836 This latter CBMvariant GH5 EG exhibited improved activity toward Avicel. On the basis of this small sample of directed evolution/chimera engineering studies one can conclude directed evolution may be a helpful first approach toward uncovering locations contributing to thermal stability. However, the appending of non-native CBMs to the variant does not always result in hydrolytic gains toward increasingly crystalline cellulose substrates. Strategic implementation of disulfide bonds has also been shown to result in thermal stability gains in bacterial GH5 cellulases. Badieyan et al. performed MD simulations on 12 GH5 cellulases to develop a rules-based approach to engineering stability.752 Correlations with several protein structure dynamic factors were developed to identify which were more likely predictors of overall thermal stability. The authors confirmed a positive correlation between protein flexibility and optimum activity temperature existed within the subset of GH5s (Figure 61). Using this information, Badieyan et al. identified a particularly thermally susceptible region of C. thermocellum CenC. This happened to be the large subdomain insertion between the α6 helix and β6 strand shown in Figure 53. A new disulfide bond was introduced in this location with the intent of stabilizing the protein without affecting activity. Assays of the wild-type and disulfide-linked mutant with pNPC suggested the disulfide bridge consistently resulted in improved activities at elevated temperatures. The Tm of C. thermocellum CenC was improved by approximately 4 °C with the additional disulfide bond. As has been noted previously in the literature,675 the mutation actually had little effect on overall protein fluctuations, but rather reduced flexibility was sequestered to the subdomain region. This latter observation likely accounts for the maintained activity with the addition of the disulfide bond. Hyperglycosylation of fungal GH5 cellulases by heterologous expression in yeast has been shown to positively impact thermal stability. In two separate studies, TrCel5A was heterologously expressed in S. cerevisiae and P. pastoris.779,838 In each case, the half-life of TrCel5A at 70 °C was at least doubled as a result of the hyperglycosylation. Both Qin et al. and Samanta et al. note that hyperglycosylation did not detrimentally affect specific activity on CMC, as is sometimes a concern when non-native glycans block substrate binding.779,838 This approach to

8.3. Protein Engineering

As with the catalytic mechanism, many of the noted successes in engineering GH5 cellulases pertain to bacterial GH5 EGs. Briefly discussing some of the more pertinent advances from bacterial representatives, we focus on published approaches to engineering specificity, activity, and stability in GH5s. These 1396

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

hydrolyzed. Mutation of most of the identified residues abolished activity on both CMC and cellopentaose. However, the authors noted that one of the conserved aspartate residues upon mutation to alanine was unable to hydrolyze cellopentaose, but that activity on CMC was maintained. A follow-up assay indicated the D232A M. phaseolina Egl1 variant was capable of hydrolyzing cellohexaose. The results suggest that substrate specificity in GH5 cellulases can be modified through a single point mutation, potentially in the same fashion as the mutated aspartate was a conserved residue. Nevertheless, the benefit of increasing minimum EG substrate size with respect to industrial applications is unclear. Perhaps one of the more well-known instances of GH5 engineering is the significant activity improvement from a single active site mutation in A. cellulolyticus Cel5A.845 Baker et al. mutated a tryptophan implicated in product inhibition (Y245G) to alleviate stacking interactions slowing the rate at which the cellobiose product leaves the active site. The improvement in activity was 40% over wild-type as a result of the staggering 1480% reduction in affinity of the active site for cellobiose, as measured by the inhibition constant. Unfortunately given that this residue is not conserved either in type or similarity, it is difficult to directly define a lesson that may be applied in engineering fungal GH5s. In fact, TaCel5A already displays a glycine in the same location suggesting product inhibition in TaCel5A may have been addressed through natural evolutionary changes. TrCel5A displays an asparagine and may similarly be unaffected by product inhibition at this site. Finally, heterologous hosts capable of rapidly producing GH5s demonstrating near native activity are particularly useful for screening enzymes in the search for novel activities and properties. Nakazawa et al. describe successful recombinant expression of the key T. reesei EGs, Cel7B, Cel5A, and Cel12A.846 The authors report the first description of a prokaryotic host (E. coli) expressing all three EGs from T. reesei. The authors characterize the activity, stability, and specificity of each of the recombinant EGs, as they believe demonstration of properties similar those of the natively expressed EGs represents a potentially advantageous opportunity to use the E. coli host in high-throughput mutagenesis studies. The recombinant TrCel5A CD, the only GH5 of the three EGs, behaved remarkably similarly to TrCel5A expressed in T. reesei. The TrCel5A CD, targeting only β-1,4 glycosidic linkages in model substrates, maintained broad pH range of 4.5−6, with an optimum pH of 5.5 at 50 °C. The enzyme also maintained stability at 50 °C for 60 min, though activity dropped by 60% with a 10 °C increase. In general, this study offers a promising alternative host for high-throughput expression of TrCel5A mutants.

Figure 61. Using MD simulation of a large set of GH5 EGs from hyperthermophilic, thermophilic, mesophilic, and psychrophilic organisms, Badieyan et al. illustrated a correlation between protein flexibility, as measured by root-mean-square deviation (RMSD), and the optimal activity temperature (OAT) of the enzyme. The authors performed 10 ns MD simulations of each enzyme at three different temperatures. The correlation appears to be linear. Reprinted with permission from ref 752. Copyright 2011 Wiley Periodicals, Inc.

improved thermal stability is inherently dependent on the location of readily glycosylated serine and threonine residues at the surface. Thus, it is unlikely that this is a universal approach to engineering thermal stability in fungal GH5 cellulases. Individual enzymes will have to be characterized to ensure specific activity is not adversely affected in each case, and the extent to which hyperglycosylation aids in stability may vary. Stability of TrCel5A at extreme pH values is also a motivator of several protein engineering studies, the alteration of which would enable harsher alkaline biomass pretreatment conditions, a pretreatment option under intense development currently.839−842 Wang et al. describe the use of directed evolution to increase the pH optimum of TrCel5A.843 The approach identified a surface exposed asparagine, Asn321, that had been substituted by a threonine shifting the optimal pH of TrCel5A from 4.8 to 5.4. Further site-directed mutagenesis at this site, N231H, broadened the optimal range to pH 6.0. The authors conclude this conserved residue is a key determinant of TrCel5A pH stability. Thus, this mechanism of improved alkalineresistance may similarly apply to other homologous GH5s. Qin et al. later applied the same approach to identify additional residues contributing to the specific pH profile of TrCel5A. In this latter study, a series of charged, surface exposed residues were again shown to contribute to the pH optimum.324 Qin et al. tested several single point asparagine variants and three cluster variants, one of which interestingly included a substrate binding aromatic residue. The alkaline resistivity was improved to a maximum pH 6.2. The key takeaway message from both of these studies appears to be that charged residues at the surface of TrCel5A significantly affect pH stability. Notably, the gains in TrCel5A pH stability at increasingly higher pH do not yet reach those of alkaline levels, and significant surface residue remodeling is likely necessary to achieve such a goal. Site-directed mutagenesis has also been successfully implemented to alter substrate specificity and activity in fungal GH5s. After examining conserved residues in several GH5 sequences, Wang et al. identified several key residues putatively involved with substrate binding and catalysis in Macrophomina phaseolina Egl1.844 The wild-type enzyme was capable of hydrolyzing both CMC and cellopentaose, the shortest cello-oligomer able to be

8.4. Conclusions

GH5 enzymes represent one of the larger and more diverse groups of GHs with over 20 different hydrolytic modes of action and at least 51 different phylogenetic subfamilies. Though relatively few, the fungal GH5 cellulases have been wellcharacterized; the industrial relevance of these cellulases is paramount as this family contributes over 50% of fungal EG action in the T. reesei secretome.742 Structural and biochemical characterization efforts over the years have successfully elucidated the following features of GH5 cellulases: (1) All of the fungal GH5 cellulases discovered to date are EGs. 1397

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Table 12. Reported GH12 Crystal Structures source and original name in primary citation

PDB code

Pyrococcus f uriosus DSM3638 Cel12A

3VGI

Bacillus licheniformis DSM13 BlXG12

2JEM 2JEN 1HOB 2BW8 2BWA 2BWC 3B7M 1NLR 2NLR 1OA4 3AMH 3AMM 3AMN 3AMP 3AMQ 3O7O 3VHN 3VHO

Rhodothermus marinus ITI378 Cel12A

Streptomyces lividans 1376 CelB Streptomyces sp. 11AG8 Cel12A Thermotoga maritime MSB8 Cel12A

Aspergillus aculeatus KSM510 Xeg1 Aspergillus niger CBS 120.49 EglA Humicola grisea ATTC 22081 Cel12A

Hypocrea schweinitzii ATCC 66965 Cel12A Trichoderma harzianum IOC-3844 Cel12A Trichoderma reesei QM9414 Cel12A

3VL8 3VL9 1KS4 1KS5 1OLR 1UU4 1UU5 1UU6 1W2U 1OA3 4H7M 1H8V 1OA2 1OLQ

resolution (Å)

brief highlights

ref

Archaeal Structure 1.07 first archaea GH12 structure Bacterial Structures 1.78 apo structure 1.40 ligand complex structure 1.80 apo structure, first structure of a thermostable GH12 enzyme 1.54 apo structure 1.68 bound cellotriose 2.15 bound cellotetraose 2.10 thermostable variant enzyme 1.75 first GH12 structure 1.20 first GH12 ligand complex structure 1.50 structure of a thermophilic GH12 enzyme 2.09 wild-type apo structure 1.98 wild-type soaked with cellotetraose 1.47 E134C variant soaked with cellobiose 1.78 E134C variant soaked with cellotetraose 1.80 E134C variant cocrystallized with cellobiose 2.41 2.50 bound cellobiose ligand 1.93 Y61GG insertion Fungal Structures 1.90 1.20 bound xyloglucan 2.50 palladium complex 2.10 1.20 apo structure 1.49 cellobiose ligand 1.70 cellotetraose ligand with mixed linkages binding from −4 to −1 subsites 1.40 cellotetraose binding from −2 to +2 subsites by cellopentaose soak 1.52 thio-oligosaccharide ligand spanning active site 1.70 2.07 1.9 first fungal GH12 structure 1.5 A35V variant 1.7 P210C variant

(2) GH5 cellulases exhibit the ubiquitous TIM barrel fold. The generality of this structure allows for significant variation of the loops connecting the major structural elements and likely accounts for the wide variety of observed substrate specificities, thermal stabilities, and pH optima. (3) Relatively few GH5 fungal structures have been determined (only 4 thus far), and much of what is known regarding mechanistic function has been determined from bacterial representatives. (4) The catalytic active site of GH5 enzymes is defined by two sequence motifs, an Asn-Glu-Pro motif on the β4 strand that includes the catalytic acid and a Glu-Xxx-Gly motif on the β7 strand (where Xxx is typically an aromatic residue) that contains the complementary Glu nucleophile. The location of the catalytic acid and base includes GH5 cellulases in the 4/7 superfamily of enzymes. (5) Hydrolysis of the glycosidic linkage occurs through a twostep, retaining mechanism. The glycosylation step produces a glycosyl-enzyme intermediate resulting in glycosidic bond cleavage. A water molecule then conducts nucleophilic attack during the glycosylation step, which resets the enzyme to its initial catalytic state. Studies have

874 862 862 875 884 884 884 884 871 872 876 877 877 877 877 877 891 887 887 878 878 879 879 880 208 208 208 208 876 881 567 876 876

not yet convincingly determined the rate-limiting step in GH5 catalysis. (6) Perhaps as a result of the significant sequence diversity within the family, GH5 cellulases respond particularly well to protein engineering efforts. Significant gains in thermal stability have been made through the introduction of disulfide bonds and hyperglycosylation with little adverse impact to activity. (7) GH5 EGs naturally occur both with and without appended CBMs. The appendage of this module appears to be evolutionarily related to the environmental conditions (i.e., at high-substrate loading conditions, GH5s do not require CBMs for effective cellulolytic conversion). Bacterial GH5s have been instrumental in developing our understanding of the catalytic mechanisms of cellulases within this family, and a general understanding of the role these enzymes play in industrial biomass conversion is clear. Nevertheless, the role the constellation of loops connecting the major TIM barrel structural elements plays in determining specificity and thermal stability remains an outstanding question. Given the diversity of specificities as well as the range of thermal and pH tolerances within this family of 1398

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

1399

4.9 5

Cel12A

5.5

5.5

4.5

Cel3A

Cel12A

Cel12A

Cel12A

Cel12A

Aspergillus niger

Aspergillus niger

Cel12A

Aspergillus niger

Gloeophyllum trabeum ATCC 11539 Humicola grisea ATCC 22081

3.5−4.0

50

52

50

45−55

80

CMC, Avicel, PASC, Glc3/Glc4/Glc5/Glc6 CMC, PASC, Avicel, Glc4/Glc5, barley β-glucan, glucomannan, filter paper

β-glucan

PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan CMC, amorphous cellulose, xylan, mannan, filter paper

azurine cross-linked hydroxyethyl cellulose, oNPC

azurine cross-linked hydroxyethyl cellulose, oNPC

CMC, Lichenan, barley β-glucan, Glc6, pachyman, laminarin, xyloglucan, glucomannan CMC, PASC, filter paper, Glc2/Glc3/Glc4/ Glc5/Glc6

CMC

oNPC

oNPC

CMC

CMC

o-nitrophenyl-β-D-cellobioside (oNPC)

CMC

CMC

Karlsson et al., 2002334

Ishihara et al., 2005899

assay conditions pH 5 and 37 °C Henriksson et al., 1999898

retained >50% activity at pH 7, and 25% at pH 4; at pH 5, retained 90% activity at 60 °C, and 90% activity pH 5−7.7 Ascomycota

Zygomycota; retained >80% activity after 1 h at 80 °C, and >50% for 4 h at 70 °C Ascomycota; retains 12% activity at 80 °C

Moriya et al. 2003926 Koga et al., 2008927 Wonganu et al., 2008928 Shulein et al, 2005,922 Vlasenko et al., 2010332

Zygomycota

Murashima et al. 2002925

Eberhardt et al., 2000829

CMC, PASC, barley β-glucan, lichenin

Ascomycota Zygomycota

Murashima et al. 2002925

Zhao et al., 2012723 Shimonaka et al., 2004924

CMC-Na, barley β-glucan, Avicel, filter paper CMC

CMC, Glc33, Glc4/Glc5/Glc6

Igarashi et al, 2008908

PASC, CMC, lichenan, barley β-glucan, glucomannan

Ascomycota; subfamily B; 90% relative activity retained at 70 °C and 30% at 80 °C Basidiomycota; subfamily C

max activity at 70 °C, though dropped to 10% after 20 min; stable for at least 1 h at 65 °C Zygomycota

Liu et al., 2010

921

Baba et al., 2005906

Zygomycota

Ascomycota; retains 22% activity at 70 °C Ascomycota; retained near 50% activity at pH 10; retained >60% activity at 70 °C Zygomycota

Shulein et al, 2005,922 Vlasenko et al., 2010332 Miettinen-Oinonen et al., 2004,502 Szijártó et al., 2008503 Baba et al., 2005906

Dalboege et al., 2007,923 Vlasenko et al., 2010332

Takashima et al., 2007915

Takashima et al., 2007915

Basidiomycota; retains 25% activity at 70 °C Ascomycota; retains 40% activity at 80 °C ascomycota; retained >75% activity at 80 °C for 10 min; stable at pH 3−12 at 4 ° C for 20 h Ascomycota; Retained >75% activity at 80 °C for 10 min; stable at pH 3−12 at 4 °C for 20 h Ascomycota; retains 83% activity at 80 °C

Shulein et al., 2005,922 Vlasenko et al., 2010332 Vlasenko et al., 2010332

comments Ascomycota

ref Emalfrab et al., 2003715

konjac glucomannan, PASC, CMC-Na, barley β-glucan, xylan, Avicel

CMC, Avicel

CMC, Avicel

CMC, PASC, Avicel, BC, pretreated corn stover, xyloglucan, xylan, arabinoxylan, mannan, galactomannan, azurine crosslinked hydroxyethylcellulose CMC, PASC, Avicel, BC, pretreated corn stover, xyloglucan, xylan, arabinoxylan, mannan, galactomannan CMC; hydroxyethylcellulose

CMC, Avicel

PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan PASC, Avicel, BC, pretreated corn stover, CMC, xyloglucan, xylan, arabinoxylan, mannan, galactomannan CMC, Avicel

CMC, filter paper, β-glucan, Avicel, xylan

Chemical Reviews Review

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Ascomycota; retains 12% activity at 80 °C a

Cel45A Volutella colletotrichoides

All GH45 enzymes belong to subfamily A unless otherwise noted in the comments.

CMC, PASC, Avicel, BC, pretreated corn stover, xyloglucan, xylan, arabinoxylan, mannan, galactomannan

Egl1 Ustilago maydis

comments

catalytic center. There is no obvious counterpart to the catalytic base of GH45 subfamily A and B, but the catalytic acid and surrounding residues of GH45s are conserved in the ThrXxxTyr/Phe and HisXxxAsp motifs in the β1 and β5 strands (Thr6, Tyr8, His119, and Asp121 in HiCel45A; Figure 69), both in expansins and swollenins and also in other classes of proteins recently discovered in fungi as discussed below. The conservation of the catalytic acid and its surroundings suggests that the machinery for protonation of a glycosidic oxygen may play an important role in their mechanism of action. Expansins were first discovered and characterized by Cosgrove and co-workers as proteins responsible for cell wall extension.931 In one of the original functional studies of expansins, McQueen-Mason and Cosgrove demonstrated that these proteins were able to mechanically weaken filter paper.931 Soon after, the same authors showed that the addition of hemicellulose greatly improved the binding of expansins to crystalline cellulose, suggesting that the proteins might target cellulose and hemicellulose simultaneously.932 Many studies have followed these original reports related to the potential mechanisms of expansins, including structure and function studies,933−938 hints regarding the molecular mechanism of expansin action in plant cell wall expansion,930,939−942 and their ability to synergize with GHs in cellulose or biomass hydrolysis. We note that observations of synergy with cellulases has been significantly mixed, with studies ranging from little to no synergy to those suggesting substrate disruption, the latter often in the absence of quantitative results or with only minor increases in synergy when substrate conversion is quantified.943−947 More recently, expansins have been shown to synergize more effectively with xylanases.948 Multiple families of expansins have also been characterized on the basis of available sequence data with the primary expansins studied to date termed α- and βexpansins.308,949−951 The nomenclature of expansins has been described by Kende et al.952 Expansins are vital to and ubiquitous in plants. Plant genomes typically contain on the order of 30 or more expansin genes, the majority of which are αexpansins. Expansin-like proteins are also found in prokaryotes and various nonplant eukaryotes suggesting that they are likely important for alternate functions in nonplant organisms, such as in root colonization.953−955 The protein structures of three expansins are currently available: two group-1 pollen allergen plant β-expansins from Phleum pratense (1N10, Major Timothy grass pollen allergen Phl P1; Fedorov, Ball, Leistler, Valenta, Almo, unpublished) and Zea mays (ZmEXPB1; 2HCZ),938 and the bacterial EXLX1 from Bacillus subtilis (3D30).937 The structures and sequence comparisons reveal a canonical two-domain fold of expansins. An N-terminal double-psi β-barrel domain of around 110−120 residues, homologous to the GH45 catalytic module, is packed against a C-terminal Ig-like β-sandwich domain of ∼100 residues; the domains are aligned to form a ∼60 Å long putative polysaccharide-binding surface (Figure 73). Recent structures of the bacterial EXLX1 in complex with β-1,3/1,4glucan and cellulose oligosaccharides show a new type of ligandmediated dimerization, where the oligosaccharide is sandwiched between the Ig-like domains of two EXLX1 proteins in opposite polarity.935 Both Yennawar et al.938 and Kerff et al.937 provide detailed analyses and discussions of structure and sequence similarities and differences between expansins, GH45s, and other related proteins. Among GH45 structures, MeCel45 is most closely related to expansins,937 and GH45 subfamily B shows higher sequence similarity with expansins than subfamily

Schauwecker et al, 1995929 Shulein et al, 2005,922 Vlasenko et al., 2010332

Karlsson et al., 2002334

ref substrate specificity

CMC, PASC, Avicel, Glc3/Glc4/Glc5, barley β-glucan, glucomannan, filter paper CMC β-glucan 60

substrate for opt pH opt

5 Cel45A

organism

expression host

enzyme

Review

Trichoderma reesei

temp opt (° C)

Table 15. continued

Ascomycota; retained 70% of activity at pH 5 and 40−70 °C; subfamily B Basidiomycota

Chemical Reviews

1413

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 73. GH45 subfamily B is more similar to expansins than subfamily A. (A) Surface representation of the structure of HiCel45A (4ENG) in subfamily A, with catalytic residues Asp10 and Asp121 in red and Trp18 at subsite −4 in green. (B) Surface of MeCel45A (1WC2) in subfamily B with corresponding residues highlighted, Asp24, Asp131, and Trp64. (C) Surface of β-expansin EXPB1 from Zea mays (2HCZ), with residues Asp107 and Trp194 highlighted. (D) Superposition of MeCel45A (1WC2; green) with Asp24, Asp132, and Trp64 shown, and expansin EXPB1 (2HCZ; purple) with residues Asp107 and Trp194 shown. (E) Close-up of the active site of HiCel45A (blue stick), MeCel45A (green stick), and ZmEXPB1 (magenta stick) showing the tryptophan residue at subsite −4 (Trp18/64/194), the proposed catalytic base (Asp10/24), and the catalytic acid (Asp121/132/ 107) surrounded by conserved tyrosine, threonine, and histidine residues. The two cellotriose ligands in the 4ENG structure (white carbon atoms) are included in panels D and E.

of the GH45 domain, in particular, β1/β2 and β4/β5 (Figure 69), indicate that the putative sugar-binding surface may be more enclosed in swollenins, or that they may form an additional domain in analogy with EcMltA discussed below. So far, swollenins have only been found in a limited number of ascomycete fungi (11 species in UNIPROT). In addition to TrSWO1,956 disruptive effects on cellulose and increased adsorption and enhanced activity of cellulases have also been demonstrated for swollenins from Neosartorya f umigata (formerly Aspergillus f umigatus),957 Trichoderma asperellum,958 T. pseudokoningii,959 and Penicillium oxalicum.946 Interestingly, Wang et al. could produce bioactive T. asperellum SWO recombinantly in E. coli by using a cellulose-assisted refolding and purification process.958 They also devised a method to quantitatively measure the disruption activity of the swollenin, by using a bacterial GH5 EG with negligible activity on untreated crystalline cellulose. More recently the effects of TrSWO1 on pretreated corn stover were investigated.960 The swollenin primarily disrupted the hemicellulose fraction and

A. The catalytic center motifs ThrXxxTyr and HisXxxAsp are conserved, but some loops are shorter in the expansins making the putative sugar-binding surface even more shallow and nearly flat, similar to LPMOs as will be discussed in section 11. This may indicate that expansins act in very close proximity to an insoluble substrate surface. The name “swollenin” was coined by Saloheimo, Pentillä, and co-workers at VTT, Finland, upon discovery of a new protein, SWO1, from T. reesei that shows structure disruption effects on cotton fibers and filter paper.309 The expression of TrSWO1 was found to be highly upregulated under cellulase-inducing conditions pointing to an important role in biomass degradation. No structure of any swollenin is known to date, but sequence similarities indicate that the C-terminal half of the protein is homologous to expansins with one GH45-like and one Ig-like domain. In addition, the 475-residue TrSWO1 protein contains an N-terminal family 1 CBM followed by a typical Ser/Thr-rich linker, and a region/domain of unknown structure and function. Several insertions between the β-strands 1414

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Asp308 in EcMltA resulted in complete loss of activity.963 Interestingly, though, the mechanism of EcMltA differs from that of GH45s. The EcMltA enzyme cleaves β-1,4-glycosidic bonds in peptidoglycan with a nonhydrolytic mechanism where the 6-hydroxyl is joined to the anomeric carbon to form nonreducing 1,6-anhydro-muropeptides. These lytic transglycosylases thus offer an example of the utilization of the same constellation of amino acids in a nonhydrolytic mechanism, although analogous reaction products have not been demonstrated in GH45s, expansins, or swollenins. Inspired by the alternate mechanism of EcMltA, the apparent lack of a catalytic base, and the notion of a surprisingly long distance over the −1 subsite in HiCel45A,904 we speculate that, at least in some cases, a mechanism might be employed that utilizes strain in the polysaccharide concomitantly with protonation of the glycosidic oxygen as means for glycosidic bond cleavage. Perhaps this may be combined with a nucleophilic attack on the anomeric carbon by a hydroxyl group on another polysaccharide resulting in lytic transglycosylation with inversion of the anomeric configuration and without formation of new reducing ends. Such a mechanism would make sense for expansins in particular, since the activity would be targeted to polysaccharides under tension and would allow for rearrangements in the polysaccharide network without compromising the integrity of the cell wall. Finally, we conclude that conservation of the catalytic acid and its environment seems to be a recurring theme, suggesting that protonation and glycosidic bond cleavage are involved also in the GH45-related proteins. However, the molecular mechanism for GH45 subfamily C, expansins, swollenin, loosenin, and other related proteins remain to be elucidated.

released substantial amounts of oligomeric and monomeric sugar. With respect to cellulose hydrolysis, synergism was small with the CBH TrCel7A or the EG TrCel5A, whereas pronounced synergism in terms of xylose yield was observed with TrCel5A as well as the xylanases TrXyn10A or TrXyn11A, the latter releasing about 3 times more xylose when combined with TrSWO1. A new type of GH45-related fungal protein, “loosenin” from the basidiomycete B. adusta (BaLOOS1), was discovered and characterized recently.961 BaLOOS1 binds strongly to both cellulose and chitin, affects the morphology of cotton fibers, and enhances the activity of cellulases, but did not exhibit detectable hydrolytic activity when acting alone. Fibers of Agave tequilana bagasse, known for high recalcitrance, were hydrolyzed 7.5 times faster by a cellulase/xylanase cocktail after BaLOOS1 pretreatment. The protein is only 109 amino acid residues in length and consists of a sole GH45-like domain, presumably similar in structure to that of expansins. The ThrPheTyr sequence is present in strand β1 and the catalytic acid appears to be conserved in strand β5, although the HisXxxAsp motif is replaced by AspLeuAsp (Figure 69). A protein-BLAST search returned over 200 hits from fungi with about 40% or higher sequence identities to BaLOOS1, suggesting that loosenin proteins are ubiquitous in fungi. Yet another group of fungal proteins related to GH45 and expansins is represented by EnCelA from Emericella nidulans (or A. nidulans) in the sequence alignment in Figure 69. The CelAencoding gene is constitutively expressed in all developmental cycles of the fungus, but the protein is exclusively present in the cell walls of conidiospores.962 We note that the authors use the name EglD, while the sequence shown in the article is identical to that of EnCelA (Q5AVE5_EMENI) and different from that of E. nidulans EglD (Q5BCX8; EGLD_EMENI) in the Uniprot database. Investigations of the influence of gene expression on morphogenesis, growth, and germination of conidia indicate that the protein may be involved in cell wall remodeling during spore germination. The protein was extracted from conidiospores and assayed in vitro for CMCase activity, but no degradation was detected.962 The GH45-like domain is appended by a region of ∼95 residues, similar in length as in expansins, but it is uncertain whether it folds into a similar Iglike domain. Furthermore, the GH45-domain is preceded by a region of ∼150 residues rich in Ser and Thr, which may serve as an O-glycosylated linker of a cell wall anchor. Many members lack this N-terminal region and may represent nonanchored homologues. However, if they may play any significant role in lignocellulose degradation, this appears to be unknown to date. Several of the sequences are annotated as “cellulase” but without reference to biochemical evidence. Significant sequence similarity is also recognized between GH45s and a family of lytic transglycosylases, bacterial enzymes involved in cell wall peptidoglycan processing. EcMltA from E. coli shows 31% and 23% sequence identity with MeCel45A and HiCel45A, respectively. The crystal structure of EcMltA reveals a two-domain structure where the larger A domain overlaps closely with that of GH45s (2AE0).963 However, EcMltA has a ∼140 residue long insertion between β1 and β2 of the canonical double-psi β-barrel topology, which folds into an additional βbarrel B domain. A large glycan-binding groove is formed at the interface between the two domains. Again, the catalytic center is strikingly similar to that of GH45s. The catalytic acid-associated motifs (ThrGlyTyr and HisPheAsp in EcMltA) are highly conserved within lytic transglycosylase family 2, and mutation of

10.4. Conclusions

GH45 cellulases are clearly industrially relevant proteins given their inclusion in commercial detergent and conditioning formulations. However, of all GH families of fungal cellulases, GH45 are by far the least well characterized. Accordingly, insight into both their structure and function is relatively limited. Here, we generalize the findings discussed above for GH45s: (1) GH45 CDs exhibit a six-stranded, double-psi β-barrel fold, which is among the smallest carbohydrate active enzyme protein folds. The small size may aid in access to small pores in the substrate. (2) Phylogenetic analysis suggests three different subfamily classifications (A, B, and C) may exist and are independent of kingdom. Currently, there is no structural representative from subfamily C. (3) The enzymes exhibit a wide range of cell wall polysaccharide specificities and utilize an inverting hydrolytic mechanism for turnover. (4) The inverting mechanism is not well-characterized in any GH45; however, in HiCel45A, the ligand bound structure gives rise to the hypothesis that elongation of the substrate in the transition state may be required for hydrolysis.918 (5) Loop regions connecting the central β-strands contribute to specificity and ligand binding. Conformational changes of these loops upon binding have been observed and may be a part of the subfamily A hydrolytic mechanism. Subfamily B has a more open active site as a result of loop deletions relative to subfamily A. (6) GH45s are structurally similar to a host of other proteins including expansins, swollenins, loosenins, and lytic 1415

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Table 16. Reported LPMO Crystal Structuresa source and original name in primary citation

PDB code

resolution (Å)

ion charge

brief highlights

ref

Fungal LPMO Structures first GH61 structure reported Zn-bound GH61 structure; noted similarities with aromatics on surface to family 1 CBMs Mg-bound GH61 structure demonstrated copper was the active metal and observed N-terminal histidine N-methylation Cu(NO3)2-soaked GH61 to confirm copper binding

Trichoderma reesei GH61B Thielavia terrestris GH61E

2VTC 3EII

1.60 2.25

Ni2+ Zn2+

Thielavia terrestris GH61E Thermoascus aurantiacus GH61 Thermoascus aurantiacus GH61 Phanerochaete chrysosporium GH61D Neurospora crassa LPMO-2 Neurospora crassa LPMO-3

3EJA 2YET

1.90 1.50

Mg2+ Cu2+

3ZUD

1.25

Cu2+

45BQ

1.75

Cu2+

first basidiomycete GH61 structure reported

993

4EIR 4EIS

1.10 1.37

Cu2+ Cu2+

995 995

Aspergillus oryzae LPMO (AA11) Aspergillus oryzae LPMO (AA11)

4MAH

1.55

Zn2+

first confirmed type 2 LPMO structure reported with activity for the C4 carbon first confirmed type 3 LPMO structure reported with activity for the C1 and C4 carbons first reported AA11 structure. Zn cation in the active site

4MAI

1.40

Cu2+

first reported AA11 structure; Cu(I) cation in the active site

1024

Serratia marcescens CPB21 Serratia marcescens CPB21 Enterococcus faecalis V583 CBM33A Serratia marcescens CPB21 Bacillus amyloliquefaciens CBM33 Bacillus amyloliquefaciens CBM33 Bacillus amyloliquefaciens CBM33 Enterococcus faecalis CBM33A

2BEM 2BEN 4A02

1.55 1.80 0.95

N/A N/A N/A

2LHS 2YOW

n/a (NMR) 1.80

N/A N/A

NMR structure; demonstrated that CBP21 is a rigid macromolecule apo structure of CBM33 enzyme

990 991

2YOX

1.90

Cu1+

Cu-bound structure with a Cu(I) ion; reduction was induced by X-rays

991

2YOY

1.70

1+

Cu

Cu-bound structure with a Cu(I) ion; reduction was induced by ascorbate

991

4ALC

1.49

Cu2+

998

Enterococcus faecalis CBM33A Streptomyces coelicolor LPMO10B Streptomyces coelicolor LPMO10C

4ALT 4OY6

1.49 1.29

Cu2+ Cu2+

first reported CBM33/LPMO from family AA9 with a +2 copper oxidation state collected by a specialized diffraction method photoreduced CBM33/LPMO from family AA9 with a +1 copper oxidation state one of two first cellulose active bacterial (AA10) LPMO structures solved

998 991

4OY7

1.50

Cu1+

second of two first cellulose active bacterial (AA10) LPMO structures solved

991

a

Nonfungal LPMO Structures first family 33 CBM structure reported Y54A mutant of CBP21 apo structure of a CBM33 enzyme

337 967 967 51 51

1024

972 972 980

Both fungal and non-fungal LPMOs are included.

decrystallization of individual chains requires thermodynamic work.130−132,580,964 Over 60 years ago, Reese and co-workers speculated that nature likely employs additional mechanisms to disrupt the crystalline lattice of cellulose.291 To this end, a discovery was made in 2010 that verified this hypothesis and overturned the conventional cellulose depolymerization paradigm.48 Namely, the characterization of a new family of biomassdegrading enzymes, now referred to as LPMOs,42,50,965 revealed a completely new mechanistic approach for cleaving glycosidic bonds in cellulose (and chitin). It was shown that LPMOs can be quite synergistic with the traditional GH enzyme cocktails described above, even before the basis for their mechanisms was reported.966,967 These enzymes were previously characterized either as family 33 CBMs or family 61 glycoside hydrolase (GH61) cellulases from nonfungal and fungal origin, respectively. LPMOs are now understood to be ubiquitous in nature, to be highly upregulated during biomass degradation, and to exhibit significant diversity both within and between biomass-degrading organisms.60,965,968,969 Given that LPMOs are emerging as key components of cellulase cocktails, likely due to their ability to create chain breaks in crystalline regions of cellulose where EGs would not be able to productively bind to the substrate, here we briefly review the discoveries thus far in this new field. We note that in

transglycoyslases, many of which have conserved major components of the GH45 active site. Much remains to be accomplished toward understanding GH45 enzymes and their hydrolytic mechanisms. The current dearth of information cannot even definitively address the question of consistent mechanisms across subfamilies. Even the simplest of questions such as the identity of the catalytic base is likely to remain unaddressed for some time given the difficulty in attribution this role in inverting enzymes (see GH6 discussion). Nevertheless, these enzymes and the GH45-like counterparts play a role in both cell wall deconstruction and generation, suggesting further examination of this interesting family is warranted both for their potential in industrial applications and for understanding their role in natural biomass-degrading contexts.

11. LYTIC POLYSACCHARIDE MONOOXYGENASES The structures of ligand-bound GHs described above marked a turning point in our collective understanding of plant cell wall degradation starting in the 1990s that has enabled more than two decades of intense mechanistic research and development. However, as discussed above, cellulose and similar polysaccharides pack tightly into dense crystalline lattices on which GHs must act. Given the inherent recalcitrance of cellulose, 1416

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 74. S. marcescens CBP21 structure and initial mechanistic discoveries. (A) The overall structure of S. marcescens CBP21 with the active site residues highlighted in stick format.972 PDB code: 2BEM, chain C. His28, Glu60, Ala112, His114, and Phe187 are shown in stick format. The coordination distances to the ion in the structure to the histidine residues are highlighted. (B) The S. marcescens CBP21 active site.972

histidine residues (His114) of CBP21 was mutated, which resulted in complete elimination of the synergistic benefits of action.966 A later report from Moser et al.973 additionally demonstrated that two CBM33s from T. f usca (denoted E7 and E8) were also able to synergize with S. marcescens chitinases in the depolymerization of β-chitin and to a lesser extent with T. f usca cellulases at low concentrations on filter paper assays.973 These reports marked the first in-depth biochemical and structural studies of CBM33 proteins and, especially in the case of S. marcescens CBP21, hinted at an unknown, powerful mechanism for disrupting crystalline polysaccharide substrates.966,972,974 In 2010, Harris and co-workers reported an in-depth study of GH61 enzymes from the fungi T. aurantiacus and T. terrestris.967 Before the study from Harris et al., GH61s were thought to exhibit weak EG activity, for example on CMC,487,975−977 and their considerable synergy with the T. reesei secretome had only been briefly reported.978 Harris and co-workers demonstrated that either GH61 enzyme added to a native T. reesei cellulase cocktail at low concentrations (4 mg GH61/g cellulose) significantly enhanced the T. reesei enzyme cocktail performance on pretreated corn stover (Figure 75).967 Indeed, the authors report a reduction in overall protein loading was possible by a factor of 2 if a GH61 enzyme was added, which is a reduction in loading with significant cost savings implications for enzymatic hydrolysis of pretreated biomass.21−23,967 Intriguingly, the authors also showed that there was no synergy on various clean cellulose substrates (e.g., Avicel), which at the time was hypothesized as a mechanism of GH61 action on either hemicellulose or lignin. The T. terrestris GH61E enzyme structure was solved (Figure 76), which was the second representative structure from this family (the first structural member was T. reesei GH61B (TrGH61B) reported two years earlier337). In both the structural reports from Harris et al.967 and from Karkehabadi et al.,337 it was speculated that GH61 enzymes are not GHs due to the lack of conserved, carboxylate pairs in sufficient proximity required for GH action.169,177 Karkehabadi et al. also noted a structural similarity to S. marcescens CBP21.337 In the two first GH61 structures solved, similar to CBM33s,972 both exhibit a metal-binding site coordinated by two histidine residues on the flat face of the protein. Harris et al. both varied and removed the metal altogether with EDTA, and demonstrated that the presence of a

this section we describe research into both fungal and nonfungal LPMOs as the mechanistic developments to date are closely linked. Unlike previous sections, we primarily follow a chronological description of the scientific developments and discoveries, as this field of cellulase research is quite nascent. At the end of the section, we then summarize the current state of the knowledge and several interesting future directions. Despite significant progress, there is still much to learn about the mechanisms employed by these fascinating, newly discovered enzymes. Lastly, the nomenclature related to these enzymes has evolved considerably and likely will continue to do so for some time.42,50,152 Although all enzymes described in this section are LPMOs, we generally refer to the enzymes as was done historically (e.g., we refer to them as “GH61” or “CBM33”, described below, when discussing studies before the term LPMO was coined). We note that the term LPMO is now reasonably well accepted and conveys the salient details of their currently reported function. Remarks on the likely nomenclature for these enzymes is provided at the end of this section. Additionaly, descriptions of structures discussed in this section are provided in Table 16. 11.1. Initial Discoveries of Oxidative Function

Prior to 2005, family 33 CBMs from insect viruses had previously been shown to bind to chitin, but no mechanistic details were revealed.970,971 In 2005, Vaaje-Kolstad et al. reported the first family 33 CBM structure from the chitinolytic bacterium Serratia marcescens (Table 16).972 The protein was termed “chitin-binding protein 21” (CBP21), where the “21” denotes the fact that the protein is 21 kDa.966,972 The overall 3D structure of CBP21 was characterized as a “budded” fibronectin type III fold (Figure 74). In the same study, mutations of residues on the putative chitin-binding surface were conducted, all of which were shown to reduce the binding affinity to the substrate.972 The CBP21 structure harbors a metal ion between the conserved N-terminal histidine (His28) and another conserved histidine (His114), but its function was not initially realized (Figure 74). Shortly after, the same group published a study wherein they systematically demonstrated that CBP21 dramatically boosts β-chitin conversion in the presence of any of the three native GH18 chitinases from S. marcescens (two chitobiohydrolases, one endo-chitinase) or in mixtures with all three GH18 chitinases present.966 One of the conserved 1417

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

spectrometry experiments with isotopically labeled oxygen atoms in either 18O2 or with H218O, the authors demonstrated that the aldonic acid functionality incorporated one oxygen atom from O2 and one from H2O (Scheme 3). This led the Scheme 3. Overall Reaction for S. marcescens CBP21 Action on Chitina

a

Separate mass spectrometry experiments with 18O-labeled molecular oxygen and water demonstrated that S. marcescens CBP21 action involves a tandem oxidation−hydrolysis mechanism.48

Figure 75. Comparison of cellulase activity after 75 h of the specified T. reesei cellulases alone (light gray) or with T. aurantiacus GH61A included at 10% of the total protein loading of 2.5 mg/g of cellulose (dark gray). The T. reesei mixtures tested comprised ratios of 4.5:2.5:1.0 for Cel7A:Cel6A:Cel7B.967

authors to postulate a mechanism wherein both oxidation and hydrolysis occur as a result of CBP21 action. Another noteworthy observation from this work is that cyanide significantly inhibits CBP21 action. As cyanide is a wellknown inhibitor of O2 binding to metal centers in enzymes and catalysts, this suggests that molecular oxygen binds directly to the histidine-coordinated metal center in the enzyme. Additionally, superoxide dismutase does not inhibit the reaction significantly implying that the metal center in the oxygen-bound form of the enzyme is shielded from bulk solvent. Overall, these results suggest that CBP21 acts directly on the surface of βchitin, to which it is known to bind with high affinity,966,972 and that molecular oxygen binds to the metal center primarily when the enzyme is actively engaged on the substrate for oxidation.48 It was noted that, given the similarity in structure and metalbinding site between CBM33s and GH61s, these enzymes may employ similar chemical mechanisms.48 This seminal study marked a turning point in biomass conversion research in that the conventional, nearly universal hydrolytic paradigm was overturned; a drastically different mechanism in the cleavage of glycosidic bonds in polysaccharides was revealed.48 It is now generally hypothesized that these oxidative enzymes produce chain breaks in crystalline regions of cellulose (and chitin),

divalent metal cation is essential for the observed performance enhancements. Mutation of both of the conserved, ligating histidine residues to alanine completely abolished activity as measured by synergistic performance with a T. reesei cocktail on pretreated corn stover.967 Lastly, Harris et al. noted a similarity in the orientation and spacing of three tyrosine residues on the flat face of T. terrestris GH61E to that of family 1 CBMs, perhaps suggesting the putative GH61 binding face to the cellulose surface (Figure 76).346,351,352,355,362,967 Taken together, these studies showed that GH61s and CBM33s share similar structural functionalities and that the metal coordination was important for synergism with GHs.337,966,967,972 The major breakthrough in understanding the mechanistic basis of these enzymes was reported in 2010 when VaajeKolstad et al. observed that CBP21 can depolymerize β-chitin alone in the presence of a reducing agent and molecular oxygen pointing toward an oxidative mechanism for CBP21.48 Specifically, oligomers were released from β-chitin that exhibited an aldonic acid functionality at the C1 carbon. Using mass

Figure 76. T. terrestris GH61 enzyme and active site.967 PDB code: 3EII (A) An overall structural view is shown with active site residues highlighted in yellow stick. The aromatic residues forming the putative cellulose binding face of the enzyme are shown in green stick. (B) The active site representation illustrates the coordination distances of the zinc ion (gray sphere) to His1, His68, and two water molecules shown in red sphere. 1418

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 77. T. aurantiacus GH61 structure. (A) Overall structure of the T. aurantiacus GH61 (LPMO) with the active site residues shown in stick format.51 PDB code: 2YET, chain A. His1, His86, Gln173, Tyr175 shown in stick format. (B) Active site of the T. aurantiacus GH61 (LPMO) enzyme.51 Note that the Nε2 atom of the N-terminal histidine exhibits a methyl group of unknown function. The two red spheres coordinating the copper ion are water molecules. The coordination geometry of the copper ion in this enzyme is octahedral, indicating a +2 oxidation state.

electron paramagnetic resonance (EPR) spectroscopy, crystallographic studies, and activity assays to demonstrate that copper is the active metal.51 They found that copper binds very strongly to the metal-binding site (the Kd was shown to be tighter than 1 nM),51 and they also observed a methylation at the Nε2 atom of the N-terminal histidine residue in the enzyme active site (Figure 77). This histidine N-methylation was also present in the previously reported T. terrestris and T. reesei GH61 structures,337,967 but was not described in the papers describing the structures. In unrelated systems, this post-translational modification has been shown to modify copper-binding systems,988 perhaps by modulating the binding affinity of copper via a pKa shift of the histidine residues989,990 or by modifying the overall electronic structure of the system such that the oxygen activation and reactivity is altered.989−991 Overall, the active site geometry in the T. aurantiacus GH61 structure appears to harbor a Cu(II) cation and displays classic Jahn−Teller distortion with elongated coordination bonds in the axial positions. In the equatorial plane, the copper ion is ligated in a bidentate fashion to the N-terminal histidine and monodentate to a second histidine residue and a polyethylene glycol hydroxyl group, the latter from the crystallization buffer. The authors coined the term “histidine brace” for this structural motif that binds copper.51 Via activity assays, they also demonstrate that copper is essential to the reactivity of the enzyme, and that reducing agents such as ascorbate and gallate can potentiate activity.51 Action at the C6-carbon was also postulated, but evidence of this type of activity was not reported and has not been definitively verified to date in the open literature. Although at least one other study has presented indirect evidence that C6-oxidation may occur, it may be that these observations are actually C4 oxidation, as discussed below.992 Shortly after, Phillips et al. reported the detailed characterization of a N. crassa GH61 enzyme and its interaction with CDH from the same organism.50 Specifically, they knocked out the CDH-expressing gene in N. crassa, which resulted in a significant reduction in the cellulolytic performance of the secreted enzyme cocktail in the mutant strain. Add-back studies with CDH from M. thermophila resulted in a full rescue of performance suggesting that CDH is a primary driver of biomass depolymerization in N. crassa.50 The authors then systematically demonstrated that both metals and oxygen were important for

which likely is synergistic with conventional GH enzymes due to the presence of additional attachment and detachment sites for processive GHs. EGs, on the other hand, are thought to act primarily in amorphous regions where the substrate is more accessible for complexation.557 From the initial discovery of oxidative activity in CBP21, many studies have been subsequently published to characterize various features of the LPMO catalytic mechanisms, to understand substrate specificity and regioselectivity, and to understand the range of reducing agents able to potentiate this oxidative activity. Soon after the initial characterization of CBP21 oxidative action, Forsberg et al. reported that a twodomain CBM33 enzyme from Streptomyces coelicolor A3(2) is active on cellulose.979 Similar to CBP21, the S. coelicolor CBM33 primarily produced oxidized oligomers that had an even number of glycans in the product. Additional chitinolytic CBM33 structures have been reported from the same group, which exhibit a similar product profile as the one of S. marcescens CPB21.980 11.2. Mechanistic and Structural Studies

Subsequently, four studies were published in rapid succession in 2011, several of which demonstrated that GH61s employ copper as the active metal and demonstrated that multiple chemical or enzymatic reducing agents can promote oxidative activity.49−51,981 Langston et al. demonstrated that CDH is able to act as a reducing agent for GH61 activity (specifically for the T. aurantiacus GH61 enzyme and H. insolens CDH).49 It has long been known that CDHs are coregulated with cellulases, and their specific function beyond cellobiose oxidation has been the subject of intense study.982,983 CDH consists of a heme domain and a flavin adenine dinucleotide (FAD) domain. The FAD domain of CDH is responsible for a 2-electron oxidation of cellobiose, and the heme domain is believed to transfer the electrons from the FAD domain to another electron acceptor, including in this case to GH61 enzymes. Structures of the heme and FAD domains are known,984,985 and kinetic measurements of electron transfer rates have been reported.986,987 Langston et al. also showed that the natively expressed CDH and GH61 enzymes from T. terrestris are synergistic in cellulose depolymerization. With the GH61 from T. aurantiacus expressed in A. oryzae, Quinlan et al. employed isothermal titration calorimetry, 1419

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Scheme 4. Originally Proposed Catalytic Mechanism for LPMO Action50,312,a

The final hydroxylated product (top left) will undergo a rapid elimination reaction to form a C1-lactone and cleave the glycosidic bond. It is noted that an oxidative attack at the C4 carbon follows the same overall mechanism in terms of the elementary steps and differs only in the position of the oxidative attack and the final product (namely a 4-keto sugar instead of a C1-lactone). a

the CDH enhancement of the N. crassa secretome by loading with EDTA and running the hydrolysis reaction anaerobically, respectively. Given these observations, the authors then analyzed GH61 proteins in the N. crassa secretome with tryptic digestions, LC-MS/MS analysis, and inductively coupled plasma-atomic emission spectroscopy (ICP-AES), which demonstrated that the GH61 enzymes from N. crassa bind copper in a 1:1 ratio. On the basis of the detailed chemical analysis conducted in their study, Phillips et al. thus independently concluded that copper was the active metal in GH61 enzymes and that CDH potentiates its activity. By analyzing multiple GH61s, the authors also discovered the presence of 2 types of specificities: some N. crassa GH61s produced C1-oxidized species (aldonic acids), dubbed “type 1”, whereas some GH61s were suggested to form 4-keto sugars via oxidation at the C4 carbon (dubbed “type 2”). Phillips and colleagues proposed that GH61 and CBM33 enzymes should be termed “polysaccharide monooxygenases”, which was later modified with the term “lytic”,42 thus generally becoming referred to as LPMOs.965 This study was also the first to propose a chemical mechanism for GH61 action, illustrated in Scheme 4. The mechanism proposed by Phillips et al. consists of the following steps: reduced CDH transfers an electron to the “resting” Cu(II)-bound LPMO to form Cu(I), which then binds molecular oxygen (Scheme 4). Oxygen then abstracts an electron from Cu(I) to form a Cu(II)−superoxo species (Cu(II)−O−O•), which can then abstract a hydrogen atom from either the C1 or C4 carbon of the substrate to form a Cu(II)−hydrosuperoxo species (Cu(II)−O−OH) and a radical on the substrate centered at the point of hydrogen atom abstraction. Another electron and proton are transferred from CDH to form water and a Cu(II)−oxyl species (Cu(II)−O•). The Cu(II)−oxyl species then hydroxylates the substrate, which undergoes a spontaneous and rapid elimination reaction to form a lactone or 4-keto sugar at the C1 or C4 carbon, respectively, resulting in glycosidic bond cleavage.50 The lactone form can

then further, either spontaneously or via enzymatic catalysis, form an aldonic acid, which aligns with the mass spectrometry experiments reported by Vaaje-Kolstad et al. in the initial report of CBP21 employing an oxidative catalytic mechanism.48 Directly following the reports from Quinlan et al.,51 and Phillips et al.,50 Westereng and co-workers reported that P. chrysosporium GH61D, expressed heterologously in Pichia pastoris, is also a copper-dependent enzyme that is primarily specific to glycosidic bond cleavage at the C1 carbon in cellulose.981 Likely due to the expression in P. pastoris, the Nterminal histidine did not exhibit N-methylation as observed earlier in the T. aurantiacus,51 T. terrestris,967 or T. reesei structures.337 However, even without the N-methylation, the P. chrysosporium GH61D was still active on cellulose.981 The same observation has also been made for two GH61s from Podospora anserina, also expressed in P. pastoris.992 Direct comparisons of an oxidative enzyme with and without this post-translational modification to date have not been reported in the open literature. A more recent follow-up study was also published wherein the crystal structure of P. chrysosporium GH61D produced from P. pastoris was reported.993 This LPMO structure was the first reported from a basidiomycete fungus, which seems to utilize LPMOs quite ubiquitously in wood degradation.994 These initial mechanistic and structural characterizations of LPMO enzymes in 2011 demonstrated unequivocally that copper is a highly active cofactor in LPMO action, that CBM33s and GH61s share some commonality in their metal-binding sites that suggests an overall similar mechanism, and that a reducing agent, either enzymatic or from small molecules, and molecular oxygen are needed for catalysis by LPMOs.48,50,51,979,981 In 2012, more reports began to generally refer to these enzymes as “LPMOs”, and several additional details regarding LPMO action were reported. Beeson et al. reported an additional characterization of N. crassa LPMO action wherein they more definitively assigned the type 2 LPMO products to be 4-keto sugars,312 as suggested earlier in Phillips et al.50 Soon after, the same group 1420

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

published two structures of natively expressed N. crassa LPMOs (GH61s), one of which (a type 2 LPMO) was shown to be specific for oxidation at the C4 carbon, and the other of which is a type 3 LPMO that carried out oxidation both at the C1 and C4 carbons in cellulose.995 In both structures an N-methylation of the N-terminal histidine reported earlier was observed.51,995 Interestingly, the active site in both enzymes displays an oblong molecule coordinated end-on (η1) to the copper cation. The Cu−O1 distances reported are 2.96, 2.92, and 3.44 Å (chains A and B in N. crassa LPMO-2 and chain A in N. crassa LPMO-3, respectively).995 For the species in N. crassa LPMO-2, assuming that the species is an O−O species, the O−O bond distance is 1.16 Å. This observation led the authors to speculate that this may be a dioxygen species, namely superoxo, weakly bound to the copper atom.995 However, this observation has been questioned given the distance from the copper ion to the putative O1 oxygen. In Cu(II)−O−O• superoxide species, the Cu−O bond distance should be around 1.33 Å.996 Nonetheless, another interesting observation was made in the N. crassa LPMO-3 structure; the molecule crystallizes as a face-to-face dimer, and a tyrosine residue near the active site of its neighboring LPMO was hydroxylated to 3,4-dihydroxyphenylalanine, perhaps suggesting that the X-rays reduced copper and activated the enzyme, resulting in hydroxylation of the nearby tyrosine residue in the neighboring molecule.995 The study from Li et al. also presented sequence alignments grouped by LPMO type to ascertain which residues are important for regioselectivity.995 On the basis of these alignments, the authors described potential enzyme−substrate interactions that give rise to the difference LPMO types via docking the LPMOs onto the cellulose surface. They noted several differences between type 1, 2, and 3 LPMOs including differences in the number and placement of aromatic residues and putative N-glycans near the binding surface, informed by previous studies on family 1 CBM−cellulose interactions.346,352,371 More recently, the same group has developed further insights into the regioselectivity of LPMOs. Vu et al. combined phylogenetic analysis over a much larger set of LPMOs with experimental studies of additional N. crassa LPMOs to identify structural motifs that are important for regioselectivity.997 Vu et al. demonstrated that type 3 LPMOs exhibit an extra loop near the N-terminus, consisting of approximately 12 amino acid residues. Removal of this loop in an N. crassa LPMO resulted in the production of primarily aldonic acids (i.e., oxidation at the C1 carbon or conversion to a type 1 LPMO), suggesting that this loop is important for C4 specificity.997 Multiple additional conserved residues were found from their phylogenetic analysis, all of which now highlight the need for a large mutation campaign to further understand the structural aspects for LPMO regioselectivity. Interestingly, the comprehensive sequence analysis of fungal LPMOs from Vu et al. highlights two subfamilies of LPMOs that they mention have no structural or functional characterization to date (Figure 4 in ref 997), which is perhaps an area ripe for discovery of new LPMO activities or substrate specificities.997 In many of the earlier LPMO studies, it was assumed that the flat face of LPMOs is the primary binding face to crystalline substrates. To verify this hypothesis and to gain further insights into the copper binding in bacterial LPMOs (CBM33s), Aachmann and co-workers conducted a detailed NMR and isothermal titration calorimetry study on S. marcescens CBP21.990 Interestingly, they applied an elegant NMR method to demonstrate that the flat face harboring the copper-binding

site is indeed the binding face to crystalline chitin. Additionally, it is likely that LPMOs first require in situ reduction of Cu(II) in the active site to “activate” the enzyme for oxygen binding, as also proposed by Phillips et al.50 To understand the binding affinity differences between monovalent and divalent copper, Aachmann et al. used competition assays to demonstrate that Cu(I) binds more strongly to CBP21 than Cu(II) (1.2 nM compared to 55 nM, respectively). Similar to their previous report,48 the authors also show that cyanide directly inhibits CBP21, but with NMR, they demonstrate that cyanide interacts with the copper ion, thus more definitively showing that the metal takes place directly in the reaction.990 To further elucidate the nature of LPMOs with Cu(I) bound to the active site, Hemsworth et al. recently reported a structure photoreduced with X-rays.991 Similar to the study from Aachmann et al.,990 they demonstrate Cu(II) binds with very high affinity to a CBM33 from Bacillus amyloliquefaciens, ranging between 6 nM at pH 5 and 80 nM at pH 7.991 Interestingly, they also measure the melting temperature of the enzyme with and without copper bound, which shows an astounding 20 °C increase in the presence of copper. An X-ray photoreduced structure exhibits a T-shaped configuration, indicative of a Cu(I)-bound state. A primary outcome from the work of Hemsworth et al. that should be noted in structural studies of LPMOs is that X-rays can reduce copper, and thus proper care should be taken when analyzing the active site geometries to note the coordination of the copper atom, as proper analysis of the coordination geometry can immediately suggest the oxidation state of the copper ion.991 Lastly, the authors give significant mechanistic weight to the differences in nonfungal LPMOs (CBM33s) and fungal GH61s in this study, as discussed in more detail below.991 As a further illustration of the sensitivity of LPMO active sites to X-ray radiation, Gudmunsson et al. also recently reported a series of bacterial LPMO structures from Enterococcus faecalis CBM33A (EfaCBM33A) along the reduction pathway from Cu2+ to Cu1+.998 The original EfaCBM33A structure was solved previously in an apo form.980 A specialized photoreduction method using helical X-ray diffraction was employed to collect structures along an increasingly intense dosage profile. These structures definitively revealed the change in LPMO active site architecture in the two oxidation states, namely with the Cu2+ state exhibiting a trigonal bipyramidal geometry and the Cu1+ state in a T-shaped configuration. Comparison of these structures to similar small-molecule crystal structures bound to copper unequivocally demonstrated that the oxidation states of copper in the low and high dosage structures are Cu2+ and Cu1+. Moreover, the method employed in this study provides a straightforward means to solve LPMO crystal structures in both copper oxidation states. Until the end of 2013, all LPMOs reported in the literature were only active on the insoluble substrates cellulose and chitin. Although likely an important feature of LPMOs utilized by biomass-degrading organisms, this substrate preference creates significant challenges in the evaluation of kinetics, study of the catalytic mechanism, and detailed molecular-level characterization of enzyme−substrate interactions. Isaksen et al. recently described the biochemical characterization of the first reported LPMO acting on soluble oligosaccharides of cellulose.999 Namely, an N. crassa LPMO (NCU02916) or NcrLPMO9C was found to exhibit substantial activity on cellopentaose and cellohexaose with specificity for the C4-carbon. For cellopentaose oxidation, a nonoxidized trimer and a C4-oxidized dimer 1421

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Scheme 5. Overall Reaction Scheme for C1 and C4 Oxidation of Cellulose by LPMOs and Subsequent Hydrolysisa

a

(A) C1 oxidation produces a lactone species, and subsequent hydrolysis produces an aldonic acid functionality at C1 overall resulting in glycosidic bond cleavage.48,50,51 (B) C4 oxidation produces a 4-keto sugar which can then undergo hydrolysis to form a gemdiol species.50,312,999

Scheme 6. Copper−Oxyl Catalytic Mechanism Examined by Density Functional Theory Calculationsa

a

Density functional theory calculations predicted that a Cu(II)-oxyl species is the most likely reactive oxygen species (ROS) in the LPMO catalytic mechanism, which has been predicted previously to exhibit significant oxidative power. The barrier for this reaction mechanism is predicted to be 18.8 kcal/mol. The rate-limiting step is hydrogen abstraction from the substrate. H-abstraction from the C1 and C4 carbon are predicted to be energetically equivalent.1003

are observed, suggesting binding in a −3 to +2 mode in the enzyme using the nomenclature for carbohydrate binding sites.175,176,999 This enabled more detailed characterization of the reaction productions with NMR spectroscopy than was possible before, which when combined with MS/MS analysis definitively reveals a 4-keto sugar product that further hydrolyzes to a geminal diol in solution (Scheme 5).999 Even more importantly, this discovery enables incredible potential for experimental elucidation of the LPMO catalytic mechanism. As a result, substrate-bound structures may now be possible via NMR or X-ray crystallography. The development of LPMO kinetic assays will likely also follow soon, which will enable experimental determination of the rate-limiting steps in LPMO action. By using unreactive analogues (e.g., cyanide in place of oxygen), isolation of reactive intermediates will likely be

possible, which will enable dramatic advances in our collective understanding of these important oxidative enzymes. Using the same enzyme, the Eijsink group subsequently reported that it is active on xyloglucan.1000 This report highlights the potential scope of LPMO substrate specificity, and likely suggests that oxidative depolymerization of polysaccharides is likely more general than for just cellulose and chitin. Indeed, a recent study has identified an LPMO that is active on starch.1001 As noted in multiple studies since the initial report that LPMOs were oxidative enzymes, understanding the catalytic mechanism is one of the most important outstanding problems for LPMOs.48,50,51,312,991,995,1002 To that end, the first report of the LPMO catalytic mechanism was recently published.1003 Using density functional theory calculations, Kim et al. used an active site, or “theozyme” model,1004 of the T. aurantiacus 1422

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

Figure 78. A. orzyae AA11 LPMO structure. (A) Overall structure of the A. orzyae LPMO structure with the active site residues shown in stick format.1024 PDB code: 4MAI. His1, Ala58, His60, Glu138, Tyr140 shown in stick format. (B) Active site of the A. orzyae LPMO enzyme. The copper ion coordination state indicates that the copper ion is in a +1 oxidation state.

LPMO51 to first examine how oxygen binds to a Cu(I)-bound state in the LPMO active site. With this activated structure, two catalytic mechanisms for hydrogen abstraction and substrate hydroxylation were compared with transition state calculations, one with a copper−superoxo reactive oxygen species (ROS) and the other utilizing a copper−oxyl ROS. For oxygen binding to the LPMO active site, it was predicted that oxygen binds end-on (η1) to copper with an O−O bond distance of 1.31 Å, which is in good agreement with the ideal bond length of Cu(II) superoxide species of 1.33 Å.996 From there, a “superoxo” mechanism was tested wherein the LPMO−Cu(II)−superoxo species abstracts a hydrogen atom from C1 (or C4) carbon of a cellobiose unit, which was chosen to represent the cellulose chain. This reaction forms a radical centered on the substrate carbon atom (C1 or C4) and an LPMO−Cu(II)−hydroperoxo species. The O−O bond then breaks, and the hydroxyl group undergoes a rebound mechanism to form a covalent bond with the C1 (or C4) carbon. This results in a hydroxylated substrate and an LPMO− Cu(II)−oxyl species, and the oxyl group is then reduced to water by action of a reducing agent. The potential energy surface (PES) for this reaction exhibits a barrier of 43.0 kcal/mol, regardless of attack at either the C1 or C4 carbon.1003 Given such a high barrier for a Cu(II)−superoxo mechanism, it was hypothesized that a much more powerful ROS was needed for cellulose hydroxylation. Cu(II)−oxyl species have previously been predicted to exhibit a much stronger oxidative character than Cu(II)−superoxo.1005,1006 Given the likely very short lifetime of Cu(II)−oxyl, it has only been experimentally isolated in a single study to date.1007 Nonetheless, this species has previously been implicated in methane and aromatic hydroxylation reactions.1008−1010 Thus, Cu(II)−oxyl was subsequently hypothesized to be the ROS in LPMO action. The reaction cycle was proposed as follows (Scheme 6): a reducing agent first acts on the LPMO−Cu(II)−superoxo species to produce water and an LPMO−Cu(II)−oxyl species. LPMO−Cu(II)−oxyl abstracts a hydrogen from the substrate to form a LPMO− Cu(II)−hydroxyl species, which then undergoes an oxygenrebound mechanism to hydroxylate the substrate. The overall barrier for this reaction mechanism was calculated to be 18.8 kcal/mol, which is a much more feasible energy barrier at biological conditions than 43.0 kcal/mol for a “superoxo” mechanism. The rate-limiting step in the Cu(II)−oxyl mechanism is C−H abstraction from the substrate, and

abstractions from the C1 and C4 carbon are energetically quite similar.1003 Although the two proposed reaction mechanisms from Kim et al.1003 draw heavily on the proposed catalytic cycle from Phillips and colleagues50 (Scheme 4) and the study of other copper oxygenases (refs 1005, 1006, 1009, 1011−1023), there are some important contrasting features. Namely, the mechanisms proposed by Kim et al. do not require formation of two radicals simultaneously, and do not employ a reducing agent at disparate parts of the catalytic cycle.1003 Additionally, N-methylation of the N-terminal histidine was also studied with the theozyme approach, but no appreciable differences were predicted on the basis of the theozyme method. This study provided the first detailed examination of the LPMO catalytic mechanism, and implicated a ROS for LPMO action of substantial oxidative power,1005,1006 concomitant with the incredibly strong glycosidic bonds in polysaccharides.71 Certainly, however, more work remains to be conducted to fully characterize the LPMO catalytic mechanism. Given the reclassification of GH61s and CBM33s as LPMOs, the curators of the CAZy database recently published an update in which many oxidative biomass-degrading enzymes including LPMOs were catalogued.152 With the diverse activity of this portfolio of enzymes, the large suite of oxidative enzymes was given the generic title “auxiliary activity” or AA. Fungal LPMOs, formerly GH61s, are classified as AA9 enzymes whereas nonfungal LPMOs are classified as AA10 enzymes.152 Additional families of LPMOs have also recently begun to emerge. Hemsworth and colleagues reported the discovery of a new LPMO family, classified as AA11 on the CAZy database.1024 The authors indicate that this class of LPMOs is phylogenetically distinct from the currently classified oxidative enzymes, and characterize a specific member of this new family, which is a chitinolytic LPMO from A. oryzae that forms aldonic acids. The structure of the AA11 LPMO is overall quite similar to both previously characterized fungal and nonfungal LPMOs. A conserved alanine residue is contained in the active site, similar to nonfungal LPMOs, and a tyrosine residue resides axial to the copper ion. On the basis of this observation and detailed analysis of the EPR spectra, the authors suggest that AA11 LPMOs are intermediate between the two known families (Figure 78). In 2014, a report from the Eijsink group was also published which demonstrated that AA10 LPMOs also exhibit a broader regioselectivity than just C1 oxidation.991 In a structural and 1423

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

biochemical study from Forsberg et al., the authors solved highresolution structures of a pair of AA10 LPMOs from S. coelicolor, and showed that one was a C1 oxidizer and the other specific to C4 oxidation. The enzymes were also shown to be synergistic in their degradation of cellulose. They also demonstrated that the copper affinities, redox potentials, and EPR spectra are quite similar for both the C1 and C4 oxidizing LPMOs. The outcome from this study based on a structural comparison of the active sites is that regioselectivity is likely directed by residues beyond the first “shell” from the copper binding site of LPMOs, similar to hypotheses from AA9 LPMOs. From a more industrial standpoint, additional discoveries have been made related to LPMO action. Related to the original observation that LPMOs do not synergize with GH cocktails on clean cellulose substrates from Harris et al.,967 Dimarogona and colleagues demonstrated that the presence of lignin specifically boosts the activity of a fungal LPMO.1025 Cannella et al. conducted a study with the LPMO-containing Cellic CTec2 cocktail (Novozymes, Inc.) at high solids loadings, and demonstrated that up to 4% of the glucose released was gluconic acid.626 The authors went on to further demonstrate that the β-glucosidase in Cellic CTec2 cleaves the glycosidic bond in cellobionic acid 10 times slower that in cellobiose and that gluconic acid is a significant inhibitor of the β-glucosidase. Similar to the results of Dimarogona et al.,1025 Cannella and colleagues demonstrate that the presence of lignin negates the need to add external reducing agents.626 In a subsequent study, Cannella and Jørgenson demonstrate that simultaneous saccharification and fermentation conditions reduce the amount of gluconic acid production likely due to oxygen uptake by the fermentative organism.627 Lastly, the analysis of heterogeneous, oxidized products from LPMOs has been a significant challenge in the field that has required the development of new analytical approaches. In a series of two papers, Isaksen et al. and Westereng et al. report detailed spectroscopic and chromatographic methods that enable chemically accurate and robust methodologies for analysis of oxidized products.52,999

ROS via an oxygen-rebound mechanism for polysaccharide hydroxylation.1003 LPMOs are a recent discovery, and they are rapidly emerging as a very important enzyme addition to the canon of cellulolytic enzymes, which to date have primarily been hydrolytic. They are also attracting interest from a fundamental perspective, given their unique copper-binding sites and the fact that they catalyze cleavage of very strong glycosidic bonds.1027 LPMOs are able to synergize with GHs, likely as endo-acting enzymes that act directly on the surface of crystalline polysaccharides. This ability to act on crystalline regions offers a plausible explanation of the apparent synergy of LPMOs with EGs, which are thought to primarily act in more accessible, amorphous regions of cellulose. As further evidence toward this hypothesis, in the case of αchitin, low crystallinity in the initial substrate reduces the boosting potential of CBP21.1028 As discussed in section 2, glycosidic linkages in polysaccharides are among the strongest covalent linkages found in nature.71 The hydrolytic enzymes that cleave these linkages are thus some of the most powerful enzymes known, and the discovery of LPMOs demonstrates that other enzymatic paradigms are also able to selectively cleave these bonds.48 Transition metals are usually required in enzymes or oxidation catalysts to circumvent spin-forbidden transitions in the triplet ground state of molecular oxygen. The oxidative power required to break strong bonds in carbohydrates coupled to the active site geometries revealed in LPMO structures has drawn comparison to methane monooxygenase, which activates one of the strongest known C−H bonds.50,51,1002,1009 Similarly, Kim et al. predicted that an incredibly powerful oxidative species, Cu(II)− oxyl, is likely responsible for polysaccharide hydroxylation in a fungal LPMO.1003 Going forward, there are undoubtedly many open questions related to the mechanisms of these fascinating enzymes. In a recent perspective on LPMOs,1002 Hemsworth, Davies, and Walton posited that nonfungal (CBM33, AA10) and fungal (GH61, AA9) LPMOs may employ a different catalytic mechanism based on differences in the active site structures, illustrated for representative structures in Figures 74 and 77, and their EPR spectra. Specifically, nonfungal LPMOs examined to date exhibit a conserved alanine residue in the active site that is absent in fungal AA9 LPMO structures and sequence alignments. This alanine residue may alter the initial binding of oxygen1003 or modulate the identity of the ROS.991,1002 Additionally, the identity of the axially coordinating residue was thought for some time to be mainly tyrosine in fungal AA9 LPMOs and phenylalanine in nonfungal AA10 LPMOs, but it is not known how universal this delineation is, or if the axial residue plays a significant role in the reaction. In fact, Harris et al. demonstrated in their initial study that the mutation of the tyrosine residue to phenylalanine in a fungal LPMO reduced the observed synergy, but did not remove it altogether suggesting that the enzyme was still active.967 Later reports of AA10 LPMOs have shown that the axial residue can be tyrosine.1029 The report of the AA11 LPMO family was the first fungal LPMO that exhibits an axial phenylalanine residue and an alanine in the active site, in a seemingly intermediate active site arrangement between AA9 and AA10 LPMOs.1024 Additionally, there is a highly conserved residue in the active site that is typically glutamine in fungal LPMOs and glutamic acid in nonfungal LPMOs, but again its role in the oxidation reaction is unknown. Additionally, the lack of N-methylation on the Nterminal histidine in bacterially expressed AA10 LPMOs, or

11.3. Conclusions

In terms of LPMO mechanisms of action, to date, we collectively know the following: (1) LPMO action requires a reducing agent and molecular oxygen and a copper ion in the active site.48,50,51,981 (2) The reducing agent required for LPMO action can be enzymatic (CDH), an externally added small molecule, or lignin-derived molecules found in biomass.48−51,312,626,965,967,979,995,997,999,1002,1024−1026 (3) LPMOs have been shown to definitively act either on C1 or C4 carbons,50,312,995,997,999 and several motifs have been identified from biochemical and structural studies that impart regioselectivity.997 Oxidative action at the C1 carbon yields a lactone, which can be hydrolyzed to an aldonic acid motif, whereas C4 oxidation yields a 4-keto sugar that can hydrolyze to a geminal diol. (4) There have been three “auxiliary activity” families characterized according to the CAZy database thus far on the basis of sequence differences.152,1024 Different LPMOs within the AA9 and 10 families can act either on insoluble forms of cellulose or chitin and, in one reported case to date, on soluble cello-oligomers,999 xyloglucan,1000 and starch.1001 (5) A theoretical prediction has been reported that suggests the fungal LPMO mechanism employs a Cu(II)−oxyl 1424

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

12. MODELING ENZYMATIC HYDROLYSIS Extremely valuable information regarding the catalytic and processive mechanisms of CBHs and EGs can be obtained by considering each component in isolation. However, synergism between multiple cellulase components cannot be captured by models that only consider one cellulase at a time. Kinetic models of multiple enzyme components (cocktails) allow for testing of predictions and quantifying the effects of varying enzyme and substrate concentrations and properties.33,556 We review here two basic categories of kinetic models that have been applied to the enzymatic hydrolysis of cellulose, namely ordinary differential equation (ODE)-based models and agent-based models. In what follows, we first highlight ODE-based models (Figure 79), followed by agent-based models (Figure 80).

fungal LPMO heterologously expressed in yeast, suggests that perhaps filamentous fungi are able to employ this posttranslational modification whereas bacteria and some yeast cannot. If this modification is significant for catalytic action, then perhaps it might lead to differences in activity or even a different mechanism. Lastly, X-ray radiation is able to rapidly photoreduce the copper ion from a copper(II) oxidation state to a copper(I) oxidation state in AA10 (and for at least one case, AA11) LPMOs. However, fungal LPMOs solved to date with similar X-ray crystallography approaches are able to harbor copper(II) ions seemingly with no consideration for photoreduction. This difference may suggest clues as to the differences in the mechanisms of these LPMO families. Regardless, although the LPMO enzymes in the AA9−11 families are distinctly separated in sequence space, it is not clear how universal the active site differences are between the three LPMO families, nor if this has significant ramifications on the oxidation mechanism. Thus far, LPMO products and substrates do not significantly differ between the families relative to the internal variation per family. Undoubtedly, the more information that we collectively elucidate regarding the mechanisms of these novel enzymes will likely warrant constant evaluation as to their classification. The differences in reactivity, now among the three LPMO families,1024 will require considerable biochemical, structural, and theoretical work to fully unravel. Additionally, to date very little has been reported on the electron transfer steps besides a study on the interaction of a fungal LPMO with two different CDHs.1026 Sygmund et al. examined the effect of the two native CDHs from N. crassa in their interaction with an LPMO from the same organism, and found that there are differences in the efficiency of CDH/ LPMO interaction, but the authors point out that this requires a significantly expanded scope to be able to ascertain LPMO− CDH intermolecular interactions and catalytic efficiencies.1026 Intramolecular electron transfer in LPMOs has been postulated to occur via clusters of aromatic residues,991,995 but the test of this hypothesis has not yet been reported. The need for further studies into the interactions of CDH or small-molecule reducing agents with LPMOs is obvious, and may be appropriate for biophysical studies such as those utilizing surface plasmon resonance, NMR, or even cocrystallization. Beyond the detailed, molecular mechanisms of LPMO action, there are many questions that need to be answered related to industrial applications of LPMOs. The design of optimal mixtures of GHs and LPMOs will be required for biomass depolymerization in a biofuels or chemicals context, and this work will require massive efforts similar to the decades of efforts already spent optimizing GH cocktails and understanding enzyme synergy. The application of sophisticated enzymology tools such as those reviewed above from Väljamäe et al.,307,531,557 Westh et al.,559,560 and Igarashi et al.476,558 will undoubtedly help elucidate the mechanisms of synergy in a more detailed fashion, which will accelerate the development of optimized GH/LPMO cocktails. Lastly, the discovery of LPMOs 20 years after the first structures of the first GH enzymes from fungi172,173,192 and 60 years after the C1−Cx hypothesis was put forth by Reese et al.291 highlights the fact that we still have much to learn about novel biomass depolymerization mechanisms in nature. There are likely more discoveries on the level of LPMOs that we still have yet to uncover about how fungi and organisms from other kingdoms of life degrade plant cell walls.

Figure 79. Illustrative example of an ODE-based model for the enzymatic degradation of crystalline cellulose. (A) An example of the discrete steps of CBH processive/hydrolytic activity that may be captured by an ODE based kinetic model. Kinetic schemes to capture the CBH processive action may assume (B) that CBH can only produce cellobiose or (C) that initial cuts produce glucose, cellobiose, and cellotriose with nonzero probability. Note that the cycle in part A does not correspond precisely to either of the schemes shown in parts B or C. Part A is reprinted with permission from ref 532. Copyright 2011 American Chemical Society. Parts B and C are reprinted with permission from ref 1040. Copyright 2011 Wiley Periodicals, Inc.

12.1. Ordinary Differential Equation-Based Models

ODE-based kinetic models first formulate a set of process steps (Figure 79A) and then develop a set of differential equations that describe these steps (Figure 79B,C). The rate constants necessary for each process step generally come from experiments (or sometimes from simulations) wherein the kinetics of individual steps have been isolated. Varying system inputs (e.g., starting concentrations of substrate or enzyme, substrate properties, the reaction rate constants, etc.) and solving the 1425

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

enzyme loading exists in between these two limits, resulting in a maximum degree of synergy.1032 Eriksson et al. modeled the synergistic action of TrCel7A and TrCel7B (a CBH and an EG, respectively) on steam-pretreated spruce and attempted to account for the rapid rate decrease seen in enzymatic hydrolysis.569 Their experiments indeed exhibited this rate decline, and they were able to rule out enzyme thermal stability and product inhibition as major causes. Addition of TrCel7B increases the hydrolysis rate more than adding TrCel7A, which in turn has a greater effect than adding substrate. Their explanation for EG/CBH synergism centered on the ability for EGs to “remove” disordered cellulose chains that act as obstacles to processive CBH action. This proposal has gained a lot of traction in recent years as convincing experimental evidence has pointed to steric obstacles as a ratelimiting factor for CBHs acting in isolation307 and their removal as a vital role for EGs.557 Zhang and Lynd developed a model for an enzyme cocktail with parameters chosen from the experimental literature for TrCel7A, TrCel6A, and TrCel7B.1034 Notably, they apply a single set of enzyme parameters to various “substrates” (by examining characteristic values of degree of polymerization DP and fraction of β-glucosidic bonds available to cellulase). They conclude that these two substrate parameters are sufficient to capture various phenomena such as the dependence of cellulase synergy on the substrate properties, enzyme loading, and reaction time. In addition, of the two substrate properties, they predict that enhancing substrate availability is more beneficial to increasing activity than decreasing DP.1034 Bommarius et al.596 compared two approaches to the problem of enzyme adsorption, diffusion, and reaction, namely fractal kinetics and jamming kinetics. Fractal kinetics accounts for the heterogeneous nature of the substrate that restricts cellulase diffusion, whereas jamming kinetics focuses on the obstruction of cellulase motion due to other bound cellulases. Experimentally, they studied the effect of using an array of different pretreatments on rate slowdown. They find that the rate of hydrolysis dropped by 2−3 orders of magnitude at high degrees of conversions, but this is not dependent on the pretreatment method. In fact, they conclude that pretreatment and fractal kinetics are irrelevant to the rate decline seen at high degrees of conversion. Their model explained the decline, however, as being due to the jamming of enzymes bound to the substrate surface.596 Zhou et al. introduced a model that explicitly accounted for the changing morphology of the substrate due to enzymatic hydrolysis as well as the cellulose chain fragmentation.1035−1037 This is an important concept, as the evolution of accessible surface area is likely to be a complex function of multiple dynamic parameters throughout the course of hydrolysis. This model was applied to the T. reesei system of Cel7A, Cel6A, and Cel7B degrading Avicel,1035 and they were able to explicitly capture the hydrolytic evolution of cellulose substrate. Their results indicated that, in addition to averaged substrate characteristics such as DP, internal surface area, and enzyme accessibility fraction, that the distribution of structural heterogeneities is an important characteristic to be considered. Their model captured the rapid rate retardation of enzymatic hydrolysis, and the authors proposed that this distribution is an important contributing factor in the slowdown. Their “surface ablation” model is conceptually similar to the lattice models of Sild et al.600,1038 discussed below. Within this model, there are two relevant substrate-related time scales: the time to degrade a

Figure 80. Illustrative example of an agent-based model for crystalline cellulose enzymatic degradation. (A) At the beginning of an adsorption/hydrolysis simulation the substrate surface lattice is perfectly ordered. (B) After adsorption equilibrium has been reached, nonproductively bound CBHs contribute to a reduced apparent processivity. (C) Surface erosion is exemplified after limited hydrolysis and (D) after extended hydrolysis. Reprinted with permission from ref 600. Copyright 2001 John Wiley and Sons.

resulting set of differential equations can reveal the dependency of various outputs (product concentrations, conversion, degree of synergism, etc.) on the inputs. In the earliest model of this type, Suga et al. predicted the molecular weight distribution of saccharides released from an insoluble substrate, considering EG and CBH (the latter acting exclusively at chain ends) both in isolation and in concert.1030 They modeled the substrate−enzyme interaction as Michaelis− Menten kinetics, but this is solely for the reaction and not for adsorption, which they did not consider. They assumed that all β-glycosidic bonds are accessible to the enzymes. They noted that, with EG alone, the production of monomers (in this case, cellobiose) is extremely low due to its random cleavage of internal bonds. On the other hand, CBHs produce only monomers, but have a low concentration of chain-end glycosidic bonds available to cleave. In concert, however, they demonstrated that the monomer production was drastically increased, thus capturing endo/exo synergism.1030 The model of Okazaki et al. added to the approach of Suga et al. the inclusion of β-glucosidase activity as well as consideration of product inhibition.1031 Their results predicted that CBH catalytic activity is inversely proportional to the initial substrate DP and that EG activity is not dependent on DP. However, the synergistic effect of employing EG and CBH in concert actually increases as the substrate DP increases.1031 Converse and Optekar considered a binary cocktail of EG and CBH1032 in order to rationalize the fact that the degree of synergism had been shown1033 to go through a maximum as the total enzyme concentration increased. Their model featured competitive adsorption of the two cellulases for a limited number of sites. At low enzyme concentrations, the system is also at low conversion, and the CBH is less dependent on the new chain ends created by the EG. At high enzyme concentrations, CBH dominates the surface sites and thus restricts the EG from creating new chain ends. An optimal 1426

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

complexation-limited framework. Fox et al.’s calculation for apparent processivity was much lower than for intrinsic processivity; thus, they concluded that CBHs do have their processive runs halted by steric obstacles on the surface,532 consistent with Kurašin and Väljamäe.307 Griggs et al. developed a population balance model to capture the hydrolysis of cellulose by action of an EG and CBH.1041 For the CBH, distinct steps for the adsorption, complexation, processive hydrolysis, and desorption were modeled. They captured temporal development of the chain length distribution for EG in isolation, CBH in isolation, and EG/CBH in concert for both amorphous and structured substrate. In a follow-up study, β-glucosidase action was included to model a complete cocktail. The rate-limiting factor in this study was determined to be the availability of substrate.1042 One of the most difficult aspects of gaining mechanistic insight into how cellulases work is that surface interactions are very difficult to characterize.1043 Fox et al. utilized photoactivated localization microscopy (PALM), and applied it to quantify the binding affinity of six different CBMs for regions of a cellulosic cotton substrate of varying crystallinity.1044 Their results demonstrated quantitatively the invalidity of applying a purely binary classification to cellulose (amorphous vs crystalline). They quantitatively assessed the binding preference of the different CBMs studied and found a continuum of binding preferences. They then related these findings to binary synergistic enzyme activity by synthesizing chimeras composed of A. cellulolyticus GH5 EG, TrCel7A linker, and each of the six CBMs. Their results revealed a significant new element of cellulase synergism. If two cellulases worked in either totally different regions of cellulose or in the same region, there is no synergistic effect. However, enzymes with similar but nonidentical binding site preferences gave optimal synergism, by enhancing the susceptibility of the substrate for one another. These conclusions have clear implications for selecting enzymes for cocktails with optimal synergism. Gao et al. challenged the conventional paradigm that increased hydrolytic activity positively correlates with an increase in surface-bound enzyme.1045 With a cocktail of TrCel7A, TrCel6A, and TrCel7B, they showed that although the binding coefficient on cellulose III was only half that on native cellulose Iβ, its hydrolytic activity was increased. In fact, they showed that enzyme loading can be decreased by 5-fold on cellulose III to achieve concomitant hydrolysis rates as on cellulose Iβ. To rationalize this observation, they developed a kinetic model that couples enzyme binding, chain decrystallization/sliding, and hydrolysis as distinct steps in the enzymatic processive cycle. Their model showed that their nonintuitive experimental result can be reproduced if the enzymes’ processive ability (quantified by rate constant kslide) was enhanced while its initial chain-association ability (quantified by nkon, where n is the number of available binding sites and kon is the adsorption rate constant) was diminished.

single glucan chain (nonmorphological) and the time to completely degrade the substrate (which necessitates morphological consideration). The model is able to correlate the first time scale with the initial rapid rate slowdown,1037 a finding that is compatible with the steric obstacles hypothesis.307 However, the model is not able to match experiments in regard to the second time scale, leading to the conclusion that other inhibitory factors must be at work in this regime.1037 Levine et al. introduced a mechanistic model that included the following novel features: enzyme adsorption and complexation were modeled as distinct steps, the equilibrium assumption for adsorption was abandoned, and representation of cellulosic substrate was as a polydisperse distribution of spheres.1039 They applied this to the individual and synergistic behavior of Cel7A and Cel5A from T. reesei. Their model performed well except with low surface area, where enzyme competition for adsorption sites is particularly high and enzyme crowding is most relevant. Also, the experimentally observed retardation of hydrolysis could only be accounted for with abnormally strong product inhibition or short enzyme half-lives; thus, they concluded that neither of these were the primary causes of the slowdown. They also found that enzyme crowding cannot be the sole reason for the hydrolytic rate slowdown; model results only showed this effect with multiple enzymes present and with high substrate loadings, whereas experiments find the rate slowdown to be universal (even with single enzymes). They note that “structural heterogeneities” may play a significant role, but their model did not account for these. Levine et al. later expanded their previous model to track individual product formation (glucose, cellobiose, and cellotriose) instead of just overall hydrolytic activity,1040 with the goal of developing a rational method for determining the optimum cellulase ratios for hydrolysis experiments (using Cel7A, Cel6A, and Cel5A from T. reesei). They considered various combinations of substrate DP and surface area, including values representative of BMCC. The model results suggested that optimal enzyme ratios are 1:0:1 with Cel7B:Cel7A:Cel6A at shorter times (24 h) and 1:1:0 at longer times (72 h); the authors indicated this may be related to the relative thermal stabilities of the two CBHs. Informed by the kinetic models just described, Fox et al. experimentally examined the kinetics of CBH chain complexation and how EGs affect these kinetics.532 They employ longtime hydrolysis trials (>100 h) of BMCC by CBH Cel7A from T. longibrachiatum and EG II from T. emersonii. They measured initial-cut product release and equated this to initial chain complexation. They also measured the processive-cut products and deduced processivity from the ratio of processive-cut products to initial-cut products. Via material balances on the initial-cut products and the processive-cut products, they developed a system of ODEs. Even when chain-ends for CBH action were in excess (due to high concentration of EG), the rate of generation of initial-cut products was an order of magnitude lower than the hydrolysis rate of soluble cellohexaose. From this and other evidence, it was concluded that chain complexation is rate-limiting. This was in contrast to the findings of Kurašin and Väljamäe.307 However, Fox et al. also found that increasing EG concentration decreases the length of a processive run for the CBH while also increasing the hydrolytic conversion. This finding perhaps supports the idea that EGs help to “rescue” CBHs that are trapped behind obstacles.557 This would explain the trends seen in processivity and conversion within the obstacle-limiting framework, though it does not fit as well in the

12.2. Agent-Based Models

Agent-based models (also known as cellular automata) treat the cellulosic substrate and enzymes as individual entities and assign behaviors or properties to each on the basis of previously published parameters. They have the advantage of giving a spatial dimension to traditional models that are solely based on differential equations. In addition, they provide the additional capability of visual inspection, which can give insight into mechanisms and guide further model development (Figure 80). 1427

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

found that varying the enzyme/substrate ratio and adsorption strength (for either CBH or EG) from the literature values resulted in a decrease in glucose production. They found decreasing glucose production with time and attributed this to enzyme crowding on the surface and decreased cellulose surface area. Asztalos et al. utilized a spatial model for cellulose degradation that “forms a bridge” between all-atom MD simulation and higher-level ODE models that accounted for enzyme crowding on the surface.1048 They developed a coarse-grained stochastic model of endo and exo cellulases with a two-dimensional surface modeling crystalline cellulose. They examined TrCel7A, TrCel6A, and TrCel7B and include the following reactive events: adsorption, interchain hydrogen bond breaking, hydrolysis of glycosidic bonds, and desorption. Endo/exo synergism was qualitatively reproduced.1048 In summary, both ODE- and agent-based models of cellulose degradation by enzyme cocktails have supplemented and improved experimental characterization of the substrate and enzymatic action. In particular, they have shed light on the mechanisms and influencing factors of significant experimentally observed phenomena including cellulase synergism and the rapid initial retardation in hydrolysis rate. These models offer the ability to test mechanistic explanations for these phenomena as well as to strip the system down to its basic essentials in order to offer these insights. As such, they can inform the selection of process conditions, such as optimal cellulase loadings and ratios between cocktail components. Finally, they have the potential of assisting in the determination of the relative rates of the individual steps of the processive cycle of CBHs. Going forward, these models have the potential to enhance mechanistic understanding, particularly as they are increasingly coupled to advanced experimental techniques (refs 102, 103, 128, 307, 476, 521, 532, 534, 535, 546, 548, 552, 557, 558, 560, 561, 564, 795, and 1044) and include the synergistic action of CBHs (with endo and exo capability and acting from both reducing-end and nonreducing-end), EGs, and LPMOs.

Finally, this type of model more naturally captures physical phenomena on the substrate surface such as enzyme crowding and nonproductive binding to substrate. Sild et al. employed an infinite, anisotropic, two-dimensional lattice simulation with overlapping binding sites (of only one type) with the goal of estimating the amount of bound enzyme as a function of free enzyme.1038 Simple rate equations governed the binding and dissociation of CBH (e.g., Cel7A) to/from an initially perfect crystalline surface (e.g., Avicel). Though experimentally it had been shown that the adsorption of TrCel7A to cellulose could be accounted for by assuming two different types of binding sites (perhaps signifying regions of high and low crystallinity), Sild et al. demonstrated with their model that if the binding sites are allowed to overlap (as is the case experimentally),348 the data is fit equally well by a model that assumes only one type of binding site. Their data also indicate that, at high cellulase loadings, the surface reaches an apparent equilibrium before the theoretical maximum coverage due to surface crowding and slow rearrangement. This resulted in surface coverage that is as much as 40% below the theoretical maximum, depending on the shape of the adsorbing molecule. Later, Väljamäe et al.600 coupled experiments with an expanded version of the Monte Carlo simulation model of Sild et al.1038 by adding two significant aspects: TrCel6A was included (in addition to TrCel7A), and the hydrolysis, as well as the binding, was considered. Experimentally, they find hydrolytic rate retardation at short times (in the first 5−10 min). Also, preincubation of BMCC with TrCel7A actually decreased the reaction velocity for subsequent hydrolysis with TrCel7A. However, preincubation with TrCel6A increased the reaction velocity for subsequent hydrolysis by TrCel7A. Thus, so-called exo/exo synergism was not dependent on the CBHs being incubated simultaneously. They subsequently sought to explain these results via Monte Carlo simulations. They attributed the initial rate retardation to two primary factors: steric hindrance by nonproductively bound cellulases and cellulose surface erosion due to CBH action. Their simulations support these two modes of CBH inhibition (Figure 80). The early stages of adsorption/desorption simulations featured the most drastic increase in bound CBHs causing a concomitant increase in enzyme obstacles. In addition, CBH processive action degraded the surface from an initially homogeneous one to a drastically eroded one; at long times, a steady-state erosion pattern having a constant retarding effect should be established.600 Fenske et al. examined the case of a single cellulase that has nonzero exo and endo capabilities acting on an insoluble polysaccharide substrate, modeled as a two-dimensional lattice of fixed DP.1046 Enzymes were allowed to move and cleave bonds on the surface in a stochastic fashion (with fixed probabilities). Their results indicated that substrate inhibition was present at high substrate loadings with only one cellulase component. This was due to a decrease in “autosynergism”, that is, the endo activity of an enzyme aiding the exo-activity of the same enzyme.1046 Warden et al. developed a cellular automata model to model the deconstruction of cellulose to glucose by cellulase cocktails.1047 They developed a three-dimensional spatial lattice model that included control over many physical and chemical variables involved in the saccharification of cellulose. They utilized published values for the cellulase system for T. reesei (section 4.3) and examined the effect on glucose production of varying enzyme loadings, adsorption strengths, and catalytic activities of the EGs and CBHs in the system. In general, they

13. CONCLUDING REMARKS Fungi have evolved to be the most powerful and prevalent biomass degrading organisms in nature, exhibiting a diverse range of lifestyles for the turnover of lignocellulosic material on Earth. Given their significant activity and ability to be readily produced at high titers on the industrial scale, fungal cellulase cocktails are an excellent starting point for industrial biorefinery applications. Indeed, industrial-scale lignocellulosic biorefineries, mostly slated to produce bioethanol at this point, are beginning to come online worldwide at the time of this review. Given the scale of fuel demand and the significant potential for biomass conversion to offset fossil fuel usage, the use of fungal cellulases will likely grow dramatically in the coming years, which in turn could make fungal cellulases the most widely produced industrial enzymes in the world by a large margin. As enzymes remain a major cost in renewable fuels production, this in turn makes even “small” gains in enzyme or cocktail performance of significant importance in the renewable energy economy. As covered in this review, significant strides in our fundamental understanding of cellulase action have been made in the past several decades, especially driven by the structural biology efforts starting in the early 1990s. However, challenges to “well-accepted” models and the basic physical mechanisms of cellulose deconstruction have been reported in the last several 1428

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

industrial world. Advances from the worldwide research community will be needed to comprehensively address these open questions and to harness the considerable potential of these fascinating enzymes.

years that highlight how much we have yet to understand. For example, the role of hydrolytic EGs has been recently revealed to be generation of “dissociation” points for CBHs, instead of the commonly cited attachment points (see section 6.2). Additionally, the recent discovery of LPMOs highlights the fact that basic cellulolytic paradigms are not fully characterized, and it is possible that other enzyme activities exist that have not yet been discovered, or like LPMOs, may be misclassified. Beyond those two examples, many open questions remain even regarding basic catalytic mechanisms in some cases (e.g., GH6 cellulases and LPMOs), the molecular basis for activity differences within the same family of cellulolytic enzymes, the basis of synergistic function in enzyme cocktails (e.g., why are multiple EGs needed? What function do LPMOs serve beyond EGs?), and enzyme−substrate interactions across the multiple length scales likely of importance in cellulose (and biomass) deconstruction. Unlike most enzymes that work in solution, cellulases function at a solid−liquid interface on a physically heterogeneous substrate in the case of clean cellulose and physically and chemically heterogeneous substrate in the case of native or pretreated plant cell walls. Our lack of understanding of these solid substrates coupled to the inherent difficulties associated with effective kinetics measurements of single enzymes and enzyme cocktails makes the challenge of understanding and improving cellulase action a daunting technical challenge. Moreover, for both fundamental investigations and industrial applications, the substrate of choice must be carefully considered for interpretation of results, and in the latter case of pretreated biomass, the substrate treatment must be “co-optimized” with the enzyme cocktail design. Taken together, these recent findings and open questions continue to challenge our collective basic premise of cellulose deconstruction and highlight the fact that exciting, impactful advances are yet to be made in the study of cellulose deconstruction in both natural and applied contexts. Going forward, these open questions will be addressed through the continued application of traditional structural biology and biochemical measurements, development and application of advanced biophysical measurements, kinetic and molecular modeling coupled to experimental findings, screening and mining natural diversity of both natural enzymes and secretomes, detailed substrate characterization, and the development of novel methods for engineering enzymes that inherently work at solid−liquid interfaces in a cocktail context. The elucidation of rate-limiting steps in “simple” cocktails via advanced methodologies developed in the last several years has begun to uncover some aspects of cellulolytic action that can be considered targets for enzyme engineering. Continued development and application of these types of quantitative methods in increasingly complex systems and across substrates will further elucidate the physical and chemical mechanisms of cellulase action, and highlight opportunities for improvement of enzyme performance. In conclusion, fungal cellulases, some of which remain to be discovered, work in concert to accomplish one of the most important processes in nature, namely the turnover of recalcitrant cellulose. This process is of paramount importance in the global carbon cycle and may become one of the most industrially important enzymatic reactions given the desperately needed drive toward a renewable energy-based global society. Understanding the function, diversity, and mechanisms of fungal cellulases remains an extremely challenging, massive endeavor but one of considerable importance for both the natural and

AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Author Contributions #

These authors contributed equally to the review.

Notes

The authors declare no competing financial interest. Biographies

Christina M. Payne is an assistant professor in the Department of Chemical and Materials Engineering at the University of Kentucky and the August T. Larsson Guest Researcher at the Swedish University of Agricultural Sciences in Uppsala, Sweden. She received her Ph.D. in chemical engineering from Vanderbilt University in 2007. After a brief foray into the world of chemical process engineering, she undertook postdoctoral studies at the National Renewable Energy Laboratory (NREL) in 2011 and was promoted to staff scientist the same year. In 2012, she joined the faculty at the University of Kentucky where her research focuses on application of computational tools to investigate protein structure−function relationships and biocatalysis.

Brandon C. Knott is a Postdoctoral Research Fellow in the National Bioenergy Center at NREL and recipient of the NREL Director’s Fellowship. He obtained his Ph.D. from the University of California, Santa Barbara, in 2012 in Chemical Engineering for research into the mechanisms of nucleation from solution utilizing statistical mechanics, computer simulation, and experiments. His current research at NREL 1429

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

focuses on utilizing advanced computational approaches to elucidate mechanisms of enzymatic catalysis relevant to biofuels production.

Michael E. Himmel is a Research Fellow in the Biosciences Center at NREL. He obtained his Ph.D. from Colorado State University in 1980 in Biochemistry. At NREL, he has conducted, led, and designed research in protein biochemistry, recombinant technology, enzyme engineering, and new microorganism discovery. During the past three decades, he has contributed over 300 journal articles, 7 books, and 22 patents to the literature. In 2004, he was a recipient of an R&D 100 award for “Advanced Catalytic System for Biomass Conversion”.

Heather B. Mayes is a Ph.D. candidate in the Department of Chemical and Biological Engineering at Northwestern University in Evanston, IL, where she is using computational chemistry to elucidate thermochemical and enzymatic routes for cellulose decomposition to produce renewable fuels and chemicals. She is a Department of Energy Computational Science Graduate Fellow, Chicago ARCS Scholar, and active member of the Society of Women Engineers. Before returning to school, she worked as a chemical engineering consultant for Jacobs Consultancy, helping transportation fuel companies evaluate potential new processes and make existing processes safer and more energyefficient.

Mats Sandgren is an Associate Professor at the Department of Chemistry and Biotechnology, Swedish University of Agricultural Sciences, Uppsala, Sweden, and has studied cellulose degrading enzymes since 1998. At Uppsala University he obtained his Ph.D. in molecular biology in 2003, in the group of Alwyn Jones, for research on structure−function relationships in cellulases. He and his research group moved to the Swedish University of Agricultural Sciences in 2007 and focus now on research on structure−function relationships in cellulases and other carbohydrate-active enzymes. Henrik Hansson is a Researcher at the Department of Chemistry and Biotechnology, Swedish University of Agricultural Sciences, Uppsala, Sweden. He obtained his Ph.D. from the Royal Institute of Technology, Stockholm, in 2002 for research in protein−protein interaction using NMR and other biophysical techniques. A postdoctoral period followed at Uppsala University, Uppsala, Sweden, working with Dr. Gerard J. Kleywegt in the laboratory of Professor T. Alwyn Jones. In 2006, he went to the Swedish University of Agricultural Sciences and Department of Molecular Biology, now the Department of Chemistry and Biotechnology, to use crystallography and structural biology to

Jerry Ståhlberg is an Associate Professor at the Department of Chemistry and Biotechnology, Swedish University of Agricultural

study cellulases and other cellulose degrading enzymes. 1430

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

EPR GH GH5 GH6 GH7 GH12 GH45 GH61 Glc3 Glc4 Glc5 Glc6 Glc8 Glc9 LPMO MCC MD NMR PASC oNPC pNPC pNPL RMSD SAXS SEM TEM XRD

Sciences, Uppsala, Sweden, and has studied cellulose-degrading enzymes since the mid-1980s. At Uppsala University he obtained his Ph.D. in biochemistry with Göran Pettersson, and he then joined the structural biology group of Alwyn Jones. He moved to the Swedish University of Agricultural Sciences in 1998 and is leading research on structure−function relationships in cellulases and other carbohydrateactive enzymes.

Gregg T. Beckham is a Senior Engineer in the National Bioenergy Center at NREL in Golden, Colorado. He obtained his Ph.D. from the Massachusetts Institute of Technology in 2007 in Chemical Engineering for research into solid-state nucleation mechanisms using statistical mechanics and molecular simulation. After joining NREL in 2008, his research interests have focused on understanding structure−function relationships in carbohydrate-active enzymes, chemical catalysis, and process development for carbohydrate and lignin valorization.

electron paramagnetic resonance glycoside hydrolase family 5 glycoside hydrolase family 6 glycoside hydrolase family 7 glycoside hydrolase family 12 glycoside hydrolase family 45 glycoside hydrolase family 61 glycoside hydrolase cellotriose cellotetraose cellopentaose cellohexaose cellooctaose cellononaose lytic polysaccharide monooxygenase microcrystalline cellulose molecular dynamics nuclear magnetic resonance phosphoric acid swollen cellulose o-nitrophenyl-β-D-cellobioside p-nitrophenyl-β-D-cellobioside p-nitrophenol-β-D-lactoside root-mean-square deviation small-angle X-ray scattering scanning electron microscopy transmission electron microscopy X-ray diffraction

REFERENCES (1) Perlack, R. D.; Wright, L. L.; Turhollow, A. F.; Graham, R. L.; Stokes, B. J.; Erbach, D. C. Biomass as Feedstock for A Bioenergy and Bioproducts Industry: The Technical Feasibility of a Billion-Ton Annual Supply; DOE/GO-102005-2135; U.S. Department of Energy: Oak Ridge, TN, 2005. (2) Conti, J. J.; Holtberg, P. D.; Diefenderfer, J. R.; Napolitano, S. A.; Schaal, M.; Turnure, J. T.; Westfall, L. D. Annual Energy Outlook 2014; DOE/EIA-0383(2014); U.S. Energy Information Administration: Washington, DC, 2014. (3) Brandt, A. R.; Millard-Ball, A.; Ganser, M.; Gorelick, S. M. Environ. Sci. Technol. 2013, 47, 8031. (4) Pacala, S.; Socolow, R. Science 2004, 305, 968. (5) Ragauskas, A. J.; Williams, C. K.; Davison, B. H.; Britovsek, G.; Cairney, J.; Eckert, C. A.; Frederick, W. J.; Hallett, J. P.; Leak, D. J.; Liotta, C. L.; Mielenz, J. R.; Murphy, R.; Templer, R.; Tschaplinski, T. Science 2006, 311, 484. (6) Himmel, M. E.; Ding, S. Y.; Johnson, D. K.; Adney, W. S.; Nimlos, M. R.; Brady, J. W.; Foust, T. D. Science 2007, 315, 804. (7) Somerville, C.; Youngs, H.; Taylor, C.; Davis, S. C.; Long, S. P. Science 2010, 329, 790. (8) Somerville, C.; Bauer, S.; Brininstool, G.; Facette, M.; Hamann, T.; Milne, J.; Osborne, E.; Paredez, A.; Persson, S.; Raab, T.; Vorwerk, S.; Youngs, H. Science 2004, 306, 2206. (9) Scheller, H. V.; Ulvskov, P. Annu. Rev. Plant Biol. 2010, 61, 263. (10) Mohnen, D. Curr. Opin. Plant Biol. 2008, 11, 266. (11) Atmodjo, M. A.; Hao, Z. Y.; Mohnen, D. Annu. Rev. Plant Biol. 2013, 64, 747. (12) Boerjan, W.; Ralph, J.; Baucher, M. Annu. Rev. Plant Biol. 2003, 54, 519. (13) Carpita, N. C. Annu. Rev. Plant Physiol. Plant Mol. Biol. 1996, 47, 445. (14) Somerville, C. Annu. Rev. Cell Dev. Biol. 2006, 22, 53. (15) Morgan, J. L. W.; Strumillo, J.; Zimmer, J. Nature 2013, 493, 181. (16) Pauly, M.; Keegstra, K. Plant J. 2008, 54, 559. (17) Zakzeski, J.; Bruijnincx, P. C. A.; Jongerius, A. L.; Weckhuysen, B. M. Chem. Rev. 2010, 110, 3552.

ACKNOWLEDGMENTS We thank our colleagues, past and present, for many productive collaborations and engaging discussions that contributed to ideas in this review. C.M.P. acknowledges the August T. Larsson Guest Researcher Programme at the Swedish University of Agricultural Sciences for funding. B.C.K. and G.T.B. acknowledge the NREL Laboratory Directed Research and Development Program and Director’s Fellowship Program for funding. G.T.B. and M.E.H. acknowledge funding from the US Department of Energy BioEnergy Technologies Office. H.B.M. thanks the DOE Computational Science Graduate Fellowship (CSGF), provided under Grant DE-FG0297ER25308 and the ARCS Foundation Inc., Chicago Chapter. H.H., M.S., and J.S. thank the faculty for Natural Resources and Agriculture at the Swedish University of Agricultural Sciences for support of research through the program “MicroDrivE”. ABBREVIATIONS AFM atomic force microscopy BC bacterial cellulose BMCC bacterial microcrystalline cellulose CBD cellulose-binding domain CBH cellobiohydrolase CBM carbohydrate-binding module CBM33 family 33 carbohydrate-binding module CBP21 chitin-binding protein 21 CD catalytic domain CDH cellobiose dehydrogenase CI crystallinity index CMC carboxymethyl cellulose EG endoglucanase 1431

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(18) Ragauskas, A. J.; Beckham, G. T.; Biddy, M. J.; Chandra, R.; Chen, F.; Davis, M. F.; Davison, B. H.; Dixon, R. A.; Gilna, P.; Keller, M.; Langan, P.; Naskar, A. K.; Saddler, J. N.; Tschaplinski, T. J.; Tuskan, G. A.; Wyman, C. E. Science 2014, 344, 1246843. (19) Linger, J. G.; Vardon, D. R.; Guarnieri, M. T.; Karp, E. M.; Hunsinger, G. B.; Franden, M. A.; Johnson, C. W.; Chupka, G.; Strathmann, T. J.; Pienkos, P. T.; Beckham, G. T. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 12013. (20) Chundawat, S. P. S.; Beckham, G. T.; Himmel, M. E.; Dale, B. E. Annu. Rev. Chem. Biomol. Eng. 2011, 2, 121. (21) Aden, A.; Foust, T. Cellulose 2009, 16, 535. (22) Humbird, D.; Davis, R.; Tao, L.; Kinchin, C.; Hsu, D.; Aden, A.; Schoen, P.; Lukas, J.; Olthof, B.; Worley, M.; Sexton, D.; Dudgeon, D. Process Design and Economics for Biochemical Conversion of Lignocellulosic Biomass to Ethanol; NREL/TP-5100-47764; National Renewable Energy Laboratory: Golden, CO, 2011. (23) Davis, R.; Tao, L.; Tan, E.; Biddy, M. J.; Beckham, G. T.; Scarlata, C.; Jacobson, J.; Cafferty, K.; Ross, J.; Lukas, J.; Knorr, D.; Schoen, P. Process Design and Economics for the Conversion of Lignocellulosic Biomass to Hydrocarbons: Dilute-Acid Prehydrolysis and Enzymatic Hydrolysis Deconstruction of Biomass to Sugars and Biological Conversion of Sugars to Hydrocarbons; NREL/TP-5100-60223; National Renewable Energy Laboratory: Golden, CO, 2013. (24) Mosier, N.; Wyman, C.; Dale, B.; Elander, R.; Lee, Y. Y.; Holtzapple, M.; Ladisch, M. Bioresour. Technol. 2005, 96, 673. (25) Wyman, C. E.; Dale, B. E.; Elander, R. T.; Holtzapple, M.; Ladisch, M. R.; Lee, Y. Y. Bioresour. Technol. 2005, 96, 1959. (26) Wyman, C. E.; Dale, B. E.; Elander, R. T.; Holtzapple, M.; Ladisch, M. R.; Lee, Y. Y. Bioresour. Technol. 2005, 96, 2026. (27) Wyman, C. E.; Dale, B. E.; Elander, R. T.; Holtzapple, M.; Ladisch, M. R.; Lee, Y. Y.; Mitchinson, C.; Saddler, J. N. Biotechnol. Prog. 2009, 25, 333. (28) Garlock, R. J.; Balan, V.; Dale, B. E.; Pallapolu, V. R.; Lee, Y. Y.; Kim, Y.; Mosier, N. S.; Ladisch, M. R.; Holtzapple, M. T.; Falls, M.; Sierra-Ramirez, R.; Shi, J.; Ebrik, M. A.; Redmond, T.; Yang, B.; Wyman, C. E.; Donohoe, B. S.; Vinzant, T. B.; Elander, R. T.; Hames, B.; Thomas, S.; Warner, R. E. Bioresour. Technol. 2011, 102, 11063. (29) Tao, L.; Aden, A.; Elander, R. T.; Pallapolu, V. R.; Lee, Y. Y.; Garlock, R. J.; Balan, V.; Dale, B. E.; Kim, Y.; Mosier, N. S.; Ladisch, M. R.; Falls, M.; Holtzapple, M. T.; Sierra, R.; Shi, J.; Ebrik, M. A.; Redmond, T.; Yang, B.; Wyman, C. E.; Hames, B.; Thomas, S.; Warner, R. E. Bioresour. Technol. 2011, 102, 11105. (30) Wyman, C. E.; Balan, V.; Dale, B. E.; Elander, R. T.; Falls, M.; Hames, B.; Holtzapple, M. T.; Ladisch, M. R.; Lee, Y. Y.; Mosier, N.; Pallapolu, V. R.; Shi, J.; Thomas, S. R.; Warner, R. E. Bioresour. Technol. 2011, 102, 11052. (31) Klein-Marcuschamer, D.; Oleskowicz-Popiel, P.; Simmons, B. A.; Blanch, H. W. Biotechnol. Bioeng. 2012, 109, 1083. (32) Lynd, L. R.; Weimer, P. J.; van Zyl, W. H.; Pretorius, I. S. Microbiol. Mol. Biol. Rev. 2002, 66, 506. (33) Zhang, Y. H. P.; Lynd, L. R. Biotechnol. Bioeng. 2004, 88, 797. (34) Zhang, Y. H. P.; Himmel, M. E.; Mielenz, J. R. Biotechnol. Adv. 2006, 24, 452. (35) Stephanopoulos, G. Science 2007, 315, 801. (36) Lynd, L. R.; Laser, M. S.; Brandsby, D.; Dale, B. E.; Davison, B.; Hamilton, R.; Himmel, M.; Keller, M.; McMillan, J. D.; Sheehan, J.; Wyman, C. E. Nat. Biotechnol. 2008, 26, 169. (37) Atsumi, S.; Liao, J. C. Curr. Opin. Biotechnol. 2008, 19, 414. (38) Alonso, D. M.; Bond, J. Q.; Dumesic, J. A. Green Chem. 2010, 12, 1493. (39) Peralta-Yahya, P. P.; Zhang, F. Z.; del Cardayre, S. B.; Keasling, J. D. Nature 2012, 488, 320. (40) Jang, Y. S.; Kim, B.; Shin, J. H.; Choi, Y. J.; Choi, S.; Song, C. W.; Lee, J.; Park, H. G.; Lee, S. Y. Biotechnol. Bioeng. 2012, 109, 2437. (41) Martinez, D.; Berka, R. M.; Henrissat, B.; Saloheimo, M.; Arvas, M.; Baker, S. E.; Chapman, J.; Chertkov, O.; Coutinho, P. M.; Cullen, D.; Danchin, E. G. J.; Grigoriev, I. V.; Harris, P.; Jackson, M.; Kubicek, C. P.; Han, C. S.; Ho, I.; Larrondo, L. F.; de Leon, A. L.; Magnuson, J. K.; Merino, S.; Misra, M.; Nelson, B.; Putnam, N.; Robbertse, B.;

Salamov, A. A.; Schmoll, M.; Terry, A.; Thayer, N.; WesterholmParvinen, A.; Schoch, C. L.; Yao, J.; Barbote, R.; Nelson, M. A.; Detter, C.; Bruce, D.; Kuske, C. R.; Xie, G.; Richardson, P.; Rokhsar, D. S.; Lucas, S. M.; Rubin, E. M.; Dunn-Coleman, N.; Ward, M.; Brettin, T. S. Nat. Biotechnol. 2008, 26, 553. (42) Medie, F. M.; Davies, G. J.; Drancourt, M.; Henrissat, B. Nat. Rev. Microbiol. 2012, 10, 227. (43) Tien, M.; Kirk, T. K. Proc. Natl. Acad. Sci. U.S.A. 1984, 81, 2280. (44) Kirk, T. K.; Farrell, R. L. Annu. Rev. Microbiol. 1987, 41, 465. (45) Bugg, T. D. H.; Winfield, C. J. Nat. Prod. Rep. 1998, 15, 513. (46) Bugg, T. D. H.; Ahmad, M.; Hardiman, E. M.; Rahmanpour, R. Nat. Prod. Rep. 2011, 28, 1883. (47) Bugg, T. D. H.; Ahmad, M.; Hardiman, E. M.; Singh, R. Curr. Opin. Biotechnol. 2011, 22, 394. (48) Vaaje-Kolstad, G.; Westereng, B.; Horn, S. J.; Liu, Z. L.; Zhai, H.; Sørlie, M.; Eijsink, V. G. H. Science 2010, 330, 219. (49) Langston, J. A.; Shaghasi, T.; Abbate, E.; Xu, F.; Vlasenko, E.; Sweeney, M. D. Appl. Environ. Microbiol. 2011, 77, 7007. (50) Phillips, C. M.; Beeson, W. T.; Cate, J. H.; Marletta, M. A. ACS Chem. Biol. 2011, 6, 1399. (51) Quinlan, R. J.; Sweeney, M. D.; Lo Leggio, L.; Otten, H.; Poulsen, J.-C. N.; Johansen, K. S.; Krogh, K. B. R. M.; Jørgensen, C. I.; Tovborg, M.; Anthonsen, A.; Tryfona, T.; Walter, C. P.; Dupree, P.; Xu, F.; Davies, G. J.; Walton, P. H. Proc. Natl. Acad. Sci. U.S.A. 2011, 108, 15079. (52) Westereng, B.; Agger, J. W.; Horn, S. J.; Vaaje-Kolstad, G.; Aachmann, F. L.; Stenstrøm, Y. H.; Eijsink, V. G. H. J. Chromatogr. A 2013, 1271, 144. (53) Bayer, E. A.; Chanzy, H.; Lamed, R.; Shoham, Y. Curr. Opin. Struct. Biol. 1998, 8, 548. (54) Bayer, E. A.; Belaich, J. P.; Shoham, Y.; Lamed, R. Annu. Rev. Microbiol. 2004, 58, 521. (55) Doi, R. H.; Kosugi, A. Nat. Rev. Microbiol. 2004, 2, 541. (56) Fontes, C. M. G. A.; Gilbert, H. J. Annu. Rev. Biochem. 2010, 79, 655. (57) Brunecky, R.; Alahuhta, M.; Xu, Q.; Donohoe, B. S.; Crowley, M. F.; Kataeva, I. A.; Yang, S.-J.; Resch, M. G.; Adams, M. W. W.; Lunin, V. V.; Himmel, M. E.; Bomble, Y. J. Science 2013, 342, 1513. (58) Naas, A. E.; Mackenzie, A. K.; Mravec, J.; Schückel, J.; Willats, W. G. T.; Eijsink, V. G. H.; Pope, P. B. mBio 2014, 5, e01401. (59) Resch, M. G.; Donohoe, B. S.; Baker, J. O.; Decker, S. R.; Bayer, E. A.; Beckham, G. T.; Himmel, M. E. Energy Environ. Sci. 2013, 6, 1858. (60) Eastwood, D. C.; Floudas, D.; Binder, M.; Majcherczyk, A.; Schneider, P.; Aerts, A.; Asiegbu, F. O.; Baker, S. E.; Barry, K.; Bendiksby, M.; Blumentritt, M.; Coutinho, P. M.; Cullen, D.; de Vries, R. P.; Gathman, A.; Goodell, B.; Henrissat, B.; Ihrmark, K.; Kauserud, H.; Kohler, A.; LaButti, K.; Lapidus, A.; Lavin, J. L.; Lee, Y. H.; Lindquist, E.; Lilly, W.; Lucas, S.; Morin, E.; Murat, C.; Oguiza, J. A.; Park, J.; Pisabarro, A. G.; Riley, R.; Rosling, A.; Salamov, A.; Schmidt, O.; Schmutz, J.; Skrede, I.; Stenlid, J.; Wiebenga, A.; Xie, X. F.; Kues, U.; Hibbett, D. S.; Hoffmeister, D.; Hogberg, N.; Martin, F.; Grigoriev, I. V.; Watkinson, S. C. Science 2011, 333, 762. (61) Goodell, B.; Jellison, J.; Liu, J.; Daniel, G.; Paszczynski, A.; Fekete, F.; Krishnamurthy, S.; Jun, L.; Xu, G. J. Biotechnol. 1997, 53, 133. (62) Reese, E. T. Biotechnol. Bioeng. Symp. 1976, 6, 9. (63) Sørensen, A.; Lübeck, M.; Lübeck, P.; Ahring, B. Biomolecules 2013, 3, 612. (64) Brown, R. M., Jr.; Saxena, I. M. Plant Physiol. Biochem. 2000, 38, 57. (65) Doblin, M. S.; Kurek, I.; Jacob-Wilk, D.; Delmer, D. P. Plant Cell Physiol. 2002, 43, 1407. (66) Saxena, I. M.; Brown, R. M., Jr. Ann. Bot. 2005, 96, 9. (67) Hu, S. Q.; Gao, Y. G.; Tajima, K.; Sunagawa, N.; Zhou, Y.; Kawano, S.; Fujiwara, T.; Yoda, T.; Shimura, D.; Satoh, Y.; Munekata, M.; Tanaka, I.; Yao, M. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 17957. (68) Mazur, O.; Zimmer, J. J. Biol. Chem. 2011, 286, 17601. 1432

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(69) Sethaphong, L.; Haigler, C. H.; Kubicki, J. D.; Zimmer, J.; Bonetta, D.; DeBolt, S.; Yingling, Y. G. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 7512. (70) Omadjela, O.; Narahari, A.; Strumillo, J.; Melida, H.; Mazur, O.; Bulone, V.; Zimmer, J. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 17856. (71) Wolfenden, R.; Lu, X.; Young, G. J. Am. Chem. Soc. 1998, 120, 6814. (72) Wolfenden, R.; Snider, M. J. Acc. Chem. Res. 2001, 34, 938. (73) Jahren, A. H. Annu. Rev. Earth Planet. Sci. 2007, 35, 509. (74) Richter, S. L.; Johnson, A. H.; Dranoff, M. M.; LePage, B. A.; Williams, C. J. Geochim. Cosmochim. Acta 2008, 72, 2744. (75) Jahren, A. H.; Sternberg, L. S. Geology 2008, 36, 99. (76) Stankiewicz, B. A.; Briggs, D. E.; Evershed, R. P.; Flannery, M. B.; Wuttke, M. Science 1997, 276, 1541. (77) Ballantyne, A. P.; Rybczynski, N.; Baker, P. A.; Harington, C. R.; White, D. Palaeogeogr., Palaeoclimatol., Palaeoecol. 2006, 242, 188. (78) Wolfenden, R. Chem. Rev. 2006, 106, 3379. (79) Wolfenden, R.; Yuan, Y. J. Am. Chem. Soc. 2008, 130, 7548. (80) Dumas, J.-B. C. R. Hebd. Seances Acad. Sci. 1839, 8, 51. (81) Nishikawa, S.; Ono, S. Proc. Tokyo Math.-Phys. Soc. 1913, 7, 131. (82) Meyer, K. H.; Mark, H. Ber. Dtsch. Chem. Ges. 1928, 61, 593. (83) Meyer, K. H.; Misch, L. Helv. Chim. Acta 1937, 20, 232. (84) Honjo, G.; Watanabe, M. Nature 1958, 181, 326. (85) Gardner, K. H.; Blackwell, J. Biopolymers 1974, 13, 1975. (86) Atalla, R. H.; VanderHart, D. L. Science 1984, 223, 283. (87) VanderHart, D. L.; Atalla, R. H. Macromolecules 1984, 17, 1465. (88) Nishiyama, Y.; Langan, P.; Chanzy, H. J. Am. Chem. Soc. 2002, 124, 9074. (89) Nishiyama, Y.; Sugiyama, J.; Chanzy, H.; Langan, P. J. Am. Chem. Soc. 2003, 125, 14300. (90) Langan, P.; Nishiyama, Y.; Chanzy, H. Biomacromolecules 2001, 2, 410. (91) Swatloski, R. P.; Spear, S. K.; Holbrey, J. D.; Rogers, R. D. J. Am. Chem. Soc. 2002, 124, 4974. (92) Pinkert, A.; Marsh, K. N.; Pang, S.; Staiger, M. P. Chem. Rev. 2009, 109, 6712. (93) Langan, P.; Nishiyama, Y.; Chanzy, H. J. Am. Chem. Soc. 1999, 121, 9940. (94) Wada, M.; Chanzy, H.; Nishiyama, Y.; Langan, P. Macromolecules 2004, 37, 8548. (95) Chundawat, S. P. S.; Bellesia, G.; Uppugundla, N.; Sousa, L. D.; Gao, D. H.; Cheh, A. M.; Agarwal, U. P.; Bianchetti, C. M.; Phillips, G. N.; Langan, P.; Balan, V.; Gnanakaran, S.; Dale, B. E. J. Am. Chem. Soc. 2011, 133, 11163. (96) Fan, L. T.; Lee, Y.-H.; Beardmore, D. H. Biotechnol. Bioeng. 1980, 22, 177. (97) Atalla, R. H.; Ellis, J. D.; Schroeder, L. R. J. Wood Chem. Technol. 1984, 4, 465. (98) Nishiyama, Y. J. Wood Sci. 2009, 55, 241. (99) Dale, B. E.; Tsao, G. T. J. Appl. Polym. Sci. 1982, 27, 1233. (100) Kennedy, C. J.; Cameron, G. J.; Sturcova, A.; Apperley, D. C.; Altaner, C.; Wess, T. J.; Jarvis, M. C. Cellulose 2007, 14, 235. (101) Ha, M. A.; Apperley, D. C.; Evans, B. W.; Huxham, M.; Jardine, W. G.; Vietor, R. J.; Reis, D.; Vian, B.; Jarvis, M. C. Plant J. 1998, 16, 183. (102) Fernandes, A. N.; Thomas, L. H.; Altaner, C. M.; Callow, P.; Forsyth, V. T.; Apperley, D. C.; Kennedy, C. J.; Jarvis, M. C. Proc. Natl. Acad. Sci. U.S.A. 2011, 108, E1195. (103) Thomas, L. H.; Forsyth, V. T.; Sturcova, A.; Kennedy, C. J.; May, R. P.; Altaner, C. M.; Apperley, D. C.; Wess, T. J.; Jarvis, M. C. Plant Physiol. 2013, 161, 465. (104) Kimura, S.; Laosinchai, W.; Itoh, T.; Cui, X. J.; Linder, C. R.; Brown, R. M., Jr. Plant Cell 1999, 11, 2075. (105) Mueller, S. C.; Brown, R. M., Jr. J. Cell Biol. 1980, 84, 315. (106) Endler, A.; Persson, S. Mol. Plant 2011, 4, 199. (107) Herth, W. Planta 1983, 159, 347. (108) Ding, S. Y.; Himmel, M. E. J. Agric. Food. Chem. 2006, 54, 597. (109) Newman, R. H.; Hill, S. J.; Harris, P. J. Plant Physiol. 2013, 163, 1558.

(110) Bellesia, G.; Asztalos, A.; Shen, T. Y.; Langan, P.; Redondo, A.; Gnanakaran, S. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2010, 66, 1184. (111) Beckham, G. T.; Bomble, Y. J.; Bayer, E. A.; Himmel, M. E.; Crowley, M. F. Curr. Opin. Biotechnol. 2011, 22, 231. (112) O’Sullivan, A. C. Cellulose 1997, 4, 173. (113) French, A. D. In Advances in Carbohydrate Chemistry and Biochemistry; Elsevier Academic Press Inc.: San Diego, 2012; Vol 67, pp 19−93. (114) Matthews, J. F.; Skopec, C. E.; Mason, P. E.; Zuccato, P.; Torget, R. W.; Sugiyama, J.; Himmel, M. E.; Brady, J. W. Carbohydr. Res. 2006, 341, 138. (115) Paavilainen, S.; Rog, T.; Vattulainen, I. J. Phys. Chem. B 2011, 115, 3747. (116) Zhao, Z.; Shklyaev, O. E.; Nili, A.; Mohamed, M. N. A.; Kubicki, J. D.; Crespi, V. H.; Zhong, L. H. J. Phys. Chem. A 2013, 117, 2580. (117) Matthews, J. F.; Beckham, G. T.; Bergenstrahle-Wohlert, M.; Brady, J. W.; Himmel, M. E.; Crowley, M. F. J. Chem. Theory Comput. 2012, 8, 735. (118) Matthews, J. F.; Bergenstrahle, M.; Beckham, G. T.; Himmel, M. E.; Nimlos, M. R.; Brady, J. W.; Crowley, M. F. J. Phys. Chem. B 2011, 115, 2155. (119) Hadden, J. A.; French, A. D.; Woods, R. J. Biopolymers 2013, 99, 746. (120) Srinivas, G.; Cheng, X. L.; Smith, J. C. J. Chem. Theory Comput. 2011, 7, 2539. (121) Bu, L. T.; Beckham, G. T.; Crowley, M. F.; Chang, C. H.; Matthews, J. F.; Bomble, Y. J.; Adney, W. S.; Himmel, M. E.; Nimlos, M. R. J. Phys. Chem. B 2009, 113, 10994. (122) Hynninen, A. P.; Matthews, J. F.; Beckham, G. T.; Crowley, M. F.; Nimlos, M. R. J. Chem. Theory Comput. 2011, 7, 2137. (123) Wohlert, J.; Berglund, L. A. J. Chem. Theory Comput. 2011, 7, 753. (124) Bellesia, G.; Chundawat, S. P. S.; Langan, P.; Redondo, A.; Dale, B. E.; Gnanakaran, S. J. Phys. Chem. B 2012, 116, 8031. (125) Chang, R.; Gross, A. S.; Chu, J. W. J. Phys. Chem. B 2012, 116, 8074. (126) Markutsya, S.; Devarajan, A.; Baluyut, J. Y.; Windus, T. L.; Gordon, M. S.; Lamm, M. H. J. Chem. Phys. 2013, 138, 214108. (127) Zhao, H.; Kwak, J. H.; Wang, Y.; Franz, J. A.; White, J. M.; Holladay, J. E. Energy Fuels 2006, 20, 807. (128) Ciesielski, P. N.; Matthews, J. F.; Tucker, M. P.; Beckham, G. T.; Crowley, M. F.; Himmel, M. E.; Donohoe, B. S. ACS Nano 2013, 7, 8011. (129) Habibi, Y.; Lucia, L. A.; Rojas, O. J. Chem. Rev. 2010, 110, 3479. (130) Beckham, G. T.; Matthews, J. F.; Peters, B.; Bomble, Y. J.; Himmel, M. E.; Crowley, M. F. J. Phys. Chem. B 2011, 115, 4118. (131) Payne, C. M.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T. J. Phys. Chem. Lett. 2011, 2, 1546. (132) Cho, H. M.; Gross, A. S.; Chu, J. W. J. Am. Chem. Soc. 2011, 133, 14033. (133) Kwan, C. C.; Ghadiri, M.; Papadopoulos, D. G.; Bentham, A. C. Chem. Eng. Technol. 2003, 26, 185. (134) Hwang, J. W.; Yang, Y. K.; Hwang, J. K.; Pyun, Y. R.; Kim, Y. S. J. Biosci. Bioeng. 1999, 88, 183. (135) Iguchi, M.; Yamanaka, S.; Budhiono, A. J. Mater. Sci. 2000, 35, 261. (136) Bielecki, S.; Krystynowicz, A.; Turkiewicz, M.; Kalinowska, H. In Polysaccharides and Polyamides in the Food Industry; Steinbüchel, A., Rhee, S. K., Eds.; Wiley-Blackwell: Weinheim, Germany, 2005; pp 31− 85. (137) Wanichapichart, P.; Kaewnopparat, S.; Buaking, K.; Puthai, W. J. Sci. Technol. 2002, 24, 855. (138) Horikawa, Y.; Sugiyama, J. Cellulose 2008, 15, 419. (139) Sugiyama, J.; Harada, H.; Fujiyoshi, Y.; Uyeda, N. Mokuzai Gakkaishi 1984, 30, 98. (140) Ek, R.; Gustafsson, C.; Nutt, A.; Iversen, T.; Nyström, C. J. Mol. Recognit. 1998, 11, 263. (141) Imai, T.; Sugiyama, J. Macromolecules 1998, 31, 6275. 1433

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(142) Koyama, M.; Sugiyama, J.; Itoh, T. Cellulose 1997, 4, 147. (143) Mihranyan, A.; Llagostera, A. P.; Karmhag, R.; Strømme, M.; Ek, R. Int. J. Pharm. 2004, 269, 433. (144) Walseth, C. S. Tappi 1952, 35, 228. (145) Whitmore, R. E.; Atalla, R. H. Int. J. Biol. Macromol. 1985, 7, 182. (146) Schenzel, K.; Fischer, S.; Brendler, E. Cellulose 2005, 12, 223. (147) Kataoka, Y.; Kondo, T. Macromolecules 1998, 31, 760. (148) Åkerholm, M.; Hinterstoisser, B.; Salmén, L. Carbohydr. Res. 2004, 339, 569. (149) Thygesen, A.; Oddershede, J.; Lilholt, H.; Thomsen, A. B.; Ståhl, K. Cellulose 2005, 12, 563. (150) Park, S.; Baker, J. O.; Himmel, M. E.; Parilla, P. A.; Johnson, D. K. Biotechnol. Biofuels 2010, 3, 10. (151) Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, B. Nucleic Acids Res. 2009, 37, D233. (152) Levasseur, A.; Drula, E.; Lombard, V.; Coutinho, P.; Henrissat, B. Biotechnol. Biofuels 2013, 6, 41. (153) Lombard, V.; Golaconda Ramulu, H.; Drula, E.; Coutinho, P. M.; Henrissat, B. Nucleic Acids Res. 2014, 42, D490. (154) Lairson, L. L.; Henrissat, B.; Davies, G. J.; Withers, S. G. Annu. Rev. Biochem. 2008, 77, 521. (155) Lombard, V.; Bernard, T.; Rancurel, C.; Brumer, H.; Coutinho, P. M.; Henrissat, B. Biochem. J. 2010, 432, 437. (156) Boraston, A. B.; Bolam, D. N.; Gilbert, H. J.; Davies, G. J. Biochem. J. 2004, 382, 769. (157) Koshland, D. E., Jr. Biol. Rev. 1953, 28, 413. (158) Jongkees, S. A. K.; Withers, S. G. Acc. Chem. Res. 2014, 47, 226. (159) Davies, G. J.; Planas, A.; Rovira, C. Acc. Chem. Res. 2012, 45, 308. (160) van der Kamp, M. W.; Mulholland, A. J. Biochemistry 2013, 52, 2708. (161) Vocadlo, D. J.; Davies, G. J. Curr. Opin. Chem. Biol. 2008, 12, 539. (162) Jencks, W. P. Annu. Rev. Biochem. 1962, 32, 639. (163) Phillips, D. C. Proc. Natl. Acad. Sci. U.S.A. 1967, 57, 484. (164) Vernon, C. A. Proc. R. Soc. London, Ser. B 1967, 167, 389. (165) Chipman, D. M.; Sharon, N. Science 1969, 165, 454. (166) Deslongchamps, P. Stereoelectronic Effects in Organic Chemistry; Pergamon Press: New York, 1983. (167) Post, C. B.; Karplus, M. J. Am. Chem. Soc. 1986, 108, 1317. (168) Sinnott, M. L. Adv. Phys. Org. Chem. 1988, 24, 113. (169) Sinnott, M. L. Chem. Rev. 1990, 90, 1171. (170) Davies, G.; Sinnott, M. L.; Withers, S. G. In Comprehensive Biological Catalysis: A Mechanistic Reference; Sinnott, M., Ed.; Academic Press: San Diego, 1998; Vol. I, pp 119−208. (171) Vocadlo, D. J.; Davies, G. J.; Laine, R.; Withers, S. G. Nature 2001, 412, 835. (172) Divne, C.; Ståhlberg, J.; Reinikainen, T.; Ruohonen, L.; Pettersson, G.; Knowles, J. K. C.; Teeri, T. T.; Jones, T. A. Science 1994, 265, 524. (173) Divne, C.; Ståhlberg, J.; Teeri, T. T.; Jones, T. A. J. Mol. Biol. 1998, 275, 309. (174) Sulzenbacher, G.; Driguez, H.; Henrissat, B.; Schülein, M.; Davies, G. J. Biochemistry 1996, 35, 15280. (175) Davies, G. J.; Wilson, K. S.; Henrissat, B. Biochem. J. 1997, 321, 557. (176) Biely, P.; Krátký, Z.; Vršanská, M. Eur. J. Biochem. 1981, 119, 559. (177) Davies, G.; Henrissat, B. Structure 1995, 3, 853. (178) Vasella, A.; Davies, G. J.; Bohm, M. Curr. Opin. Chem. Biol. 2002, 6, 619. (179) Schwarz, J. C. P. J. Chem. Soc., Chem. Commun. 1973, 14, 505. (180) Joint Commission on Biochemical Nomenclature. Eur. J. Biochem. 1980, 111, 295. (181) Cremer, D.; Pople, J. A. J. Am. Chem. Soc. 1975, 97, 1354. (182) Barnett, C. B.; Naidoo, K. J. Mol. Phys. 2009, 107, 1243. (183) Barnett, C. B.; Naidoo, K. J. J. Phys. Chem. B 2010, 114, 17142.

(184) Mayes, H. B.; Broadbelt, L. J.; Beckham, G. T. J. Am. Chem. Soc. 2014, 136, 1008. (185) Hill, A. D.; Reilly, P. J. J. Chem. Inf. Model. 2007, 47, 1031. (186) Blake, C. C. F.; Mair, G. A.; North, A. C. T.; Phillips, D. C.; Sarma, V. R. Proc. R. Soc. London, Ser. B 1967, 167, 365. (187) Blake, C. C. F.; Koenig, D. F.; Mair, G. A.; North, A. C. T.; Phillips, D. C.; Sarma, V. R. Nature 1965, 206, 757. (188) Schindler, M.; Assaf, Y.; Sharon, N.; Chipman, D. M. Biochemistry 1977, 16, 423. (189) Pincus, M. R.; Scheraga, H. A. Biochemistry 1981, 20, 3960. (190) Ford, L. O.; Johnson, L. N.; Machin, P. A.; Phillips, D. C.; Tjian, R. J. Mol. Biol. 1974, 88, 349. (191) Strynadka, N. C. J.; James, M. N. G. J. Mol. Biol. 1991, 220, 401. (192) Rouvinen, J.; Bergfors, T.; Teeri, T.; Knowles, J. K.; Jones, T. A. Science 1990, 249, 380. (193) Barr, B. K.; Hsieh, Y.-L.; Ganem, B.; Wilson, D. B. Biochemistry 1996, 35, 586. (194) Zou, J.-y.; Kleywegt, G. J.; Ståhlberg, J.; Driguez, H.; Nerinckx, W.; Claeyssens, M.; Koivula, A.; Teeri, T. T.; Jones, T. A. Structure 1999, 7, 1035. (195) Driguez, H. Top. Curr. Chem. 1997, 187, 85. (196) Rye, C. S.; Withers, S. G. Curr. Opin. Chem. Biol. 2000, 4, 573. (197) Diot, J.; García-Moreno, M. I.; Gouin, S. G.; Ortiz Mellet, C.; Haupt, K.; Kovensky, J. Org. Biomol. Chem. 2009, 7, 357. (198) Withers, S. G.; Street, I. P.; Bird, P.; Dolphin, D. H. J. Am. Chem. Soc. 1987, 109, 7530. (199) Withers, S. G.; Rupitz, K.; Street, I. P. J. Biol. Chem. 1988, 263, 7929. (200) White, A.; Tull, D.; Johns, K.; Withers, S. G.; Rose, D. R. Nat. Struct. Biol. 1996, 3, 149. (201) Withers, S. G.; Warren, R. A. J.; Street, I. P.; Rupitz, K.; Kempton, J. B.; Aebersold, R. J. Am. Chem. Soc. 1990, 112, 5887. (202) Withers, S. G.; Street, I. P. J. Am. Chem. Soc. 1988, 110, 8551. (203) Williams, S. J.; Withers, S. G. Carbohydr. Res. 2000, 327, 27. (204) Czjzek, M.; Cicek, M.; Zamboni, V.; Bevan, D. R.; Henrissat, B.; Esen, A. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 13555. (205) Verdoucq, L.; Morinière, J.; Bevan, D. R.; Esen, A.; Vasella, A.; Henrissat, B.; Czjze, M. J. Biol. Chem. 2004, 279, 31796. (206) Davies, G. J.; Mackenzie, L.; Varrot, A.; Dauter, M.; Brzozowski, A. M.; Schülein, M.; Withers, S. G. Biochemistry 1998, 37, 11707. (207) Varrot, A.; Davies, G. J. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2003, 59, 447. (208) Sandgren, M.; Berglund, G. I.; Shaw, A.; Ståhlberg, J.; Kenne, L.; Desmet, T.; Mitchinson, C. J. Mol. Biol. 2004, 342, 1505. (209) Varrot, A.; Frandsen, T. P.; Driguez, H.; Davies, G. J. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 2201. (210) Guérin, D. M. A.; Lascombe, M.-B.; Costabel, M.; Souchon, H.; Lamzin, V.; Béguin, P.; Alzari, P. M. J. Mol. Biol. 2002, 316, 1061. (211) Deslongchamps, P. Pure Appl. Chem. 1975, 43, 351. (212) Bennet, A. J.; Sinnott, M. L. J. Am. Chem. Soc. 1986, 108, 7287. (213) Deslongchamps, P. Pure Appl. Chem. 1993, 65, 1161. (214) Nerinckx, W.; Desmet, T.; Claeyssens, M. ARKIVOC 2006, 13, 90. (215) Warshel, A. Computer Modeling of Chemical Reactions in Enzymes and Solutions; Wiley-Interscience: New York, 1991. (216) Warshel, A.; Sharma, P. K.; Kato, M.; Xiang, Y.; Liu, H.; Olsson, M. H. M. Chem. Rev. 2006, 106, 3210. (217) Fort, S.; Coutinhoa, P. M.; Schülein, M.; Nardin, R.; Cottaz, S.; Driguez, H. Tetrahedron Lett. 2001, 42, 3443. (218) Walvoort, M. T. C.; van der Marel, G. A.; Overkleeft, H. S.; Codée, J. D. C. Chem. Sci. 2013, 4, 897. (219) Pauling, L. Chem. Eng. News 1946, 24, 1375. (220) Withers, S. G. Pure Appl. Chem. 1995, 67, 1673. (221) Smith, B. J. J. Am. Chem. Soc. 1997, 119, 2699. (222) Biarnés, X.; Ardèvol, A.; Planas, A.; Rovira, C.; Laio, A.; Parrinello, M. J. Am. Chem. Soc. 2007, 129, 10686. (223) Sega, M.; Autieri, E.; Pederiva, F. J. Chem. Phys. 2009, 130, 225102. (224) DeMarco, M. L.; Woods, R. J. Glycobiology 2008, 18, 426. 1434

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(225) Ryu, D. D. Y.; Mandels, M. Enzyme Microb. Technol. 1980, 2, 91. (226) Montenecourt, B. S. Trends Biotechnol. 1983, 1, 156. (227) Eveleigh, D. E. Philos. Trans. R. Soc. London, Ser. A 1987, 321, 435. (228) Persson, I.; Tjerneld, F.; Hahn-Hägerdal, B. Process Biochem. 1991, 26, 65. (229) Cherry, J. R.; Fidantsef, A. L. Curr. Opin. Biotechnol. 2003, 14, 438. (230) Mandels, M.; Reese, E. T. J. Bacteriol. 1957, 73, 269. (231) Gauss, W. F.; Suzuki, S.; Takagi, M. (Bio Research Center Company Limited) Manufacture of Alcohol from Cellulosic Materials Using Plural Ferments. Bio Research Center Company Limited. U.S. Patent 3,990,944, Nov 9, 1976. (232) Silver, R. S. (Gulf Research & Development Company) Saccharification Method. U.S. Patent 4,409,329, Oct. 11, 1983. (233) El Gogary, S.; Leite, A.; Crivellaro, O.; Dorry, H. E.; Eveleigh, D. E. In TRICEL 89An International Symposium on Trichoderma Cellulases; Kubicek, C. P., Esterbauer, H., Eveleigh, D. E., Steiner, W., Eds., Royal Chemical Society: London, U.K., 1990; pp 200−211. (234) Sternberg, D.; Mandels, G. R. J. Bacteriol. 1979, 139, 761. (235) Mandels, M.; Weber, J.; Parizek, R. Appl. Microbiol. 1971, 21, 152. (236) Himmel, M. E.; Adney, W. S.; Rivard, C. J.; Baker, J. O. In Energy from Biomass and Wastes XVI: Proceedings of the Institute of Gas Technology Conference; Institute of Gas Technology: Orlando, FL, 1992; pp 529−543. (237) Ghose, T. K. Pure Appl. Chem. 1987, 59, 257. (238) Gallo, B. J.; Andreotti, R.; Roche, C.; Ryu, D.; Mandels, M. Biotechnol. Bioeng. Symp. 1978, 8, 89. (239) Ryu, D.; Andereotti, R.; Mandels, M.; Gallo, B.; Reese, E. Biotechnol. Bioeng. 1979, 21, 1887. (240) Montenecourt, B. S.; Eveleigh, D. E. In Hydrolysis of Cellulose: Mechanisms of Enzymatic and Acid Catalysis; Brown, R., Jurasek, L., Eds.; American Chemical Society: Washington, DC, 1979; pp 289− 301. (241) Montenecourt, B. S.; Eveleigh, D. E. Appl. Environ. Microbiol. 1977, 34, 777. (242) Ghosh, A.; Alrabiai, S.; Ghosh, B. K.; Trimiño-Vazquez, H.; Eveleigh, D. E.; Montenecourt, B. S. Enzyme Microb. Technol. 1982, 4, 110. (243) Singhania, R.; Sukumaran, R.; Pandey, A. Appl. Biochem. Biotechnol. 2007, 142, 60. (244) Shoemaker, S. P.; Raymond, J. C.; Bruner, R. In Trends in the Biology of Fermentations for Fuels and Chemicals; Hollaender, A., Rabson, R., Rogers, P., Pietro, A., Valentine, R., Wolfe, R., Eds.; Plenum Press: New York, 1981; pp 89−109. (245) Portnoy, T.; Margeot, A.; Seidl-Seiboth, V.; Le Crom, S.; Ben Chaabane, F.; Linke, R.; Seiboth, B.; Kubicek, C. P. Eukaryotic Cell 2011, 10, 262. (246) Penttilä, M.; Nevalainen, H.; Rättö, M.; Salminen, E.; Knowles, J. Gene 1987, 61, 155. (247) Knowles, J.; Lehtovaara, P.; Penttilä, M.; Teeri, T.; Harkki, A.; Salovuori, I. Antonie van Leeuwenhoek 1987, 53, 335. (248) Uusitalo, J. M.; Helena Nevalainen, K. M.; Harkki, A. M.; Knowles, J. K. C.; Penttilä, M. E. J. Biotechnol. 1991, 17, 35. (249) Harkki, A.; Mäntylä, A.; Penttilä, M.; Muttilainen, S.; Bühler, R.; Suominen, P.; Knowles, J.; Nevalainen, H. Enzyme Microb. Technol. 1991, 13, 227. (250) Teeri, T. T.; Penttilä, M.; Keränen, S.; Nevalainen, H.; Knowles, J. K. C. In Biotechnology of Filamentous Fungi; Finkelstein, D. B., Ball, C., Eds.; Newnes: Boston, 1992; pp 417−445. (251) Srisodsuk, M.; Reinikainen, T.; Penttilä, M.; Teeri, T. T. J. Biol. Chem. 1993, 268, 20756. (252) Mach, R. L.; Schindler, M.; Kubicek, C. P. Curr. Genet. 1994, 25, 567. (253) Nakari-Setälä, T.; Penttilä, M. Appl. Environ. Microbiol. 1995, 61, 3650. (254) Keränen, S.; Penttilä, M. Curr. Opin. Biotechnol. 1995, 6, 534.

(255) Ilmén, M.; Thrane, C.; Penttilä, M. E. Mol. Gen. Genet. 1996, 251, 451. (256) Mach, R. L.; Strauss, J.; Zeilinger, S.; Schindler, M.; Kubicek, C. P. Mol. Microbiol. 1996, 21, 1273. (257) Takashima, S.; Iikura, H.; Nakamura, A.; Masaki, H.; Uozumi, T. FEMS Microbiol. Lett. 1996, 145, 361. (258) Ilmén, M.; Onnela, M. L.; Klemsdal, S.; Keränen, S.; Penttilä, M. Mol. Gen. Genet. 1996, 253, 303. (259) Ilmén, M.; Saloheimo, A.; Onnela, M.-L.; Penttilä, M. E. Appl. Environ. Microbiol. 1997, 63, 1298. (260) Pakula, T. M.; Uusitalo, J.; Saloheimo, M.; Salonen, K.; Aarts, R. J.; Penttilä, M. Microbiology 2000, 146, 223. (261) Saloheimo, A.; Aro, N.; Ilmén, M.; Penttilä, M. J. Biol. Chem. 2000, 275, 5817. (262) Aro, N.; Saloheimo, A.; Ilmén, M.; Penttilä, M. J. Biol. Chem. 2001, 276, 24309. (263) Vasara, T.; Salusjärvi, L.; Raudaskoski, M.; Keränen, S.; Penttilä, M.; Saloheimo, M. Mol. Microbiol. 2001, 42, 1349. (264) Valkonen, M.; Penttilä, M.; Saloheimo, M. Mol. Genet. Genomics 2004, 272, 443. (265) Nakari-Setälä, T.; Paloheimo, M.; Kallio, J.; Vehmaanperä, J.; Penttilä, M.; Saloheimo, M. Appl. Environ. Microbiol. 2009, 75, 4853. (266) Kubicek, C. P.; Mikus, M.; Schuster, A.; Schmoll, M.; Seiboth, B. Biotechnol. Biofuels 2009, 2, 1. (267) Steiger, M. G.; Vitikainen, M.; Uskonen, P.; Brunner, K.; Adam, G.; Pakula, T.; Penttilä, M.; Saloheimo, M.; Mach, R. L.; Mach-Aigner, A. R. Appl. Environ. Microbiol. 2011, 77, 114. (268) Seiboth, B.; Ivanova, C.; Seidl-seiboth, V. In Biofuel ProductionRecent Developments and Prospects; Bernardes, M. A. d. S., Ed., InTech: Rijeka, Croatia, 2011; pp 309−340. (269) Le Crom, S.; Schackwitz, W.; Pennacchio, L.; Magnuson, J. K.; Culley, D. E.; Collett, J. R.; Martin, J.; Druzhinina, I. S.; Mathis, H.; Monot, F.; Seiboth, B.; Cherry, B.; Rey, M.; Berka, R.; Kubicek, C. P.; Baker, S. E.; Margeot, A. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 16151. (270) Seidl, V.; Gamauf, C.; Druzhinina, I.; Seiboth, B.; Hartl, L.; Kubicek, C. BMC Genomics 2008, 9, 327. (271) England, G. R.; Kelley, A.; Mitchinson, C. (Genencor International, Inc.) Induction of Gene Expression Using a High Concentration Sugar Mixture. U.S. Publication 2010/0009408 A1, Jan. 14, 2010. (272) Geysens, S.; Pakula, T.; Uusitalo, J.; Dewerte, I.; Penttilä, M.; Contreras, R. Appl. Environ. Microbiol. 2005, 71, 2910. (273) Hui, J. P. M.; Lanthier, P.; White, T. C.; McHugh, S. G.; Yaguchi, M.; Roy, R.; Thibault, P. J. Chromatogr. B 2001, 752, 349. (274) Peterson, R.; Nevalainen, H. Microbiology 2012, 158, 58. (275) Datema, R.; Schwarz, R. T. Eur. J. Biochem. 1978, 90, 505. (276) Joutsjoki, V. V.; Kuittinen, M.; Torkkeli, T. K.; Suominen, P. L. FEMS Microbiol. Lett. 1993, 112, 281. (277) Kiiskinen, L.-L.; Kruus, K.; Bailey, M.; Ylösmäki, E.; Siika-aho, M.; Saloheimo, M. Microbiology 2004, 150, 3065. (278) Salles, B. C.; Te’o, V. S. J.; Gibbs, M. D.; Bergquist, P. L.; Filho, E. X. F.; Ximenes, E. A.; Nevalainen, K. M. H. Biotechnol. Lett. 2007, 29, 1195. (279) Nyyssönen, E.; Penttilä, M.; Harkki, A.; Saloheimo, A.; Knowles, J. K. C.; Keränen, S. Nat. Biotechnol. 1993, 11, 591. (280) Paloheimo, M.; Mäntylä, A.; Kallio, J.; Puranen, T.; Suominen, P. Appl. Environ. Microbiol. 2007, 73, 3215. (281) Shoemaker, S.; Schweickart, V.; Ladner, M.; Gelfand, D.; Kwok, S.; Myambo, K.; Innis, M. Bio/Technology 1983, 1, 691. (282) Teeri, T.; Salovuori, I.; Knowles, J. Bio/Technology 1983, 1, 696. (283) Panttilä, M. E.; André, L.; Saloheimo, M.; Lehtovaara, P.; Knowles, J. K. C. Yeast 1987, 3, 175. (284) Penttilä, M. E.; André, L.; Lehtovaara, P.; Bailey, M.; Teeri, T. T.; Knowles, J. K. C. Gene 1988, 63, 103. (285) Zurbriggen, B.; Bailey, M. J.; Penttilä, M. E.; Poutanen, K.; Linko, M. J. Biotechnol. 1990, 13, 267. (286) Bailey, M. J.; Siika-aho, M.; Valkeajarvi, A.; Penttilä, M. E. Biotechnol. Appl. Biochem. 1993, 17, 65. 1435

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(287) Stålbrand, H.; Saloheimo, A.; Vehmaanperä, J.; Henrissat, B.; Penttilä, M. Appl. Environ. Microbiol. 1995, 61, 1090. (288) Valkonen, M.; Penttilä, M.; Saloheimo, M. Appl. Environ. Microbiol. 2003, 69, 2065. (289) Ilmén, M.; den Haan, R.; Brevnova, E.; McBride, J.; Wiswall, E.; Froehlich, A.; Koivula, A.; Voutilainen, S. P.; Siika-aho, M.; la Grange, D. C.; Thorngren, N.; Ahlgren, S.; Mellon, M.; Deleault, K.; Rajgarhia, V.; van Zyl, W. H.; Penttilä, M. Biotechnol. Biofuels 2011, 4, 30. (290) Voutilainen, S. P.; Nurmi-Rantala, S.; Penttilä, M.; Koivula, A. Appl. Microbiol. Biotechnol. 2014, 98, 2991. (291) Reese, E. T.; Siu, R. G. H.; Levinson, H. S. J. Bacteriol. 1950, 59, 485. (292) Pedersen, K. O. In Les Protéines: Rapports et Discussions; Stoops, R., Ed.; Institut International de Chimie Solvay: Bruxelles, 1953; pp 19−62. (293) Mandels, M.; Reese, E. T. In Developments in Industrial Microbiology; Society for Industrial Microbiology: New York, 1964; pp 5−20. (294) Selby, K.; Maitland, C. C. Biochem. J. 1965, 94, 578. (295) Li, L. H.; Flora, R. M.; King, K. W. Arch. Biochem. Biophys. 1965, 111, 439. (296) Reese, E. T.; Mandels, M. In Cellulose and Cellulose Derivatives; Bikales, N. M., Segal, L., Eds.; Wiley Interscience: New York, 1971; pp 1079−1094. (297) Liu, T. H.; King, K. W. Arch. Biochem. Biophys. 1967, 120, 462. (298) Eriksson, K.-E. In Cellulases and Their Applications; Hajny, G. J., Reese, E. T., Eds.; American Chemical Society: Washington, DC, 1969; pp 83−104. (299) Halliwell, G.; Griffin, M.; Vincent, R. Biochem. J. 1972, 127, 43. (300) Berghem, L. E. R.; Pettersson, L. G. Eur. J. Biochem. 1973, 37, 21. (301) Van Tilbeurgh, H.; Pettersson, G.; Bhikabhai, R.; De Boeck, H.; Claeyssens, M. Eur. J. Biochem. 1985, 148, 329. (302) Chanzy, H.; Henrissat, B. FEBS Lett. 1985, 184, 285. (303) van Tilbeurgh, H.; Bhikhabhai, R.; Pettersson, L. G.; Claeyssens, M. FEBS Lett. 1984, 169, 215. (304) Ståhlberg, J.; Johansson, G.; Pettersson, G. Biochim. Biophys. Acta 1993, 1157, 107. (305) Henrissat, B.; Driguez, H.; Viet, C.; Schülein, M. Bio/Technology 1985, 3, 722. (306) Wood, T. M. Biochem. Soc. Trans. 1985, 13, 407. (307) Kurašin, M.; Väljamäe, P. J. Biol. Chem. 2011, 286, 169. (308) Sampedro, J.; Cosgrove, D. J. Genome Biol. 2005, 6, 242. (309) Saloheimo, M.; Paloheimo, M.; Hakola, S.; Pere, J.; Swanson, B.; Nyyssönen, E.; Bhatia, A.; Ward, M.; Penttilä, M. Eur. J. Biochem. 2002, 269, 4202. (310) Baker, J. O.; King, M. R.; Adney, W. S.; Decker, S. R.; Vinzant, T. B.; Lantz, S. E.; Nieves, R. E.; Thomas, S. R.; Li, L.-C.; Cosgrove, D. J.; Himmel, M. E. Appl. Biochem. Biotechnol. 2000, 84−86, 217. (311) Beeson, W. T.; Iavarone, A. T.; Hausmann, C. D.; Cate, J. H. D.; Marletta, M. A. Appl. Environ. Microbiol. 2011, 77, 650. (312) Beeson, W. T.; Phillips, C. M.; Cate, J. H. D.; Marletta, M. A. J. Am. Chem. Soc. 2012, 134, 890. (313) Kostylev, M.; Wilson, D. Biofuels 2012, 3, 61. (314) Kubicek, C. P.; Herrera-Estrella, A.; Seidl-Seiboth, V.; Martinez, D. A.; Druzhinina, I. S.; Thon, M.; Zeilinger, S.; Casas-Flores, S.; Horwitz, B. A.; Mukherjee, P. K.; Mukherjee, M.; Kredics, L.; Alcaraz, L. D.; Aerts, A.; Antal, Z.; Atanasova, L.; Cervantes-Badillo, M. G.; Challacombe, J.; Chertkov, O.; McCluskey, K.; Coulpier, F.; Deshpande, N.; von Döhren, H.; Ebbole, D.; Esquivel-Naranjo, E. U.; Fekete, E.; Flipphi, M.; Glaser, F.; Gómez-Rodríguez, E. Y.; Gruber, S.; Han, C.; Henrissat, B.; Hermosa, R.; Hernández-Oñate, M.; Karaffa, L.; Kosti, I.; Le Crom, S.; Lindquist, E.; Lucas, S.; Lübeck, M.; Lübeck, P.; Margeot, A.; Metz, B.; Misra, M.; Nevalainen, H.; Omann, M.; Packer, N.; Perrone, G.; Uresti-Rivera, E. E.; Salamov, A.; Schmoll, M.; Seiboth, B.; Shapiro, H.; Sukno, S.; Tamayo-Ramos, J. A.; Tisch, D.; Wiest, A.; Wilkinson, H. H.; Zhang, M.; Coutinho, P. M.; Kenerley, C. M.; Monte, E.; Baker, S. E.; Grigoriev, I. V. Genome Biol. 2011, 12, 1.

(315) Murphy, L.; Cruys-Bagger, N.; Damgaard, H. D.; Baumann, M. J.; Olsen, S. N.; Borch, K.; Lassen, S. F.; Sweeney, M.; Tatsumi, H.; Westh, P. J. Biol. Chem. 2012, 287, 1252. (316) Teeri, T. T. Trends Biotechnol. 1997, 15, 160. (317) Takashima, S.; Nakamura, A.; Hidaka, M.; Masaki, H.; Uozumi, T. J. Biochem. 1999, 125, 728. (318) Saloheimo, M.; Kuja-Panula, J.; Ylösmäki, E.; Ward, M.; Penttilä, M. Appl. Environ. Microbiol. 2002, 68, 4546. (319) Foreman, P. K.; Brown, D.; Dankmeyer, L.; Dean, R.; Diener, S.; Dunn-Coleman, N. S.; Goedegebuur, F.; Houfek, T. D.; England, G. J.; Kelley, A. S.; Meerman, H. J.; Mitchell, T.; Mitchinson, C.; Olivares, H. A.; Teunissen, P. J. M.; Yao, J.; Ward, M. J. Biol. Chem. 2003, 278, 31988. (320) Mach, R. L., Klonierung und Charakterisierung einiger Gene des Kohlenstoffmetabolismus von Trichoderma reesei. Ph.D. Thesis, Institute of Biochemistry and Technology, Vienna, Austria, 1993. (321) Korotkova, O. G.; Semenova, M. V.; Morozova, V. V.; Zorov, I. N.; Sokolova, L. M.; Bubnova, T. M.; Okunev, O. N.; Sinitsyn, A. P. Biochemistry 2009, 74, 569. (322) Karkehabadi, S.; Helmich, K. E.; Kaper, T.; Hansson, H.; Mikkelsen, N.-E.; Gudmundsson, M.; Piens, K.; Fujdala, M.; Banerjee, G.; Scott-Craig, J. S.; Walton, J. D.; Phillips, G. N.; Sandgren, M. J. Biol. Chem. 2014, 289, 31624. (323) Saloheimo, M.; Lehtovaara, P.; Penttilä, M.; Teeri, T. T.; Ståhlberg, J.; Johansson, G.; Pettersson, G.; Claeyssens, M.; Tomme, P.; Knowles, J. K. C. Gene 1988, 63, 11. (324) Qin, Y.; Wei, X.; Song, X.; Qu, Y. J. Biotechnol. 2008, 135, 190. (325) Teeri, T. T.; Lehtovaara, P.; Kauppinen, S.; Salovuori, I.; Knowles, J. Gene 1987, 51, 43. (326) Poidevin, L.; Feliu, J.; Doan, A.; Berrin, J.-G.; Bey, M.; Coutinho, P. M.; Henrissat, B.; Record, E.; Heiss-Blanquet, S. Appl. Environ. Microbiol. 2013, 79, 4220. (327) Boer, H.; Koivula, A. Eur. J. Biochem. 2003, 270, 841. (328) Becker, D.; Braet, C.; Brumer, H., III; Claeyssens, M.; Divne, C.; Fagerström, B. R.; Harris, M.; Jones, T. A.; Kleywegt, G. J.; Koivula, A.; Mahdi, S.; Piens, K.; Sinnott, M. L.; Ståhlberg, J.; Teeri, T. T.; Underwood, M.; Wohlfahrt, G. Biochem. J. 2001, 356, 19. (329) Penttilä, M.; Lehtovaara, P.; Nevalainen, H.; Bhikhabhai, R.; Knowles, J. Gene 1986, 45, 253. (330) Van Arsdell, J. N.; Kwok, S.; Schweickart, V. L.; Ladner, M. B.; Gelfand, D. H.; Innis, M. A. Bio/Technology 1987, 5, 60. (331) Biely, P.; Vršnska, M.; Claeyssens, M. Eur. J. Biochem. 1991, 200, 157. (332) Vlasenko, E.; Schülein, M.; Cherry, J.; Xu, F. Bioresour. Technol. 2010, 101, 2405. (333) Fowler, T.; Mitchinson, C. (Genencor International, Inc.) Mutant EGIII Cellulase, DNA Encoding Such EGIII Compositions and Methods for Obtaining Same. U.S. Patent US 6,187,732 B1, Feb. 13, 2001. (334) Karlsson, J.; Siika-aho, M.; Tenkanen, M.; Tjerneld, F. J. Biotechnol. 2002, 99, 63. (335) Saloheimo, A.; Henrissat, B.; Hoffrén, A.-M.; Teleman, O.; Penttilä, M. Mol. Microbiol. 1994, 13, 219. (336) Benkő , Z.; Siika-aho, M.; Viikari, L.; Réczey, K. Enzyme Microb. Technol. 2008, 43, 109. (337) Karkehabadi, S.; Hansson, H.; Kim, S.; Piens, K.; Mitchinson, C.; Sandgren, M. J. Mol. Biol. 2008, 383, 144. (338) Gilbert, H. J.; Knox, J. P.; Boraston, A. B. Curr. Opin. Struct. Biol. 2013, 23, 669. (339) Shoseyov, O.; Shani, Z.; Levy, I. Microbiol. Mol. Biol. Rev. 2006, 70, 283. (340) Hashimoto, H. Cell. Mol. Life Sci. 2006, 63, 2954. (341) Hilden, L.; Johansson, G. Biotechnol. Lett. 2004, 26, 1683. (342) Guillén, D.; Sánchez, S.; Rodríguez-Sanoja, R. Appl. Microbiol. Biotechnol. 2010, 85, 1241. (343) van Tilbeurgh, H.; Tomme, P.; Claeyssens, M.; Bhikhabhai, R.; Pettersson, G. FEBS Lett. 1986, 204, 223. 1436

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(376) Xiao, Z.; Gao, P.; Qu, Y.; Wang, T. Biotechnol. Lett. 2001, 23, 711. (377) Wang, L.; Zhang, Y.; Gao, P. Sci. China, Ser. C: Life Sci. 2008, 51, 620. (378) Hall, M.; Bansal, P.; Lee, J. H.; Realff, M. J.; Bommarius, A. S. Bioresour. Technol. 2011, 102, 2910. (379) Wang, Y. G.; Tang, R. T.; Tao, J.; Wang, X. N.; Zheng, B. S.; Feng, Y. J. Biol. Chem. 2012, 287, 29568. (380) Hall, M.; Rubin, J.; Behrens, S. H.; Bommarius, A. S. J. Biotechnol. 2011, 155, 370. (381) Voutilainen, S. P.; Puranen, T.; Siika-aho, M.; Lappalainen, A.; Alapuranen, M.; Kallio, J.; Hooman, S.; Viikri, L.; Vehmaanperä, J.; Koivula, A. Biotechnol. Bioeng. 2008, 101, 515. (382) Kim, T.-W.; Chokhawala, H. A.; Nadler, D.; Blanch, H. W.; Clark, D. S. Biotechnol. Bioeng. 2010, 107, 601. (383) Kern, M.; McGeehan, J. E.; Streeter, S. D.; Martin, R. N. A.; Besser, K.; Elias, L.; Eborall, W.; Malyon, G. P.; Payne, C. M.; Himmel, M. E.; Schnorr, K.; Beckham, G. T.; Cragg, S. M.; Bruce, N. C.; McQueen-Mason, S. J. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 10189. (384) Várnai, A.; Siika-aho, M.; Viikari, L. Biotechnol. Biofuels 2013, 6, 30. (385) Hoffrén, A. M.; Teeri, T. T.; Teleman, O. Protein Eng. 1995, 8, 443. (386) Mulakala, C.; Reilly, P. J. Proteins: Struct., Funct., Bioinf. 2005, 60, 598. (387) Nimlos, M. R.; Matthews, J. F.; Crowley, M. F.; Walker, R. C.; Chukkapalli, G.; Brady, J. V.; Adney, W. S.; Clearyl, J. M.; Zhong, L. H.; Himmel, M. E. Protein Eng., Des. Sel. 2007, 20, 179. (388) Zhong, L.; Matthews, J. F.; Crowley, M. F.; Rignall, T.; Talon, C.; Cleary, J. M.; Walker, R. C.; Chukkapalli, G.; McCabe, C.; Nimlos, M. R.; Brooks, C. L.; Himmel, M. E.; Brady, J. W. Cellulose 2008, 15, 261. (389) Zhong, L. H.; Matthews, J. F.; Hansen, P. I.; Crowley, M. F.; Cleary, J. M.; Walker, R. C.; Nimlos, M. R.; Brooks, C. L.; Adney, W. S.; Himmel, M. E.; Brady, J. W. Carbohydr. Res. 2009, 344, 1984. (390) Tavagnacco, L.; Mason, P. E.; Schnupf, U.; Pitici, F.; Zhong, L. H.; Himmel, M. E.; Crowley, M.; Cesaro, A.; Brady, J. W. Carbohydr. Res. 2011, 346, 839. (391) Nimlos, M. R.; Beckham, G. T.; Matthews, J. F.; Bu, L. T.; Himmel, M. E.; Crowley, M. F. J. Biol. Chem. 2012, 287, 20603. (392) Shiiba, H.; Hayashi, S.; Yui, T. Carbohydr. Res. 2013, 374, 96. (393) Payne, C. M.; Resch, M. G.; Chen, L. Q.; Crowley, M. F.; Himmel, M. E.; Taylor, L. E., II; Sandgren, M.; Ståhlberg, J.; Stals, I.; Tan, Z. P.; Beckham, G. T. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 14646. (394) Shiiba, H.; Hayashi, S.; Yui, T. Cellulose 2012, 19, 635. (395) Yui, T.; Shiiba, H.; Tsutsumi, Y.; Hayashi, S.; Miyata, T.; Hirata, F. J. Phys. Chem. B 2010, 114, 49. (396) Mackerell, A. D.; Feig, M.; Brooks, C. L. J. Comput. Chem. 2004, 25, 1400. (397) Sakon, J.; Irwin, D.; Wilson, D. B.; Karplus, P. A. Nat. Struct. Biol. 1997, 4, 810. (398) van Aalten, D. M. F.; Synstad, B.; Brurberg, M. B.; Hough, E.; Riise, B. W.; Eijsink, V. G. H.; Wierenga, R. K. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 5842. (399) Abuja, P. M.; Pilz, I.; Claeyssens, M.; Tomme, P. Biochem. Biophys. Res. Commun. 1988, 156, 180. (400) Abuja, P. M.; Schmuck, M.; Pilz, I.; Tomme, P.; Claeyssens, M.; Esterbauer, H. Eur. Biophys. J. 1988, 15, 339. (401) Abuja, P. M.; Pilz, I.; Tomme, P.; Claeyssens, M. Biochem. Biophys. Res. Commun. 1989, 165, 615. (402) Pilz, I.; Schwarz, E.; Kilburn, D. G.; Miller, R. C., Jr.; Warren, R. A. J.; Gilkes, N. R. Biochem. J. 1990, 271, 277. (403) Meinke, A.; Schmuck, M.; Gilkes, N. R.; Kilburn, D. G.; Miller, R. C., Jr.; Warren, R. A. J. Glycobiology 1992, 2, 321. (404) Langsford, M. L.; Gilkes, N. R.; Singh, B.; Moser, B.; Miller, R. C., Jr.; Warren, R. A. J.; Kilburn, D. G. FEBS Lett. 1987, 225, 163. (405) Shen, H.; Schmuck, M.; Pilz, I.; Gilkes, N. R.; Kilburn, D. G.; Miller, R. C., Jr.; Warren, R. A. J. J. Biol. Chem. 1991, 266, 11335.

(344) Tomme, P.; van Tilbeurgh, H.; Pettersson, G.; Van Damme, J.; Vandekerckhove, J.; Knowles, J.; Teeri, T.; Claeyssens, M. Eur. J. Biochem. 1988, 170, 575. (345) Gilkes, N. R.; Warren, R. A. J.; Miller, R. C., Jr.; Kilburn, D. G. J. Biol. Chem. 1988, 263, 10401. (346) Kraulis, P. J.; Clore, G. M.; Nilges, M.; Jones, T. A.; Pettersson, G.; Knowles, J.; Gronenborn, A. M. Biochemistry 1989, 28, 7241. (347) Gouet, P.; Robert, X.; Courcelle, E. Nucleic Acids Res. 2003, 31, 3320. (348) Ståhlberg, J.; Johansson, G.; Pettersson, G. Bio/Technology 1991, 9, 286. (349) Reinikainen, T.; Ruohonen, L.; Nevanen, T.; Laaksonen, L.; Kraulis, P.; Jones, T. A.; Knowles, J. K. C.; Teeri, T. T. Proteins 1992, 14, 475. (350) Reinikainen, T.; Teleman, O.; Teeri, T. T. Proteins 1995, 22, 392. (351) Linder, M.; Lindeberg, G.; Reinikainen, T.; Teeri, T. T.; Pettersson, G. FEBS Lett. 1995, 372, 96. (352) Linder, M.; Mattinen, M. L.; Kontteli, M.; Lindeberg, G.; Ståhlberg, J.; Drakenberg, T.; Reinikainen, T.; Pettersson, G.; Annila, A. Protein Sci. 1995, 4, 1056. (353) Linder, M.; Teeri, T. T. Proc. Natl. Acad. Sci. U.S.A. 1996, 93, 12251. (354) Linder, M.; Salovuori, I.; Ruohonen, L.; Teeri, T. T. J. Biol. Chem. 1996, 271, 21268. (355) Linder, M.; Teeri, T. T. J. Biotechnol. 1997, 57, 15. (356) Mattinen, M.-L.; Kontteli, M.; Kerovuo, J.; Drakenberg, T.; Annila, A.; Linder, M.; Reinikainen, T.; Lindeberg, G. Protein Sci. 1997, 6, 294. (357) Mattinen, M.-L.; Linder, M.; Teleman, A.; Annila, A. FEBS Lett. 1997, 407, 291. (358) Srisodsuk, M.; Lehtiö, J.; Linder, M.; Margolles-Clark, E.; Reinikainen, T.; Teeri, T. T. J. Biotechnol. 1997, 57, 49. (359) Mattinen, M.-L.; Linder, M.; Drakenberg, T.; Annila, A. Eur. J. Biochem. 1998, 256, 279. (360) Carrard, G.; Linder, M. Eur. J. Biochem. 1999, 262, 637. (361) Linder, M.; Nevanen, T.; Teeri, T. T. FEBS Lett. 1999, 447, 13. (362) Lehtiö, J.; Sugiyama, J.; Gustavsson, M.; Fransson, L.; Linder, M.; Teeri, T. T. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 484. (363) Sugimoto, N.; Igarashi, K.; Wada, M.; Samejima, M. Langmuir 2012, 28, 14323. (364) Guo, J.; Catchmark, J. M. Biomacromolecules 2013, 14, 1268. (365) Takashima, S.; Ohno, M.; Hidaka, M.; Nakamura, A.; Masaki, H.; Uozumi, T. FEBS Lett. 2007, 581, 5891. (366) Creagh, A. L.; Ong, E.; Jervis, E.; Kilburn, D. G.; Haynes, C. A. Proc. Natl. Acad. Sci. U.S.A. 1996, 93, 12229. (367) Liu, Y. S.; Baker, J. O.; Zeng, Y. N.; Himmel, M. E.; Haas, T.; Ding, S. Y. J. Biol. Chem. 2011, 286, 11195. (368) Harrison, M. J.; Nouwens, A. S.; Jardine, D. R.; Zachara, N. E.; Gooley, A. A.; Nevalainen, H.; Packer, N. H. Eur. J. Biochem. 1998, 256, 119. (369) Beckham, G. T.; Dai, Z.; Matthews, J. F.; Momany, M.; Payne, C. M.; Adney, W. S.; Baker, S. E.; Himmel, M. E. Curr. Opin. Biotechnol. 2012, 23, 338. (370) Beckham, G. T.; Matthews, J. F.; Bomble, Y. J.; Bu, L. T.; Adney, W. S.; Himmel, M. E.; Nimlos, M. R.; Crowley, M. F. J. Phys. Chem. B 2010, 114, 1447. (371) Taylor, C. B.; Talib, M. F.; McCabe, C.; Bu, L.; Adney, W. S.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T. J. Biol. Chem. 2012, 287, 3147. (372) Crooks, G. E.; Hon, G.; Chandonia, J.-M.; Brenner, S. E. Genome Res. 2004, 14, 1188. (373) Chen, L. Q.; Drake, M. R.; Resch, M. G.; Greene, E. R.; Himmel, M. E.; Chaffey, P. K.; Beckham, G. T.; Tan, Z. P. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 7612. (374) Din, N.; Gilkes, N. R.; Tekant, B.; Miller, R. C., Jr.; Warren, R. A. J.; Kilburn, D. G. Bio/Technology 1991, 9, 1096. (375) Gao, P.-J.; Chen, G.-J.; Wang, T.-H.; Zhang, Y.-S.; Liu, J. Acta Biochim. Biophys. Sin. 2000, 33, 13. 1437

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(406) Boisset, C.; Borsali, R.; Schülein, M.; Henrissat, B. FEBS Lett. 1995, 376, 49. (407) Receveur, V.; Czjzek, M.; Schülein, M.; Panine, P.; Henrissat, B. J. Biol. Chem. 2002, 277, 40887. (408) von Ossowski, I.; Eaton, J. T.; Czjzek, M.; Perkins, S. J.; Frandsen, T. P.; Schülein, M.; Panine, P.; Henrissat, B.; ReceveurBréchot, V. Biophys. J. 2005, 88, 2823. (409) Wright, P. E.; Dyson, H. J. J. Mol. Biol. 1999, 293, 321. (410) Dunker, A. K.; Brown, C. J.; Lawson, J. D.; Iakoucheva, L. M.; Obradovic, Z. Biochemistry 2002, 41, 6573. (411) Dunker, A. K.; Lawson, J. D.; Brown, C. J.; Williams, R. M.; Romero, P.; Oh, J. S.; Oldfield, C. J.; Campen, A. M.; Ratliff, C. M.; Hipps, K. W.; Ausio, J.; Nissen, M. S.; Reeves, R.; Kang, C.; Kissinger, C. R.; Bailey, R. W.; Griswold, M. D.; Chiu, W.; Garner, E. C.; Obradovic, Z. J. Mol. Graphics Modell. 2001, 19, 26. (412) Dunker, A. K.; Obradovic, Z. Nat. Biotechnol. 2001, 19, 805. (413) Dyson, H. J.; Wright, P. E. Nat. Rev. Mol. Cell Biol. 2005, 6, 197. (414) Lima, L. H. F.; Serpa, V. I.; Rosseto, F. R.; Sartori, G. R.; Neto, M. D.; Martinez, L.; Polikarpov, I. Cellulose 2013, 20, 1573. (415) Poon, D. K. Y.; Withers, S. G.; McIntosh, L. P. J. Biol. Chem. 2007, 282, 2091. (416) Beckham, G. T.; Bomble, Y. J.; Matthews, J. F.; Taylor, C. B.; Resch, M. G.; Yarbrough, J. M.; Decker, S. R.; Bu, L. T.; Zhao, X. C.; McCabe, C.; Wohlert, J.; Bergenstrahle, M.; Brady, J. W.; Adney, W. S.; Himmel, M. E.; Crowley, M. F. Biophys. J. 2010, 99, 3773. (417) Sammond, D. W.; Payne, C. M.; Brunecky, R.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T. PLoS One 2012, 7, e48615. (418) Hui, J. P. M.; White, T. C.; Thibault, P. Glycobiology 2002, 12, 837. (419) Stals, I.; Sandra, K.; Geysens, S.; Contreras, R.; Van Beeumen, J.; Claeyssens, M. Glycobiology 2004, 14, 713. (420) Christiansen, M. N.; Kolarich, D.; Nevalainen, H.; Packer, N. H.; Jensen, P. H. Anal. Chem. 2010, 82, 3500. (421) Deshpande, N.; Wilkins, M. R.; Packer, N.; Nevalainen, H. Glycobiology 2008, 18, 626. (422) Goto, M. Biosci. Biotechnol. Biochem. 2007, 71, 1415. (423) Cummings, R. D.; Doering, T. L. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Freeze, H. H., Stanley, P., Bertozzi, C. R., Hart, G. W., Etzler, M. E., Eds.; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 2009. (424) Zhao, X.; Rignall, T. R.; McCabe, C.; Adney, W. S.; Himmel, M. E. Chem. Phys. Lett. 2008, 460, 284. (425) Ting, C. L.; Makarov, D. E.; Wang, Z. G. J. Phys. Chem. B 2009, 113, 4970. (426) Martinez, D.; Larrondo, L. F.; Putnam, N.; Gelpke, M. D. S.; Huang, K.; Chapman, J.; Helfenbein, K. G.; Ramaiya, P.; Detter, J. C.; Larimer, F.; Coutinho, P. M.; Henrissat, B.; Berka, R.; Cullen, D.; Rokhsar, D. Nat. Biotechnol. 2004, 22, 695. (427) Wymelenberg, A. V.; Minges, P.; Sabat, G.; Martinez, D.; Aerts, A.; Salamov, A.; Grigoriev, I.; Shapiro, H.; Putnam, N.; Belinky, P.; Dosoretz, C.; Gaskell, J.; Kersten, P.; Cullen, D. Fungal Genet. Biol. 2006, 43, 343. (428) King, A. J.; Cragg, S. M.; Li, Y.; Dymond, J.; Guille, M. J.; Bowles, D. J.; Bruce, N. C.; Graham, I. A.; McQueen-Mason, S. J. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 5345. (429) Todaka, N.; Moriya, S.; Saita, K.; Hondo, T.; Kiuchi, I.; Takasu, H.; Ohkuma, M.; Piero, C.; Hayashizaki, Y.; Kudo, T. FEMS Microbiol. Ecol. 2007, 59, 592. (430) Eichinger, L.; Pachebat, J. A.; Glockner, G.; Rajandream, M. A.; Sucgang, R.; Berriman, M.; Song, J.; Olsen, R.; Szafranski, K.; Xu, Q.; Tunggal, B.; Kummerfeld, S.; Madera, M.; Konfortov, B. A.; Rivero, F.; Bankier, A. T.; Lehmann, R.; Hamlin, N.; Davies, R.; Gaudet, P.; Fey, P.; Pilcher, K.; Chen, G.; Saunders, D.; Sodergren, E.; Davis, P.; Kerhornou, A.; Nie, X.; Hall, N.; Anjard, C.; Hemphill, L.; Bason, N.; Farbrother, P.; Desany, B.; Just, E.; Morio, T.; Rost, R.; Churcher, C.; Cooper, J.; Haydock, S.; van Driessche, N.; Cronin, A.; Goodhead, I.; Muzny, D.; Mourier, T.; Pain, A.; Lu, M.; Harper, D.; Lindsay, R.; Hauser, H.; James, K.; Quiles, M.; Madan Babu, M.; Saito, T.; Buchrieser, C.; Wardroper, A.; Felder, M.; Thangavelu, M.; Johnson,

D.; Knights, A.; Loulseged, H.; Mungall, K.; Oliver, K.; Price, C.; Quail, M. A.; Urushihara, H.; Hernandez, J.; Rabbinowitsch, E.; Steffen, D.; Sanders, M.; Ma, J.; Kohara, Y.; Sharp, S.; Simmonds, M.; Spiegler, S.; Tivey, A.; Sugano, S.; White, B.; Walker, D.; Woodward, J.; Winckler, T.; Tanaka, Y.; Shaulsky, G.; Schleicher, M.; Weinstock, G.; Rosenthal, A.; Cox, E. C.; Chisholm, R. L.; Gibbs, R.; Loomis, W. F.; Platzer, M.; Kay, R. R.; Williams, J.; Dear, P. H.; Noegel, A. A.; Barrell, B.; Kuspa, A. Nature 2005, 435, 43. (431) Sucgang, R.; Kuo, A.; Tian, X. J.; Salerno, W.; Parikh, A.; Feasley, C. L.; Dalin, E.; Tu, H.; Huang, E. Y.; Barry, K.; Lindquist, E.; Shapiro, H.; Bruce, D.; Schmutz, J.; Salamov, A.; Fey, P.; Gaudet, P.; Anjard, C.; Babu, M. M.; Basu, S.; Bushmanova, Y.; van der Wel, H.; Katoh-Kurasawa, M.; Dinh, C.; Coutinho, P. M.; Saito, T.; Elias, M.; Schaap, P.; Kay, R. R.; Henrissat, B.; Eichinger, L.; Rivero, F.; Putnam, N. H.; West, C. M.; Loomis, W. F.; Chisholm, R. L.; Shaulsky, G.; Strassmann, J. E.; Queller, D. C.; Kuspa, A.; Grigoriev, I. V. Genome Biol. 2011, 12, R20. (432) Kunii, M.; Yasuno, M.; Shindo, Y.; Kawata, T. Dev. Genes Evol. 2014, 224, 25. (433) Sethi, A.; Kovaleva, E. S.; Slack, J. M.; Brown, S.; Buchman, G. W.; Scharf, M. E. Arch. Insect Biochem. Physiol. 2013, 84, 175. (434) Bissett, F. H. J. Chromatogr. A 1979, 178, 515. (435) Fägerstam, L. G.; Pettersson, L. G. FEBS Lett. 1979, 1100, 363. (436) Fägerstam, L. G.; Pettersson, L. G. FEBS Lett. 1980, 119, 97. (437) Shoemaker, S.; Watt, K.; Tsitovsky, G.; Cox, R. Nat. Biotechnol. 1983, 1, 687. (438) Fägerstam, L. G.; Pettersson, L. G.; Engström, J. Å. FEBS Lett. 1984, 167, 309. (439) Henrissat, B.; Claeyssens, M.; Tomme, P.; Lemesle, L.; Mornon, J.-P. Gene 1989, 81, 83. (440) Teeri, T. T.; Koivula, A.; Linder, M.; Wohlfahrt, G.; Divne, C.; Jones, T. A. Biochem. Soc. Trans. 1998, 26, 173. (441) Knott, B. C.; Momeni, M. H.; Crowley, M. F.; Mackenzie, L. F.; Götz, A. W.; Sandgren, M.; Withers, S. G.; Ståhlberg, J. S.; Beckham, G. T. J. Am. Chem. Soc. 2013, 136, 321. (442) Schou, C.; Rasmussen, G.; Kaltoft, M.-B.; Henrissat, B.; Schülein, M. Eur. J. Biochem. 1993, 217, 947. (443) Knowles, J. K. C.; Lentovaara, P.; Murray, M.; Sinnott, M. L. J. Chem. Soc., Chem. Commun. 1988, 21, 1401. (444) McCarter, J. D.; Withers, S. G. Curr. Opin. Struct. Biol. 1994, 4, 885. (445) Vrsanska, M.; Biely, P. Carbohydr. Res. 1992, 227, 19. (446) Ståhlberg, J.; Divne, C.; Koivula, A.; Piens, K.; Claeyssens, M.; Teeri, T. T.; Jones, T. A. J. Mol. Biol. 1996, 264, 337. (447) Tews, I.; Perrakis, A.; Oppenheim, A.; Dauter, Z.; Wilson, K. S.; Vorgias, C. E. Nat. Struct. Biol. 1996, 3, 638. (448) Sulzenbacher, G.; Schülein, M.; Davies, G. J. Biochemistry 1997, 36, 5902. (449) Momeni, M. H.; Payne, C. M.; Hansson, H.; Mikkelsen, N. E.; Svedberg, J.; Engström, Å.; Sandgren, M.; Beckham, G. T.; Ståhlberg, J. J. Biol. Chem. 2013, 288, 5861. (450) Kleywegt, G. J.; Zou, J. Y.; Divne, C.; Davies, G. J.; Sinning, I.; Ståhlberg, J.; Reinikainen, T.; Srisodsuk, M.; Teeri, T. T.; Jones, T. A. J. Mol. Biol. 1997, 272, 383. (451) Mackenzie, L. F.; Davies, G. J.; Schülein, M.; Withers, S. G. Biochemistry 1997, 36, 5893. (452) Mackenzie, L. F.; Sulzenbacher, G.; Divne, C.; Jones, T. A.; Wöldike, H. F.; Schülein, M.; Withers, S. G.; Davies, G. J. Biochem. J. 1998, 335, 409. (453) Namchuk, M. N.; McCarter, J. D.; Becalski, A.; Andrews, T.; Withers, S. G. J. Am. Chem. Soc. 2000, 122, 1270. (454) Withers, S. G.; Aebersold, R. Protein Sci. 1995, 4, 361. (455) Street, I. P.; Kempton, J. B.; Withers, S. G. Biochemistry 1992, 31, 9970. (456) Klarskov, K.; Piens, K.; Ståhlberg, J.; Høj, P. B.; Van Beeumen, J.; Claeyssens, M. Carbohydr. Res. 1997, 304, 143. (457) Davies, G. J.; Ducros, V.; Lewis, R. J.; Borchert, T. V.; Schülein, M. J. Biotechnol. 1997, 57, 91. 1438

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(458) Ubhayasekera, W.; Muñoz, I. G.; Vasella, A.; Ståhlberg, J.; Mowbray, S. L. FEBS J. 2005, 272, 1952. (459) Knott, B. C.; Crowley, M. F.; Himmel, M. E.; Ståhlberg, J.; Beckham, G. T. J. Am. Chem. Soc. 2014, 136, 8810. (460) Muñoz, I. G.; Ubhayasekera, W.; Henriksson, H.; Szabó, I.; Pettersson, G.; Johansson, G.; Mowbray, S. L.; Ståhlberg, J. J. Mol. Biol. 2001, 314, 1097. (461) von Ossowski, I.; Ståhlberg, J.; Koivula, A.; Piens, K.; Becker, D.; Boer, H.; Harle, R.; Harris, M.; Divne, C.; Mahdi, S.; Zhao, Y. X.; Driguez, H.; Claeyssens, M.; Sinnott, M. L.; Teeri, T. T. J. Mol. Biol. 2003, 333, 817. (462) Muñoz, I. G.; Mowbray, S. L.; Ståhlberg, J. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2003, 59, 637. (463) Ståhlberg, J.; Henriksson, H.; Divne, C.; Isaksson, R.; Pettersson, G.; Johansson, G.; Jones, T. A. J. Mol. Biol. 2001, 305, 79. (464) Grassick, A.; Murray, P. G.; Thompson, R.; Collins, C. M.; Byrnes, L.; Birrane, G.; Higgins, T. M.; Tuohy, M. G. Eur. J. Biochem. 2004, 271, 4495. (465) Parkkinen, T.; Koivula, A.; Vehmaanpera, J.; Rouvinen, J. Protein Sci. 2008, 17, 1383. (466) Textor, L. C.; Colussi, F.; Silveira, R. L.; Serpa, V.; de Mello, B. L.; Muniz, J. R. C.; Squina, F. M.; Pereira, N.; Skaf, M. S.; Polikarpov, I. FEBS J. 2013, 280, 56. (467) Gao, J. L.; Truhlar, D. G. Annu. Rev. Phys. Chem. 2002, 53, 467. (468) Garcia-Viloca, M.; Gao, J.; Karplus, M.; Truhlar, D. G. Science 2004, 303, 186. (469) Zhang, Y.; Yan, S. H.; Yao, L. S. J. Phys. Chem. B 2013, 117, 8714. (470) Barnett, C. B.; Wilkinson, K. A.; Naidoo, K. J. J. Am. Chem. Soc. 2011, 133, 19474. (471) Li, J. H.; Du, L. K.; Wang, L. S. J. Phys. Chem. B 2010, 114, 15261. (472) Yan, S. H.; Li, T.; Yao, L. S. J. Phys. Chem. B 2011, 115, 4982. (473) Bolhuis, P. G.; Chandler, D.; Dellago, C.; Geissler, P. L. Annu. Rev. Phys. Chem. 2002, 53, 291. (474) Peters, B.; Beckham, G. T.; Trout, B. L. J. Chem. Phys. 2007, 127, 034109. (475) Peters, B.; Trout, B. L. J. Chem. Phys. 2006, 125, 054108. (476) Igarashi, K.; Uchihashi, T.; Koivula, A.; Wada, M.; Kimura, S.; Okamoto, T.; Penttilä, M.; Ando, T.; Samejima, M. Science 2011, 333, 1279. (477) Barnett, C. B.; Wilkinson, K. A.; Naidoo, K. J. J. Am. Chem. Soc. 2010, 132, 12800. (478) Granum, D. M.; Vyas, S.; Sambasivarao, S. V.; Maupin, C. M. J. Phys. Chem. B 2013, 118, 434. (479) Bu, L. T.; Crowley, M. F.; Himmel, M. E.; Beckham, G. T. J. Biol. Chem. 2013, 288, 12175. (480) Zhang, Y.; Yan, S.; Yao, L. Theor. Chem. Acc. 2013, 132, 1367. (481) Lin, Y.; Silvestre-Ryan, J.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T.; Chu, J.-W. J. Am. Chem. Soc. 2011, 133, 16617. (482) Lin, Y.; Beckham, G. T.; Himmel, M. E.; Crowley, M. F.; Chu, J.-W. J. Phys. Chem. B 2013, 117, 10750. (483) Szijártó, N.; Horan, E.; Zhang, J.; Puranen, T.; Siika-Aho, M.; Viikari, L. Biotechnol. Biofuels 2011, 4, 2. (484) Takada, G.; Kawaguchi, T.; Sumitani, J.; Arai, M. Biosci. Biotechnol. Biochem. 1998, 62, 1615. (485) Takada, G.; Kawaguchi, T.; Sumitani, J.-I.; Arai, M. J. Ferment. Bioeng. 1998, 85, 1. (486) Kanamasa, S.; Mochizuki, M.; Takada, G.; Kawaguchi, T.; Sumitani, J.-I.; Arai, M. J. Biosci. Bioeng. 2003, 95, 627. (487) Bauer, S.; Vasu, P.; Persson, S.; Mort, A. J.; Somerville, C. R. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 11417. (488) Gielkens, M. M. C.; Dekkers, E.; Visser, J.; de Graaff, L. H. Appl. Environ. Microbiol. 1999, 65, 4340. (489) Kitamoto, N.; Go, M.; Shibayama, T.; Kimura, T.; Kito, Y.; Ohmiya, K.; Tsukagoshi, N. Appl. Microbiol. Biotechnol. 1996, 46, 538. (490) Kotaka, A.; Bando, H.; Kaya, M.; Kato-Murai, M.; Kuroda, K.; Sahara, H.; Hata, Y.; Kondo, A.; Ueda, M. J. Biosci. Bioeng. 2008, 105, 622.

(491) Luo, H.; Yang, J.; Yang, P.; Li, J.; Huang, H.; Shi, P.; Bai, Y.; Wang, Y.; Fan, Y.; Yao, B. Appl. Microbiol. Biotechnol. 2010, 85, 1015. (492) Zhang, Y.; Xu, X.; Zhou, X.; Chen, R.; Yang, P.; Meng, Q.; Meng, K.; Luo, H.; Yuan, J.; Yao, B.; Zhang, W. PLoS One 2013, 8, e81993. (493) Li, Y.-L.; Li, H.; Li, A.-N.; Li, D.-C. J. Appl. Microbiol. 2009, 106, 1867. (494) Gusakov, A. V.; Sinitsyn, A. P.; Salanovich, T. N.; Bukhtojarov, F. E.; Markov, A. V.; Ustinov, B. B.; van Zeijl, C.; Punt, P.; Burlingame, R. Enzyme Microb. Technol. 2005, 36, 57. (495) Müller, U.; Tenberge, K. B.; Oeser, B.; Tudzynski, P. Mol. PlantMicrobe Interact. 1997, 10, 268. (496) Kanokratana, P.; Chantasingh, D.; Champreda, V.; Tanapongpipat, S.; Pootanakit, K.; Eurwilaichitr, L. Protein Expression Purif. 2008, 58, 148. (497) Takashima, S.; Iikura, H.; Nakamura, A.; Hidaka, M.; Masaki, H.; Uozumi, T. J. Biochem. 1998, 124, 717. (498) Takashima, S.; Nakamura, A.; Hidaka, M.; Masaki, H.; Uozumi, T. J. Biotechnol. 1996, 50, 137. (499) Schülein, M. J. Biotechnol. 1997, 57, 71. (500) Xu, F.; Ding, H.; Tejirian, A. Enzyme Microb. Technol. 2009, 45, 203. (501) Hamada, N.; Ishikawa, K.; Fuse, N.; Kodaira, R.; Shimosaka, M.; Amano, Y.; Kanda, T.; Okazaki, M. J. Biosci. Bioeng. 1999, 87, 442. (502) Miettinen-Oinonen, A.; Londesborough, J.; Joutsjoki, V.; Lantto, R.; Vehmaanperä, J. Enzyme Microb. Technol. 2004, 34, 332. (503) Szijártó, N.; Siika-aho, M.; Tenkanen, M.; Alapuranen, M.; Vehmaanperä, J.; Réczey, K.; Viikari, L. J. Biotechnol. 2008, 136, 140. (504) Voutilainen, S. P.; Boer, H.; Linder, M. B.; Puranen, T.; Rouvinen, J.; Vehmaanperä, J.; Koivula, A. Enzyme Microb. Technol. 2007, 41, 234. (505) Voutilainen, S. P.; Boer, H.; Alapuranen, M.; Jänis, J.; Vehmaanperä, J.; Koivula, A. Appl. Microbiol. Biotechnol. 2009, 83, 261. (506) Karnaouri, A. C.; Topakas, E.; Christakopoulos, P. Appl. Microbiol. Biotechnol. 2014, 98, 231. (507) Hou, Y.; Wang, T.; Long, H.; Zhu, H. Acta Biochim. Biophys. Sin. 2007, 39, 101. (508) Gao, L.; Gao, F.; Wang, L. S.; Geng, C. L.; Chi, L. L.; Zhao, J.; Qu, Y. B. J. Biol. Chem. 2012, 287, 15906. (509) Wei, X. M.; Qin, Y. Q.; Qu, Y. B. J. Microbiol. Biotechnol. 2010, 20, 265. (510) Limam, F.; Chaabouni, S. E.; Ghrir, R.; Marzouki, N. Enzyme Microb. Technol. 1995, 17, 340. (511) Marjamaa, K.; Toth, K.; Bromann, P. A.; Szakacs, G.; Kruus, K. Enzyme Microb. Technol. 2013, 52, 358. (512) Uzcategui, E.; Ruiz, A.; Montesino, R.; Johansson, G.; Pettersson, G. J. Biotechnol. 1991, 19, 271. (513) Tuohy, M. G.; Walsh, D. J.; Murray, P. G.; Claeyssens, M.; Cuffe, M. M.; Savage, A. V.; Coughlan, M. P. Biochim. Biophys. Acta 2002, 1596, 366. (514) Texier, H.; Dumon, C.; Neugnot-Roux, V.; Maestracci, M.; O’Donohue, M. J. J. Ind. Microbiol. Biotechnol. 2012, 39, 1569. (515) Furniss, C. S. M.; Williamson, G.; Kroon, P. A. J. Sci. Food Agric. 2005, 85, 574. (516) Hong, J.; Tamaki, H.; Yamamoto, K.; Kumagai, H. Appl. Microbiol. Biotechnol. 2003, 63, 42. (517) Colussi, F.; Serpa, V.; Delabona, P. d. S.; Manzine, L. R.; Voltatodio, M. L.; Alves, R.; Mello, B. L.; Pereira, N., Jr.; Farinas, C. S.; Golubev, A. M.; Santos, M. A. M.; Polikarpov, I. J. Microbiol. Biotechnol. 2011, 21, 808. (518) Ganga, A.; González-Candelas, L.; Ramón, D.; Pérez-González, J. A. J. Agric. Food Chem. 1997, 45, 2359. (519) Mitrovic, A.; Flicker, K.; Steinkellner, G.; Gruber, K.; Reisinger, C.; Schirrmacher, G.; Camattari, A.; Glieder, A. J. Mol. Catal. B: Enzym. 2014, 103, 16. (520) Song, J.; Liu, B.; Liu, Z.; Yang, Q. Mol. Biol. Rep. 2010, 37, 2135. (521) Horn, S. J.; Sørlie, M.; Vårum, K. M.; Väljamäe, P.; Eijsink, V. G. H. Methods Enzymol. 2012, 510, 69. (522) Doner, L. W.; Irwin, P. L. Anal. Biochem. 1992, 202, 50. 1439

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(523) Irwin, D. C.; Spezio, M.; Walker, L. P.; Wilson, D. B. Biotechnol. Bioeng. 1993, 42, 1002. (524) Irwin, D.; Shin, D.-H.; Zhang, S.; Barr, B. K.; Sakon, J.; Karplus, P. A.; Wilson, D. B. J. Bacteriol. 1998, 180, 1709. (525) Koivula, A.; Kinnari, T.; Harjunpaa, V.; Ruohonen, L.; Teleman, A.; Drakenberg, T.; Rouvinen, J.; Jones, T. A.; Teeri, T. T. FEBS Lett. 1998, 429, 341. (526) Zhang, S.; Irwin, D. C.; Wilson, D. B. Eur. J. Biochem. 2000, 267, 3101. (527) Kipper, K.; Väljamäe, P.; Johansson, G. Biochem. J. 2005, 385, 527. (528) Vuong, T. V.; Wilson, D. B. Appl. Environ. Microbiol. 2009, 75, 6655. (529) Watson, B. J.; Zhang, H.; Longmire, A. G.; Moon, Y. H.; Hutcheson, S. W. J. Bacteriol. 2009, 191, 5697. (530) Velleste, R.; Teugjas, H.; Väljamäe, P. Cellulose 2010, 17, 125. (531) Jalak, J.; Väljamäe, P. Biotechnol. Bioeng. 2010, 106, 871. (532) Fox, J. M.; Levine, S. E.; Clark, D. S.; Blanch, H. W. Biochemistry 2012, 51, 442. (533) Zhang, Y. H. P.; Lynd, L. R. Biomacromolecules 2005, 6, 1510. (534) Bubner, P.; Plank, H.; Nidetzky, B. Biotechnol. Bioeng. 2013, 110, 1529. (535) Goacher, R. E.; Selig, M. J.; Master, E. R. Curr. Opin. Biotechnol. 2014, 27, 123. (536) White, A. R.; Brown, R. M., Jr. Proc. Natl. Acad. Sci. U.S.A. 1981, 78, 1047. (537) Chanzy, H.; Henrissat, B.; Vuong, R.; Schülein, M. FEBS Lett. 1983, 153, 113. (538) Chanzy, H.; Henrissat, B.; Vuong, R. FEBS Lett. 1984, 172, 193. (539) Blanchette, R. A.; Abad, A. R.; Cease, K. R.; Lovrien, R. E.; Leathers, T. D. Appl. Environ. Microbiol. 1989, 55, 2293. (540) Boisset, C.; Fraschini, C.; Schülein, M.; Henrissat, B.; Chanzy, H. Appl. Environ. Microbiol. 2000, 66, 1444. (541) Donohoe, B. S.; Selig, M. J.; Viamajala, S.; Vinzant, T. B.; Adney, W. S.; Himmel, M. E. Biotechnol. Bioeng. 2009, 103, 480. (542) Imai, T.; Boisset, C.; Samejima, M.; Igarashi, K.; Sugiyama, J. FEBS Lett. 1998, 432, 113. (543) Lee, H. J.; Brown, R. M., Jr. J. Biotechnol. 1997, 57, 127. (544) Nieves, R. A.; Ellis, R. P.; Todd, R. J.; Johnson, T. J. A.; Grohmann, K.; Himmel, M. E. Appl. Environ. Microbiol. 1991, 57, 3163. (545) Lee, I.; Evans, B. R.; Lane, L. M.; Woodward, J. Bioresour. Technol. 1996, 58, 163. (546) Ganner, T.; Bubner, P.; Eibinger, M.; Mayrhofer, C.; Plank, H.; Nidetzky, B. J. Biol. Chem. 2012, 287, 43215. (547) Lee, I.; Evans, B. R.; Woodward, J. Ultramicroscopy 2000, 82, 213. (548) Santa-Maria, M.; Jeoh, T. Biomacromolecules 2010, 11, 2000. (549) Wang, J. P.; Quirk, A.; Lipkowski, J.; Dutcher, J. R.; Hill, C.; Mark, A.; Clarke, A. J. Langmuir 2012, 28, 9664. (550) Jeoh, T.; Santa-Maria, M. C.; O’Dell, P. J. Carbohydr. Polym. 2013, 97, 581. (551) Wang, J. P.; Quirk, A.; Lipkowski, J.; Dutcher, J. R.; Clarke, A. J. Langmuir 2013, 29, 14997. (552) Bubner, P.; Dohr, J.; Plank, H.; Mayrhofer, C.; Nidetzky, B. J. Biol. Chem. 2012, 287, 2759. (553) Boisset, C.; Pétrequin, C.; Chanzy, H.; Henrissat, B.; Schülein, M. Biotechnol. Bioeng. 2001, 72, 339. (554) Luterbacher, J. S.; Walker, L. P.; Moran-Mirabal, J. M. Biotechnol. Bioeng. 2013, 110, 108. (555) Jung, J.; Sethi, A.; Gaiotto, T.; Han, J. J.; Jeoh, T.; Gnanakaran, S.; Goodwin, P. M. J. Biol. Chem. 2013, 288, 24164. (556) Bansal, P.; Hall, M.; Realff, M. J.; Lee, J. H.; Bommarius, A. S. Biotechnol. Adv. 2009, 27, 833. (557) Jalak, J.; Kurašin, M.; Teugjas, H.; Väljamäe, P. J. Biol. Chem. 2012, 287, 28802. (558) Igarashi, K.; Koivula, A.; Wada, M.; Kimura, S.; Penttilä, M.; Samejima, M. J. Biol. Chem. 2009, 284, 36186. (559) Cruys-Bagger, N.; Elmerdahl, J.; Praestgaard, E.; Borch, K.; Westh, P. FEBS J. 2013, 280, 3952.

(560) Cruys-Bagger, N.; Elmerdahl, J.; Praestgaard, E.; Tatsumi, H.; Spodsberg, N.; Borch, K.; Westh, P. J. Biol. Chem. 2012, 287, 18451. (561) Cruys-Bagger, N.; Tatsumi, H.; Ren, G. R.; Borch, K.; Westh, P. Biochemistry 2013, 52, 8938. (562) Praestgaard, E.; Elmerdahl, J.; Murphy, L.; Nymand, S.; McFarland, K. C.; Borch, K.; Westh, P. FEBS J. 2011, 278, 1547. (563) Horn, S. J.; Sikorski, P.; Cederkvist, J. B.; Vaaje-Kolstad, G.; Sørlie, M.; Synstad, B.; Vriend, G.; Vårum, K. M.; Eijsink, V. G. H. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 18089. (564) Nakamura, A.; Watanabe, H.; Ishida, T.; Uchihashi, T.; Wada, M.; Ando, T.; Igarashi, K.; Samejima, M. J. Am. Chem. Soc. 2014, 136, 4584. (565) Henrissat, B. Cellul. Commun. 1998, 5, 84. (566) Lee, T. M.; Farrow, M. F.; Arnold, F. H.; Mayo, S. L. Protein Sci. 2011, 20, 1935. (567) Sandgren, M.; Shaw, A.; Ropp, T. H.; Wu, S.; Bott, R.; Cameron, A. D.; Ståhlberg, J.; Mitchinson, C.; Jones, T. A. J. Mol. Biol. 2001, 308, 295. (568) Väljamäe, P.; Sild, V.; Nutt, A.; Pettersson, G.; Johansson, G. Eur. J. Biochem. 1999, 266, 327. (569) Eriksson, T.; Karlsson, J.; Tjerneld, F. Appl. Biochem. Biotechnol. 2002, 101, 41. (570) Wood, T. M.; McCrae, S. I. Biochem. J. 1972, 128, 1183. (571) Bu, L. T.; Beckham, G. T.; Shirts, M. R.; Nimlos, M. R.; Adney, W. S.; Himmel, M. E.; Crowley, M. F. J. Biol. Chem. 2011, 286, 18161. (572) Bu, L. T.; Nimlos, M. R.; Shirts, M. R.; Ståhlberg, J.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T. J. Biol. Chem. 2012, 287, 24807. (573) Beckham, G. T.; Ståhlberg, J.; Knott, B. C.; Himmel, M. E.; Crowley, M. F.; Sandgren, M.; Sørlie, M.; Payne, C. M. Curr. Opin. Biotechnol. 2014, 27, 96. (574) Koivula, A.; Reinikainen, T.; Ruohonen, L.; Valkeajarvi, A.; Claeyssens, M.; Teleman, O.; Kleywegt, G. J.; Szardenings, M.; Rouvinen, J.; Jones, T. A.; Teeri, T. T. Protein Eng. 1996, 9, 691. (575) Nakamura, A.; Tsukada, T.; Auer, S.; Furuta, T.; Wada, M.; Koivula, A.; Igarashi, K.; Samejima, M. J. Biol. Chem. 2013, 288, 13503. (576) GhattyVenkataKrishna, P. K.; Alekozai, E. M.; Beckham, G. T.; Schulz, R.; Crowley, M. F.; Uberbacher, E. C.; Cheng, X. Biophys. J. 2013, 104, 904. (577) Taylor, C. B.; Payne, C. M.; Himmel, M. E.; Crowley, M. F.; McCabe, C.; Beckham, G. T. J. Phys. Chem. B 2013, 117, 4924. (578) Payne, C. M.; Bomble, Y.; Taylor, C. B.; McCabe, C.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T. J. Biol. Chem. 2011, 286, 41028. (579) Zakariassen, H.; Aam, B. B.; Horn, S. J.; Vårum, K. M.; Sørlie, M.; Eijsink, V. G. H. J. Biol. Chem. 2009, 284, 10610. (580) Payne, C. M.; Jiang, W.; Shirts, M. R.; Himmel, M. E.; Crowley, M. F.; Beckham, G. T. J. Am. Chem. Soc. 2013, 135, 18831. (581) Holtzapple, M.; Cognata, M.; Shu, Y.; Hendrickson, C. Biotechnol. Bioeng. 1990, 36, 275. (582) Gusakov, A. V.; Sinitsyn, A. P. Biotechnol. Bioeng. 1992, 40, 663. (583) Pingali, S. V.; O’Neill, H. M.; McGaughey, J.; Urban, V. S.; Rempe, C. S.; Petridis, L.; Smith, J. C.; Evans, B. R.; Heller, W. T. J. Biol. Chem. 2011, 286, 32801. (584) Andrić, P.; Meyer, A. S.; Jensen, P. A.; Dam-Johansen, K. Biotechnol. Adv. 2010, 28, 308. (585) Gan, Q.; Allen, S. J.; Taylor, G. Biochem. Eng. J. 2002, 12, 223. (586) Tengborg, C.; Galbe, M.; Zacchi, G. Enzyme Microb. Technol. 2001, 28, 835. (587) Xiao, Z. Z.; Zhang, X.; Gregg, D. J.; Saddler, J. N. Appl. Biochem. Biotechnol. 2004, 113, 1115. (588) Lee, Y. H.; Fan, L. T. Biotechnol. Bioeng. 1983, 25, 939. (589) Andrić, P.; Meyer, A. S.; Jensen, P. A.; Dam-Johansen, K. Biotechnol. Adv. 2010, 28, 407. (590) Gan, Q.; Allen, S. J.; Taylor, G. Process Biochem. 2003, 38, 1003. (591) Gusakov, A. V.; Sinitsyn, A. P.; Klyosov, A. A. Biotechnol. Bioeng. 1987, 29, 906. (592) Gavlighi, H. A.; Meyer, A. S.; Mikkelsen, J. D. Biotechnol. Lett. 2013, 35, 205. (593) Igarashi, K.; Samejima, M.; Eriksson, K. E. L. Eur. J. Biochem. 1998, 253, 101. 1440

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(594) Dekker, R. F. H.; Wallis, A. F. A. Biotechnol. Bioeng. 1983, 25, 3027. (595) Sternberg, D.; Vijayakumar, P.; Reese, E. T. Can. J. Microbiol. 1977, 23, 139. (596) Bommarius, A. S.; Katona, A.; Cheben, S. E.; Patel, A. S.; Ragauskas, A. J.; Knudson, K.; Pu, Y. Metab. Eng. 2008, 10, 370. (597) Teugjas, H.; Väljamäe, P. Biotechnol. Biofuels 2013, 6, 105. (598) Halliwell, G.; Griffin, M. Biochem. J. 1973, 135, 587. (599) Johnson, E. A.; Reese, E. T.; Demain, A. L. J. Appl. Biochem. 1982, 4, 64. (600) Väljamäe, P.; Sild, V.; Pettersson, G.; Johansson, G. Eur. J. Biochem. 1998, 253, 469. (601) Zhang, S.; Wolfgang, D. E.; Wilson, D. B. Biotechnol. Bioeng. 1999, 66, 35. (602) Bezerra, R. M. F.; Dias, A. A. Appl. Biochem. Biotechnol. 2004, 112, 173. (603) Oh, K. K.; Kim, S. W.; Jeong, Y. S.; Hong, S. I. Appl. Biochem. Biotechnol. 2000, 89, 15. (604) Philippidis, G. P.; Smith, T. K.; Wyman, C. E. Biotechnol. Bioeng. 1993, 41, 846. (605) Wald, S.; Wilke, C. R.; Blanch, H. W. Biotechnol. Bioeng. 1984, 26, 221. (606) Claeyssens, M.; van Tilbeurgh, H.; Tomme, P.; Wood, T. M.; McRae, S. I. Biochem. J. 1989, 261, 819. (607) van Tilbeurgh, H.; Claeyssens, M. FEBS Lett. 1985, 187, 283. (608) Vonhoff, S.; Piens, K.; Pipelier, M.; Braet, C.; Claeyssens, M.; Vasella, A. Helv. Chim. Acta 1999, 82, 963. (609) Hsu, T. A.; Gong, C. S.; Tsao, G. T. Biotechnol. Bioeng. 1980, 22, 2305. (610) Nidetzky, B.; Zachariae, W.; Gercken, G.; Hayn, M.; Steiner, W. Enzyme Microb. Technol. 1994, 16, 43. (611) Gruno, M.; Väljamäe, P.; Pettersson, G.; Johansson, G. Biotechnol. Bioeng. 2004, 86, 503. (612) van Tilbeurgh, H.; Loontiens, F. G.; De Bruyne, C. K.; Claeyssens, M. Methods Enzymol. 1988, 160, 45. (613) Du, F. Y.; Wolger, E.; Wallace, L.; Liu, A.; Kaper, T.; Kelemen, B. Appl. Biochem. Biotechnol. 2010, 161, 313. (614) Teugjas, H.; Väljamäe, P. Biotechnol. Biofuels 2013, 6, 104. (615) Murphy, L.; Bohlin, C.; Baumann, M. J.; Olsen, S. N.; Sørensen, T. H.; Anderson, L.; Borch, K.; Westh, P. Enzyme Microb. Technol. 2013, 52, 163. (616) Bezerra, R. M. F.; Dias, A. A. Appl. Biochem. Biotechnol. 2005, 126, 49. (617) Bezerra, R. M. F.; Dias, A. A.; Fraga, I.; Pereira, A. N. Appl. Biochem. Biotechnol. 2006, 134, 27. (618) Bezerra, R. M. F.; Dias, A. A.; Fraga, I.; Pereira, A. N. Appl. Biochem. Biotechnol. 2011, 165, 178. (619) Holtzapple, M. T.; Caram, H. S.; Humphrey, A. E. Biotechnol. Bioeng. 1984, 26, 753. (620) Berlin, A.; Balakshin, M.; Gilkes, N.; Kadla, J.; Maximenko, V.; Kubo, S.; Saddler, J. J. Biotechnol. 2006, 125, 198. (621) Kumar, R.; Wyman, C. E. Biotechnol. Bioeng. 2014, 111, 1341. (622) Qing, Q.; Yang, B.; Wyman, C. E. Bioresour. Technol. 2010, 101, 9624. (623) Kont, R.; Kurašin, M.; Teugjas, H.; Väljamäe, P. Biotechnol. Biofuels 2013, 6, 135. (624) Dekker, R. F. H. Biotechnol. Bioeng. 1986, 28, 1438. (625) Dale, M. P.; Ensley, H. E.; Kern, K.; Sastry, K. A. R.; Byers, L. D. Biochemistry 1985, 24, 3530. (626) Cannella, D.; Hsieh, C.-w. C.; Felby, C.; Jørgensen, H. Biotechnol. Biofuels 2012, 5, 26. (627) Cannella, D.; Jørgensen, H. Biotechnol. Bioeng. 2014, 111, 59. (628) Krisch, J.; Bencsik, O.; Papp, T.; Vagvolgyi, C.; Tako, M. Bioresour. Technol. 2012, 114, 555. (629) Christakopoulos, P.; Goodenough, P. W.; Kekos, D.; Macris, B. J.; Claeyssens, M.; Bhat, M. K. Eur. J. Biochem. 1994, 224, 379. (630) Gao, L.; Gao, F.; Zhang, D. Y.; Zhang, C.; Wu, G. H.; Chen, S. L. Bioresour. Technol. 2013, 147, 658.

(631) He, H. Y.; Qin, Y. L.; Chen, G. G.; Li, N.; Liang, Z. Q. Appl. Biochem. Biotechnol. 2013, 169, 870. (632) Bhikhabhai, R.; Pettersson, L. G. FEBS Lett. 1984, 167, 301. (633) Saarelainen, R.; Paloheimo, M.; Fagerström, R.; Suominen, P. L.; Nevalainen, K. M. H. Mol. Gen. Genet. 1993, 241, 497. (634) Xu, J.; Takakuwa, N.; Nogawa, M.; Okada, H.; Morikawa, Y. Appl. Microbiol. Biotechnol. 1998, 49, 718. (635) Fischer, W. H.; Spiess, J. Proc. Natl. Acad. Sci. U.S.A. 1987, 84, 3628. (636) Busby, W. H.; Quackenbush, G. E.; Humm, J.; Youngblood, W. W.; Kizer, J. J. Biol. Chem. 1987, 262, 8532. (637) Hinke, S. A.; Pospisilik, J. A.; Demuth, H.-U.; Mannhart, S.; Kühn-Wache, K.; Hoffmann, T.; Nishimura, E.; Pederson, R. A.; McIntosh, C. H. J. Biol. Chem. 2000, 275, 3827. (638) Van Coillie, E.; Proost, P.; Van Aelst, I.; Struyf, S.; Polfliet, M.; De Meester, I.; Harvey, D. J.; Van Damme, J.; Opdenakker, G. Biochemistry 1998, 37, 12672. (639) Podell, D. N.; Abraham, G. N. Biochem. Biophys. Res. Commun. 1978, 81, 176. (640) Dana, C. M.; Dotson-Fagerstrom, A.; Roche, C. M.; Kal, S. M.; Chokhawala, H. A.; Blanch, H. W.; Clark, D. S. Biotechnol. Bioeng. 2014, 111, 842. (641) Dana, C. M.; Saija, P.; Kal, S. M.; Bryan, M. B.; Blanch, H. W.; Clark, D. S. Biotechnol. Bioeng. 2012, 109, 2710. (642) Varki, A. Glycobiology 1993, 3, 97. (643) Hart, G. W.; Copeland, R. J. Cell 2010, 143, 672. (644) Maras, M.; DeBruyn, A.; Schraml, J.; Herdewijn, P.; Claeyssens, M.; Fiers, W.; Contreras, R. Eur. J. Biochem. 1997, 245, 617. (645) Jeoh, T.; Michener, W.; Himmel, M. E.; Decker, S. R.; Adney, W. S. Biotechnol. Biofuels 2008, 1, 10. (646) Stals, I.; Sandra, K.; Devreese, B.; Van Beeumen, J.; Claeyssens, M. Glycobiology 2004, 14, 725. (647) Sheirneiss, G.; Montenecourt, B. S. Appl. Microbiol. Biotechnol. 1984, 20, 46. (648) Adney, W. S.; Jeoh, T.; Beckham, G. T.; Chou, Y. C.; Baker, J. O.; Michener, W.; Brunecky, R.; Himmel, M. E. Cellulose 2009, 16, 699. (649) Voutilainen, S. P.; Murray, P. G.; Tuohy, M. G.; Koivula, A. Protein Eng., Des. Sel. 2010, 23, 69. (650) Heinzelman, P.; Komor, R.; Kanaan, A.; Romero, P.; Yu, X.; Mohler, S.; Snow, C.; Arnold, F. Protein Eng., Des. Sel. 2010, 23, 871. (651) Komor, R. S.; Romero, P. A.; Xie, C. B.; Arnold, F. H. Protein Eng., Des. Sel. 2012, 25, 827. (652) Smith, M. A.; Bedbrook, C. N.; Wu, T.; Arnold, F. H. ACS Synth. Biol. 2013, 2, 690. (653) Ducros, V. M.-A.; Tarling, C. A.; Zechel, D. L.; Brzozowski, A. M.; Frandsen, T. P.; von Ossowski, I.; Schülein, M.; Withers, S. G.; Davies, G. J. Chem. Biol. 2003, 10, 619. (654) Henrissat, B. Biochem. J. 1991, 280, 309. (655) Mertz, B.; Kuczenski, R. S.; Larsen, R. T.; Hill, A. D.; Reilly, P. J. Biopolymers 2005, 79, 197. (656) Koivula, A.; Ruohonen, L.; Wohlfahrt, G.; Reinikainen, T.; Teeri, T. T.; Piens, K.; Claeyssens, M.; Weber, M.; Vasella, A.; Becker, D.; Sinnott, M. L.; Zou, J.-Y.; Kleywegt, G. J.; Szardenings, M.; Ståhlberg, J.; Jones, T. A. J. Am. Chem. Soc. 2002, 124, 10015. (657) Henrissat, B.; Teeri, T. T.; Warren, R. A. J. FEBS Lett. 1998, 425, 352. (658) Spezio, M.; Wilson, D. B.; Karplus, P. A. Biochemistry 1993, 32, 9906. (659) Varrot, A.; Hastrup, S.; Schülein, M.; Davies, G. J. Biochem. J. 1999, 304, 297. (660) Meinke, A.; Damude, H. G.; Tomme, P.; Kwan, E.; Kilburn, D. G.; Miller, R. C., Jr.; Warren, R. A. J.; Gilkes, N. R. J. Biol. Chem. 1995, 270, 4383. (661) Damude, H. G.; Withers, S. G.; Kilburn, D. G.; Miller, R. C., Jr.; Warren, R. A. J. Biochemistry 1995, 34, 2220. (662) Varrot, A.; Schülein, M.; Davies, G. J. Biochemistry 1999, 38, 8884. (663) Davies, G. J.; Brzozowski, A. M.; Dauter, M.; Varrot, A.; Schülein, M. Biochem. J. 2000, 348, 201. 1441

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(664) de Grotthuss, C. J. T. Ann. Chim. 1806, 58, 54. (665) Varrot, A.; Frandsen, T. P.; von Ossowski, I.; Boyer, V.; Cottaz, S.; Driguez, H.; Schülein, M.; Davies, G. J. Structure 2003, 11, 855. (666) Varrot, A.; Macdonald, J.; Stick, R. V.; Pell, G.; Gilbert, H. J.; Davies, G. J. Chem. Commun. 2003, 946. (667) Varrot, A.; Leydier, S.; Pell, G.; Macdonald, J. M.; Stick, R. V.; Henrissat, B.; Gilbert, H. J.; Davies, G. J. J. Biol. Chem. 2005, 280, 20181. (668) Yoshida, M.; Sato, K.; Kaneko, S.; Fukuda, K. Biosci. Biotechnol. Biochem. 2009, 73, 67. (669) Liu, Y.; Igarashi, K.; Kaneko, S.; Tonozuka, T.; Samejima, M.; Fukuda, K.; Yoshida, M. Biosci. Biotechnol. Biochem. 2009, 73, 1432. (670) Liu, Y.; Yoshida, M.; Kurakata, Y.; Miyazaki, T.; Igarashi, K.; Samejima, M.; Fukuda, K.; Nishikawa, A.; Tonozuka, T. FEBS J. 2010, 277, 1532. (671) Tamura, M.; Miyazaki, T.; Tanaka, Y.; Yoshida, M.; Nishikawa, A.; Tonozuka, T. FEBS J. 2012, 279, 1871. (672) Thompson, A. J.; Heu, T.; Shaghasi, T.; Benyamino, R.; Jones, A.; Friis, E. P.; Wilson, K. S.; Davies, G. J. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2012, 68, 875. (673) Wang, X.-J.; Peng, Y.-J.; Zhang, L.-Q.; Li, A.-N.; Li, D.-C. Appl. Microbiol. Biotechnol. 2012, 95, 1469. (674) Wu, I.; Arnold, F. H. Biotechnol. Bioeng. 2013, 110, 1874. (675) Eijsink, V. G. H.; Bjørk, A.; Gåseidnes, S.; Sirevåg, R.; Synstad, B.; van den Burg, B.; Vriend, G. J. Biotechnol. 2004, 113, 105. (676) Eijsink, V. G. H.; Gåseidnes, S.; Borchert, T. V.; van den Burg, B. Biomol. Eng. 2005, 22, 21. (677) Matthews, B. W.; Nicholson, H.; Becktel, W. J. Proc. Natl. Acad. Sci. U.S.A. 1987, 84, 6663. (678) Watanabe, K.; Masuda, T.; Ohashi, H.; Mihara, H.; Suzuki, Y. Eur. J. Biochem. 1994, 226, 277. (679) Prajapati, R. S.; Das, M.; Sreeramulu, S.; Sirajuddin, M.; Srinivasan, S.; Krishnamurthy, V.; Ranjani, R.; Ramakrishnan, C.; Varadarajan, R. Proteins: Struct., Funct., Bioinf. 2007, 66, 480. (680) Wolfgang, D. E.; Wilson, D. B. Biochemistry 1999, 38, 9746. (681) André, G.; Kanchanawong, P.; Palma, R.; Cho, H.; Deng, X.; Irwin, D.; Himmel, M. E.; Wilson, D. B.; Brady, J. W. Protein Eng. 2003, 16, 125. (682) Vuong, T. V.; Wilson, D. B. FEBS J. 2009, 276, 3837. (683) Sandgren, M.; Wu, M.; Karkehabadi, S.; Mitchinson, C.; Kelemen, B. R.; Larenas, E. A.; Ståhlberg, J.; Hansson, H. J. Mol. Biol. 2013, 425, 622. (684) Ruohonen, L.; Koivula, A.; Reinikainen, T.; Valkeajärvi, A.; Teleman, A.; Claeyssens, M.; Szardenings, M.; Jones, T. A.; Teeri, T. T. In Trichoderma reesei Cellulases and Other Hydrolases: Enzyme Structures, Biochemistry, Genetics and Applications; Suominen, P., Reinikainen, T., Eds.; Foundation for Biotechnical and Industrial Fermentation Research: Espoo, Finland, 1993; pp 87−96. (685) Zhang, S.; Wilson, D. B. J. Biotechnol. 1997, 57, 101. (686) Wu, M.; Bu, L.; Vuong, T. V.; Wilson, D. B.; Crowley, M. F.; Sandgren, M.; Ståhlberg, J.; Beckham, G. T.; Hansson, H. J. Biol. Chem. 2013, 288, 33107. (687) Barr, B. K.; Wolfgang, D. E.; Piens, K.; Claeyssens, M.; Wilson, D. B. Biochemistry 1998, 37, 9220. (688) Larsson, A. M.; Bergfors, T.; Dultz, E.; Irwin, D. C.; Roos, A.; Driguez, H.; Wilson, D. B.; Jones, T. A. Biochemistry 2005, 44, 12915. (689) Dougherty, D. A. Science 1996, 271, 163. (690) van Tilbeurgh, H.; Loontiens, F. G.; Engelborgs, Y.; Claeyssens, M. Eur. J. Biochem. 1989, 184, 553. (691) Teleman, A.; Koivula, A.; Reinikainen, T.; Valkeajärvi, A.; Teeri, T. T.; Drakenberg, T.; Teleman, O. Eur. J. Biochem. 1995, 231, 250. (692) Taylor, J. S.; Teo, B. T.; Wilson, D. B.; Brady, J. W. Protein Eng. 1995, 8, 1145. (693) Harjunpäa,̈ V.; Teleman, A.; Koivula, A.; Ruohonen, L.; Teeri, T. T.; Teleman, O.; Drakenberg, T. Eur. J. Biochem. 1996, 240, 584. (694) Konstantinidis, A. K.; Marsden, I.; Sinnott, M. L. Biochem. J. 1993, 291, 883. (695) Damude, H. G.; Ferro, V.; Withers, S. G.; Warren, R. A. J. Biochem. J. 1996, 315, 467.

(696) Zhang, S.; Barr, B. K.; Wilson, D. B. Eur. J. Biochem. 2000, 267, 244. (697) Calza, R. E.; Irwin, D. C.; Wilson, D. B. Biochemistry 1985, 24, 7797. (698) Ghangas, G. S.; Wilson, D. B. Appl. Environ. Microbiol. 1988, 54, 2521. (699) Amano, Y.; Shiroishi, M.; Nisizawa, K.; Hoshino, E.; Kanda, T. J. Biochem. 1996, 120, 1123. (700) Quirk, A.; Lipkowski, J.; Vandenende, C.; Cockburn, D.; Clarke, A. J.; Dutcher, J. R.; Roscoe, S. G. Langmuir 2010, 26, 5007. (701) Okada, G.; Nisizawa, K. J. Biochem. 1975, 78, 297. (702) Okada, G.; Nisizawa, T.; Nisizawa, K. Biochem. J. 1966, 99, 214. (703) Wood, T. M. Biochem. J. 1971, 121, 353. (704) Wu, M.; Nerinckx, W.; Piens, K.; Ishida, T.; Hansson, H.; Sandgren, M.; Ståhlberg, J. FEBS J. 2013, 280, 184. (705) Ai, Y.-C.; Wilson, D. B. Enzyme Microb. Technol. 2002, 30, 804. (706) Ai, Y.-C.; Zhang, S.; Wilson, D. B. Enzyme Microb. Technol. 2003, 32, 331. (707) Lantz, S. E.; Goedegebuur, F.; Hommes, R.; Kaper, T.; Kelemen, B. R.; Mitchinson, C.; Wallace, L.; Ståhlberg, J.; Larenas, E. A. Biotechnol. Biofuels 2010, 3, 20. (708) Heinzelman, P.; Snow, C. D.; Smith, M. A.; Yu, X. L.; Kannan, A.; Boulware, K.; Villalobos, A.; Govindarajan, S.; Minshull, J.; Arnold, F. H. J. Biol. Chem. 2009, 284, 26229. (709) Heinzelman, P.; Snow, C. D.; Wu, I.; Nguyen, C.; Villalobos, A.; Govindarajan, S.; Minshull, J.; Arnold, F. H. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 5610. (710) Wu, I.; Heel, T.; Arnold, F. H. Biochim. Biophys. Acta 2013, 1834, 1539. (711) Denman, S.; Xue, G.-P.; Patel, B. Appl. Environ. Microbiol. 1996, 62, 1889. (712) Li, X.-L.; Chen, H.; Ljungdahl, L. G. Appl. Environ. Microbiol. 1997, 63, 4721. (713) Wohlfahrt, G.; Pellikka, T.; Boer, H.; Teeri, T. T.; Koivula, A. Biochemistry 2003, 42, 10095. (714) Chow, C.-M.; Yagüe, E.; Raguz, S.; Wood, D. A.; Thurston, C. F. Appl. Environ. Microbiol. 1994, 60, 2779. (715) Emalfrab, M. A.; Burlingame, R. P.; Olson, P. T.; Sinitsyn, A. P.; Parriche, M.; Bousson, J. C.; Pynnonen, C. M.; Punt, P. J.; Van Zeijl, C. M. J. (Dyadic International, Inc.) Transformation System in the Field of Filamentous Fungal Hosts. U.S. Patent 6,573,086 B1, June 3, 2003. (716) Moriya, T.; Watanabe, M.; Sumida, N.; Okakura, K.; Murakami, T. Biosci. Biotechnol. Biochem. 2003, 67, 1434. (717) Dalbøge, H.; Heldt-Hansen, H. P. Mol. Gen. Genet. 1994, 243, 253. (718) Toda, H.; Nagahata, N.; Amano, Y.; Nozaki, K.; Kanda, T.; Okazaki, M.; Shimosaka, M. Biosci. Biotechnol. Biochem. 2008, 72, 3142. (719) Wu, W.; Lange, L.; Skovlund, D. A.; Liu, Y. (Novozymes A/S) Polypeptides Having Cellobiohydrolase II Activity and Polynucleotides Encoding Same. U.S. Patent 7,867,744 B2, Jan. 11, 2011. (720) Wang, H.-C.; Chen, Y.-C.; Huang, C.-T.; Hseu, R.-S. Protein Expression Purif. 2013, 90, 153. (721) Chen, H.; Li, X.-L.; Blum, D. L.; Ximenes, E. A.; Ljungdahl, L. G. Appl. Biochem. Biotechnol. 2003, 108, 775. (722) Gao, L.; Wang, F.; Gao, F.; Wang, L.; Zhao, J.; Qu, Y. Bioresour. Technol. 2011, 102, 8339. (723) Zhao, J.; Shi, P.; Li, Z.; Yang, P.; Luo, H.; Bai, Y.; Wang, Y.; Yao, B. Bioresour. Technol. 2012, 121, 404. (724) Tsai, C.-F.; Qiu, X.; Liu, J.-H. Anaerobe 2003, 9, 131. (725) Harhangi, H. R.; Freelove, A. C. J.; Ubhayasekera, W.; van Dinther, M.; Steenbakkers, P. J. M.; Akhmanova, A.; van der Drift, C.; Jetten, M. S. M.; Mowbray, S. L.; Gilbert, H. J.; Op den Camp, H. J. M. Biochim. Biophys. Acta, Gene Struct. Expression 2003, 1628, 30. (726) Yamanobe, T.; Watanabe, M.; Hamaya, T.; Sumida, N.; Aoyagi, K.; Murakami, T. (Japan as represented by Director General of Agency of Industrial Science and Technology, Meiji Seika Kaisha Ltd.) Protein Having Cellulase Activities and Process for Producing the Same. U.S. Patent 6,127,160. Oct. 3, 2000. 1442

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(761) Clarke, A. J.; Drummelsmith, J.; Yaguchi, M. FEBS Lett. 1997, 414, 359. (762) Tseng, C.-W.; Ko, T.-P.; Guo, R.-T.; Huang, J.-W.; Wang, H.C.; Huang, C.-H.; Cheng, Y.-S.; Wang, A. H. J.; Liu, J.-R. Acta Crystallogr., Sect. F: Struct. Biol. Cryst. Commun. 2011, 67, 1189. (763) Liu, J.; Wang, X.; Xu, D. J. Phys. Chem. B 2010, 114, 1462. (764) Saharay, M.; Guo, H.-B.; Smith, J. C.; Guo, H. In Computational Modeling in Lignocellulosic Biofuel Production; Nimlos, M. R., Crowley, M. F., Eds.; American Chemical Society: Washington, DC, 2010; pp 135−154. (765) Medve, J.; Lee, D.; Tjerneld, F. J. Chromatogr. A 1998, 808, 153. (766) Quiocho, F. A. Annu. Rev. Biochem. 1986, 55, 287. (767) Aurora, R.; Rose, G. D. Protein Sci. 1998, 7, 21. (768) Payne, C. M.; Baban, J.; Horn, S. J.; Backe, P. H.; Arvai, A. S.; Dalhus, B.; Bjørås, M.; Eijsink, V. G. H.; Sørlie, M.; Beckham, G. T.; Vaaje-Kolstad, G. J. Biol. Chem. 2012, 287, 36322. (769) Matsumura, M.; Signor, G.; Matthews, B. W. Nature 1989, 342, 291. (770) Vieille, C.; Zeikus, G. J. Microbiol. Mol. Biol. Rev. 2001, 65, 1. (771) Volkin, D. B.; Klibanov, A. M. J. Biol. Chem. 1987, 262, 2945. (772) Elcock, A. H. J. Mol. Biol. 1998, 284, 489. (773) Hendsch, Z. S.; Jonsson, T.; Sauer, R. T.; Tidor, B. Biochemistry 1996, 35, 7621. (774) Perutz, M. F.; Raidt, H. Nature 1975, 255, 256. (775) Waldburger, C. D.; Schildbach, J. F.; Sauer, R. T. Nat. Struct. Mol. Biol. 1995, 2, 122. (776) Liu, J. H.; Tsai, C. F.; Liu, J. W.; Cheng, K. J.; Cheng, C. L. Enzyme Microb. Technol. 2001, 28, 582. (777) Lin, L.; Meng, X.; Liu, P.; Hong, Y.; Wu, G.; Huang, X.; Li, C.; Dong, J.; Xiao, L.; Liu, Z. Appl. Microbiol. Biotechnol. 2009, 82, 671. (778) Liu, W.; Zhang, X.-Z.; Zhang, Z.; Zhang, Y. H. P. Appl. Environ. Microbiol. 2010, 76, 4914. (779) Samanta, S.; Basu, A.; Halder, U. C.; Sen, S. K. J. Microbiol. 2012, 50, 518. (780) Parry, N. J.; Beever, D. E.; Owen, E.; Nerinckx, W.; Claeyssens, M.; Van Beeumen, J.; Bhat, M. K. Arch. Biochem. Biophys. 2002, 404, 243. (781) Pereira, J. H.; Chen, Z.; McAndrew, R. P.; Sapra, R.; Chhabra, S. R.; Sale, K. L.; Simmons, B. A.; Adams, P. D. J. Struct. Biol. 2010, 172, 372. (782) Berghem, L. E. R.; Pettersson, L. G.; Axiofredriksson, U. B. Eur. J. Biochem. 1976, 61, 621. (783) Okada, G. J. Biochem. 1975, 77, 33. (784) Okada, G. J. Biochem. 1976, 80, 913. (785) Selby, K.; Maitland, C. C. Biochem. J. 1967, 104, 716. (786) Beldman, G.; Searle-Van Leeuwen, M. F.; Rombouts, F. M.; Voragen, F. G. J. Eur. J. Biochem. 1985, 146, 301. (787) Beldman, G.; Voragen, A. G. J.; Rombouts, F. M.; Pilnik, W. Biotechnol. Bioeng. 1988, 31, 173. (788) Vinzant, T. B.; Adney, W. S.; Decker, S. R.; Baker, J. O.; Kinter, M. T.; Sherman, N. E.; Fox, J. W.; Himmel, M. E. Appl. Biochem. Biotechnol. 2001, 91−93, 99. (789) Claeyssens, M.; Aerts, G. Bioresour. Technol. 1992, 39, 143. (790) Johnston, D. B.; Shoemaker, S. P.; Smith, G. M.; Whitaker, J. R. J. Food Biochem. 1998, 22, 301. (791) Jäger, G.; Wu, Z.; Garschhammer, K.; Engel, P.; Klement, T.; Rinaldi, R.; Spiess, A. C.; Büchs, J. Biotechnol. Biofuels 2010, 3, 18. (792) Nidetzky, B.; Steiner, W.; Claeyssens, M. Biochem. J. 1994, 303, 817. (793) Palonen, H.; Tjerneld, F.; Zacchi, G.; Tenkanen, M. J. Biotechnol. 2004, 107, 65. (794) Le Costaouëc, T.; Pakarinen, A.; Várnai, A.; Puranen, T.; Viikari, L. Bioresour. Technol. 2013, 143, 196. (795) Cruys-Bagger, N.; Ren, G.; Tatsumi, H.; Baumann, M. J.; Spodsberg, N.; Andersen, H. D.; Gorton, L.; Borch, K.; Westh, P. Biotechnol. Bioeng. 2012, 109, 3199. (796) Medve, J.; Karlsson, J.; Lee, D.; Tjerneld, F. Biotechnol. Bioeng. 1998, 59, 621.

(727) Brown, K.; Harris, P.; De Leon, A. L.; Merino, S. T. (Novozymes, Inc.) Polypeptides Having Cellobiohydrolase Activity and Polynucleotides Encoding Same. U.S. Patent 7,220,565, May 22, 2007. (728) Shoemaker, S. P.; Brown, R. D., Jr. Biochim. Biophys. Acta 1978, 523, 133. (729) Shoemaker, S. P.; Brown, R. D., Jr. Biochim. Biophys. Acta 1978, 523, 147. (730) Jenkins, J.; Lo Leggio, L.; Harris, G.; Pickersgill, R. FEBS Lett. 1995, 362, 281. (731) Pickersgill, R.; Harris, G.; Lo Leggio, L.; Mayans, O.; Jenkins, J. Biochem. Soc. Trans. 1998, 26, 190. (732) Henrissat, B.; Bairoch, A. Biochem. J. 1996, 316, 695. (733) Henrissat, B.; Callebaut, I.; Fabrega, S.; Lehn, P.; Mornon, J. P.; Davies, G. Proc. Natl. Acad. Sci. U.S.A. 1995, 92, 7090. (734) Aspeborg, H.; Coutinho, P. M.; Wang, Y.; Brumer, H., III; Henrissat, B. BMC Evol. Biol. 2012, 12, 186. (735) St John, F. J.; Gonzalez, J. M.; Pozharski, E. FEBS Lett. 2010, 584, 4435. (736) Béguin, P. Annu. Rev. Microbiol. 1990, 44, 219. (737) Hilge, M.; Gloor, S. M.; Rypniewski, W.; Sauer, O.; Heightman, T. D.; Zimmermann, W.; Winterhalter, K.; Piontek, K. Structure 1998, 6, 1433. (738) Larsson, A. M.; Anderson, L.; Xu, B.; Muñoz, I. G.; Usón, I.; Janson, J.-C.; Stålbrand, H.; Ståhlberg, J. J. Mol. Biol. 2006, 357, 1500. (739) Lo Leggio, L.; Parry, N. J.; Van Beeumen, J.; Claeyssens, M.; Bhat, M. K.; Pickersgill, R. W. Acta Crystallogr., Sect. D: Biol. Crystallogr. 1997, 53, 599. (740) Lo Leggio, L.; Larsen, S. FEBS Lett. 2002, 523, 103. (741) Ståhlberg, J.; Johansson, G.; Pettersson, G. Eur. J. Biochem. 1988, 173, 179. (742) Suominen, P. L.; Mäntylä, A. L.; Karhunen, T.; Hakola, S.; Nevalainen, H. Mol. Gen. Genet. 1993, 241, 523. (743) Dominguez, R.; Souchon, H.; Spinelli, S.; Dauter, Z.; Wilson, K. S.; Chauvaux, S.; Béguin, P.; Alzari, P. M. Nat. Struct. Biol. 1995, 2, 569. (744) Reardon, D.; Farber, G. K. FASEB J. 1995, 9, 497. (745) Copley, R. R.; Bork, P. J. Mol. Biol. 2000, 303, 627. (746) Nagano, N.; Orengo, C. A.; Thornton, J. M. J. Mol. Biol. 2002, 321, 741. (747) Wang, Q. P.; Tull, D.; Meinke, A.; Gilkes, N. R.; Warren, R. A. J.; Aebersold, R.; Withers, S. G. J. Biol. Chem. 1993, 268, 14096. (748) Macarrón, R.; Acebal, C.; Castillón, M. P.; Dominguez, J. M.; de la Mata, I.; Pettersson, G.; Tomme, P.; Claeyssens, M. Biochem. J. 1993, 289, 867. (749) Macarrón, R.; Henrissat, B.; Claeyssens, M. Biochim. Biophys. Acta, Gen. Subj. 1995, 1245, 187. (750) Macarron, R.; Van Beeumen, J.; Henrissat, B.; de la Mata, I.; Claeyssens, M. FEBS Lett. 1993, 316, 137. (751) Murzin, A. G.; Brenner, S. E.; Hubbard, T.; Chothia, C. J. Mol. Biol. 1995, 247, 536. (752) Badieyan, S.; Bevan, D. R.; Zhang, C. Biotechnol. Bioeng. 2012, 109, 31. (753) Van Petegem, F.; Vandenberghe, I.; Bhat, M. K.; Van Beeumen, J. Biochem. Biophys. Res. Commun. 2002, 296, 161. (754) Ducros, V.; Czjzek, M.; Belaich, A.; Gaudin, C.; Fierobe, H. P.; Belaich, L. P.; Davies, G. J.; Haser, R. Structure 1995, 3, 939. (755) Sakon, J.; Adney, W. S.; Himmel, M. E.; Thomas, S. R.; Karplus, P. A. Biochemistry 1996, 35, 10648. (756) Barras, F.; Bortoligerman, I.; Bauzan, M.; Rouvier, J.; Gey, C.; Heyraud, A.; Henrissat, B. FEBS Lett. 1992, 300, 145. (757) Gebler, J.; Gilkes, N. R.; Claeyssens, M.; Wilson, D. B.; Béguin, P.; Wakarchuk, W. W.; Kilburns, D. G.; Miller, R. C., Jr.; Warrens, R. A. J.; Withers, S. G. J. Biol. Chem. 1992, 267, 12559. (758) Baird, S. D.; Hefford, M. A.; Johnson, D. A.; Sung, W. L.; Yaguchi, M.; Seligy, V. L. Biochem. Biophys. Res. Commun. 1990, 169, 1035. (759) Navas, J.; Béguin, P. Biochem. Biophys. Res. Commun. 1992, 189, 807. (760) Py, B.; Bortoli-German, I.; Haiech, J.; Chippaux, M.; Barras, F. Protein Eng. 1991, 4, 325. 1443

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(797) Karlsson, J.; Medve, J.; Tjerneld, F. Appl. Biochem. Biotechnol. 1999, 82, 243. (798) Billard, H.; Faraj, A.; Lopes Ferreira, N.; Menir, S.; HeissBlanquet, S. Biotechnol. Biofuels 2012, 5, 9. (799) Donnelly, M. K.; Moran-Mirabal, J. M.; Corgie, S. C.; Craighead, H. G.; Walker, L. P. Biophys. J. 2010, 98, 749A. (800) Jeoh, T.; Wilson, D. B.; Walker, L. P. Biotechnol. Prog. 2006, 22, 270. (801) Tambor, J. H.; Ren, H.; Ushinsky, S.; Zheng, Y.; Riemens, A.; St-Francois, C.; Tsang, A.; Powlowski, J.; Storms, R. Appl. Microbiol. Biotechnol. 2012, 93, 203. (802) Baker, J. O.; Tatsumoto, K.; Grohmann, K.; Woodward, J.; Wichert, J. M.; Shoemaker, S. P.; Himmel, M. E. Appl. Biochem. Biotechnol. 1992, 34−35, 217. (803) Li, C.; Knierim, B.; Manisseri, C.; Arora, R.; Scheller, H. V.; Auer, M.; Vogel, K. P.; Simmons, B. A.; Singh, S. Bioresour. Technol. 2010, 101, 4900. (804) Wahlström, R.; Rovio, S.; Suurnäkki, A. RSC Adv. 2012, 2, 4472. (805) Schülein, M.; Tikhomirov, D. F.; Schou, C. In Trichoderma reesei Cellulases and Other Hydrolases: Enzyme Structures, Biochemistry, Genetics, and Applications: Proceedings of the TRICEL93 Symposium, June 2−5, 1993, Espoo, Finland; Suominen, P., Reinikainen, T., Eds.; Foundation for Biotechnical and Industrial Fermentation Research: Helsinki, 1993; pp 109−116. (806) Karlsson, J.; Momcilovic, D.; Wittgren, B.; Schülein, M.; Tjerneld, F.; Brinkmalm, G. Biopolymers 2002, 63, 32. (807) Liu, D.; Zhang, R.; Yang, X.; Xu, Y.; Tang, Z.; Tian, W.; Shen, Q. Protein Expression Purif. 2011, 79, 176. (808) Chikamatsu, G.; Shirai, K.; Kato, M.; Kobayashi, T.; Tsukagoshi, N. FEMS Microbiol. Lett. 1999, 175, 239. (809) Li, C.-H.; Wang, H.-R.; Yan, T.-R. Molecules 2012, 17, 9774. (810) Karnchanatat, A.; Petsom, A.; Sangvanicha, P.; Piapukiew, J.; Whalley, A. J. S.; Reynolds, C. D.; Gadd, G. M.; Sihanonth, P. Enzyme Microb. Technol. 2008, 42, 404. (811) Shi, R.; Li, Z.; Ye, Q.; Xu, J.; Liu, Y. Bioresour. Technol. 2013, 142, 338. (812) Yoon, J.-J.; Cha, C.-J.; Kim, Y.-S.; Son, D.-W.; Kim, Y.-K. J. Microbiol. Biotechnol. 2007, 17, 800. (813) Yoon, J.-J.; Cha, C.-J.; Kim, Y.-S.; Kim, W. Biotechnol. Lett. 2008, 30, 1373. (814) de Almeida, M. N.; Falkoski, D. L.; Guimaraes, V. M.; Ramos, H. J.; Visser, E. M.; Maitan-Alfenas, G. P.; de Rezende, S. T. Bioresour. Technol. 2013, 143, 413. (815) Cohen, R.; Suzuki, M. R.; Hammel, K. E. Appl. Environ. Microbiol. 2005, 71, 2412. (816) Kim, H. M.; Lee, Y. G.; Patel, D. H.; Lee, K. H.; Lee, D.-S.; Bae, H.-J. J. Ind. Microbiol. Biotechnol. 2012, 39, 1081. (817) Takashima, S.; Nakamura, A.; Masaki, H.; Uozumi, T. Biosci. Biotechnol. Biochem. 1997, 61, 245. (818) Fujino, Y.; Ogata, K.; Nagamine, T.; Ushida, K. Biosci. Biotechnol. Biochem. 1998, 62, 1795. (819) Sun, J.; Phillips, C. M.; Anderson, C. T.; Beeson, W. T.; Marletta, M. A.; Glass, N. L. Protein Expression Purif. 2011, 75, 147. (820) Qiu, X.; Selinger, B.; Yanke, L.; Cheng, K. Gene 2000, 245, 119. (821) Krogh, K. B. R. M.; Kastberg, H.; Jørgensen, C. I.; Berlin, A.; Harris, P. V.; Olsson, L. Enzyme Microb. Technol. 2009, 44, 359. (822) Chulkin, A. M.; Loginov, D. S.; Vavilova, E. A.; Abyanova, A. R.; Zorov, I. N.; Kurzeev, S. A.; Koroleva, O. V.; Benevolenskii, S. V. Biochemistry (Moscow) 2009, 74, 655. (823) Liu, G.; Qin, Y.; Hu, Y.; Gao, M.; Peng, S.; Qu, Y. Enzyme Microb. Technol. 2013, 52, 190. (824) Rubini, M. R.; Dillon, A. J.; Kyaw, C. M.; Faria, F. P.; PocasFonseca, M. J.; Silva-Pereira, I. J. Appl. Microbiol. 2010, 108, 1187. (825) Mernitz, G.; Koch, A.; Henrissat, B.; Schulz, G. Curr. Genet. 1996, 29, 490. (826) Jeya, M.; Joo, A.-R.; Lee, K.-M.; Sim, W.-I.; Oh, D.-K.; Kim, Y.S.; Kim, I.-W.; Lee, J.-K. Appl. Microbiol. Biotechnol. 2010, 85, 1005. (827) Uzcategui, E.; Johansson, G.; Ek, B.; Pettersson, G. J. Biotechnol. 1991, 21, 143.

(828) Zhao, J.; Shi, P.; Huang, H.; Li, Z.; Yuan, T.; Yang, P.; Luo, H.; Bai, Y.; Yao, B. Appl. Microbiol. Biotechnol. 2012, 95, 947. (829) Eberhardt, R. Y.; Gilbert, H. J.; Hazlewood, G. P. Microbiology 2000, 146, 1999. (830) Murray, P. G.; Grassick, A.; Laffey, C. D.; Cuffe, M. M.; Higgins, T.; Savage, A. V.; Planas, A.; Tuohy, M. G. Enzyme Microb. Technol. 2001, 29, 90. (831) Hong, J.; Tamaki, H.; Yamamoto, K.; Kumagai, H. Biotechnol. Lett. 2003, 25, 657. (832) Nozaki, K.; Seki, T.; Matsui, K.; Mizuno, M.; Kanda, T.; Amano, Y. Biosci. Biotechnol. Biochem. 2007, 71, 2375. (833) Sul, O. J.; Kim, J. H.; Park, S. J.; Son, Y. J.; Park, B. R.; Chung, D. K.; Jeong, C. S.; Han, I. S. Appl. Biochem. Biotechnol. 2004, 66, 63. (834) Huang, X. M.; Yang, Q.; Liu, Z. H.; Fan, J. X.; Chen, X. L.; Song, J. Z.; Wang, Y. Appl. Biochem. Biotechnol. 2010, 162, 103. (835) Ding, S. J.; Ge, W.; Buswell, J. A. Eur. J. Biochem. 2001, 268, 5687. (836) Telke, A. A.; Zhuang, N.; Ghatge, S. S.; Lee, S.-H.; Shah, A. A.; Khan, H.; Um, Y.; Shin, H.-D.; Chung, Y. R.; Lee, K. H.; Kim, S.-W. PLoS One 2013, 8, e65727. (837) Xiao-Zhou, Z.; Zhang, Y. H. P. Microb. Biotechnol. 2011, 4, 98. (838) Qin, Y.; Wei, X.; Liu, X.; Wang, T.; Qu, Y. Protein Expression Purif. 2008, 58, 162. (839) Banerjee, G.; Car, S.; Scott-Craig, J.; Hodge, D.; Walton, J. Biotechnol. Biofuels 2011, 4, 16. (840) Li, Q.; Gao, Y.; Wang, H.; Li, B.; Liu, C.; Yu, G.; Mu, X. Bioresour. Technol. 2012, 125, 193. (841) Banerjee, G.; Car, S.; Liu, T.; Williams, D. L.; Meza, S. L.; Walton, J. D.; Hodge, D. B. Biotechnol. Bioeng. 2012, 109, 922. (842) Karp, E. M.; Donohoe, B. S.; O’Brien, M. H.; Ciesielski, P. N.; Mittal, A.; Biddy, M. J.; Beckham, G. T. ACS Sustainable Chem. Eng. 2014, 2, 1481. (843) Wang, T.; Liu, X.; Yu, Q.; Zhang, X.; Qu, Y.; Gao, P.; Wang, T. Biomol. Eng. 2005, 22, 89. (844) Wang, H.; Jones, R. W. Appl. Microbiol. Biotechnol. 1997, 48, 225. (845) Baker, J. O.; McCarley, J. R.; Lovettt, R.; Yu, C. H.; Adney, W. S.; Rignall, T. R.; Vinzant, T. B.; Decker, S. R.; Sakon, J.; Himmel, M. E. Appl. Biochem. Biotechnol. 2005, 121, 129. (846) Nakazawa, H.; Okada, K.; Kobayashi, R.; Kubota, T.; Onodera, T.; Ochiai, N.; Omata, N.; Ogasawara, W.; Okada, H.; Morikawa, Y. Appl. Microbiol. Biotechnol. 2008, 81, 681. (847) Håkansson, U.; Fägerstam, L.; Pettersson, G.; Andersson, L. Biochim. Biophys. Acta 1978, 524, 385. (848) Ü lker, A.; Sprey, B. FEMS Microbiol Lett. 1990, 57, 215. (849) Sprey, B.; Uelker, A. FEMS Microbiol Lett. 1992, 71, 253. (850) Hayn, M.; Klinger, R.; Esterbauer, H. In Trichoderma Reesei Cellulases and Other Hydrolases; Suominen, P., Reinikainen, T., Eds.; Foundation for Biotechnical and Industrial Fermentation Research: Helsinki, 1993; pp 147−151. (851) Ward, M.; Wu, S.; Dauberman, J.; Weiss, G.; Larenas, E.; Bower, B.; Rey, M.; Clarkson, K.; Bott, R. The Tricell 93 Symposium; Espoo, Finland, 1993; pp 153−158. (852) Okada, H.; Tada, K.; Sekiya, T.; Yokoyama, K.; Takahashi, A.; Tohda, H.; Kumagai, H.; Morikawa, Y. Appl. Environ. Microbiol. 1998, 64, 555. (853) Goedegebuur, F.; Fowler, T.; Phillips, J.; van der Kley, P.; van Solingen, P.; Dankmeyer, L.; Power, S. D. Curr. Genet. 2002, 41, 89. (854) Master, E. R.; Zheng, Y.; Storms, R.; Tsang, A.; Powlowski, J. Biochem. J. 2008, 411, 161. (855) Song, B.-C.; Kim, K.-Y.; Yoon, J.-J.; Sim, S.-H.; Lee, K.; Kim, Y.S.; Kim, Y.-K.; Cha, C.-J. J. Microbiol. Technol. 2008, 18, 404. (856) Takeda, T.; Takahashi, M.; Nakanishi-Masuno, T.; Nakano, Y.; Saitoh, H.; Hirabuchi, A.; Fujisawa, S.; Terauchi, R. Appl. Microbiol. Biotechnol. 2010, 88, 1113. (857) Zechel, D. L.; He, S.; Dupont, C.; Withers, S. G. Biochem. J. 1998, 336, 139. (858) Sprey, B.; Bochem, H.-P. FEMS Microbiol. Lett. 1992, 97, 113. 1444

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(890) Sandgren, M.; Ståhlberg, J.; Mitchinson, C. Prog. Biophys. Mol. Biol. 2005, 89, 246. (891) Forse, G. J.; Ram, N.; Banatao, D. R.; Cascio, D.; Sawaya, M. R.; Klock, H. E.; Lesley, S. A.; Yeates, T. O. Protein Sci. 2011, 20, 168. (892) Murao, S.; Sakamoto, R.; Arai, M. Methods Enzymol. 1988, 160, 274. (893) Sakamoto, S.; Tamura, G.; Ito, K.; Ishikawa, T.; Iwano, K.; Nishiya, N. Curr. Genet. 1995, 27, 435. (894) Van Den Broeck, H. C.; De Graaff, L. H.; Visser, J.; Van Ooijen, A. J. J. (Gist-Brocades, B.V.) Fungal Cellulases. U.S. Patent 6,190,890 B1, Feb. 20, 2001. (895) Hasper, A. A.; Dekkers, E.; van Mil, M.; van de Vondervoort, P. J. I.; de Graaff, L. H. Appl. Environ. Microbiol. 2002, 68, 1556. (896) Narra, M.; Dixit, G.; Divecha, J.; Kumar, K.; Madamwar, D.; Shah, A. R. Int. Biodeterior. Biodegrad. 2014, 88, 150. (897) Shimokawa, T.; Shibuya, H.; Nojiri, M.; Yoshida, S.; Ishihara, M. Appl. Environ. Microbiol. 2008, 74, 5857. (898) Henriksson, G.; Nutt, A.; Henriksson, H.; Pettersson, B.; Ståhlberg, J.; Johansson, G.; Pettersson, G. Eur. J. Biochem. 1999, 259, 88. (899) Ishihara, H.; Imamura, K.; Kita, M.; Aimi, T.; Kitamoto, Y. Mycoscience 2005, 46, 148. (900) Henrissat, B.; Bairoch, A. Biochem. J. 1993, 293, 781. (901) Gilbert, H. J.; Hall, J.; Hazlewood, G. P.; Ferreira, L. M. A. Mol. Microbiol. 1990, 4, 759. (902) Rasmussen, G.; Mikkelsen, J. M.; Schuelein, M.; Patkar, S. A.; Hagen, F.; Hjort, C. M.; Hastrup, S. (Novo Nordisk A/S) A Cellulase Preparation Comprising an Endoglucanase Enzyme. W.O. Patent 1991017243 A1, Nov. 14, 1991. (903) Davies, G. J.; Dodson, G. G.; Hubbard, R. E.; Tolley, S. P.; Dauter, Z.; Wilson, K. S.; Hjort, C.; Mikkelsen, J. M.; Rasmussen, G.; Schülein, M. Nature 1993, 365, 362. (904) Davies, G. J.; Tolley, S. P.; Henrissat, B.; Hjort, C.; Schülein, M. Biochemistry 1995, 34, 16210. (905) Couturier, M.; Feliu, J.; Haon, M.; Navarro, D.; LesageMeessen, L.; Coutinho, P. M.; Berrin, J. G. Microb. Cell Fact. 2011, 10, 103. (906) Baba, Y.; Shimonaka, A.; Koga, J.; Kubota, H.; Kono, T. J. Bacteriol. 2005, 187, 3045. (907) Yang, J. C.; Madupu, R.; Durkin, A. S.; Ekborg, N. A.; Pedamallu, C. S.; Hostetler, J. B.; Radune, D.; Toms, B. S.; Henrissat, B.; Coutinho, P. M.; Schwarz, S.; Field, L.; Trindade-Silva, A. E.; Soares, C. A. G.; Elshahawi, S.; Hanora, A.; Schmidt, E. W.; Haygood, M. G.; Posfai, J.; Benner, J.; Madinger, C.; Nove, J.; Anton, B.; Chaudhary, K.; Foster, J.; Holman, A.; Kumar, S.; Lessard, P. A.; Luyten, Y. A.; Slatko, B.; Wood, N.; Wu, B.; Teplitski, M.; Mougous, J. D.; Ward, N.; Eisen, J. A.; Badger, J. H.; Distel, D. L. PLoS One 2009, 4, e6085. (908) Igarashi, K.; Ishida, T.; Hori, C.; Samejima, M. Appl. Environ. Microbiol. 2008, 74, 5628. (909) Sakamoto, K.; Toyohara, H. Comp. Biochem. Physiol., Part B: Biochem. Mol. Biol. 2009, 152, 390. (910) Sievers, F.; Wilm, A.; Dineen, D.; Gibson, T. J.; Karplus, K.; Li, W.; Lopez, R.; McWilliam, H.; Remmert, M.; Söding, J.; Thompson, J. D.; Higgins, D. G. Mol. Syst. Biol. 2011, 7, 539. (911) Okonechnikov, K.; Golosova, O.; Fursov, M.; UGENE Team. Bioinformatics 2012, 28, 1166. (912) Valjakka, J.; Rouvinen, J. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2003, 59, 765. (913) Hirvonen, M.; Papageorgiou, A. C. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 336. (914) Hirvonen, M.; Papageorgiou, A. C. J. Mol. Biol. 2003, 329, 403. (915) Takashima, S.; Iikura, H.; Nakamura, A.; Hidaka, M.; Masaki, H.; Uozumi, T. J. Biotechnol. 1999, 67, 85. (916) Xu, B. Z.; Hellman, U.; Ersson, B.; Janson, J. C. Eur. J. Biochem. 2000, 267, 4970. (917) Xu, B. Z. Endoglucanase and Mannanase from Blue Mussel, Mytilus edulis: Purification, Characterization, Gene and Three Dimensional Structure. Ph.D. Thesis, Center for Surface Biotechnology, Uppsala Biomedical Center, Uppsala University, Uppsala, Sweden, 2002.

(859) Huang, Y.; Krauss, G.; Cottaz, S.; Driguez, H.; Lipps, G. Biochem. J. 2005, 385, 581. (860) Yuan, S.; Wu, Y.; Cosgrove, D. J. Plant Physiol 2001, 127, 324. (861) Damasio, A. R.; Ribeiro, L. F.; Ribeiro, L. F.; Furtado, G. P.; Segato, F.; Almeida, F. B.; Crivellari, A. C.; Buckeridge, M. S.; Souza, T. A.; Murakami, M. T.; Ward, R. J.; Prade, R. A.; Polizeli, M. L. Biochim. Biophys. Acta 2012, 1824, 461. (862) Gloster, T. M.; Ibatullin, F. M.; Macauley, K.; Eklöf, J. M.; Roberts, S.; Turkenburg, J. P.; Bjørnvad, M. E.; Jørgensen, P. L.; Danielsen, S.; Johansen, K. S.; Borchert, T. V.; Wilson, K. S.; Brumer, H.; Davies, G. J. J. Biol. Chem. 2007, 282, 19177. (863) Wicher, K.; Abou-Hachem, M.; Halldó r sdó t tir, S.; Thorbjarnadóttir, S.; Eggertsson, G.; Hreggvidsson, G.; Nordberg Karlsson, E.; Holst, O. Appl. Microbiol. Biotechnol. 2001, 55, 578. (864) Bok, J.-D.; Yernool, D. A.; Eveleigh, D. E. Appl. Environ. Microbiol. 1998, 64, 4774. (865) Powlowski, J.; Mahajan, S.; Schapira, M.; Master, E. R. Carbohydr. Res. 2009, 344, 1175. (866) Grishutin, S. G.; Gusakov, A. V.; Dzedzyulya, E. I.; Sinitsyn, A. P. Carbohydr. Res. 2006, 341, 218. (867) Bauer, M. W.; Driskill, L. E.; Callen, W.; Snead, M. A.; Mathur, E. J.; Kelly, R. M. J. Bacteriol. 1999, 181, 284. (868) Kim, H.; Ahn, J.-H.; Görlach, J. M.; Caprari, C.; Scott-Craig, J. S.; Walton, J. D. Mol. Plant-Microbe Interact. 2001, 14, 1436. (869) Park, Y. B.; Cosgrove, D. J. Plant Physiol. 2012, 158, 1933. (870) Okada, H.; Mori, K.; Tada, K.; Nogawa, M.; Morikawa, Y. J. Mol. Catal. B: Enzym. 2000, 10, 249. (871) Sulzenbacher, G.; Shareck, F.; Morosoli, R.; Dupont, C.; Davies, G. J. Biochemistry 1997, 36, 16032. (872) Sulzenbacher, G.; Mackenzie, L. F.; Wilson, K. S.; Withers, S. G.; Dupont, C.; Davies, G. J. Biochemistry 1999, 38, 4826. (873) Keitel, T.; Simon, O.; Borriss, R.; Heinemann, U. Proc. Natl. Acad. Sci. U.S.A. 1993, 90, 5287. (874) Kim, H.-W.; Kataoka, M.; Ishikawa, K. FEBS Lett. 2012, 586, 1009. (875) Crennell, S. J.; Hreggvidsson, G. O.; Nordberg Karlsson, E. J. Mol. Biol. 2002, 320, 883. (876) Sandgren, M.; Gualfetti, P. J.; Shaw, A.; Gross, L. S.; Saldajeno, M.; Day, A. G.; Jones, T. A.; Mitchinson, C. Protein Sci. 2003, 12, 848. (877) Cheng, Y.-S.; Ko, T.-P.; Wu, T.-H.; Ma, Y.; Huang, C.-H.; Lai, H.-L.; Wang, A. H. J.; Liu, J.-R.; Guo, R.-T. Proteins: Struct., Funct., Bioinf. 2011, 79, 1193. (878) Yoshizawa, T.; Shimizu, T.; Hirano, H.; Sato, M.; Hashimoto, H. J. Biol. Chem. 2012, 287, 18710. (879) Khademi, S.; Zhang, D.; Swanson, S. M.; Wartenberg, A.; Witte, K.; Meyer, E. F. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 660. (880) Sandgren, M.; Gualfetti, P. J.; Paech, C.; Paech, S.; Shaw, A.; Gross, L. S.; Saldajeno, M.; Berglund, G. I.; Jones, T. A.; Mitchinson, C. Protein Sci. 2003, 12, 2782. (881) Prates, É. T.; Stankovic, I.; Silveira, R. L.; Liberato, M. V.; Henrique-Silva, F.; Pereira, N., Jr.; Polikarpov, I.; Skaf, M. S. PLoS One 2013, 8, e59069. (882) Robeva, A.; Politi, V.; Shannon, J. D.; Bjarnason, J. B.; Fox, J. W. Biomed. Biochim. Acta 1991, 50, 769. (883) Muilu, J.; Törrönen, A.; Peräkylä, M.; Rouvinen, J. Proteins: Struct., Funct., Bioinf. 1998, 31, 434. (884) Crennell, S. J.; Cook, D.; Minns, A.; Svergun, D.; Andersen, R. L.; Nordberg Karlsson, E. J. Mol. Biol. 2006, 356, 57. (885) Törrönen, A.; Harkki, A.; Rouvinen, J. EMBO J. 1994, 13, 2493. (886) van Solingen, P.; Meijer, D.; van der Kleij, W. A. H.; Barnett, C.; Bolle, R.; Power, S. D.; Jones, B. E. Extremophiles 2001, 5, 333. (887) Cheng, Y.-S.; Ko, T.-P.; Huang, J.-W.; Wu, T.-H.; Lin, C.-Y.; Luo, W.; Li, Q.; Ma, Y.; Huang, C.-H.; Wang, A. J.; Liu, J.-R.; Guo, R.-T. Appl. Microbiol. Biotechnol. 2012, 95, 661. (888) Nakazawa, H.; Okada, K.; Onodera, T.; Ogasawara, W.; Okada, H.; Morikawa, Y. Appl. Microbiol. Biotechnol. 2009, 83, 649. (889) Spilliaert, R.; Hreggvidsson, G. O.; Kristjansson, J. K.; Eggertsson, G.; Palsdottir, A. Eur. J. Biochem. 1994, 224, 923. 1445

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(952) Kende, H.; Bradford, K. J.; Brummell, D. A.; Cho, H. T.; Cosgrove, D. J.; Fleming, A. J.; Gehring, C.; Lee, Y.; McQueen-Mason, S.; Rose, J. K. C.; Voesenek, L. Plant Mol. Biol. 2004, 55, 311. (953) Darley, C. P.; Li, Y.; Schaap, P.; McQueen-Mason, S. J. FEBS Lett. 2003, 546, 416. (954) Qin, L.; Kudla, U.; Roze, E. H. A.; Goverse, A.; Popeijus, H.; Nieuwland, J.; Overmars, H.; Jones, J. T.; Schots, A.; Smant, G.; Bakker, J.; Helder, J. Nature 2004, 427, 30. (955) Brotman, Y.; Briff, E.; Viterbo, A.; Chet, I. Plant Physiol. 2008, 147, 779. (956) Jäger, G.; Girfoglio, M.; Dollo, F.; Rinaldi, R.; Bongard, H.; Commandeur, U.; Fischer, R.; Spiess, A. C.; Büchs, J. Biotechnol. Biofuels 2011, 4, 1. (957) Chen, X.-a.; Ishida, N.; Todaka, N.; Nakamura, R.; Maruyama, J.-i.; Takahashi, H.; Kitamoto, K. Appl. Environ. Microbiol. 2010, 76, 2556. (958) Wang, T.-Y.; Chen, H.-L.; Lu, M.-Y. J.; Chen, Y.-C.; Sung, H.M.; Mao, C.-T.; Cho, H.-Y.; Ke, H.-M.; Hwa, T.-Y.; Ruan, S.-K.; Hung, K.-Y.; Chen, C.-K.; Li, J.-Y.; Wu, Y.-C.; Chen, Y.-H.; Chou, S.-P.; Tsai, Y.-W.; Chu, T.-C.; Shih, C.-C. A.; Li, W.-H.; Shih, M.-C. Biotechnol. Biofuels 2011, 4, 24. (959) Zhou, Q.; Lv, X.; Zhang, X.; Meng, X.; Chen, G.; Liu, W. World J. Microbiol. Biotechnol. 2011, 27, 1905. (960) Gourlay, K.; Hu, J.; Arantes, V.; Andberg, M.; Saloheimo, M.; Penttilä, M.; Saddler, J. Bioresour. Technol. 2013, 142, 498. (961) Quiroz-Castañeda, R. E.; Martínez-Anaya, C.; Cuervo-Soto, L. I.; Segovia, L.; Folch-Mallol, J. L. Microb. Cell Fact. 2011, 10, 8. (962) Bouzarelou, D.; Billini, M.; Roumelioti, K.; Sophianopoulou, V. Fungal Genet. Biol. 2008, 45, 839. (963) van Straaten, K. E.; Dijkstra, B. W.; Vollmer, W.; Thunnissen, A.-M. W. H. J. Mol. Biol. 2005, 352, 1068. (964) Beckham, G. T.; Crowley, M. F. J. Phys. Chem. B 2011, 115, 4516. (965) Horn, S. J.; Vaaje-Kolstad, G.; Westereng, B.; Eijsink, V. G. H. Biotechnol. Biofuels 2012, 5, 45. (966) Vaaje-Kolstad, G.; Horn, S. J.; van Aalten, D. M. F.; Synstad, B.; Eijsink, V. G. H. J. Biol. Chem. 2005, 280, 28492. (967) Harris, P. V.; Welner, D.; McFarland, K. C.; Re, E.; Poulsen, J. C. N.; Brown, K.; Salbo, R.; Ding, H. S.; Vlasenko, E.; Merino, S.; Xu, F.; Cherry, J.; Larsen, S.; Lo Leggio, L. Biochemistry 2010, 49, 3305. (968) Wymelenberg, A. V.; Gaskell, J.; Mozuch, M.; Sabat, G.; Ralph, J.; Skyba, O.; Mansfield, S. D.; Blanchette, R. A.; Martinez, D.; Grigoriev, I.; Kersten, P. J.; Cullen, D. Appl. Environ. Microbiol. 2010, 76, 3599. (969) Yakovlev, I.; Vaaje-Kolstad, G.; Hietala, A. M.; Stefanczyk, E.; Solheim, H.; Fossdal, C. G. Appl. Microbiol. Biotechnol. 2012, 95, 979. (970) Schnellmann, J.; Zeltins, A.; Blaak, H.; Schrempf, H. Mol. Microbiol. 1994, 13, 807. (971) Li, Z.; Li, C.; Yang, K.; Wang, L.; Yin, C.; Gong, Y.; Pang, Y. Virus Res. 2003, 96, 113. (972) Vaaje-Kolstad, G.; Houston, D. R.; Riemen, A. H. K.; Eijsink, V. G. H.; van Aalten, D. M. F. J. Biol. Chem. 2005, 280, 11313. (973) Moser, F.; Irwin, D.; Chen, S. L.; Wilson, D. B. Biotechnol. Bioeng. 2008, 100, 1066. (974) Eijsink, V. G. H.; Vaaje-Kolstad, G.; Vårum, K. M.; Horn, S. J. Trends Biotechnol. 2008, 26, 228. (975) Karlsson, J.; Saloheimo, M.; Siika-aho, M.; Tenkanen, M.; Penttilä, M.; Tjerneld, F. Eur. J. Biochem. 2001, 268, 6498. (976) Koseki, T.; Mese, Y.; Fushinobu, S.; Masaki, K.; Fujii, T.; Ito, K.; Shiono, Y.; Murayama, T.; Iefuji, H. Appl. Microbiol. Biotechnol. 2008, 77, 1279. (977) Saloheimo, M.; Nakari-Setälä, T.; Tenkanen, M.; Penttilä, M. Eur. J. Biochem. 1997, 249, 584. (978) Merino, S.; Cherry, J. In Biofuels; Olsson, L., Ed.; Springer: Berlin, 2007; pp 95−120. (979) Forsberg, Z.; Vaaje-Kolstad, G.; Westereng, B.; Bunæs, A. C.; Stenstrøm, Y.; MacKenzie, A.; Sørlie, M.; Horn, S. J.; Eijsink, V. G. H. Protein Sci. 2011, 20, 1479.

(918) Davies, G. J.; Dodson, G.; Moore, M. H.; Tolley, S. P.; Dauter, Z.; Wilson, K. S.; Rasmussen, G.; Schülein, M. Acta Crystallogr., Sect. D: Biol. Crystallogr. 1996, 52, 7. (919) Castillo, R. M.; Mizuguchi, K.; Dhanaraj, V.; Albert, A.; Blundell, T. L.; Murzin, A. G. Structure 1999, 7, 227. (920) Brock, V.; Kennedy, V. S. J. Exp. Mar. Biol. Ecol. 1992, 159, 51. (921) Liu, G.; Wei, X.; Qin, Y.; Qu, Y. J. Gen. Appl. Microbiol. 2010, 56, 223. (922) Shulein, M.; Henriksen, T.; Lassen, S. F.; Kauppinen, M. S. (Novozymes A/S) Endoglucanases. U.S. Patent 6,855,531 B2, Feb. 15, 2005. (923) Dalboege, H.; Diderichsen, B.; Sandal, T.; Kauppinen, S. (Novozymes A/S) Method of Providing Novel DNA Sequences. W.O. Patent 1997043409 A3, Feb. 26, 1998. (924) Shimonaka, A.; Baba, Y.; Koga, J.; Nakane, A.; Kubota, H.; Kono, T. Biosci. Biotechnol. Biochem. 2004, 68, 2299. (925) Murashima, K.; Nishimura, T.; Nakamura, Y.; Koga, J.; Moriya, T.; Sumida, N.; Yaguchi, T.; Kono, T. Enzyme Microb. Technol. 2002, 30, 319. (926) Moriya, T.; Murashima, K.; Nakane, A.; Yanai, K.; Sumida, N.; Koga, J.; Murakami, T.; Kono, T. J. Bacteriol. 2003, 185, 1749. (927) Koga, J.; Baba, Y.; Shimonaka, A.; Nishimura, T.; Hanamura, S.; Kono, T. Appl. Environ. Microbiol. 2008, 74, 4210. (928) Wonganu, B.; Pootanakit, K.; Boonyapakron, K.; Champreda, V.; Tanapongpipat, S.; Eurwilaichitr, L. Protein Expression Purif. 2008, 58, 78. (929) Schauwecker, F.; Wanner, G.; Kahmann, R. Biol. Chem. HoppeSeyler 1995, 376, 617. (930) Cosgrove, D. J. Nature 2000, 407, 321. (931) McQueen-Mason, S.; Cosgrove, D. J. Proc. Natl. Acad. Sci. U.S.A. 1994, 91, 6574. (932) McQueen-Mason, S. J.; Cosgrove, D. J. Plant Physiol. 1995, 107, 87. (933) Wang, T.; Park, Y. B.; Caporini, M. A.; Rosay, M.; Zhong, L. H.; Cosgrove, D. J.; Hong, M. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 16444. (934) Georgelis, N.; Tabuchi, A.; Nikolaidis, N.; Cosgrove, D. J. J. Biol. Chem. 2011, 286, 16814. (935) Georgelis, N.; Yennawar, N. H.; Cosgrove, D. J. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 14830. (936) Kim, I. J.; Ko, H.-J.; Kim, T.-W.; Choi, I.-G.; Kim, K. H. Biotechnol. Bioeng. 2013, 110, 401. (937) Kerff, F.; Amoros, A.; Herman, R.; Sauvage, E.; Petrella, S.; Filée, P.; Charlier, P.; Joris, B.; Tabuchi, A.; Nikolaidis, N.; Cosgrove, D. J. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 16876. (938) Yennawar, N. H.; Li, L.-C.; Dudzinski, D. M.; Tabuchi, A.; Cosgrove, D. J. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 14664. (939) Cosgrove, D. J. Plant Physiol. 1998, 118, 333. (940) Darley, C. P.; Forrester, A. M.; McQueen-Mason, S. J. Plant Mol. Biol. 2001, 47, 179. (941) Cosgrove, D. J. Nat. Rev. Mol. Cell Biol. 2005, 6, 850. (942) Lipchinsky, A. Acta Physiol. Plant. 2013, 35, 3277. (943) Kim, E. S.; Lee, H. J.; Bang, W. G.; Choi, I. G.; Kim, K. H. Biotechnol. Bioeng. 2009, 102, 1342. (944) Wei, W.; Yang, C.; Luo, J.; Lu, C.; Wu, Y.; Yuan, S. J. Plant Physiol. 2010, 167, 1204. (945) Lin, H.; Shen, Q.; Zhan, J. M.; Wang, Q.; Zhao, Y. H. PLoS One 2013, 8, e75022. (946) Kang, K.; Wang, S.; Lai, G.; Liu, G.; Xing, M. BMC Biotechnol. 2013, 13, 42. (947) Georgelis, N.; Nikolaidis, N.; Cosgrove, D. J. Carbohydr. Polym. 2014, 100, 17. (948) Lee, H. J.; Kim, I. J.; Kim, J. F.; Choi, I. G.; Kim, K. H. Bioresour. Technol. 2013, 149, 516. (949) Li, Y.; Jones, L.; McQueen-Mason, S. Curr. Opin. Plant Biol. 2003, 6, 603. (950) Lee, Y.; Choi, D.; Kende, H. Curr. Opin. Plant Biol. 2001, 4, 527. (951) Cosgrove, D. J. Curr. Opin. Plant Biol. 2000, 3, 73. 1446

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(980) Vaaje-Kolstad, G.; Bøhle, L. A.; Gåseidnes, S.; Dalhus, B.; Bjørås, M.; Mathiesen, G.; Eijsink, V. G. H. J. Mol. Biol. 2012, 416, 239. (981) Westereng, B.; Ishida, T.; Vaaje-Kolstad, G.; Wu, M.; Eijsink, V. G. H.; Igarashi, K.; Samejima, M.; Ståhlberg, J.; Horn, S. J.; Sandgren, M. PLoS One 2011, 6, e27807. (982) Henriksson, G.; Johansson, G.; Pettersson, G. J. Biotechnol. 2000, 78, 93. (983) Henriksson, G.; Ander, P.; Pettersson, B.; Pettersson, G. Appl. Microbiol. Biotechnol. 1995, 42, 790. (984) Hallberg, B. M.; Bergfors, T.; Bäckbro, K.; Pettersson, G.; Henriksson, G.; Divne, C. Structure 2000, 8, 79. (985) Hallberg, B. M.; Henriksson, G.; Pettersson, G.; Divne, C. J. Mol. Biol. 2002, 315, 421. (986) Igarashi, K.; Momohara, I.; Nishino, T.; Samejima, M. Biochem. J. 2002, 365, 521. (987) Igarashi, K.; Yoshida, M.; Matsumura, H.; Nakamura, N.; Ohno, H.; Samejima, M.; Nishino, T. FEBS J. 2005, 272, 2869. (988) Tickler, A. K.; Smith, D. G.; Ciccotosto, G. D.; Tew, D. J.; Curtain, C. C.; Carrington, D.; Masters, C. L.; Bush, A. I.; Cherny, R. A.; Cappai, R.; Wade, J. D.; Barnham, K. J. J. Biol. Chem. 2005, 280, 13355. (989) Paiva, A. C. M.; Juliano, L.; Boschcov, P. J. Am. Chem. Soc. 1976, 98, 7645. (990) Aachmann, F. L.; Sørlie, M.; Skjåk-Bræk, G.; Eijsink, V. G. H.; Vaaje-Kolstad, G. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 18779. (991) Hemsworth, G. R.; Taylor, E. J.; Kim, R. Q.; Gregory, R. C.; Lewis, S. J.; Turkenburg, J. P.; Parkin, A.; Davies, G. J.; Walton, P. H. J. Am. Chem. Soc. 2013, 135, 6069. (992) Bey, M.; Zhou, S.; Poidevin, L.; Henrissat, B.; Coutinho, P. M.; Berrin, J.-G.; Sigoillot, J.-C. Appl. Environ. Microbiol. 2013, 79, 488. (993) Wu, M.; Beckham, G. T.; Larsson, A. M.; Ishida, T.; Kim, S.; Payne, C. M.; Himmel, M. E.; Crowley, M. F.; Horn, S. J.; Westereng, B.; Igarashi, K.; Samejima, M.; Ståhlberg, J.; Eijsink, V. G. H.; Sandgren, M. J. Biol. Chem. 2013, 288, 12828. (994) Floudas, D.; Binder, M.; Riley, R.; Barry, K.; Blanchette, R. A.; Henrissat, B.; Martinez, A. T.; Otillar, R.; Spatafora, J. W.; Yadav, J. S.; Aerts, A.; Benoit, I.; Boyd, A.; Carlson, A.; Copeland, A.; Coutinho, P. M.; de Vries, R. P.; Ferreira, P.; Findley, K.; Foster, B.; Gaskell, J.; Glotzer, D.; Gorecki, P.; Heitman, J.; Hesse, C.; Hori, C.; Igarashi, K.; Jurgens, J. A.; Kallen, N.; Kersten, P.; Kohler, A.; Kues, U.; Kumar, T. K. A.; Kuo, A.; LaButti, K.; Larrondo, L. F.; Lindquist, E.; Ling, A.; Lombard, V.; Lucas, S.; Lundell, T.; Martin, R.; McLaughlin, D. J.; Morgenstern, I.; Morin, E.; Murat, C.; Nagy, L. G.; Nolan, M.; Ohm, R. A.; Patyshakuliyeva, A.; Rokas, A.; Ruiz-Duenas, F. J.; Sabat, G.; Salamov, A.; Samejima, M.; Schmutz, J.; Slot, J. C.; John, F. S.; Stenlid, J.; Sun, H.; Sun, S.; Syed, K.; Tsang, A.; Wiebenga, A.; Young, D.; Pisabarro, A.; Eastwood, D. C.; Martin, F.; Cullen, D.; Grigoriev, I. V.; Hibbett, D. S. Science 2012, 336, 1715. (995) Li, X.; Beeson, W. T.; Phillips, C. M.; Marletta, M. A.; Cate, J. H. D. Structure 2012, 20, 1051. (996) Dietzel, P. D. C.; Kremer, R. K.; Jansen, M. J. Am. Chem. Soc. 2004, 126, 4689. (997) Vu, V. V.; Beeson, W. T.; Phillips, C. M.; Cate, J. H. D.; Marletta, M. A. J. Am. Chem. Soc. 2014, 136, 562. (998) Gudmundsson, M.; Kim, S.; Wu, M.; Ishida, T.; Haddad Momeni, M.; Vaaje-Kolstad, G.; Lundberg, D.; Royant, A.; Ståhlberg, J.; Eijsink, V. G. H.; Beckham, G. T.; Sandgren, M. J. Biol. Chem. 2014, in press. (999) Isaksen, T.; Westereng, B.; Aachmann, F. L.; Agger, J. W.; Kracher, D.; Kittl, R.; Ludwig, R.; Haltrich, D.; Eijsink, V. G. H.; Horn, S. J. J. Biol. Chem. 2014, 289, 2632. (1000) Agger, J. W.; Isaksen, T.; Várnai, A.; Vidal-Melgosa, S.; Willats, W. G. T.; Ludwig, R.; Horn, S. J.; Eijsink, V. G. H.; Westereng, B. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 6287. (1001) Vu, V. V.; Beeson, W. T.; Span, E. A.; Farquhar, E. R.; Marletta, M. A. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 13822. (1002) Hemsworth, G. R.; Davies, G. J.; Walton, P. H. Curr. Opin. Struct. Biol. 2013, 23, 660.

(1003) Kim, S.; Ståhlberg, J.; Sandgren, M.; Paton, R. S.; Beckham, G. T. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 149. (1004) Tantillo, D. J.; Chen, J. G.; Houk, K. N. Curr. Opin. Chem. Biol. 1998, 2, 743. (1005) Gherman, B. F.; Tolman, W. B.; Cramer, C. J. J. Comput. Chem. 2006, 27, 1950. (1006) Huber, S. M.; Ertem, M. Z.; Aquilante, F.; Gagliardi, L.; Tolman, W. B.; Cramer, C. J. Chem.Eur. J. 2009, 15, 4886. (1007) Schroder, D.; Holthausen, M. C.; Schwarz, H. J. Phys. Chem. B 2004, 108, 14407. (1008) Comba, P.; Knoppe, S.; Martin, B.; Rajaraman, G.; Rolli, C.; Shapiro, B.; Stork, T. Chem.Eur. J. 2008, 14, 344. (1009) Himes, R. A.; Karlin, K. D. Curr. Opin. Chem. Biol. 2009, 13, 119. (1010) Kunishita, A.; Teraoka, J.; Scanlon, J. D.; Matsumoto, T.; Suzuki, M.; Cramer, C. J.; Itoh, S. J. Am. Chem. Soc. 2007, 129, 7248. (1011) Klinman, J. P. Chem. Rev. 1996, 96, 2541. (1012) Solomon, E. I.; Sundaram, U. M.; Machonkin, T. E. Chem. Rev. 1996, 96, 2563. (1013) Aboelella, N. W.; Kryatov, S. V.; Gherman, B. F.; Brennessel, W. W.; Young, V. G.; Sarangi, R.; Rybak-Akimova, E. V.; Hodgson, K. O.; Hedman, B.; Solomon, E. I.; Cramer, C. J.; Tolman, W. B. J. Am. Chem. Soc. 2004, 126, 16896. (1014) Chen, P.; Solomon, E. I. J. Am. Chem. Soc. 2004, 126, 4991. (1015) Klinman, J. P. J. Biol. Chem. 2006, 281, 3013. (1016) Gherman, B. F.; Heppner, D. E.; Tolman, W. B.; Cramer, C. J. JBIC, J. Biol. Inorg. Chem. 2006, 11, 197. (1017) Maiti, D.; Fry, H. C.; Woertink, J. S.; Vance, M. A.; Solomon, E. I.; Karlin, K. D. J. Am. Chem. Soc. 2007, 129, 264. (1018) Cramer, C. J.; Tolman, W. B. Acc. Chem. Res. 2007, 40, 601. (1019) Cramer, C. J.; Gour, J. R.; Kinal, A.; Wtoch, M.; Piecuch, P.; Shahi, A. R. M.; Gagliardi, L. J. Phys. Chem. A 2008, 112, 3754. (1020) Itoh, S. In Copper-Oxygen Chemistry; John Wiley & Sons, Inc.: Hoboken, NJ, 2011; pp 225−282. (1021) Osborne, R. L.; Klinman, J. P. In Copper-Oxygen Chemistry; John Wiley & Sons, Inc.: Hoboken, NJ, 2011; pp 1−22. (1022) Peterson, R. L.; Himes, R. A.; Kotani, H.; Suenobu, T.; Tian, L.; Siegler, M. A.; Solomon, E. I.; Fukuzumi, S.; Karlin, K. D. J. Am. Chem. Soc. 2011, 133, 1702. (1023) Suess, A. M.; Ertem, M. Z.; Cramer, C. J.; Stahl, S. S. J. Am. Chem. Soc. 2013, 135, 9797. (1024) Hemsworth, G. R.; Henrissat, B.; Davies, G. J.; Walton, P. H. Nat. Chem. Biol. 2013, 10, 122. (1025) Dimarogona, M.; Topakas, E.; Olsson, L.; Christakopoulos, P. Bioresour. Technol. 2012, 110, 480. (1026) Sygmund, C.; Kracher, D.; Scheiblbrandner, S.; Zahma, K.; Felice, A. K. G.; Harreither, W.; Kittl, R.; Ludwig, R. Appl. Environ. Microbiol. 2012, 78, 6161. (1027) Solomon, E. I.; Heppner, D. E.; Johnston, E. M.; Ginsbach, J. W.; Cirera, J.; Qayyum, M.; Kieber-Emmons, M. T.; Kjaergaard, C. H.; Hadt, R. G.; Tian, L. Chem. Rev. 2014, 114, 3659. (1028) Nakagawa, Y. S.; Eijsink, V. G. H.; Totani, K.; Vaaje-Kostad, G. J. Agric. Food. Chem. 2013, 61, 11061. (1029) Forsberg, Z.; Mackenzie, A. K.; Sørlie, M.; Røhr, Å. K.; Helland, R.; Arvai, A. S.; Vaaje-Kolstad, G.; Eijsink, V. G. H. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 8446. (1030) Suga, K.; Vandedem, G.; Mooyoung, M. Biotechnol. Bioeng. 1975, 17, 433. (1031) Okazaki, M.; Mooyoung, M. Biotechnol. Bioeng. 1978, 20, 637. (1032) Converse, A. O.; Optekar, J. D. Biotechnol. Bioeng. 1993, 42, 145. (1033) Woodward, J.; Lima, M.; Lee, N. E. Biochem. J. 1988, 255, 895. (1034) Zhang, Y. H. P.; Lynd, L. R. Biotechnol. Bioeng. 2006, 94, 888. (1035) Zhou, W.; Hao, Z. Q.; Xu, Y.; Schuttler, H. B. Biotechnol. Bioeng. 2009, 104, 275. (1036) Zhou, W.; Schuttler, H. B.; Hao, Z. Q.; Xu, Y. Biotechnol. Bioeng. 2009, 104, 261. (1037) Zhou, W.; Xu, Y.; Schuttler, H. B. Biotechnol. Bioeng. 2010, 107, 224. 1447

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Chemical Reviews

Review

(1038) Sild, V.; Ståhlberg, J.; Pettersson, G.; Johansson, G. FEBS Lett. 1996, 378, 51. (1039) Levine, S. E.; Fox, J. M.; Blanch, H. W.; Clark, D. S. Biotechnol. Bioeng. 2010, 107, 37. (1040) Levine, S. E.; Fox, J. M.; Clark, D. S.; Blanch, H. W. Biotechnol. Bioeng. 2011, 108, 2561. (1041) Griggs, A. J.; Stickel, J. J.; Lischeske, J. J. Biotechnol. Bioeng. 2012, 109, 665. (1042) Griggs, A. J.; Stickel, J. J.; Lischeske, J. J. Biotechnol. Bioeng. 2012, 109, 676. (1043) Igarashi, K. Nat. Chem. Biol. 2013, 9, 350. (1044) Fox, J. M.; Jess, P.; Jambusaria, R. B.; Moo, G. M.; Liphardt, J.; Clark, D. S.; Blanch, H. W. Nat. Chem. Biol. 2013, 9, 356. (1045) Gao, D. H.; Chundawat, S. P. S.; Sethi, A.; Balan, V.; Gnanakaran, S.; Dale, B. E. Proc. Natl. Acad. Sci. U.S.A. 2013, 110, 10922. (1046) Fenske, J. J.; Penner, M. H.; Bolte, J. P. J. Theor. Biol. 1999, 199, 113. (1047) Warden, A. C.; Little, B. A.; Haritos, V. S. Biotechnol. Biofuels 2011, 4, 39. (1048) Asztalos, A.; Daniels, M.; Sethi, A.; Shen, T. Y.; Langan, P.; Redondo, A.; Gnanakaran, S. Biotechnol. Biofuels 2012, 5, 55.

1448

DOI: 10.1021/cr500351c Chem. Rev. 2015, 115, 1308−1448

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.