The Enzyme Portal: an integrative tool for enzyme information and analysis.

Enzymes play essential roles in all life processes and are used extensively in the biomedical and biotechnological fields. However, enzyme-related information is spread across multiple resources making its retrieval time-consuming. In response to this challenge, the Enzyme Portal has been established to facilitate enzyme research, by providing a freely available hub where researchers can easily find and explore enzyme-related information. It integrates relevant enzyme data for a wide range of species from various resources such as UniProtKB, PDBe and ChEMBL. Here,they describe what type of enzyme-related data the Enzyme Portal provides, how the information is organized, and, by show-casing two potential use cases, how to access and retrieve it.

Enzymes are found in the proteomes of species across the entire kingdom of life, where they catalyze the chemical reactions required for essential functions such as DNA replication, energy production and food digestion. They represent between 20 and 30% of both bacterial and eukaryotic proteomes; for instance, about 21% (> 4000 proteins) of the human proteome are enzymes, while in E. coli, they represent about 37% of its proteins. The range of reactions catalysed by these proteins is broad, and our understanding of the role these biological catalysts play is continuously evolving. Currently, the Enzyme Commission (EC) number, which provides a numerical classification of enzyme catalytic function based on the chemical reaction they catalyse, has described 7 general classes with more than 6000 distinct subgroups of enzymes, with many more waiting to be added (source: Enzymes can be specific for a unique substrate or can act on a broad range of substrates. The substrate specificities of orthologs enzymes can also vary between species.

The crucial role enzymes play in sustaining life is highlighted by the growing number of debilitating or lethal human diseases known to be caused by mutations in the amino acid sequence (variants) that results in abnormal enzyme expression or regulation, or loss of enzymatic activity. An analysis using data in the protein database UniProtKB (release 2021_01) shows that of the 4515 human proteins associated with a disease, 1413 (31%) are enzymes. Often the deleterious mutations affect active site amino acid residues that are involved in catalysis. For example, mutations affecting the cofactor binding sites in the exonuclease TREX1 are linked to Aicardi-Goutières syndrome and to an increase in susceptibility to systemic lupus erythematous.

In addition to human drug targets, enzymes are often considered ideal targets for the development of antibacterial, antiviral, and antiparasitic agents. By binding to and inhibiting D-alanyl-D-alanine carboxypeptidase, an enzyme crucial for peptidoglycan biosynthesis, penicillin was one of the most successful antibiotics used to treat bacterial infections. Research in this field is crucial, not only to address the rise in resistance to available antibiotics but also to tackle other microbial infections such as those caused by parasites, which are a huge health burden for several countries. Indeed, several parasitic enzyme families are currently being investigated as potential drug targets as illustrated in where they screened the kinome of the Trypanosoma cruzi, the causative agent of Chagas disease, to identify potential drug targets.

Enzymes are also invaluable tools for scientific research and medical diagnostics. The use of restriction enzymes and the recent development of the CRISPR (clustered regularly interspaced short palindromic repeat) technique have revolutionized the field of molecular biology. Enzymes have also been successfully used in the biotechnology field to synthesize compounds offering a more efficient and less toxic alternative to compound production compared with traditional methods. For example, the synthesis of the antidiabetic compound sitagliptin using an engineered transaminase is more efficient than the traditional method, which requires high-pressure conditions and a rhodium-based chiral catalyst.

Information about enzyme biology and biochemistry is therefore essential to understand how changes in their catalytic function lead to diseases, to design inhibitory or activating drugs, or to engineer enzymes with new or improved catalytic activity. For instance, to design a drug, access to the protein 3D structure and the positions of residues required for catalyzes such as active site(s), cofactor and ligand binding sites is fundamental. However, other important aspects need also to be considered: Will the compound interfere with the enzymatic activity of closely related enzymes? For antimicrobial drugs, will they affect the host enzymes? Is the target enzyme involved in other pathways? If yes, how will they be affected, and will this effect be detrimental?

In the postgenomic era, databases play an essential role in extracting and gathering biological information from the literature, to make it freely available to the scientific community. However, most databases tend to specialize in one specific aspect of protein biology. For example, the Protein Data Bank PDB is a repository for protein 3D structures, while a database such as M-CSA (Mechanism and Catalytic Site Atlas) provides information about the enzymatic reaction mechanisms. The spread of enzyme-related data across multiple databases makes the gathering of information difficult and time-consuming. The Enzyme Portal was established to address this challenge. It combines publicly available information on enzymes for many species, including the major model organisms, using data extracted from multiple disparate resources to provide a concise summary of information on common names, functions, enzyme classification, reaction mechanisms, biochemical pathways, and related drug compounds. Here, they provide a description of its main features and, through two case studies, describe how it can be used.

Zaru, R.; Onwubiko, J.; Ribeiro, A. J. M.; Cochrane, K.; Tyzack, J. D.; Muthukrishnan, V.; Pravda, L.; Thornton, J. M.; O’Donovan, C.; Velanker, S.; Orchard, S.; Leach, A.; Martin, M. J.. The Enzyme Portal: An Integrative Tool for Enzyme Information and Analysis. The FEBS Journal2021.