How to find information in KEGG

How to make use of the KEGG metabolic database

How do I find a standard metabolic pathway ?
From the KEGG table of contents click on the link 'Metabolic pathways' under pathway category. A list with all pathways will be available. To find the pathway link for 'Lysine biosynthesis', scroll down on the browser to the group of pathways called 'amino acid metabolism' and click on the link. You should now see the standard pathway MAP00300 for lysine biosynthesis.

How do I find species specific pathway maps ?
Once in the standard pathway map, select the species name (e.g. Escherichia coli) in the 'Go To' window and click on Exec button. You should now see the same pathway, but indicating the species name in the window (e.g. Escherichia coli). All known E.coli enzymes for which a database entry exists are labeled in green. The example indicates that not all enzymes are colored. This indicates that the corresponding enzyme (pathway) does not exist in this organism or that no information is found in any of the public databases (Swiss Prot, GenBank, Protein Data Base etc...). Since the complete genome for E.coli has been sequenced this very likely indicate that a pathway does not exist in this Gram-negative bacteria. However, about 39% of all predicted genes (called ORFs or open reading frames) are not associated with any known protein, biochemical or physiological activity and not all pathways are therefor known and not described in KEGG. Some organisms such as mouse and human are not listed for lysine biosynthesis but show a map for glutamate, for example. L-lysine, unlike glutamate, is an essential amino acid for both human and mouse. L-lysine has to be part of our diet, because we lack the necessary enzymes for its biosynthesis!!!

How do I find a chemical structure, metabolite information from a pathway map ?
Note that the immediate precursor of L-lysine also serves as a substrate for peptidoglycan synthesis to form the activated polypeptide

UDP-N-acetylmuramoyl-L-alanyl-D-glutamyl-meso-2,6-diaminopimelotl-D-alanyl-D-alanine

the structure of which can be found by clicking on the small circle next to the compound name. You will see an information page for KEGG entry C04882. The preceding problem gives an example of how to find information about a metabolite, namely its chemical structure, formula, KEGG entry number, and pathway map numbers for which it is an intermediate. Clicking on a substrate name (or its circle) on a pathway map page is the easiest way to find relevant chemical information about a substrate and its shared pathways. The same can be done for an enzyme by clicking on the E.C. number box or for an intersecting pathway indicated in a roundish shaped box. For example a link exists from the lysine biosynthesis map to the 'lysine degradation' pathway map. Clicking on the box marked 'Lysine degradation' will bring us to the corresponding catabolic processes. Note that the species selection will not change (we last selected E.coli pathways above). The new pathway map number is MAP00310.

How do I find a chemical structure, metabolite information by keyword search ?
To find a pathway metabolite or enzyme, the table of contents offers a direct link to the DBGET Ligand database at KEGG. This search mode can be found at the 'table of contents' page under the 'enzyme' category, DBGET search. Click on the link called 'Ligand' to access a generic search mode that allows a keyword entry. Note that an exact enzyme number or compound number is not necessary in the DBGET database. To find information about lysine or L-lysine, type in 'lysine' and hit the return (enter) key. You will receive a return list with 159 hits. the search returned all KEGG entries that contained the word 'lysine' anywhere in the enzyme or compound name. The list contains 45 enzyme links (ec: x.x.x.xx) and 51 compound links (cpd: Cxxxxx), one being L-lysine (cpd:C00047) and all others derivatives thereof. Clicking on the cpd number will bring you to the chemical structure information sheet. This sheet lists compound entry number for L-lysine (note that D-lysine has a different entry, of course), common name(s), formula, structure, all pathway maps that contain L-lysine as metabolite (5 maps for L-lysine including synthesis and degradation, biotin metabolism, alkaloid biosynthesis II, and Aminoacyl-tRNA biosynthesis), and finally a list of all known enzymes that use L-lysine as a substrate.

General information on biological molecules
One additional feature that is very helpful to use are the molecular catalogue entries, specifically the 'compound classification'. This link leads to a catalogue of metabolites classified according to their functional class, e.g., carbohydrates, fatty acids, phospholipids, neurotransmitters etc.. If you want to look up the structures of a class of molecules like the amino acids or various hexoses, this link will give you the best and broadest result to quickly find what you need. Use this link as a reference for structure information. As an example we are interested in the general structure of steroid hormones. A link provided in the category 'Lipids' provides a page containing the names and chemical structures of seven cholesterol derived steroid hormones. Clicking on the name link 'aldosterone' connects to a structure information page providing a link to the pathway map for C21 steroid hormone metabolism (MAP00140). Following the pathway map link results in the standard pathway for steroid hormone metabolism with aldosterone position marked as a red circle (because we started our search from aldosterone). Selecting the Homo sapiens version of the map shows as a variety of pathways whereas no corresponding bacterial map for E.coli exists. Not surprisingly, because microorganisms lack the capability of synthesizing steroid hormones. The database offers only links to eukaryotic organisms.

Why are some enzymes not colored even though they are part of a pathway of enzymes that are colored?
Sometimes an enzyme is not marked where you would expect it like in the aldosterone pathway above. This pathway map shows all known reactions summarized in a standard pathway map. Species specific enzymes are marked green. Missing enzymes that appear to interrupt a pathway occur when no entry for this enzyme (gene, amino acid sequence, protein structure) exists in any database, not only KEGG. The enzyme with the entry EC 1.14.15.5 is Corticosterone 18-monooxygenase and converts corticosterone into aldosterone. Following the E.C. link for this enzyme to the entry in GenBank (mirrored from NCBI) shows one nucleic acid sequence report for rat (exon 9 of Rat CYP11B2 gene for aldosterone synthase). A human homologue is likely to exist for this monooxygenase, but no sequence has been reported yet.