HOW file format
From Christoph's Personal Wiki
Revision as of 08:49, 9 June 2007 by Christoph (Talk | contribs) (→Secondary structure format (DSSP))
The HOW file format (or just HOW file).
Secondary structure format (DSSP)
The secondary structure format uses the DSSP assignment:
G - 3-10 helix I - pi-helix H - alpha-helix E - extended beta-sheet B - beta-bridge S - bend L - other/loop
Note: Prediction servers typically use just three categories H, E, and L, where L is the rest. Sometimes H, G, and I are merged to H and sometimes E and B are merged to E, and L means the rest just as for the PredictProtein server.
Example HOW file
178 1cdy.- KKVVLGKKGDTVELTCTASQKKSIQFHWKNSNQIKILGNQGSFLTKSPSKLNDRADSRRSLWDQGNFPLIIKNLKIEDSD 80 TYICEVEDQKEEVQLLVFGLTANSDTHLLQGQSLTLTLESPPGSSPSVQCRSPRGKNIQGGKTLSVSQLELQDSGTWTCT 160 VLQNQKKVEFKIDIVVLA .EEEEEETTS.EEE..B..SSSS..EEEEETTS.EEEEEETTEEEE.S.TTGGGEE..GGGGGGTB..EEE.S..GGG.E 80 EEEEEETTEEEEEEEEEEEEEE.S.SEEETT..EEEEEE..TT....B..B.TTS.B..BSSEEEESS..STT.EEEEEE 160 EEETTEEEEEEEEEEEE. 405 1phb.- NLAPLPPHVPEHLVFDFDMYNPSNLSAGVQEAWAVLQESNVPDLVWTRCNGGHWIATRGQLIREAYEDYRHFSSECPFIP 80 REAGEAYDFIPTSMDPPEQRQFRALANQVVGMPVVDKLENRIQELACSLIESLRPQGQCNFTEDYAEPFPIRIFMLLAGL 160 PEEDIPHLKYLTDQMTRPDGSMTFAEAKEALYDYLIPIIEQRRQKPGTDAISIVANGQVNGRPITSDEAKRMCGLLLVGG 240 LDTVVNFLSFSMEFLAKSPEHRQELIERPERIPAACEELLRRFSLVADGRILTSDYEFHGVQLKKGDQILLPQMLSGLDE 320 RENACPMHVDFSRQKVSHTTFGHGSHLCLGQHLARREIIVTLKEWLTRIPDFSIAPGAQIQHKSGIVSGVQALPLVWDPA 400 TTKAV ......TTS.GGGB....TTS.TTGGG.HHHHHGGGGSTTS.SEEEE.GGG.EEEE.SHHHHHHHHH.TTTEETTS.SSS 80 HHHHHH...TTTT..TTTHHHHHHHHHHHHSHHHHHHHHHHHHHHHHHHHHHHGGGSEEEHHHHTTTHHHHHHHHHHHT. 160 .GGGHHHHHHHHHHHHS..SSS.HHHHHHHHHHHHHHHHHHHHHS..SSHHHHHHT.EETTEE..HHHHHHHHHHHHHHH 240 HHHHHHHHHHHHHHHHH.HHHHHHHHH.GGGHHHHHHHHHHHT..B..EEEESS.EEETTEEE.TT.EEE..GGGTTT.T 320 TTSSSTTS..TT.S.....TT..GGG..TTHHHHHHHHHHHHHHHHHH....EE.TT....EE.SSB.EES..EEE..GG 400 G.... 344 1hle.A MEQLSTANTHFAVDLFRALNESDPTGNIFISPLSISSALAMIFLGTRGNTAAQVSKALYFDTVEDIHSRFQSLNADINKP 80 GAPYILKLANRLYGEKTYNFLADFLASTQKMYGAELASVDFQQAPEDARKEINEWVKGQTEGKIPELLVKGMVDNMTKLV 160 LVNAIYFKGNWQQKFMKEATRDAPFRLNKKDTKTVKMMYQKKKFPYNYIEDLKCRVLELPYQGKELSMIILLPDDIEDES 240 TGLEKIEKQLTLDKLREWTKPENLYLAEVNVHLPRFKLEESYDLTSHLARLGVQDLFNRGKADLSGMSGARDLFVSKIIH 320 KSFVDLNEEGTEAAAATAGTILLA .HHHHHHHHHHHHHHHHHHHHH.SSS.EEE.HHHHHHHHHHHHHT..HHHHHHHHHHHTGGGSTTHHHHHHHHHHHHT.S 80 S.SSEEEEEEEEEEETT....HHHHHHHHHHH..EEEEE.TTT.HHHHHHHHHHHHHHHTTTSSS.SS.TTSS.TTEEEE 160 EEEEEEEEEEBSS...GGG.EEEEEESSSS.EEEEEEEEEEEEEEEEEEGGGTEEEEEEEBTTSSEEEEEEEESS..SSS 240 SS.HHHHHT..HHHHHHHH.GGG.EEEEEEEEEE.EEEEEEEE.HHHHHHHT..GGG.TTT...HHHHSSS.EEEEEEEE 320 EEEEEE.SSEEEEEEEEEEEEEE.
Note: In the "old style" HOW format for output, the sequence length will occupy the first five positions of the header line, right adjusted. The new HOW format reserves the first six positions for the sequence length.
External links
- WebLogo — a web based application designed to make the generation of sequence logos as easy and painless as possible.
- plogo — Protein Sequence Logos using Relative Entropy
- RNA Structure Logo
- PredictProtein
- A Gallery of Sequence Logos
- Visualizing DNA binding sites: Sequence Logos and Walkers
- How Can I Make Sequence Logos on My Own Computer?
- STRING — Search Tool for the Retrieval of Interacting Proteins