Dr. Carlos J. Camacho Laboratory

From Christoph's Personal Wiki
Jump to: navigation, search

The Dr. Carlos J. Camacho Laboratory is where I did scientific research from October 2004 - July 2005, from March 2006 - September 2006 (in absentia), and from December 2006 - February 2007 (in absentia).

Scientific programming

Probably the most complicated programming project I have worked on was one where we were attempting to predict how two proteins will interact (see SmoothDock). Since billions of calculations (a minimum of 2.7 x 10^10) are needed for each protein/protein complex, I had to write specific (C) code that was optimised to run in parallel on a dedicated cluster of 256 CPUs (using the MPICH compiler). As a side note, I had to translate some original Fortran77 code into C so it could be compiled with MPICC.

We wanted to make our algorithm available to the general scientific community and, so, we decided that a web server would be the best implementation. What we needed was a simple, user-friendly interface to the back-end algorithm. Getting the user input (here the coordinates for each atom in a protein) to be transferred to the cluster required a great deal of pre-processing (data parsing, formatting, and error-checking). Likewise, the results returned by the cluster required post-processing to be eventually sent (via email) to the user.

The entire system had to run autonomously (controlled via crontab scheduling). As the administrator of this setup, I was responsible for keeping the system up at all times. However, since there were thousands of lines of code, if the system should crash it would be difficult to find out where the problem was if I didn't maintain extensive log files. I set these up to be easily parsable and had the system periodically email me the "health" of the system.

The algorithm described above is implemented through a combination of Fortran77 and C code. However, the initial data (input) and results (output) are sent through multiple pipes as a series of I/O streams using Perl, awk/gawk, sed, and bash scripts. They are all controlled via makefiles and use extensive regular expressions.

I would say that working with the command line interface (CLI) and scripting languages are my main skills and strengths. These skills have been developed through over seven years of active data mining through literally hundreds of terabytes of data in a wide array of formats and from multiple sources.

Note: This server remains up-and-running, as of 22 February 2017.

Research topics

Results

The research I have done in the laboratory has, so far, yielded two papers published[1][2] and four Web Servers:

Version 1.0: Programmer, Server architect, and administrator; July 2005–December 2006.
Version 2.0: Programmer, Server architect, and administrator; January 2007-present.
Programmer, Server architect, and administrator; January 2005–present (note: This server uses code optimised and run in parallel on 256 processors).
Programmer, Server architect, and administrator; November 2004–present.
Server architect and administrator; September 2004–present.

CAPRI

I was a participant of "Round 6" (17-Jan-2005) and "Round 7" (29-May-2005) of CAPRI. Our group presented the "SmoothDock Server".

Benchmarks

see: Protein-Protein Docking Benchmark

Keywords

protein-protein interactions; protein-DNA interactions; docking

See also

References

  1. Carlos J. Camacho, Ma H, and P. Christoph Champ (2006). Scoring a diverse set of high-quality docked conformations: A metascore based on electrostatic and desolvation interactions. Proteins, 63(4):868-77 DOI:10.1002/prot.20932 . [HubMed]
  2. P. Christoph Champ and Carlos J. Camacho (2007). FastContact: a free energy scoring tool for protein-protein complex structures. Nucleic Acids Research (Web Issue). DOI:10.1093/nar/gkm326 . [HubMed]

Further reading

  • Katchalski-Katzir, E, Shariv I, Eisenstein M, Friesem AA, Aflalo C, Vakser IA. Molecular surface recognition: Determination of geometric fit between proteins and their ligands by correlation techniques. Proc Natnl Acad Sci USA, 89(6):2195-9.
  • Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D (2003). Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Bio, 331(1):281-99.
  • Proteins: Structure, Function, and Genetics (special edition) Volume 52, Issue 1, 2003, all pages.
  • Bueno M, Camacho CJ, Sancho J (2007). "SIMPLE estimate of the free energy change due to aliphatic mutations: superior predictions based on first principles". Proteins, 68(4):850-62; PMID: 17523191.
  • Bueno M, Camacho CJ (2007). "Acidic groups docked to well defined wetted pockets at the core of the binding interface: A tale of scoring and missing protein interactions in CAPRI". Proteins, [Epub]; PMID: 17803211.

External links