Main | Browse | Search | Author Links | Manage ETD List | Review ETDs | Catalog ETDs | Help
 

Title page for ETD etd-07102007-123722


Type of Document Dissertation
Author Brenner, Paul Raymond
Author's Email Address pbrenne1@nd.edu
URN etd-07102007-123722
Title Parallel Algorithms and Distributed Systems for Computational Biophysics
Degree Doctor of Philosophy
Department Computer Science and Engineering
Advisory Committee
Advisor Name Title
Jonathan Sapirstein Committee Chair
Aaron Striegel Committee Member
Doug Thain Committee Member
Jeff Peng Committee Member
Jesus Izaguirre Committee Member
Keywords
  • distributed systems
  • replica exchange
  • algorithms
  • computational biophysics
Date of Defense 2007-07-06
Availability restricted
Abstract
The understanding of atomic scale biomolecular function is a key component

in the prevention and treatment of disease. Computational biophysics has proven

essential in this regard, accelerating the development and analysis of new biomolecular

theories. The effective contribution of biophysical simulation is limited by the

computational complexity of the existing models. In this work new computationally

efficient parallel algorithms and distributed system frameworks are developed

to extend the capability of biophysical simulation. In tandem to this development,

I present the simulation and analysis of a target protein domain linked to cancer,

Huntington disease, and Alzheimer disease.

The Replica Exchange Method is a popular biomolecular sampling algorithm

that utilizes multiple simulations (replicas), to more rapidly overcome energy landscape

boundaries and accelerate sampling. The method has limitations in scale

related to the size of the biomolecular system and required number of replicas. I

introduce a novel all pairs exchange implementation of the algorithm that provides

asymptotically four fold speedup of conformation traversal for replica counts of 8

and larger with typical exchange rates. Experimental tests with the blocked alanine

dipeptide show a 100% sampling improvement according to potential energy averages and an ergodic measure. The cluster sampling rate for a target protein

domain was nearly twice that of the single exchange near neighbor method.

The method meets the detailed balance criterion for Monte Carlo methods and

introduces no new parameterizations, biases, or heuristics.

The development of distributed systems for scientific computation is an active

research field propelled by the growing number of research projects relying

on computationally complex simulations as part of the discovery process. Many

proposed frameworks have been successfully matched with unique applications to

provide the computational capacity required. Only recently, has more focus been

targeted toward the efficient management of the distributed data. I introduce a

‘processing in network storage’ distributed system framework that efficiently couples

computation with data management over heterogeneous, autonomous, and

distributed resources. The framework provides a fault tolerant, scalable, and

bandwidth conserving approach through the utilization of existing grid software

utilities and a new hybrid database/filesystem developed with our collaborators.

The performance is evaluated during the generation of 500 biomolecular simulations

producing over 1 million output files distributed over volunteer resources.

The correlation of atomic scale simulations with existing experimental techniques

provides complementary data sets that cross validate and more thoroughly

map biomolecular motion of interest. This correlation however is complicated

by the often disjoint nature of the observables accessible from simulation and

experiment. In this work, biophysical simulations of the isomerase PIN1 WW

domain reveal insight into promising reaction coordinates to help map simulation

observed recognition loop motion to experimental nuclear magnetic resonance

(NMR) results. Post processing analysis methods and metrics including dihedral distributions, conformational clustering, hydrogen bond determination, and committor

probability calculations indicate that the observed motion of the arginine

12 residue is coupled to the multivariate conformational changes of the recognition

loop.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
[campus] BrennerP072007.pdf 2.85 Mb 00:13:12 00:06:47 00:05:56 00:02:58 00:00:15
[campus] indicates that a file or directory is accessible from the campus network only.

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact the Graduate School.