Background: There is a significant demand for creating pipelines or workflows in the life science discipline that\r\nchain a number of discrete compute and data intensive analysis tasks into sophisticated analysis procedures. This\r\nneed has led to the development of general as well as domain-specific workflow environments that are either\r\ncomplex desktop applications or Internet-based applications. Complexities can arise when configuring these\r\napplications in heterogeneous compute and storage environments if the execution and data access models are\r\nnot designed appropriately. These complexities manifest themselves through limited access to available HPC\r\nresources, significant overhead required to configure tools and inability for users to simply manage files across\r\nheterogenous HPC storage infrastructure.\r\nResults: In this paper, we describe the architecture of a software system that is adaptable to a range of both\r\npluggable execution and data backends in an open source implementation called Yabi. Enabling seamless and\r\ntransparent access to heterogenous HPC environments at its core, Yabi then provides an analysis workflow\r\nenvironment that can create and reuse workflows as well as manage large amounts of both raw and processed\r\ndata in a secure and flexible way across geographically distributed compute resources. Yabi can be used via a\r\nweb-based environment to drag-and-drop tools to create sophisticated workflows. Yabi can also be accessed\r\nthrough the Yabi command line which is designed for users that are more comfortable with writing scripts or for\r\nenabling external workflow environments to leverage the features in Yabi. Configuring tools can be a significant\r\noverhead in workflow environments. Yabi greatly simplifies this task by enabling system administrators to configure\r\nas well as manage running tools via a web-based environment and without the need to write or edit software\r\nprograms or scripts. In this paper, we highlight Yabi�s capabilities through a range of bioinformatics use cases that\r\narise from large-scale biomedical data analysis.\r\nConclusion: The Yabi system encapsulates considered design of both execution and data models, while\r\nabstracting technical details away from users who are not skilled in HPC and providing an intuitive drag-and-drop\r\nscalable web-based workflow environment where the same tools can also be accessed via a command line. Yabi is\r\ncurrently in use and deployed at multiple institutions and is available at http://ccg.murdoch.edu.au/yabi.
Loading....