Cloud based scientific data management - storage, transfer, analysis, and inference extraction - is attracting interest. In\nthis paper, we propose a next generation cloud deployment model suitable for data intensive applications. Our model\nis a flexible and self-service container-based infrastructure that delivers - network, computing, and storage resources\ntogether with the logic to dynamically manage the components in a holistic manner. We demonstrate the strength of\nour model with a bioinformatics application. Dynamic algorithms for resource provisioning and job allocation suitable\nfor the chosen data set are packaged and delivered in a privileged virtual machine as part of the container. We tested\nthe model on our private internal experimental cloud that is built on low-cost commodity hardware. We demonstrate\nthe capability of our model to create the required network and computing resources and allocate submitted jobs. The\nresults obtained shows the benefits of increased automation in terms of both a significant improvement in the time\nto complete a data analysis and a reduction in the cost of analysis. The algorithms proposed reduced the cost of\nperforming analysis by 50% at 15 GB of data analysis. The total time between submitting a job and writing the results\nafter analysis also reduced by more than 1 hr at 15 GB of data analysis.
Loading....