In recent years, the amount of data generated has been increasing exponentially. The data is coming from different sources such as machine logs, gene sequencing, sensor networks, network flows, social media. Researchers in education and research sector from areas e.g. Bioinformatics, computer science, astronomy, environmental science has huge data sets and would like to analyze this data without worrying about the scale of data sets. Thus there is an increasing demand of getting this data to work by storing and processing it in a horizontally scalable way. In recent years there has been a rise of commercially backed distributed open source softwares by the main global actors e.g. Google, Yahoo, Twitter and Facebook. These distributed softwares utilize commodity hardware to store the big data and provide ability to process it locally, thus providing good economy of scale.
Data Analysis as a Service (DaaS) project will investigate the possibility of providing a common infrastructure to researchers where they can store and process their data using advanced algorithms at big scale. In this way, we can contribute in building an Eco-system where researchers can analyze their big data sets and able to share not just data but also the whole processing pipelines. This it will provide a great possibility to collaborate with researchers across different institutions/nations. Moreover researchers can help each other in evolving the Eco-system by adding new functionality and thus improving research and DaaS platform