Project Report: SegAnnDB Feature Development

Introduction

I have been working on SegAnnDB Genomic Segmentation WebApp as part of my GSoC project. SegAnnDB is python based webapp which is used for annotating chromosomes and DNA for identifying copy number.

My GSoC project was focussed on improving the current application and adding new features to it. This blog post summarizes my work on the project.

Proposal

The proposal application can be found here

Code Commits

For collaborating, I had forked the original code of SegAnnDB from here

During the course of my project I worked in these repositories-

  1. SegAnnDB- This was the main repo. All the feature development was done on this repository.

  2. SegAnnDB-login - This repository holds the code for login module which lives on [PyPi] as well

  3. SegAnnDB-docker - This repository contains the dockerfiles and scripts related to docker images of SegAnnDB

  4. SegAnnDB-tests - This repository contains the test files for testing SegAnnDB

  5. Docker repository- This contains both docker images developed.

All of my commits to main repository - https://github.com/abstatic/SegAnnDB/commits/master

My fork of the code also contains two additional branches, chrom-explorer-view and google-login they are interim branches that I used to develop features. All of their code has been merged to the master branch.

Work Done

I will cover all the work done during my GSoC with help of my blog posts, as I have written about every work item.

1. Introduction to SegAnnDB

This was during community bonding period. This post covers introduction to SegAnnDB and the plans for the summer.

2. Selenium for testing SegAnnDB

During this task I developed a selenium based testing suite for SegAnnDB. Towards the end of project I again revisited the test suite and made changes for it to work with OAuth2 based login system.

3. New Chromosome Viewer for SegAnnDB

This was one of the major tasks of the project and it took considerable amount of time.

This detailed post covers about how I conceptualized and designed the new chromosome viewer for SegAnnDB. It contains the design decisions and how they were going to be implemented.

It also covers all the changes done to the file keeping scheme of SegAnnDB.

I added code to split the images to smaller images so that they can be used in the new chromosome viewer.

4. Further Improving the chromosome viewer

This post is about adding further improvements to the chromosome viewer of SegAnnDB. It contains the steps taken to make it more interactive.

I also worked on adding the feature to switch back to the old chromosome viewer of SegAnnDB from the new one and the other way round as well. I made sure that both parts of the application were working as expected and there were no bugs in them.

While working on the above tasks, I also prettified the python code (to PEP8) and JS code.

5. Session vs Token Based Authourization

During the course of GSoC we found out that our current login system based on Mozilla Persona will be shutdown by this November. This forced us to immediately find a new alternative for authentication in the applcation. Having worked with OAuth2 before, I chose to go that way.

This post covers basics of authentication system. I also learn a lot about how authentication works in modern day web applications.

During this work item I started experimenting with various plugins for OAuth2 login in a pyramid web application, some of them were - Authomatic, Velruse, pyramid google login

6. SegAnnDB Login System

This post covers all the details related to the new login system. As I could not find any satisfying plugin for the OAuth2 login, I forked the repository for pyramid_google_login , made it compatible with SegAnnDB and uploaded it to PyPi.

7. Docker for SegAnnDB

This was the last part of the project. We always thought that installing SegAnnDB was kind of tricky for all the new users. We wanted to simplify that process. In this task I created docker images for SegAnnDB. I created two docker images for SegAnnDB. One image uses code from Toby’s repository and the other one contains code from my repository.

I also created wrapper scripts for docker images as well as dockerfiles. They can be found at - https://github.com/abstatic/seganndb-docker

The docker repository of the images is at - https://hub.docker.com/u/abstatic/

Video

This video covers brief glimpses of the work done during my GSoC https://www.youtube.com/watch?v=cXRxkDfHjtA

Work left to be done

Although a lot of new features were developed during this GSoC, there are still many missing features. Some of them are -

  1. Travis CI - We would like to have travis CI builds for SegAnnDB and code coverage tools as well.
  2. Permission System
  3. More tests - Although the testing suite is in place, we need more tests.
  4. Sharing Annotations

Conclusion

During the course of GSoC project, a lot new features were introduced to SegAnnDB. With completion of my project, new users of SegAnnDB will find it much easier to install and use. All the new features introduced will lead to more development of SegAnnDB.

Moreover, it will improve collaboration between various researchers using SegAnnDB.

Although a lot remains to be done, I will remain in touch with my mentor and keep on improving this application.

Acknowledgements

I would like to express my sincere gratitude to my mentor Toby Hocking for the continuous support through the summer and for his patience, motivation and enthusisasm. I could not have imagined coming this far without his help.

With end of this program I have also learnt a lot about how open source works, and how to work with large codebases. This was a great opportunity for me.

Contact

In case of any queries, send an email to xabhishekflyhigh(at)gmail.com

Comments