Jahangheer Shaik Research

The recent progress in bio-based research has led to the accumulation of a large amount of meaningful biomedical data. The focus of Dr. Shaik’s research is to design applications that employ automated computational methods for reducing complexity and for improving data-driven representation, integration, analysis, interpretation and for discovering meaningful patterns and relationships in large biomedical datasets. The development of efficient and scalable data mining methods to unravel interesting patterns from disparate bio-data sources requires knowledge that bridges multiple fields. Dr. Shaik has unique training and expertise in Electrical Engineering, Bioinformatics, Biostatistics, Computer Science and Biology that distinguish him from numerous well-qualified and highly trained researchers and professionals who are engaged in similar work.


The advent of next generation sequencing technologies has necessitated the bioinformaticians to devise innovative strategies to handle the data. Some commercial applications provide pipelines to analyze these data but are mostly customized towards human and mouse genomes. For the groups that cannot afford these commercial software, ones that deal with other genomes or want to use latest methods for improved performance, pipelines can be created using several tools that are freely available. Since different groups developed these tools, they are limited to one particular paradigm or are tailored towards one particular technology. If the tool has to be used by the user working with different technology or a different paradigm, some additional work has to be done to customize for individual needs. Since the community is coming to a consensus on standard formats for the data and with multiple tools supporting multiple technologies, this problem is expected to minimize over time.  However, executing a pipeline at present requires some software development skills, especially in creating adapters to handle the output from one tool and presenting it as input to the other. Different turning parameters for different tools produce different results and finding a set of tuning parameters that work the best might itself be a challenge. Dr. Shaik has extensively dealt with these problems in analyzing high throughput genomic data including next generation sequencing data.


Dr. Shaik’s current research involves employing bioinformatics strategies for understanding: i) mechanisms¬† underlying sex in Leishmania, ii) virulence factors in Leishmania,¬† iii) phylogenetic analysis to assess divergence in Leishmania, iv) chromatin landscape in Leishmania major, v) role of viruses in Leishmaniasis and vi) Toxoplasma gondii invasion, transmission, intracellular survival and virulence.