Distributed Framework for Gene Finding using OpenMPI
Distributed Framework for Gene Finding using OpenMPI
Background Information:
Why we need a gene finder that can run on different machine
Related Information:
https://www.genome.gov/genetics-glossary/Open-Reading-Frame
https://byjus.com/biology/difference-between-gene-and-dna/
Brief description of what you want to do, including why it is useful, the data and software you will use, and the software you will write, if any.
I’ll write a that can find gene in DNA/RNA sequence that can run on muti-node cluster (multiple computers)
It will find all open reading frames (ORFs) from the sequence, and using a user defined standard to judges if the ORF is / isn’t gene.
An example standard:
Must begin from a start codon
ORF must contain at least 96 bp (32 amino acids)
ORF must occur in a CpG
CpG should appear in first third of the sequence
A detailed description of the related work.
You should search for research papers and projects that solve the same or a similar problem.
Please find it for me
Brief plan of action, including any insights you have, the various steps of the project, the software libraries or packages you will be using, and the software you will be developing on your own.
Plan of action:
Write a multithread library in C++ that deals with FASTA file
Implement ORFs finder using OpenMPI that can run on a muti-node cluster