DATABASES | Federal Research Centre "Fundamentals of Biotechnology"

Web-site with the bacterial gene annotations
The site gives a list of the most likely biological functions of the nucleotide sequence under study. The effectiveness of annotations is about 19% higher than that of all existing methods with the same number of false positives (http://genefunction.ru)

Database of periodic DNA regions in major genomes
The database contains information on areas with different periodicity tapes in various genomes. For eukaryotic genomes, these areas on average occupy ~ 8% of the genome (http://victoria.biengi.ac.ru/cgi-bin/indelper/index.cgi)

Database of potential reading frame shifts in coding sequences
The database contains information about potential mutations like reading frame shift in a variety of cds from eukaryotic genomes. On average, about 23% cds contains such mutations (http://victoria.biengi.ac.ru/cgi-bin/frameshift/index.cgi)

Web site to search of potential reading frame shifts in cds
The server allows you to find potential mutations of the type of reading frame shift to any cds (http://victoria.biengi.ac.ru/fsfinder/)

Database of Potential promoter sequence in Rice genome
The database is located at: http://victoria.biengi.ac.ru/cgi-bin/dbPPS/index.cgi
The database contains over 150 thousand potential promoter sequences. The creation of a database became possible because a new mathematical method for creating multiple alignments of nucleotide sequences was developed. Potential applications include biotechnology and genetic engineering.
There are no similar databases. This is due to the fact that the identification of promoter sequences by all previously developed mathematical algorithms is impossible. Created by the staff of the group for mathematical analysis of DNA and protein sequences (Head: Dr. E.V. Korotkov).

Rice Genome SINE Repeat Database
The database is located at: http://victoria.biengi.ac.ru/sinerice/
The database contains tens of thousands of new and known SINE repeats from 45 different families. The creation of a database became possible only because a new mathematical method for creating multiple alignments of nucleotide sequences was developed.
Potential applications include biotechnology and genetic engineering.
There are no similar databases for the rice genome. This is due to the fact that the detection of new previously unknown SINE sequences by all previously developed mathematical algorithms is impossible. Created by the staff of the group for mathematical analysis of DNA and protein sequences (Head: Dr. E.V. Korotkov).

Database of dispersed repeats in plant genomes
The database contains dispersed repeats which were found in Arabidopsis thaliana, Capsicum annuum, Daucus carota, Oryza sativa and Zea mays genomes. We applied the IP method based on optimization of position-weight matrices and two-dimensional dynamic programming, which allows one to detect repeats with weak similarity. The method is described in detail in the publications: https://doi.org/10.3390/ijms241310964 and https://doi.org/10.3390/ijms25084441.