Operons provided two different types of information for each genome: i) All its operons and ii) All its directon gene pairs (adjacent genes on the same strand with no intervening gene transcribed in the opposite one). For each operon, our database provides its gene names, genes GI and gene locus tags. Correspondingly, for each directon gene pair, Operons reports whether these are predicted to be in the same operon and the estimated confidence for the prediction. Values near to 1 (operonic-genes) or 0 (non-operonic genes) are high-confidence predictions, whereas values near to 0.5 are low-confidence predictions. Operons database is implemented using MySQL as the database management system, Tomcat as its web server, and Servlets Java, CSS and Javascript to implement the dynamic web pages. The operon prediction method is based on:

  • Intergenic distances between contiguous genes The distance between them in nucleotides (how far apart they are)
  • Their String functional relationship value or a probability based whether they are conserved near each other in other genomes.

Pairs of genes from the organism

The operon information of this option is presented in a table, where two same strand adjacent genes are predicted to be in the same operon or not. The table includes the intergenic distance between the genes, which is haw far apart in nucleotides they are, their relationship value which can be their string relationship value or their neighborhood conservation probability, and a confidence value which is an estimation of the probability that the two corresponding genes are located in the same operon or not. In the example below, genes 1, 2 belong to the same operon with a confidence of 94% while gene 2 and 3 do not belong to the same operon with a confidence of 40%.

SysName1 SysName2 Name1 Name2 Distance Relatio Value Classification Confidence
gene 1 gene 2 GeneA GeneB 50 980 operon 94%
gene 2 gene 3 GeneB GeneC 90 300 no-operon 89%

Column Description:

SysName1: Systematic name of first gene
SysName2: Systematic name of second gene
Name1: Ordinary name of first gene
Intergenic Distance: How far apart they are in neuclotides
Relatio Value: Value of co-occurrence neiborhood probability or String
Clasification: If the pair gene belong to the same operon or not
Confidence: Confidence value of the classification

The program only analyses co-occurrence of genes that do not have a functional String value and that are located within the same direction and are separated by 4 or less genes. An Artificual Neural Network stimated the confidence based on inter-genic distant and the functional value of String. Each pair of genes is analyzed separately.