We are designing a system to evaluate changes in the Splice site strength based on information theory based models of donor and acceptor splice sites and predict the formation and relative use of different cryptic sites through information theory. 

The prediction of cryptic sites is based on total exon splicing information content which is sum of information content of acceptor, donor and gap surprisal as per length of the exon. 

Ri total* = Ri acceptor + Ri donor + Gap Surprisal.

Gap Surprisal is the penalty given as per length of the exon. 

The mutations indicated below are taken from Information Analysis of Human Splice Site Mutations paper. 

(* The definition will be updated in the future to incorporate other splicing regulatory sequences.)


Gene: IDS

The mutation taking place is IVS3-2A>G, 146477843 acceptor M58342
 

Ri total initial is
Pos: 146477843             Ri_total: 23.3

After the mutation takes place, the values are as follows:

Ri content After Mutation Taking place

Relative_dist

position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
-51146477792 0.811.1 146477753 -1.510.4

4.7

0.04
-1146477842 2.511.1 146477753-0.5 13.12.0 0.25
-37146477806 4.411.1 146477753-1.1 14.40.7 0.62
0 146477843 4.5 11.1 146477753 -0.5 15.1 NA NA
-2146477841 0.311.1 146477753-0.4 11.14.0 0.06
+1 146477844 16.5 11.1 146477753 -0.2 27.4 12.3 0.00
+2146477845 -0.511.1 146477753-0.6 10.15.0 0.03
-40146477803 -0.911.1 146477753-1.2 9.06.1 0.01

 

Mutation: IVS6-2A>G 146467246 acceptor M58342
Ri total initial is
Pos: 146467246           Ri_total: 21.3

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural  (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
+6 146467252 1.8 9.0 146467118 -0.6 10.2 2.7 0.15
-43 146467203 -1.1 9.0 146467118 -0.2 7.8 5.1 0.03
-51 146467195 5.9 9.0 146467118 -0.6 14.3 1.4 0.38
+3 146467249 2.2 9.0 146467118 -0.6 10.6 2.3 0.20
-37 146467209 -2.2 9.0 146467118 -0.2 6.6 6.3 0.01
0 146467246 4.4 9.0 146467118 -0.5 12.9 NA NA
+7 146467253 6.7 9.0 146467118 -0.6 15.1 2.2 0.22
-42 146467204 -1.4 9.0 146467118 -0.6 7.1 5.8 0.02
-3 146467243 -0.1 9.0 146467118 -0.5 8.4 4.5 0.04

 

Mutation: IVS7-1G>C 146463904 acceptor M58342
Ri total initial is
Pos: 146463904           Ri_total: 7.5

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
+6146463910 0.14.5 146463729-1.2 3.4 0.1 0.93
+3146463907 -1.84.5 146463729-1.1 1.6 1.9 0.27
+17146463921 1.34.5 146463729-1.7 4.2 0.7 0.62
0 146463904 0.0 4.5 146463729 -1.1 3.5 NA NA
+7146463911 2.54.5 146463729-1.8 5.3 1.8 0.29
-12 146463892 1.4 4.5 146463729 -0.9 5.0 1.5 0.35

Mutation: IVS7-8T>G 146463904 acceptor M58342
 

Ri total initial is
Pos: 146463904             Ri_total: 7.5

After the mutation takes place, the values are as follows:

tr>

Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
+6146463910 -0.64.5 146463729-1.2 2.7 3.2 0.11
+17146463921 1.34.5 146463729-1.7 4.2 1.7 0.31
+3146463907 0.04.5 146463729-1.1 3.4 2.5 0.18
0 146463904 2.4 4.5 146463729 -1.1 5.9 NA NA
+7 146463911 11.3 4.5 146463729 -1.8 14.1 8.2 0.00
-12146463892 -1.34.5 146463729-0.9 2.3 3.6 0.08

 Gene: F9

type is acceptor
The mutation taking place is IVS4-2A>G, 136575301 acceptor M11309
 

Ri total initial is
Pos: 136575301 Ri_total: 12.98

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
+28136575329-3.72.4136575431-0.5-1.7 6.50.01
+7136575308-2.92.4136575431-0.5-1.0 5.80.02
-10136575291-3.62.4136575431-0.5-1.6 6.40.01
-19136575282-6.42.4136575431-0.7-4.6 9.40.00
+21365753030.42.4136575431-0.32.5 2.30.20
+38136575339-6.52.4136575431-0.2-4.3 9.10.00
-1 136575300 4.5 2.4 136575431 -0.4 6.6 1.80.29
+3136575304-1.32.4136575431-0.70.5 4.30.05
+6136575307-0.62.4136575431-0.61.2 3.60.08
+9136575310-1.32.4136575431-0.60.5 4.30.05
0 136575301 3.0 2.4 136575431 -0.6 4.8 NANA
-14136575287-6.72.4136575431-0.8-5.1 9.90.00
-2136575299-5.92.4136575431-0.6-4.1 8.90.00
+20136575321-5.02.4136575431-0.2-2.7 7.50.01

Mutation: IVS4-1G>C 136575301 acceptor M11309
 

Ri total initial is
Pos: 136575301             Ri_total: 12.98

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
+28136575329-3.72.4136575431-0.5-1.7 7.40.01
+7136575308-1.32.4136575431-0.50.6 5.10.03
-10136575291-3.62.4136575431-0.5-1.6 7.30.01
-19136575282-6.42.4136575431-0.7-4.6 10.30.00
+52136575353-5.22.4136575431-0.6-3.4 9.10.00
+2 136575303 6.2 2.4 136575431 -0.3 8.3 2.60.16
+38136575339-6.52.4136575431-0.2-4.3 10.00.00
+61365753071.42.4136575431-0.63.2 2.50.18
-1136575300-5.02.4136575431-0.4-2.9 8.60.00
+3136575304-0.82.4136575431-0.70.9 4.80.04
-14136575287-6.72.4136575431-0.8-5.0 10.70.00
+9136575310-0.12.4136575431-0.61.7 4.00.06
0 136575301 3.9 2.4 136575431 -0.6 5.7 NANA
+20136575321-3.32.4136575431-0.2-1.0 6.70.01

Mutation: IVS5+13A>G 136575431 donor M11309
 

Ri total initial is
Pos: 136575431             Ri_total: 12.98

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (acceptor)  nat_pos (acceptor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
-20136575411-8.311.2136575301-0.42.5 10.80.00
+22136575453-6.811.2136575301-0.83.6 9.70.00
+46136575477-5.511.2136575301-1.44.3 9.00.00
+41136575472-6.511.2136575301-1.33.4 9.90.00
+12 136575443 3.7 11.2 136575301 -0.5 14.4 1.10.47
+8136575439-0.911.2136575301-0.89.5 3.80.07
+27136575458-2.011.2136575301-0.88.4 4.90.03
0 136575431 2.4 11.2 136575301 -0.3 13.3 NA NA
-14136575417-9.511.2136575301-0.51.2 12.10.00
+39136575470-7.711.2136575301-1.02.4 10.90.00
-17136575414-5.011.2136575301-0.55.7 7.60.01
+10136575441-10.211.2136575301-0.80.2 13.10.00
+35136575466-0.911.2136575301-1.39.0 4.30.05
+33136575464-7.811.2136575301-0.92.5 10.80.00
-4136575427-2.611.2136575301-0.67.9 5.40.02
-121365754191.811.2136575301-0.212.8 0.50.71
-16136575415-10.211.2136575301-0.50.5 12.80.00
 

Gene: ADA

 IVS 10 acceptor G->A 35066 35099

The mutation taking place is IVS10-34G>A

Ri total initial is
Pos: 43887472             Ri_total: 19.4

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
+2243887494 -0.39.6 43887368-0.6 8.7 10.50.00
0 43887472 10.0 9.6 43887368 -0.5 19.2 NA NA
+2543887497 -0.89.6 43887368-0.7 8.2 11.00.00
+1043887482 0.89.6 43887368-0.5 9.9 9.30.00
+32 43887504 9.3 9.6 43887368 -0.4 18.4 0.80.57
+2643887498 4.89.6 43887368-0.3 14.1 5.10.03
+3343887505 0.09.6 43887368-0.8 8.9 10.30.00

Gene: PAH

IVS10, acceptor, G>A, 76 86
 

The mutation taking place is IVS10-11G>A, 103170505 acceptor U49897
 

Ri total initial is
Pos: 103170505             Ri_total: 14.9

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
+301031705351.210.4103170370-1.310.3 4.30.05
-15103170490-2.010.4103170370-0.67.8 6.80.01
-36103170469-2.010.4103170370-0.48.0 6.60.01
-131031704922.210.4103170370-0.612.0 2.60.16
-391031704662.110.4103170370-0.611.9 2.70.15
-24103170481-6.710.4103170370-0.53.2 11.40.00
+2103170507-5.510.4103170370-0.84.1 10.50.00
-31031705020.010.4103170370-0.69.8 4.80.04
-18103170487-3.310.4103170370-0.66.6 8.00.00
-37103170468-3.910.4103170370-0.56.0 8.60.00
0 103170505 4.8 10.4 103170370 -0.6 14.6 NA NA
+9 103170514 2.7 10.4 103170370 -0.9 12.2 2.40.19
+101031705150.010.4103170370-0.59.9 4.70.04
-5103170500-4.710.4103170370-0.35.4 9.20.00
-42103170463-2.010.4103170370-0.67.9 6.70.01
+13103170518-6.110.4103170370-0.53.9 10.70.00

Gene: HBB

accession V00497

Mutation: IVS1-15T>G 5207067 acceptor V00497
 

Ri total initial is
Pos: 5207067             Ri_total: 16.5

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
-3052070371.09.75206843-1.98.8 6.00.02
-85207059-5.59.75206843-2.51.7 13.10.00
-225207045-5.29.75206843-1.92.6 12.20.00
-2852070392.79.75206843-1.710.7 4.10.06
+852070750.09.75206843-2.67.1 7.70.00
+15207068-5.79.75206843-2.81.2 13.60.00
-752070601.99.75206843-2.49.2 5.60.02
+252070690.09.75206843-2.67.1 7.70.00
+275207094-4.49.75206843-3.41.8 13.00.00
+65207073-0.99.75206843-2.95.9 8.90.00
-15207066-3.09.75206843-2.64.1 10.70.00
+4352071102.09.75206843-3.58.2 6.60.01
+395207106-6.19.75206843-3.71-0.1 14.90.00
+195207086-3.39.75206843-3.13.2 11.60.00
-45207063-2.99.75206843-2.44.4 10.40.00
-165207051-7.19.75206843-2.10.5 14.30.00
+45207071-6.79.75206843-2.90.1 14.70.00
+14 5207081 5.1 9.7 5206843 -2.9 11.8 3.00.12
+225207089-5.89.75206843-3.40.5 14.30.00
+752070740.09.75206843-2.96.7 8.10.00
-105207057-5.89.75206843-2.11.8 13.00.00
-215207046-4.19.75206843-2.13.5 11.30.00
+375207104-6.59.75206843-3.7-0.6 15.40.00
-1152070560.09.75206843-2.57.2 7.60.01
0 5207067 7.7 9.7 5206843 -2.5 14.8 NA NA
+2052070871.49.75206843-3.18.0 6.80.01

Mutation: IVS2-2A>G 5205994 acceptor V00497
 

Ri total initial is
Pos: 5205994             Ri_total: -16.4

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total ∆Ri_Total Min. Fold Change
-305205964-4.5-25.95205732-2.6-33.0 8.80.00
-85205986-5.7-25.95205732-3.4-35.1 10.90.00
+152059952.9-25.95205732-3.7-26.7 2.50.18
-752059870.3-25.95205732-3.3-29.0 4.80.04
+2 5205996 7.0 -25.9 5205732 -3.7 -22.7 1.50.35
+275206021-5.8-25.95205732-3.2-34.9 10.70.00
-435205951-5.3-25.95205732-2.6-33.9 9.70.00
+65206000-4.4-25.95205732-3.2-33.6 9.40.00
-55205989-7.0-25.95205732-3.8-36.8 12.60.00
-125205982-5.8-25.95205732-3.1-34.8 10.60.00
+45205998-3.9-25.95205732-3.7-33.6 9.40.00
-652059883.0-25.95205732-3.1-26.0 1.80.29
+285206022-2.3-25.95205732-3.6-31.8 7.60.01
+55205999-5.7-25.95205732-3.5-35.2 11.00.00
+225206016-4.9-25.95205732-3.9-34.8 10.60.00
+75206001-7.2-25.95205732-3.8-37.0 12.80.00
-445205950-5.9-25.95205732-2.7-34.6 10.40.00
-365205958-6.9-25.95205732-2.6-35.5 11.30.00
-395205955-4.5-25.95205732-2.6-33.1 8.90.00
-35205991-5.0-25.95205732-3.1-34.1 9.90.00
-185205976-3.9-25.95205732-3.1-32.9 8.70.00
+35205997-3.1-25.95205732-3.3-32.3 8.10.00
-115205983-3.9-25.95205732-3.4-33.3 9.10.00
0 5205994 5.2 -25.9 5205732 -3.4 -24.2 NA NA
-495205945-0.2-25.95205732-2.5-28.6 4.40.05
+205206014-3.8-25.95205732-3.4-33.2 9.00.00
-475205947-4.2-25.95205732-2.5-32.6 8.40.00
+135206007-3.6-25.95205732-3.8-33.4 9.20.00

Gene: CFTR 
 

type is donor
The mutation taking place is IVS13+1G>A, 116771195 donor M28668
 

Ri total initial is
Pos: 116771195             Ri_total: 20.1

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (acceptor) nat_pos (acceptor) gap_surprisal Ri_total Ri_Total Min. Fold Change
-29116771166-4.811.7116771107-1.25.7 1.70.31
-7116771188-5.111.7116771107-0.65.9 1.50.35
-14116771181-4.311.7116771107-0.66.8 0.60.66
-5116771190-4.211.7116771107-0.56.9 0.50.71
+4 116771199 -4.2 11.7 116771107 -0.6 6.9 0.50.71
-20116771175-5.011.7116771107-0.85.8 1.60.33
0 116771195 -3.9 11.7 116771107 -0.4 7.4 NANA

Mutation: IVS2-3C>T 116689788 acceptor M28668

Ri total initial is
Pos: 116689788             Ri_total: 21.2

After the mutation takes place, the values are as follows:


Ri content After Mutation Taking place
Relative_dist position Ri_site Ri_natural (donor) nat_pos (donor) gap_surprisal Ri_total Ri_Total Min. Fold Change
+2 116689790 5.7 8.1 116689898 -0.4 13.3 6.2 0.01
-17116689771-3.78.1116689898-0.73.7 15.80.00
+28116689816-5.08.1116689898-0.62.5 17.00.00
+5116689793-4.38.1116689898-0.53.3 16.20.00
-10116689778-3.38.1116689898-0.64.1 15.40.00
-24116689764-4.58.1116689898-0.53.1 16.40.00
-31166897851.98.1116689898-0.19.8 9.70.00
0 116689788 11.6 8.1 116689898 -0.2 19.5 NA NA

 

Legend:

Relative_dist: Distance of the coordinate from the respective natural site.

position:    Position of the site in genomic coordinates.

Ri_site:     Information content of the site.

Ri_natural: Information content of opposite polarity natural site.

nat_pos:    Position of the opposite polarity natural site in genomic coordinates.

gap_surprisal:  Gap surprisal corresponding to the length of the considered distance sites.

Ri_total:     Ri-total of the considered exon.

Ri_Total:  Difference between Ri total at the site and Ri total of natural site.

Min. Fold Change: 1 / ( 2Ri_Total).

        indicates the row correspoding to natural site.

          indicates the row corresposding to the cryptic site indicated in the Reference Paper.

         indicates the information content prior to mutation taking place.