Full model selection results:
Cross validation was done on the positive acceptor model for the number of states (bw%i)
of the full connected HMM behind the splice site.
The first part is modelled via a 'linear' HMM (just observation frequencies).
The accuracy (c:%f), error rate (w:%f), false positives (fp:%f), true positives (tp:%f) is shown for the
connected model (linear+full connected) on the validation set (only for the positive model).
The optimal bias term (thresh*:%f) for the validation data is shown. This treshhold was used on the test set.
The model with highest accuracy was chosen.
results/ag_pos_bw10_tr1000_r1.out:of 5000 samples (c:0.906000,w:0.094000,fp:0.089859,tp:0.899151,tresh*:-63.968183821887123)
results/ag_pos_bw10_tr1000_r2.out:of 5000 samples (c:0.911600,w:0.088400,fp:0.072075,tp:0.885100,tresh*:-63.44361013041501)
results/ag_pos_bw10_tr1000_r3.out:of 5000 samples (c:0.909400,w:0.090600,fp:0.084912,tp:0.900413,tresh*:-64.069333771450928)
results/ag_pos_bw10_tr1000_r4.out:of 5000 samples (c:0.917800,w:0.082200,fp:0.072633,tp:0.902401,tresh*:-63.565525460489745)
results/ag_pos_bw10_tr1000_r5.out:of 4999 samples (c:0.915183,w:0.084817,fp:0.082130,tp:0.910733,tresh*:-63.879775429161818)
results/ag_pos_bw11_tr1000_r1.out:of 5000 samples (c:0.907600,w:0.092400,fp:0.087291,tp:0.899151,tresh*:-63.925766751211668)
results/ag_pos_bw11_tr1000_r2.out:of 5000 samples (c:0.912800,w:0.087200,fp:0.071105,tp:0.886674,tresh*:-63.527268108751741)
results/ag_pos_bw11_tr1000_r3.out:of 5000 samples (c:0.908000,w:0.092000,fp:0.096016,tp:0.914345,tresh*:-64.399914576175647)
results/ag_pos_bw11_tr1000_r4.out:of 5000 samples (c:0.916600,w:0.083400,fp:0.060311,tp:0.879436,tresh*:-63.13628951331107)
results/ag_pos_bw11_tr1000_r5.out:of 4999 samples (c:0.914183,w:0.085817,fp:0.070581,tp:0.888948,tresh*:-63.483397423617134)
results/ag_pos_bw12_tr1000_r1.out:of 5000 samples (c:0.905600,w:0.094400,fp:0.089859,tp:0.898089,tresh*:-63.918659892605078)
results/ag_pos_bw12_tr1000_r2.out:of 5000 samples (c:0.909000,w:0.091000,fp:0.073045,tp:0.879853,tresh*:-63.461752628769773)
results/ag_pos_bw12_tr1000_r3.out:of 5000 samples (c:0.909800,w:0.090200,fp:0.087198,tp:0.905057,tresh*:-64.10367658325643)
results/ag_pos_bw12_tr1000_r4.out:of 5000 samples (c:0.918400,w:0.081600,fp:0.071012,tp:0.901357,tresh*:-63.687753525937566)
results/ag_pos_bw12_tr1000_r5.out:of 4999 samples (c:0.912583,w:0.087417,fp:0.068335,tp:0.880978,tresh*:-63.519830879532478)
results/ag_pos_bw13_tr1000_r1.out:of 5000 samples (c:0.908800,w:0.091200,fp:0.081515,tp:0.892781,tresh*:-63.685128594473312)
results/ag_pos_bw13_tr1000_r2.out:of 5000 samples (c:0.909400,w:0.090600,fp:0.081448,tp:0.894544,tresh*:-63.780551705101878)
results/ag_pos_bw13_tr1000_r3.out:of 5000 samples (c:0.906000,w:0.094000,fp:0.088831,tp:0.897833,tresh*:-64.102562741509558)
results/ag_pos_bw13_tr1000_r4.out:of 5000 samples (c:0.914000,w:0.086000,fp:0.068093,tp:0.885177,tresh*:-63.276745820762422)
results/ag_pos_bw13_tr1000_r5.out:of 4999 samples (c:0.914583,w:0.085417,fp:0.074110,tp:0.895855,tresh*:-63.793257377491983)
results/ag_pos_bw14_tr1000_r1.out:of 5000 samples (c:0.906800,w:0.093200,fp:0.070603,tp:0.869427,tresh*:-63.345317096139794)
results/ag_pos_bw14_tr1000_r2.out:of 5000 samples (c:0.908200,w:0.091800,fp:0.073368,tp:0.878279,tresh*:-63.444006351635878)
results/ag_pos_bw14_tr1000_r3.out:of 5000 samples (c:0.907200,w:0.092800,fp:0.084585,tp:0.894221,tresh*:-64.005264175459629)
results/ag_pos_bw14_tr1000_r4.out:of 5000 samples (c:0.919600,w:0.080400,fp:0.078470,tp:0.916493,tresh*:-63.947605675070307)
results/ag_pos_bw14_tr1000_r5.out:of 4999 samples (c:0.914783,w:0.085217,fp:0.073789,tp:0.895855,tresh*:-63.720775219056897)
results/ag_pos_bw5_tr1000_r1.out:of 5000 samples (c:0.909200,w:0.090800,fp:0.079268,tp:0.890127,tresh*:-63.650506682608956)
results/ag_pos_bw5_tr1000_r2.out:of 5000 samples (c:0.910600,w:0.089400,fp:0.079832,tp:0.895068,tresh*:-63.77234083101326)
results/ag_pos_bw5_tr1000_r3.out:of 5000 samples (c:0.906400,w:0.093600,fp:0.088831,tp:0.898865,tresh*:-63.987551855087368)
results/ag_pos_bw5_tr1000_r4.out:of 5000 samples (c:0.917800,w:0.082200,fp:0.064202,tp:0.888831,tresh*:-63.260087565341408)
results/ag_pos_bw5_tr1000_r5.out:of 4999 samples (c:0.911982,w:0.088018,fp:0.072506,tp:0.886291,tresh*:-63.41828339581663)
results/ag_pos_bw6_tr1000_r1.out:of 5000 samples (c:0.907400,w:0.092600,fp:0.081194,tp:0.888535,tresh*:-63.600547326210915)
results/ag_pos_bw6_tr1000_r2.out:of 5000 samples (c:0.909600,w:0.090400,fp:0.070459,tp:0.877230,tresh*:-63.358571068550845)
results/ag_pos_bw6_tr1000_r3.out:of 5000 samples (c:0.906400,w:0.093600,fp:0.086871,tp:0.895769,tresh*:-63.930492550969205)
results/ag_pos_bw6_tr1000_r4.out:of 5000 samples (c:0.918200,w:0.081800,fp:0.073281,tp:0.904489,tresh*:-63.596195660178857)
results/ag_pos_bw6_tr1000_r5.out:of 4999 samples (c:0.911582,w:0.088418,fp:0.070902,tp:0.882572,tresh*:-63.344601108587113)
results/ag_pos_bw7_tr1000_r1.out:of 5000 samples (c:0.906400,w:0.093600,fp:0.094994,tp:0.908705,tresh*:-64.024538411238112)
results/ag_pos_bw7_tr1000_r2.out:of 5000 samples (c:0.910600,w:0.089400,fp:0.085973,tp:0.905037,tresh*:-63.930507968418844)
results/ag_pos_bw7_tr1000_r3.out:of 5000 samples (c:0.905800,w:0.094200,fp:0.092423,tp:0.902993,tresh*:-64.094041207568438)
results/ag_pos_bw7_tr1000_r4.out:of 5000 samples (c:0.921600,w:0.078400,fp:0.059987,tp:0.891962,tresh*:-63.212251572104591)
results/ag_pos_bw7_tr1000_r5.out:of 4999 samples (c:0.910182,w:0.089818,fp:0.093680,tp:0.916578,tresh*:-64.182676493704719)
results/ag_pos_bw8_tr1000_r1.out:of 5000 samples (c:0.911200,w:0.088800,fp:0.074775,tp:0.888004,tresh*:-63.601450924669457)
results/ag_pos_bw8_tr1000_r2.out:of 5000 samples (c:0.911400,w:0.088600,fp:0.073368,tp:0.886674,tresh*:-63.399431693215419)
results/ag_pos_bw8_tr1000_r3.out:of 5000 samples (c:0.910800,w:0.089200,fp:0.080993,tp:0.897833,tresh*:-63.9338746644857)
results/ag_pos_bw8_tr1000_r4.out:of 5000 samples (c:0.920200,w:0.079800,fp:0.074254,tp:0.911273,tresh*:-63.717673904608851)
results/ag_pos_bw8_tr1000_r5.out:of 4999 samples (c:0.910382,w:0.089618,fp:0.080526,tp:0.895324,tresh*:-63.659707348414614)
results/ag_pos_bw9_tr1000_r1.out:of 5000 samples (c:0.906600,w:0.093400,fp:0.077022,tp:0.879512,tresh*:-63.497271945960478)
results/ag_pos_bw9_tr1000_r2.out:of 5000 samples (c:0.910400,w:0.089600,fp:0.074661,tp:0.886149,tresh*:-63.553673239244745)
results/ag_pos_bw9_tr1000_r3.out:of 5000 samples (c:0.906600,w:0.093400,fp:0.085565,tp:0.894221,tresh*:-64.041768856800203)
results/ag_pos_bw9_tr1000_r4.out:of 5000 samples (c:0.917400,w:0.082600,fp:0.065175,tp:0.889353,tresh*:-63.206573387151913)
results/ag_pos_bw9_tr1000_r5.out:of 4999 samples (c:0.911982,w:0.088018,fp:0.090792,tp:0.916578,tresh*:-64.18305729286061)
Best models:
ag_pos_bw8_tr1000_r1.out:of 5000 samples (c:0.911200,w:0.088800,fp:0.074775,tp:0.888004,tresh*:-63.601450924669457)
ag_pos_bw11_tr1000_r2.out:of 5000 samples (c:0.912800,w:0.087200,fp:0.071105,tp:0.886674,tresh*:-63.527268108751741)
ag_pos_bw8_tr1000_r3.out:of 5000 samples (c:0.910800,w:0.089200,fp:0.080993,tp:0.897833,tresh*:-63.9338746644857)
ag_pos_bw7_tr1000_r4.out:of 5000 samples (c:0.921600,w:0.078400,fp:0.059987,tp:0.891962,tresh*:-63.212251572104591)
ag_pos_bw10_tr1000_r5.out:of 4999 samples (c:0.915183,w:0.084817,fp:0.082130,tp:0.910733,tresh*:-63.879775429161818)
Performance of the best joined (positive with corresponding negative) models on the validation data with optimal treshhold:
1) error rate
a) over all 5 runs
0.0242 0.0308 0.0314 0.0286 0.0312
b) mean
0.0292
c) standard deviation
0.0370
2) specificity ( 1 - false positives )
a) over all 5 runs
0.9833 0.9712 0.9703 0.9737 0.9724
b) mean
0.9742
c) standard deviation
0.0053
3) sensitivity (true positives)
a) over all 5 runs
0.9634 0.9659 0.9659 0.9676 0.9628
b) mean
0.9651
c) standard deviation
0.0020

