FastSP: linear time calculation of alignment accuracy

A General Observation

Observation

From the result, we could see that:

1
2
3
4
5
SP-Score: 19/26 = 0.7307…
Modeler: 19/27 = 0.7037…
SP-FN: (26 - 19)/26 = 0.2692….
SP-FP: (27 - 19)/27 = 0.2962….
TC: 3/8 = 0.375

Approach

From the last section, we know our propose is to calculate:

1
2
3
4
5
1. Number of shared homologies.
2. Number of homologies in the reference alignment.
3. Number of homologies in the estimated alignment.
4. Number of correctly aligned columns.
5. Number of aligned columns in the reference alignment.

At the same time, we have a general thinking which is:

1
2
3
2,3 —> 1
5 is the easiest
1 — > 4

First of all, it gives these definitions:

1
2
3
Si represent the i-th sequence in the alignment.
Ai represent the i-th alignment.
Ni,j represent the j-th site in Si.

I will give you an example:

Example

Then, we could know these (Explaining an example):

Approach

Algorithm

This section is for programming.

Let’s look at some difinitions first:

1
2
3
4
5
6
7
n represent the number of sequences in the alignment.
k represent the biggest length of sequences in the alignment.
S[i,j] represent a n•k matrix which equals (a, b) means Ni,j appears in site a for the reference alignment and in site b for the estimated alignment.
bx represent the number of non-gapped entries in the x-th site.
mi represent the number of elements in the i-th equivalence class.
Nx represent the number of homologies in the estimated alignment that are shared with the x-th site in the reference alignment.
hi represent the number of homologous pairs in alignment Ai.

Explaining them by this example:

Example

The pseudo code is like this:

Pseudo Code

Make a mapping:

Mapping

The result could be:

1
2
3
SP-Score: N/h1
Modeler: N/h2
TC: cor_num/k

Evaluation

Calculating the matrix S: O(nk)

Calculating combination number: O(1)

Calculating each Nx: O(n)

As for the FOR loop: O(nk)

So:

The time complexity is O(n•k).

The space complexity is O(n•k).

Comparing to the other programs:

evaluation



The link of this page is https://blog.nooa.tech/articles/bd4539a2/ . Welcome to reproduce it!

© 2018.02.08 - 2024.05.25 Mengmeng Kuang  保留所有权利!

:D 获取中...

Creative Commons License