Champuru 2

Flot (2007) Champuru 1.0: a computer software for unraveling mixtures of two DNA sequences of unequal lengths. Molecular Ecology Notes 7 (6), 974-977 [link]

Champuru makes it possible to determine the haplotypes of heterozygous individuals without cloning, simply by analyzing the patterns of double peaks in the forward and reverse chromatograms.

Forward sequence:    

The forward sequence (IUPAC one-letter code) of the heterozygous individual.

Reverse sequence:    

The reverse sequence (IUPAC one-letter code) of the heterozygous individual.

Score calculation:    

Which score calculation method to use. Currently the following methods are implemented:

PaperThe score correction method described in the Champuru 1.0 paper.
Ambiguity correctionA modification of the score calculation method described in the Champuru 1.0 paper. This modified score corrects for the fact that ambiguous characters (e.g. W) can match multiple other characters (e.g. A, T). Preliminary results suggests that this score calculation method works better when the reconstructed consensus sequences contain a lot of ambiguities. However this score correction method seems to work less good on short input sequences.
Longest LengthTake the longest number of consecutive matching nucleotides as score.
Analyze further offset pairs:    

When this checkbox is clicked Champuru will not only use the best offset pair for sequence reconstruction but it will also run the sequence reconstruction algorithm for further offset pairs. Thus it's possible to see whether there is another offset pair that works better or whether it is possible to find multiple solutions for the given input. This can take some time which is why this checkbox is not checked by default.



To get a quick feeling of how Champuru works, use the following data:
Forward sequence: ATSYKRMKY
Reverse sequence: WWKCTGAST


This example may be used in order to explain how Champuru can resolve ambiguities in the consensus sequences (Step 3):
Forward sequence: CTRAATTCAAATCACACTCGCGAAAWYMWKRAA
Reverse sequence: YWRAWTYMAAWYMMMMYYSSSRAAATCATGAA


A more complex example with lots of ambiguities in the consensus sequence:
Forward sequence: AAATSSYKRWTYMMMMMRSSRMSGSCCYWRSMWWMCCSRRGGRWYSGRARR
Reverse sequence: MRMTGMTKMWYMMMRCRRCGRCSSYMSYAKMMYMSMSGRRKSRKMRGRRKA


The following example has multiple solutions (for offset pairs 5,1 and 4,2):
Forward sequence: SYKRWTYMMMMMRSSRMSGSCCYWRSMWWMCCSRRGGRWYSGRAR
Reverse sequence: MRMTGMTKMWYMMMRCRRCGRCSSYMSYAKMMYMSMSGRRKSRKMRGRRKA


And here is a 'real-life' example (original chromatograms: Forward and Reverse):
Forward sequence: ATCGTGGGCGGCCGGCCCTGCCACAGACGGGGTTGTCACCCTCTGCGACGTGCCGTTCCAAGCAACTTAGGCAGGGCCCCTCCGCCTAGAAAGTGCTTCTCGCAACTACAACTCGCCGATGCAGGCATCGGAGATTTCAAATTTGAGCTCTTCCCGCTTCACTCGCCGTTACTGGGGGAATCCTTGTTAGTTTCTTTTCCTCCGCTTATTAATATGCTTAAATTCAGCGGGTAGCCTTGCCTGATCTGAGGTCTGGAAGGCGATTCCTTTTTTCCTTTGAGATGCCGCCACCGCTACCCGGCGGCAGCAGAAAAAAGAATCGAATGGAGAAAGATTTGTTCCGTCAAAGCGATAGAGCCGTGGCCGTTTGGGGTACATTGTTCTATGATCCCCGCGCGACACCGGATGTCGCTTGGCGGATCTTTCTCCCTGAATTTCAAGGGACGCGGTAAACCGACCGGTCGGGCCGAGCAGCACCAGGGCTGGCTAGCTAGCGCACGACCGGTCATCTCGACCGCGACCCTCAACGCCGCACGAACCCGTTCACGGCGGGCGCGYSYCSSSSCCMYMYSCKMYASASRSGGRSMMSRSGSRSRCGCGCRCRCGSRKWYKCRCRMKRKRKRTKTKWRWAKASACWCWSASASASAYRYKCYYSKGRGARMMCMMRAGMGCSMYWTKYGYKYWMARAKWYKMKRWKWYWCWSWRWWYTCYGCWWYTCACWCTWYWTMTCRSMMTCKMKCTGA
Reverse sequence: GTGCRMTMTCAACACMCSASTCTCGMRACRCATMKYGKGSGSSSSSSSCYSYSMCASASRSGGKKKTSWCMCYCTSYGMSRYGYSSYKYYMMRMRMMWYWKRSRSRGSSCCYCYSCSYMKARARWGYKYYTCKCRMMWMYAMMWCKCSSMKRYRSRSRYMKSRGAKWTYWMAWWTKWGMKCTYYYCSCKYYWCWCKCSSYKWYWSKGGGRRWMYYYKTKWKWKTYTYTTYYYCYSCKYWTWWWWATRYKYWWAWWYWSMGSGKRKMSYYKYSYSWKMTSWGRKSTSKRRRRSGMKWYYYYTTTTYYYYTKWGAKRYSSCSMCMSCKMYMCSSSGSSRSMRSARAAAARARWMKMRWRKRGARARAKWTKTKYYSYSWMARMGMKAKAGMSSYGKSSSYKTKKGGKRYAYWKTKYTMTRWKMYCCSCGCGMSACMSSRKRTSKCKYKKSGSRKMTYTYTCYCYSWRWWTYWMRRGRSRCGSKRWAMMSMSMSSKSKSGSSSMGMRSMRCMMSRGSKSKSKMKMKMKMGCRCRMSMSSKSWYMTCKMSMSCGMSMCYCWMMRCSSCRCRMRMMCSYKYWCRSSGSGSGCGCGTCCCGGCCCCATCCGCTACAGACGGGGACCAGGCGGACGCGCGCACGCGGATTCGCACGATGGATGTTTGAATAGACACTCAGACAGACATGCTCCTGGGAGAACCCAAGAGCGCCATTTGCGTTCAAAGATTCGATGATTCACTGAAT