Recently while implementing the Small Parsimony Problem I had the need to generate a binary tree from a string in Python.

The pseudo-code in the question implicitly assumes you have some functionality that will generate a DNA sequence like CAAATCCC into a binary tree, then run SmallParsimony on it. I couldn't find that functionality anywhere, so I decided to implement something simple in networkx. Hopefully it's useful.

```python

# !/usr/bin/env python

""" Shane Dowling, 27 Oct 2015 To run simply call generate_leaf_graph and pass in s string. """ def count_predecessors(T): pred = 0 for v in T.nodes(): pred += 1 return pred

def generate_leaf_graph(s): """ Will generate a leaf graph and return it along with the Character dictionary representing the characters at each leaf node position """ T = nx.DiGraph() Character = {} i = 0 parI = len(s) for (son, daughter) in zip(s[0::2], s[1::2]): Character[i] = son Character[i+1] = daughter T.add_edge(parI, i) T.add_edge(parI, i+1) i += 2 parI += 1

while count_predecessors(T) > 1: for v in T.nodes(): T.add_edge(parI, v) T.add_edge(parI, v+1) parI += 1 return [T, Character] ```