{"id":60,"date":"2015-05-09T01:24:06","date_gmt":"2015-05-09T06:24:06","guid":{"rendered":"http:\/\/www.playwithlua.com\/?p=60"},"modified":"2020-05-22T13:51:00","modified_gmt":"2020-05-22T18:51:00","slug":"telegrams","status":"publish","type":"post","link":"http:\/\/www.playwithlua.com\/?p=60","title":{"rendered":"Telegrams"},"content":{"rendered":"<p>A few days ago at work, a friend of mine came up with this problem. I call it the &#8220;telegram&#8221; problem: you get a string consisting of words run together without spaces, and figure out where spaces can go to separate out the words. For example, a <a onclick=\"javascript:pageTracker._trackPageview('\/outgoing\/en.wikipedia.org\/wiki\/The_world_wonders');\"  href=\"http:\/\/en.wikipedia.org\/wiki\/The_world_wonders\">famous misunderstanding<\/a>:<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\">parse_telegram<span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #ff6666;\">&quot;turkeytrotstowaterwhereistaskforcethirtyfourtheworldwonders&quot;<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n<span style=\"color: #808080; font-style: italic;\">-- turkey trots to water where is task force thirty four the world wonders<\/span><\/pre><\/div><\/div><\/div>\n\n<p>I wrote a solution last night.<\/p>\n<p><!--more--><\/p>\n<h2>The dictionary<\/h2>\n<p>First things first, we&#8217;re going to need a dictionary of words. If you&#8217;re on a Mac or most Linux distros you already have one, at <code>\/usr\/share\/dict\/words<\/code>. If not, I recommend Googling for the OSPD, the Official Scrabble Player&#8217;s Dictionary.<\/p>\n<p>The dictionary comes to us as a text file with one word per line. Mine, at least, contains a bunch of proper nouns, words with apostrophes, and single-letter words, which I filtered out (adding back &#8220;i&#8221; and &#8220;a,&#8221; the only two single-letter words that really exist).<\/p>\n<p>We need to turn this into some kind of data structure we can easily query. Specifically, we want to quickly find out if there are any words that begin with a certain substring, for example, there are words that begin with &#8220;appl&#8221; but none that begin with &#8220;appq.&#8221; The data structure we&#8217;ll use for this is called a trie. It&#8217;s a tree with each node representing a letter, and the edges leading out of the node representing letters that could follow this one. So, for example, here&#8217;s a trie containing the words &#8220;an,&#8221; &#8220;ant,&#8221; &#8220;app,&#8221; &#8220;apple,&#8221; and &#8220;pear:&#8221;<\/p>\n<img src=\"http:\/\/www.playwithlua.com\/wp-content\/tfo-graphviz\/0377f80a1eb2b9e680d3a69289f010b7.png\" class=\"graphviz\" \/>\n<p>A &#8220;$&#8221; node represents the end of a word, so, if a node has a &#8220;$&#8221; as a child, that means that letter can complete a word (but doesn&#8217;t have to, if there are other children).<\/p>\n<p>So, here&#8217;s a function that reads a dictionary file, filters out the words we don&#8217;t want, and inserts them into a trie:<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\"><span style=\"color: #aa9900; font-weight: bold;\">function<\/span> load_dictionary<span style=\"color: #66cc66;\">&#40;<\/span>filename<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> dictionary <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> line <span style=\"color: #aa9900; font-weight: bold;\">in<\/span> <span style=\"color: #0000aa;\">io.lines<\/span><span style=\"color: #66cc66;\">&#40;<\/span>filename<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">if<\/span> line<span style=\"color: #66cc66;\">:<\/span>match<span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #ff6666;\">&quot;^%l%l+$&quot;<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">then<\/span>\n            insert<span style=\"color: #66cc66;\">&#40;<\/span>dictionary<span style=\"color: #66cc66;\">,<\/span> line <span style=\"color: #66cc66;\">..<\/span> <span style=\"color: #ff6666;\">&quot;$&quot;<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n    <span style=\"color: #808080; font-style: italic;\">-- We're cutting all single-letter words, but<\/span>\n    <span style=\"color: #808080; font-style: italic;\">-- two of them are real words<\/span>\n    insert<span style=\"color: #66cc66;\">&#40;<\/span>dictionary<span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #ff6666;\">&quot;a$&quot;<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n    insert<span style=\"color: #66cc66;\">&#40;<\/span>dictionary<span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #ff6666;\">&quot;i$&quot;<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> dictionary\n<span style=\"color: #aa9900; font-weight: bold;\">end<\/span><\/pre><\/div><\/div><\/div>\n\n<p>And here&#8217;s the corresponding <code>insert<\/code> function, to actually put them in the trie:<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\"><span style=\"color: #aa9900; font-weight: bold;\">function<\/span> insert<span style=\"color: #66cc66;\">&#40;<\/span>dictionary<span style=\"color: #66cc66;\">,<\/span> word<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> first <span style=\"color: #66cc66;\">=<\/span> word<span style=\"color: #66cc66;\">:<\/span>sub<span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">,<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n    dictionary<span style=\"color: #66cc66;\">&#91;<\/span>first<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">=<\/span> dictionary<span style=\"color: #66cc66;\">&#91;<\/span>first<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">or<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">if<\/span> first <span style=\"color: #66cc66;\">~=<\/span> <span style=\"color: #ff6666;\">&quot;$&quot;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">then<\/span>\n        insert<span style=\"color: #66cc66;\">&#40;<\/span>dictionary<span style=\"color: #66cc66;\">&#91;<\/span>first<span style=\"color: #66cc66;\">&#93;<\/span><span style=\"color: #66cc66;\">,<\/span> word<span style=\"color: #66cc66;\">:<\/span>sub<span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #cc66cc;\">2<\/span><span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n<span style=\"color: #aa9900; font-weight: bold;\">end<\/span><\/pre><\/div><\/div><\/div>\n\n<h2>Finding all possible words<\/h2>\n<p>The first step to my solution is to find all the possible words in the telegram, matched to their place in the string. So, for the telegram &#8220;hellothere&#8221; we will make a list of words like this:<\/p>\n<pre>\r\nhellothere\r\nhe\r\nhell\r\nhello\r\n-ell\r\n---lo\r\n---lot\r\n---loth\r\n----other\r\n-----the\r\n-----there\r\n------he\r\n------her\r\n------here\r\n-------ere\r\n--------re\r\n<\/pre>\n<p>(Your list may be slightly different using the OSPD instead of Ubuntu&#8217;s dictionary like I am.)<\/p>\n<p><code>find_words<\/code> takes a telegram and a trie and returns a list of <code>{position, word}<\/code> tuples for each word. The way it works is pretty straightforward: for each letter in the telegram, search the trie for words that start with that letter. So, we look for a &#8220;$&#8221; node under the first &#8220;h,&#8221; if we don&#8217;t find one we move down to &#8220;e&#8221; and then look for a &#8220;$&#8221; node, if we find one we insert &#8220;he&#8221; and then move down to &#8220;l,&#8221; and so on, until we reach a point where the current trie node doesn&#8217;t have a child for the next letter in the telegram. Then we stop and try again with the next starting letter.<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\"><span style=\"color: #aa9900; font-weight: bold;\">function<\/span> find_words<span style=\"color: #66cc66;\">&#40;<\/span>telegram<span style=\"color: #66cc66;\">,<\/span> dictionary<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> words <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span> <span style=\"color: #808080; font-style: italic;\">-- list of tuples, {start, word}<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> start <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #66cc66;\">#<\/span>telegram <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> current_index <span style=\"color: #66cc66;\">=<\/span> start\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> current_letter <span style=\"color: #66cc66;\">=<\/span> telegram<span style=\"color: #66cc66;\">:<\/span>sub<span style=\"color: #66cc66;\">&#40;<\/span>start<span style=\"color: #66cc66;\">,<\/span> start<span style=\"color: #66cc66;\">&#41;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> current_node <span style=\"color: #66cc66;\">=<\/span> dictionary\n&nbsp;\n        <span style=\"color: #aa9900; font-weight: bold;\">while<\/span> current_node<span style=\"color: #66cc66;\">&#91;<\/span> current_letter <span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n            current_node <span style=\"color: #66cc66;\">=<\/span> current_node<span style=\"color: #66cc66;\">&#91;<\/span> current_letter <span style=\"color: #66cc66;\">&#93;<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">if<\/span> current_node<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #ff6666;\">'$'<\/span><span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">then<\/span>\n                <span style=\"color: #0000aa;\">table.insert<\/span><span style=\"color: #66cc66;\">&#40;<\/span>words<span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #66cc66;\">&#123;<\/span>start<span style=\"color: #66cc66;\">,<\/span> telegram<span style=\"color: #66cc66;\">:<\/span>sub<span style=\"color: #66cc66;\">&#40;<\/span>start<span style=\"color: #66cc66;\">,<\/span> current_index<span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">&#125;<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n            current_index <span style=\"color: #66cc66;\">=<\/span> current_index <span style=\"color: #66cc66;\">+<\/span> <span style=\"color: #cc66cc;\">1<\/span>\n            current_letter <span style=\"color: #66cc66;\">=<\/span> telegram<span style=\"color: #66cc66;\">:<\/span>sub<span style=\"color: #66cc66;\">&#40;<\/span>current_index<span style=\"color: #66cc66;\">,<\/span> current_index<span style=\"color: #66cc66;\">&#41;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> words\n<span style=\"color: #aa9900; font-weight: bold;\">end<\/span><\/pre><\/div><\/div><\/div>\n\n<h2>Culling the word list<\/h2>\n<p>Looking at our list of words, it should be clear that some of these can&#8217;t be part of any solution. For example, look at the word &#8220;lot&#8221; starting at position 4: any solution that contains &#8220;lot&#8221; must also contain a word that ends at position 3, as well as one that begins with position 7 (the left and right neighbors of &#8220;lot&#8221;). &#8220;Here&#8221; begins at 7, so we have a right neighbor, but nothing ends at 3, so, &#8220;lot&#8221; cannot be part of a solution.<\/p>\n<p>We need to go through the list of tuples and remove any words that don&#8217;t have left and right neighbors. Of course, those words we remove might themselves be the neighbors of other words, meaning that removing them might leave <em>those<\/em> words without a neighbor. So we need to keep culling the list until there&#8217;s nothing left that can be removed.<\/p>\n<p>We&#8217;ll make this fast by keeping a count of all the words that start and end at each position, and checking each word to ensure that that count is positive for each of them.<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\"><span style=\"color: #aa9900; font-weight: bold;\">function<\/span> cull_words<span style=\"color: #66cc66;\">&#40;<\/span>words<span style=\"color: #66cc66;\">,<\/span> telegram_length<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #808080; font-style: italic;\">-- Map from start \/ end position, to word count<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> by_start <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #0000aa;\">setmetatable<\/span><span style=\"color: #66cc66;\">&#40;<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span><span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #66cc66;\">&#123;<\/span>__index<span style=\"color: #66cc66;\">=<\/span><span style=\"color: #aa9900; font-weight: bold;\">function<\/span><span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> <span style=\"color: #cc66cc;\">0<\/span> <span style=\"color: #aa9900; font-weight: bold;\">end<\/span><span style=\"color: #66cc66;\">&#125;<\/span> <span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> by_end <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #0000aa;\">setmetatable<\/span><span style=\"color: #66cc66;\">&#40;<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span><span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #66cc66;\">&#123;<\/span>__index<span style=\"color: #66cc66;\">=<\/span><span style=\"color: #aa9900; font-weight: bold;\">function<\/span><span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> <span style=\"color: #cc66cc;\">0<\/span> <span style=\"color: #aa9900; font-weight: bold;\">end<\/span><span style=\"color: #66cc66;\">&#125;<\/span> <span style=\"color: #66cc66;\">&#41;<\/span>\n&nbsp;\n    <span style=\"color: #808080; font-style: italic;\">-- Initialize these counts<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> _<span style=\"color: #66cc66;\">,<\/span> w <span style=\"color: #aa9900; font-weight: bold;\">in<\/span> <span style=\"color: #0000aa;\">ipairs<\/span><span style=\"color: #66cc66;\">&#40;<\/span>words<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> l <span style=\"color: #66cc66;\">=<\/span> w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#93;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> r <span style=\"color: #66cc66;\">=<\/span> w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">+<\/span> <span style=\"color: #66cc66;\">#<\/span><span style=\"color: #66cc66;\">&#40;<\/span>w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">2<\/span><span style=\"color: #66cc66;\">&#93;<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #66cc66;\">-<\/span> <span style=\"color: #cc66cc;\">1<\/span>\n&nbsp;\n        by_start<span style=\"color: #66cc66;\">&#91;<\/span>l<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">=<\/span> by_start<span style=\"color: #66cc66;\">&#91;<\/span>l<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">+<\/span> <span style=\"color: #cc66cc;\">1<\/span>\n        by_end<span style=\"color: #66cc66;\">&#91;<\/span>r<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">=<\/span> by_end<span style=\"color: #66cc66;\">&#91;<\/span>r<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">+<\/span> <span style=\"color: #cc66cc;\">1<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n    <span style=\"color: #808080; font-style: italic;\">-- Loop until we run out of things to remove<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">while<\/span> <span style=\"color: #aa9900;\">true<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> new_words <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span>\n&nbsp;\n        <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> i<span style=\"color: #66cc66;\">,<\/span> w <span style=\"color: #aa9900; font-weight: bold;\">in<\/span> <span style=\"color: #0000aa;\">ipairs<\/span><span style=\"color: #66cc66;\">&#40;<\/span>words<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> left <span style=\"color: #66cc66;\">=<\/span> w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#93;<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> right <span style=\"color: #66cc66;\">=<\/span> w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">+<\/span> <span style=\"color: #66cc66;\">#<\/span><span style=\"color: #66cc66;\">&#40;<\/span>w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">2<\/span><span style=\"color: #66cc66;\">&#93;<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #66cc66;\">-<\/span> <span style=\"color: #cc66cc;\">1<\/span>\n&nbsp;\n            <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> left_neighbor <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#40;<\/span>left <span style=\"color: #66cc66;\">==<\/span> <span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">or<\/span> <span style=\"color: #66cc66;\">&#40;<\/span>by_end<span style=\"color: #66cc66;\">&#91;<\/span>left<span style=\"color: #66cc66;\">-<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">&gt;<\/span> <span style=\"color: #cc66cc;\">0<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> right_neighbor <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#40;<\/span>right <span style=\"color: #66cc66;\">==<\/span> telegram_length<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">or<\/span> <span style=\"color: #66cc66;\">&#40;<\/span>by_start<span style=\"color: #66cc66;\">&#91;<\/span>right<span style=\"color: #66cc66;\">+<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">&gt;<\/span> <span style=\"color: #cc66cc;\">0<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n&nbsp;\n            <span style=\"color: #aa9900; font-weight: bold;\">if<\/span> left_neighbor <span style=\"color: #aa9900; font-weight: bold;\">and<\/span> right_neighbor <span style=\"color: #aa9900; font-weight: bold;\">then<\/span>\n                <span style=\"color: #0000aa;\">table.insert<\/span><span style=\"color: #66cc66;\">&#40;<\/span>new_words<span style=\"color: #66cc66;\">,<\/span> w<span style=\"color: #66cc66;\">&#41;<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">else<\/span>\n                by_start<span style=\"color: #66cc66;\">&#91;<\/span>left<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">=<\/span> by_start<span style=\"color: #66cc66;\">&#91;<\/span>left<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">-<\/span> <span style=\"color: #cc66cc;\">1<\/span>\n                by_end<span style=\"color: #66cc66;\">&#91;<\/span>right<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">=<\/span> by_end<span style=\"color: #66cc66;\">&#91;<\/span>right<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">-<\/span> <span style=\"color: #cc66cc;\">1<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n        <span style=\"color: #aa9900; font-weight: bold;\">if<\/span> <span style=\"color: #66cc66;\">#<\/span>new_words <span style=\"color: #66cc66;\">==<\/span> <span style=\"color: #66cc66;\">#<\/span>words <span style=\"color: #aa9900; font-weight: bold;\">then<\/span> <span style=\"color: #aa9900; font-weight: bold;\">break<\/span> <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n        words <span style=\"color: #66cc66;\">=<\/span> new_words\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> words\n<span style=\"color: #aa9900; font-weight: bold;\">end<\/span><\/pre><\/div><\/div><\/div>\n\n<h2>Enumerating the solutions<\/h2>\n<p>At this point, everything in our word list can be part of a solution. We can organize our words into a tree, like so:<\/p>\n<pre>\r\niamatelegram\r\ni\r\n-a\r\n-am\r\n--ma\r\n--mate\r\n---a\r\n---ate\r\n----telegram\r\n------leg\r\n---------ram\r\n<\/pre>\n<img src=\"http:\/\/www.playwithlua.com\/wp-content\/tfo-graphviz\/dbaae6f2c686814b19c971820a478ec3.png\" class=\"graphviz\" \/>\n<p>Here&#8217;s a simple function to do that:<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\"><span style=\"color: #aa9900; font-weight: bold;\">function<\/span> build_word_tree<span style=\"color: #66cc66;\">&#40;<\/span>words<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> <span style=\"color: #aa9900; font-weight: bold;\">function<\/span> find_by_start<span style=\"color: #66cc66;\">&#40;<\/span>start<span style=\"color: #66cc66;\">&#41;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> arr <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> _<span style=\"color: #66cc66;\">,<\/span>w <span style=\"color: #aa9900; font-weight: bold;\">in<\/span> <span style=\"color: #0000aa;\">ipairs<\/span><span style=\"color: #66cc66;\">&#40;<\/span>words<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">if<\/span> w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">==<\/span> start <span style=\"color: #aa9900; font-weight: bold;\">then<\/span> <span style=\"color: #0000aa;\">table.insert<\/span><span style=\"color: #66cc66;\">&#40;<\/span>arr<span style=\"color: #66cc66;\">,<\/span> w<span style=\"color: #66cc66;\">&#91;<\/span><span style=\"color: #cc66cc;\">2<\/span><span style=\"color: #66cc66;\">&#93;<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> arr\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> <span style=\"color: #aa9900; font-weight: bold;\">function<\/span> tree_for_start<span style=\"color: #66cc66;\">&#40;<\/span>start<span style=\"color: #66cc66;\">&#41;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> tree <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> _<span style=\"color: #66cc66;\">,<\/span> w <span style=\"color: #aa9900; font-weight: bold;\">in<\/span> <span style=\"color: #0000aa;\">ipairs<\/span><span style=\"color: #66cc66;\">&#40;<\/span>find_by_start<span style=\"color: #66cc66;\">&#40;<\/span>start<span style=\"color: #66cc66;\">&#41;<\/span><span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n            tree<span style=\"color: #66cc66;\">&#91;<\/span>w<span style=\"color: #66cc66;\">&#93;<\/span> <span style=\"color: #66cc66;\">=<\/span> tree_for_start<span style=\"color: #66cc66;\">&#40;<\/span> start <span style=\"color: #66cc66;\">+<\/span> <span style=\"color: #66cc66;\">#<\/span>w <span style=\"color: #66cc66;\">&#41;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> tree\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">return<\/span> tree_for_start<span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #cc66cc;\">1<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n<span style=\"color: #aa9900; font-weight: bold;\">end<\/span><\/pre><\/div><\/div><\/div>\n\n<p>Now, every path through this tree is a possible solution. So, we simply do a depth-first traversal of the tree and list out all the paths:<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\"><span style=\"color: #aa9900; font-weight: bold;\">function<\/span> print_tree<span style=\"color: #66cc66;\">&#40;<\/span>word_tree<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> strings <span style=\"color: #66cc66;\">=<\/span> <span style=\"color: #66cc66;\">&#123;<\/span><span style=\"color: #66cc66;\">&#125;<\/span> <span style=\"color: #808080; font-style: italic;\">-- The paths through the tree<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> <span style=\"color: #aa9900; font-weight: bold;\">function<\/span> dft<span style=\"color: #66cc66;\">&#40;<\/span>path<span style=\"color: #66cc66;\">,<\/span> current_node<span style=\"color: #66cc66;\">&#41;<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> word<span style=\"color: #66cc66;\">,<\/span> children <span style=\"color: #aa9900; font-weight: bold;\">in<\/span> <span style=\"color: #0000aa;\">pairs<\/span><span style=\"color: #66cc66;\">&#40;<\/span>current_node<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> new_path <span style=\"color: #66cc66;\">=<\/span> path <span style=\"color: #aa9900; font-weight: bold;\">and<\/span> path <span style=\"color: #66cc66;\">..<\/span> <span style=\"color: #ff6666;\">&quot; &quot;<\/span> <span style=\"color: #66cc66;\">..<\/span> word <span style=\"color: #aa9900; font-weight: bold;\">or<\/span> word\n&nbsp;\n            <span style=\"color: #aa9900; font-weight: bold;\">if<\/span> <span style=\"color: #aa9900; font-weight: bold;\">not<\/span> <span style=\"color: #0000aa;\">next<\/span><span style=\"color: #66cc66;\">&#40;<\/span>children<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">then<\/span>\n                <span style=\"color: #0000aa;\">table.insert<\/span><span style=\"color: #66cc66;\">&#40;<\/span>strings<span style=\"color: #66cc66;\">,<\/span> new_path<span style=\"color: #66cc66;\">&#41;<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">else<\/span>\n                dft<span style=\"color: #66cc66;\">&#40;<\/span>new_path<span style=\"color: #66cc66;\">,<\/span> children<span style=\"color: #66cc66;\">&#41;<\/span>\n            <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n        <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n&nbsp;\n    dft<span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #aa9900;\">nil<\/span><span style=\"color: #66cc66;\">,<\/span> word_tree<span style=\"color: #66cc66;\">&#41;<\/span>\n&nbsp;\n    <span style=\"color: #aa9900; font-weight: bold;\">for<\/span> _<span style=\"color: #66cc66;\">,<\/span> s <span style=\"color: #aa9900; font-weight: bold;\">in<\/span> <span style=\"color: #0000aa;\">ipairs<\/span><span style=\"color: #66cc66;\">&#40;<\/span>strings<span style=\"color: #66cc66;\">&#41;<\/span> <span style=\"color: #aa9900; font-weight: bold;\">do<\/span>\n        <span style=\"color: #0000aa;\">print<\/span><span style=\"color: #66cc66;\">&#40;<\/span>s<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">end<\/span>\n<span style=\"color: #aa9900; font-weight: bold;\">end<\/span><\/pre><\/div><\/div><\/div>\n\n<h2>Putting it together<\/h2>\n<p>Now that we have all the tools, we just run them all together:<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\"><span style=\"color: #aa9900; font-weight: bold;\">function<\/span> parse_telegram<span style=\"color: #66cc66;\">&#40;<\/span>telegram<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> words <span style=\"color: #66cc66;\">=<\/span> find_words<span style=\"color: #66cc66;\">&#40;<\/span>telegram<span style=\"color: #66cc66;\">,<\/span> dictionary<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> culled <span style=\"color: #66cc66;\">=<\/span> cull_words<span style=\"color: #66cc66;\">&#40;<\/span>words<span style=\"color: #66cc66;\">,<\/span> <span style=\"color: #66cc66;\">#<\/span>telegram<span style=\"color: #66cc66;\">&#41;<\/span>\n    <span style=\"color: #aa9900; font-weight: bold;\">local<\/span> tree <span style=\"color: #66cc66;\">=<\/span> build_word_tree<span style=\"color: #66cc66;\">&#40;<\/span>culled<span style=\"color: #66cc66;\">&#41;<\/span>\n&nbsp;\n    print_tree<span style=\"color: #66cc66;\">&#40;<\/span>tree<span style=\"color: #66cc66;\">&#41;<\/span>\n<span style=\"color: #aa9900; font-weight: bold;\">end<\/span><\/pre><\/div><\/div><\/div>\n\n<p>Let&#8217;s test it out:<\/p>\n\n<div class=\"my_syntax_box\"><div class=\"my_syntax\"><div class=\"code\"><pre class=\"lua\" style=\"font-family:monospace;\">parse_telegram<span style=\"color: #66cc66;\">&#40;<\/span><span style=\"color: #ff6666;\">&quot;turkeytrotstowaterwhereistaskforcethirtyfourtheworldwonders&quot;<\/span><span style=\"color: #66cc66;\">&#41;<\/span>\n<span style=\"color: #808080; font-style: italic;\">-- turkey trots to water where is task force thirty four the world wonders<\/span><\/pre><\/div><\/div><\/div>\n\n<p>Again, you may get more possible solutions if you use a different dictionary.<\/p>\n<h2>Code, next steps<\/h2>\n<p>As always, the complete code is <a onclick=\"javascript:pageTracker._trackPageview('\/outgoing\/gist.github.com\/randrews\/26d4a5cd2a59b95bb376');\"  href=\"https:\/\/gist.github.com\/randrews\/26d4a5cd2a59b95bb376\">on Github<\/a>. If you want to play with it, try messing around with the <code>print_tree<\/code> function: look for the path through the tree with the highest average word length (hint: the total word length will always be constant).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A few days ago at work, a friend of mine came up with this problem. I call it the &#8220;telegram&#8221; problem: you get a string consisting of words run together without spaces, and figure out where spaces can go to separate out the words. For example, a famous misunderstanding: parse_telegram&#40;&quot;turkeytrotstowaterwhereistaskforcethirtyfourtheworldwonders&quot;&#41; &#8212; turkey trots to water [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=\/wp\/v2\/posts\/60"}],"collection":[{"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=60"}],"version-history":[{"count":0,"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=\/wp\/v2\/posts\/60\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=60"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=60"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.playwithlua.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=60"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}