ngrams, a Python code which analyzes a string or text against observed frequences of ngrams (strings of 1, 2, 3, 4, or 5 alphabetic characters.)
The information on this web page is distributed under the MIT license.
ngrams is available in a Python version.
german, a dataset directory which contains some short German texts;
ngrams, a dataset directory which contains information about the observed frequency of "ngrams" (particular sequences of n letters) in English text.
text, a dataset directory which contains some short texts in English;
text_to_wordlist, a Python code which shows how to start with a text file, read its information into a single long string, and divide that string into individual words. This allows an investigator to analyze the text for patterns.
words, a dataset directory which contains lists of words;