Skip to content

davstott/davsFilters

Repository files navigation

Dav's initial wanderings after the Intro to AI course, finally put into a repos to keep things tidy and sensible. Expect incomplete randomness.

It feels possible to use less ram with the merge sort

The ngram code is starting to come together a little bit, although I've only trained it with one of Google's files so far. Using 2-grams doesn't give much contrast between English and French, but limiting it to 3 and 4 does better

echo "the queen sometimes likes pie" | python ngramQuery.py
echo "en mon vacance, je m'apelle Dav" | python ngramQuery.py


charCountGraph.py filename1 filename2 ... filenameN
  uses matplotlib to overlay line graphs of the character frequencies in a list of filenames or a single standard input.  Useful to compare between languages.

charFreqCreate.py
  reads the data files englishFreqs and otherFreqs and creates a python structure containing letter frequencies by language, stored in the data file pickledCharFreqs

charFreqCount.py filename1 filename2
  using the letter frequency counts stored in pickledCharFreqs, reads all the data from all the filenames specified and returns a relative probability score for each language. 
  The scores are not in any particular units (well, technically natural log postieror probabilities) so just use the numbers to sort the language names, but a higher number means a greater likelihood that the provided text matches any particular language.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages