Each Trie-Node has a single parent except for the root of the Trie which does not have a parent. Each Trie-Node stores a count and a sequence of Nodes, one for each element in the alphabet. A Trie is a tree-based data structure designed to store items that are sequences of characters from an alphabet. You are required to implement your dictionary as a Trie (pronounced “try”). If the strings “apple”, “Apple”, and “APPLE” each appear once in the input file, then your representation of the word (it could be any of the three words or some other variation) would appear once and would have a frequency of 3 associated with it. That is, a new or candidate word matches a word in the dictionary independent of the case of the letters in the respective words. When storing or looking up a word in the dictionary we want the match to be case insensitive. Instead of storing a percent, your program need only store the number of times the word appears in the text file. The dictionary will contain every word in the text file and a frequency for that word. Every time your program runs it will create the dictionary from this text file. The text file contains a large number of unsorted words. Our dictionary is generated using a large text file. Our spell checker will only validate a single word rather than each word in a list of words.įor this program we need a dictionary similar to Google’s. In this project you will create such a spell corrector. it is not found in the dictionary) Google suggests a “similar” word (“similar” will be defined later) whose frequency is larger or equal to any other “similar” word. To do this it associates with every word in the dictionary a frequency, the percent of the time that word is expected to appear in a large document. It not only checks against a dictionary, but, if it doesn’t find the keyword in the dictionary, it suggests a most likely replacement. Google provides a more powerful spell corrector for validating the keywords we type into the input text box. For most spell checkers, a candidate word is considered to be spelled correctly if it is found in a long list of valid words called a dictionary. JUnit test files were included by BYU professors and the code is tailored to pass those tests. Uses a dictionary of words and spell checks individual words.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |