I am working on a program that is to take user text, put it in a sorted array, then display words and their frequencies. I am getting stuck on the frequencies-- I assume I need to make an int[], but am confused how to return both the word(String) and frequency(int) to the tester program. I visualize a two dimensional array, but cannot pull it together for coding. I have read the postings for word counts and the recently answered Set question Duane posed, but I am still unsure how my code should look.
public int wordCount() { st = new StringTokenizer(string, " \t\n\r\f.?,;:!"); return st.countTokens(); }
public String listWordsInOrder() { String returnString = ""; st = new StringTokenizer(string, " \t\n\r\f.?,;:!"); String[] sa = new String[wordCount()]; int index = 0;
for(int i = 0; i < sa.length; i++) { returnString = returnString + sa + "\n"; }
return returnString; }
}
CODE FOR TESTER PROGRAM: public class StringInfoTester extends Object {
public static void main(String[] args) { StringInfo si = new StringInfo("Hello, my name is Victor. \n What is your name?"); System.out.println(si.listWords() + "\n"); System.out.println("There are " + si.wordCount() + " words in the above list."); System.out.println(" "); System.out.println("The sorted list is: "); System.out.println(si.listWordsInOrder()); } }
I am also wondering how I might use the Comparator Interface with this program.
public class TestWordFreq { String text; StringTokenizer tokens; Hashtable freq = new Hashtable(); int count = 0; String[] orderedWords = null;
public TestWordFreq(String text) { this.text = text; tokens = new StringTokenizer(text, " \t\n\r\f.?,;:!\"\'"); while (tokens.hasMoreTokens()) { count++; String st = (String) tokens.nextToken(); //System.out.println(st); if (freq.containsKey(st)) { int v = ((Integer) freq.get(st)).intValue() + 1; freq.put(st, new Integer(v)); } else { freq.put(st, new Integer(1)); } }
orderedWords = new String[freq.keySet().size()]; int i=0; Enumeration keys = freq.keys(); while (keys.hasMoreElements()) { String key = (String) keys.nextElement(); orderedWords[i++] = key; } Arrays.sort(orderedWords); }
public static void main(String[] args) { String s = "this is the test and this is a \"very very\" long text line including abcdefghijklmnopqrstuvwxyz and 0123456789 and pi = 3.14159265358979323 and \'epsilon\' = 1e-40 some math functions: sin, cos, tangent; log, pow and others: costh, sinth, and tangenth."; TestWordFreq twf = new TestWordFreq(s); System.out.println(">>>Total words = " + twf.count); System.out.println(">>>Word frequency: "); Enumeration ks = twf.freq.keys(); while (ks.hasMoreElements()) { String k = (String) ks.nextElement(); int v = ((Integer) twf.freq.get(k)).intValue(); System.out.println(k+":\t"+v); } System.out.println(">>>Ordered Word: "); for(int i=0; i<twf.orderedWords.length; ++i) { System.out.println(twf.orderedWords); } } }
If you want some other algo for sorting, implement your own Comparator interface and then use Arrays.sort(Object[], Comparator) instead of Arrays.sort(Object[]).