The Artima Developer Community
Sponsored Link

Java Answers Forum
word count/tokenizer/comparator

1 reply on 1 page. Most recent reply: May 19, 2003 10:46 AM by chad yu

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 1 reply on 1 page
Victor VanAlter

Posts: 4
Nickname: cadetvva
Registered: May, 2003

word count/tokenizer/comparator Posted: May 3, 2003 9:52 PM
Reply to this message Reply
Advertisement
Sir/Madame:

I am working on a program that is to take user text, put it in a sorted array, then display words and their frequencies. I am getting stuck on the frequencies-- I assume I need to make an int[], but am confused how to return both the word(String) and frequency(int) to the tester program. I visualize a two dimensional array, but cannot pull it together for coding. I have read the postings for word counts and the recently answered Set question Duane posed, but I am still unsure how my code should look.

CODE:
import java.util.*;

public class StringInfo extends Object
{

private StringTokenizer st;
private String string;

public StringInfo(String string)
{

this.string = string;
}

public String listWords()
{
st = new StringTokenizer(string, " \t\n\r\f.?,;:!");
String returnString = "";

while(st.hasMoreTokens())
{
returnString = returnString + st.nextToken() + "\n";
}

return returnString;
}

public int wordCount()
{
st = new StringTokenizer(string, " \t\n\r\f.?,;:!");
return st.countTokens();
}

public String listWordsInOrder()
{
String returnString = "";
st = new StringTokenizer(string, " \t\n\r\f.?,;:!");
String[] sa = new String[wordCount()];
int index = 0;




while(st.hasMoreTokens())
{
sa[index++] = st.nextToken();
}


Arrays.sort(sa);

for(int i = 0; i < sa.length; i++)
{
returnString = returnString + sa + "\n";
}

return returnString;
}

}




CODE FOR TESTER PROGRAM:
public class StringInfoTester extends Object
{

public static void main(String[] args)
{
StringInfo si = new StringInfo("Hello, my name is Victor. \n What is your name?");
System.out.println(si.listWords() + "\n");
System.out.println("There are " + si.wordCount() + " words in the above list.");
System.out.println(" ");
System.out.println("The sorted list is: ");
System.out.println(si.listWordsInOrder());
}
}


I am also wondering how I might use the Comparator Interface with this program.

Any help would be very much appreciated!

CVA


chad yu

Posts: 4
Nickname: ccy
Registered: May, 2003

Re: word count/tokenizer/comparator Posted: May 19, 2003 10:46 AM
Reply to this message Reply
Please see the program:

import java.util.*;

public class TestWordFreq {
String text;
StringTokenizer tokens;
Hashtable freq = new Hashtable();
int count = 0;
String[] orderedWords = null;

public TestWordFreq(String text) {
this.text = text;
tokens = new StringTokenizer(text, " \t\n\r\f.?,;:!\"\'");
while (tokens.hasMoreTokens()) {
count++;
String st = (String) tokens.nextToken();
//System.out.println(st);
if (freq.containsKey(st)) {
int v = ((Integer) freq.get(st)).intValue() + 1;
freq.put(st, new Integer(v));
} else {
freq.put(st, new Integer(1));
}
}

orderedWords = new String[freq.keySet().size()];
int i=0;
Enumeration keys = freq.keys();
while (keys.hasMoreElements()) {
String key = (String) keys.nextElement();
orderedWords[i++] = key;
}
Arrays.sort(orderedWords);
}

public static void main(String[] args) {
String s = "this is the test and this is a \"very very\" long text line including abcdefghijklmnopqrstuvwxyz and 0123456789 and pi = 3.14159265358979323 and \'epsilon\' = 1e-40 some math functions: sin, cos, tangent; log, pow and others: costh, sinth, and tangenth.";
TestWordFreq twf = new TestWordFreq(s);
System.out.println(">>>Total words = " + twf.count);
System.out.println(">>>Word frequency: ");
Enumeration ks = twf.freq.keys();
while (ks.hasMoreElements()) {
String k = (String) ks.nextElement();
int v = ((Integer) twf.freq.get(k)).intValue();
System.out.println(k+":\t"+v);
}
System.out.println(">>>Ordered Word: ");
for(int i=0; i<twf.orderedWords.length; ++i) {
System.out.println(twf.orderedWords);
}
}
}

If you want some other algo for sorting, implement your own Comparator interface and then use Arrays.sort(Object[], Comparator) instead of Arrays.sort(Object[]).

Flat View: This topic has 1 reply on 1 page
Topic: Save a drawing when minimizing or resizing a Frame Previous Topic   Next Topic Topic: calander

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use