<p>Im using Lucene1.4.2 to build my system. But my system is not build a search engine like wat lucene web done that search result is displayed based on the query from user. What im going to build is a text analyst that i ll retrieve the chat discussion from online (Internet Relay Chat) IRC and analyze it to get the topic discussion of the chatrooms. </p>
<p>At first, i ll analyse it using pre-processing process ,ie: stopping and stemming.. And, tis is done using the Lucene1.4.2. I manage to get the stem words from the chatrooms and store it in database. But my problem is the steps after the stopping and stemming process. That i duno how is the coding to add documents for every chatroom and get the <b>document term frequency matrix</b> for that.. and also calculate the <b>term weight</b> and <b>inverse document frequency (idf)</b> and presents the <b>document-terms weight</b> in matrix.. </p>
<p>My question is do Lucene 1.4.2 possible to do that? If yes, could anyone pls giv me some sample code to do tat.. i had read the package org.lucene.analysis.index.. there are TermFreqVec, TermFreq class sth like tat.. bt im nt really udrstd the implementation of these classes, how do i cal them to suit my system.. Which class or interface should i call first.. and the steps as well...how is the codes to add documents?? </p><br>