Java Answers Forum - Code termFreq, documents in Lucene

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Answers Forum
Code termFreq, documents in Lucene

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

BM

Posts: 1
Nickname: plumbird
Registered: Jan, 2005

Code termFreq, documents in Lucene

Posted: Jan 18, 2005 6:40 AM

Im using Lucene1.4.2 to build my system. But my system is not build a search engine like wat lucene web done that search result is displayed based on the query from user.
What im going to build is a text analyst that i ll retrieve the chat discussion from online (Internet Relay Chat) IRC and analyze it to get the topic discussion of the chatrooms. 

At first, i ll analyse it using pre-processing process
,ie: stopping and stemming.. And, tis is done using the Lucene1.4.2. I manage to get the stem words from the chatrooms and store it in database. But my problem is the steps after the stopping and stemming process. That i duno how is the coding to add documents for every chatroom and get the document term frequency matrix for that.. and also calculate the term weight and inverse document frequency (idf) and presents the document-terms weight in matrix..


My question is do Lucene 1.4.2 possible to do that? If yes, could anyone pls giv me some sample code to do tat.. i had read the package org.lucene.analysis.index.. there are TermFreqVec, TermFreq class sth like tat.. bt im nt really udrstd the implementation of these classes,
how do i cal them to suit my system.. Which class or interface should i call first.. and the steps as well...how is the codes to add documents?? 

Thx to those kindly for help and reply..

Previous Topic

Next Topic


	Web Artima.com