Click here to get back home

Announcement: Text::Statistics::Latin 0.04

 HomeNewsGroups | Search | About
 comp.lang.perl.modules    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Announcement: Text::Statistics::Latin 0.04 Rodrigo Panchiniak Fernandes 07-09-2007
Posted by Rodrigo Panchiniak Fernandes on July 9, 2007, 3:54 pm
Please log in for more thread options


Text::Statistics::Latin 0.04 has been released.

Description:


Text::Statistics::Latin creates a seven column CSV file output with
one line each
token per text, given as input an utf8-latin coded corpus that files
names follows:
1 (1). txt', '1 (2). txt', ..., '1 (n).txt' or the pattern
1 \(([1-9]|[1-9][0-9]+)\)\.txt
Columns stores statistical information:
(1) number of word forms in document d;
(2) number of tokens in d;
(3) Id number of d, ie., n;
(4) frequency of term t in d;
(5) corpus frequency of t ;
(6) document frequency of t (number of documents where t occurs at
least once);
(7) t, UTF8 latin coded token-string

Main output file name is '1 (n + 5).txt' and it is stored in the same
directory as the corpus itself, together with residual files on each
input file with .txu and .txv ad hoc extensions.

Example:
use Text::Statistics::Latin;
&LATIN("4"); #3 (4-1) texts will be analised.

Note:

(1) 1 \(([1-9]|[1-9][0-9]+)\)\.txt is the pattern Windows Explorer
uses when renaming sets of files.
(2) This module can be used for testing information retrieval
weighting functions or text indexing.

Research supported by CAPES BEX-09323-5



Similar ThreadsPosted
ANNOUNCEMENT: Text::Statistics::Latin 0.04 July 9, 2007, 3:52 pm
Suggested module: Statistics::ROC::MCC June 26, 2008, 4:34 am
Suggested module: Statistics::FisherFormula June 26, 2008, 4:35 am
ANNOUNCEMENT: POE 0.31 Released April 19, 2005, 4:40 pm
ANNOUNCEMENT: Compress::Bzip2 2.08 May 11, 2005, 11:26 pm
ANNOUNCEMENT: Compress::Bzip2 2.08 July 9, 2007, 3:51 pm
ANNOUNCEMENT: POE 0.32, an event driven component framework August 6, 2005, 4:29 pm
New module: Crypt::MatrixSSL - semi-announcement, and PAUSE uploading quesitons January 26, 2005, 12:15 pm
text-chm May 6, 2005, 10:53 pm
Help reading PDF to get text... November 26, 2004, 3:50 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap