get-mbooks.pl
I few months ago I wrote a program called get-mbooks.pl, and it is was used to harvest MARC data from the University of Michigan’s OAI repository of public domain Google Books. You can download the...
View ArticleCode4Lib Journal Perl module (version .003)
I hacked together a Code4Lib Journal Perl module providing read-only access to the Journal’s underlying WordPress (MySQL) database. You can download the distribution, and the following is from the...
View ArticleGoogle Onebox module to search LDAP
This posting describes a Google Search Appliance Onebox module for searching an LDAP directory. At my work I help administrate a Google Search Appliance. It is used index the university’s website. The...
View ArticleText mining: Books and Perl modules
This posting simply lists some of the books I’ve read and Perl modules I’ve explored in regards to the field of text mining. Through my explorations of term frequency/inverse document frequency...
View ArticleLingua::EN::Bigram (version 0.01)
Below is the POD (Plain O’ Documentation) file describing a Perl module I wrote called Lingua::EN::Bigram. The purpose of the module is to: 1) extract all of the two-word phrases from a given text,...
View ArticleLingua::EN::Bigram (version 0.02)
I have written and uploaded to CPAN version 0.02 of my Perl module Lingua::EN::Bigram. From the README file: This module is designed to: 1) pull out all of the two-, three-, and four-word phrases in a...
View ArticleWhere in the world are windmills, my man Friday, and love?
This posting describes how a Perl module named Lingua::Concordance allows the developer to illustrate where in the continum of a text words or phrases appear and how often. Windmills, my man Friday,...
View Article