Sunday, September 5, 2010

2010/08/23-26,08/30-09/02 - chugging away

I was absent minded the last couple weeks, and completely forgot to update my bLog. Looking back through my e-mail it looks like I spent most of my time on a few things. First, I helped Clint implement a few vufind patches before he kicked off a new full copy from voyager to vufind.

On Thursday 08/26 Aaron, Bonnie, Midge, and I met with Drew and Iryana from the assessment office to further discuss their plans to implement digital measures to standardize the faculty annual review and assessment process. Auburn University will require her faculty to submit material for annual review to our digital measures server from which department heads, deans, and university management may generate reports and metrics for teaching and research performance for individual faculty on up to the university level.

At the meeting we discussed how assessment's digital measures project might dovetail with the library's nascent plans to implement an institutional repository through which faculty may publish research artifacts for open access by the general public. It turns out that the university's new guidelines for university funded research requests that faculty make their funded research available for open access either through Auburn's own "scholarly repository" (which the library supposedly is developing) or some other online repository.

Anyway, the digital measures project and the open access clause in the funding guidelines have combined to make Bonnie and Aaron very excited that the library quickly establish a repository for faculty research. Some library staff will join Drew and Iryana's group of digital measures beta testers, and I'll setup some collections in our DSpace server on repo that will be ready to accept faculty research papers.

Although I spent a lot of words describing the digital measures effort above, I actually spent much more time over the last two weeks on the tool to marry Claudine's scans of ACES documents with the appropriate MARC records in Voyager from which she wants to pull metadata. Each metadata file from the scanner specifies a title that includes series information like, "Bulletin No. 5 of the Alabama Soils Bla bla: Peanut Performance in Alabama" while the matching MARC record stores the "Peanut Performance in Alabama" part as a title, and stores the series information in other parts of the MARC record. Anyway, from the start I should have just used Lucene or tapped into the Sorl server that backs our Vufind install at http://catalog.lib.auburn.edu, but I thought I already had the set of 1200 or so MARC records to draw from, so I wound up implementing my own simple title-search algorithm. Everything looked good for my set of 4 test records, but when I ran a sanity check against the full set of over 1000 scans, it turned out that many of the MARC records I had pulled based on an earlier exchange with Claudine were not needed, and many needed MARC records were missing. I eventually collected a larger set of MARC records to draw from, but the amount of data that I'm working with now is large enough that my simple title-searcher takes a long time to run over the full record set. Next week I'll run some more tests over a subset of the data, then run the full data set into a Google docs spread sheet, and ask Claudine to go through the title matches, and manually correct the MARC bib-id's for titles that the code does not correctly match. Once we have that spread sheet, I'll feed it to the code I have to generate the dublin core for import into d-space. Anyway - the code is cool and fun, but it's freaking frustrating to trip over so many problems for this stupid little project that could have been done by hand by now.

Finally, I made some good progress on the minnows morphology database project we're working on with Professor Jon Ambruster in biology. I read over an introduction to "geometric morphometrics", and I think I came up with a good way of tricking our Minnows DSpace server to store morphometric data in a dublin core field attached to an item record for a minnow sample's reference photograph. We'll see what Jon thinks next week.

No comments:

Post a Comment