I sent several long e-mails over the last couple days. I'm sure the recipients find them annoying, because looking at them now annoys me. The take-home messages are that Microsoft hyper-v licensing sucks, LC does a poor job publishing authority records, and I'm dubious about the value of the Extensible Catalog.
I also published updates to the ebscoX and vygr2vfnd tools to Google code.
--- Reuben Pasquini 3/22/2010 1:09 PM --- Hi Hellen! I downloaded the authorities data available from http://id.loc.gov , and it looks like the download only includes "subject" authorities. That's also what LC indicates on their web site: http://id.loc.gov/authorities/about.html I extracted a summary of the data here: http://devcat.lib.auburn.edu/tests/summary.txt We could write a tool to harvest authority records from LC's Voyager backed authority web site: http://authorities.loc.gov/ , but it's not an elegant solution. I sent the following e-mail to LC. I'll let you know if I get a reply. Cheers, Reuben --------------------- Thank you for posting the LCSH authority data at http://id.loc.gov/authorities/ I notice that the id.loc.gov data set only includes subject data. Does LC offer bulk access to its name, title, and keyword authority databases, or is that data only available one record at a time from http://authorities.loc.gov/ ? Also, does LC publish the updates to its authority database via RSS ? We would like to setup a mechanism where we can automatically synchronize our catalog's authority database with LC's database. The combination of bulk-download plus RSS updates would give us everything we need. Thanks again, Reuben
--- Reuben Pasquini 3/23/2010 10:24 AM --- Hi Rod! I just got some more information from Jon about hyper-v, and it looks like it's more trouble than it's worth. Sorry to put you through all the work and e-mail. I'm now inclined to just install d-space directly onto your Windows 2008 server. I can drive out there some morning, and we can just install everything together. That will probably take a couple hours to get a basic server up, and we can spend another couple hours to setup a Tuskegee look and feel. I can come out again a week after that to try to answer any questions that come up after you play around with the server a bit. What do you think ? Cheers, Reuben --- Jon Bell 3/22/2010 4:01 PM --- ... The licensing for Windows Server 2008 R2 Hyper-V, as far as I can tell, requires that you do not use the server for any other role (web server, file server, shares, etc...) while you are using the 4 free licenses. Keeping a test VM here for a long length of time would impede us from using the server for a Ghosting server, for which it was originally purchased. I was under the assumption you'd build it here and hand it off to them, then we could start to use the server for Ghosting. It looks to me that with all this licensing troubles the simpler solution would be Linux and VMware Server 2.0, both free. Jon --- Reuben Pasquini 3/22/2010 1:32 PM --- Hi Rod, Hope you're having a good Monday. I just wanted to give you a quick update on the Tuskegee d-space project. At this point we believe we'll require a license to run Windows in a virtual machine. Jon, one of our IT experts, is trying to verify that, but that's our impression. Assuming Microsoft requires a license to run Windows in a VM, then we'll need a license from Tuskegee to setup the VM here at Auburn. If we setup the VM with an Auburn license, then when we transfer the VM to your server Tuskegee will be running an Auburn-licensed server, which probably violates Auburn's contract with Microsoft. This license thing looks like a mess. If you have easy access to two Windows licenses (doesn't matter which flavor of Windows) that you can send us (one for the production VM, another for the test VM), then we can continue down the road of running Windows in a VM. If you do not have easy access to Windows licenses for the virtual machines, then we can change our plan to either run d-space in a Linux VM (Linux is free - no licensing), or install d-space directly onto your Windows Server 2008 box without a VM. The linux option is my preference, but we can make either way work. Anyway - let me know whether you can supply a couple Windows license keys or install-media to us if we need it, or which of the other options you prefer if Windows in a VM won't work out for us. Cheers, Reuben
--- Reuben Pasquini 3/23/2010 11:59 AM --- Hi Dave, I'm Reuben Pasquini - one of Auburn's software people. I'll try to lay out my impression of what XC is, and what I think we at Auburn would hope to get out of it if we invest in it. I'll be grateful if you have time to read over this - let us know what you think. I don't speak for anyone else at the Auburn library. I'm excited about the potential for us to join a team of developers to work on software tools for use at multiple libraries. It seems a shame for small software teams at different libraries to code different solutions to the same problems, rather than somehow combine our efforts and attack more and bigger challenges. On the other hand, I'm not convinced that XC in itself offers something useful to us at Auburn. It's not my decision, but I suspect we will only join XC if we see a way we can directly use the XC software in our environment. We would also not be happy to merely accept coding assignments from the core XC team; we would require input into the design, development process, and priorities of the project. I'm involved with several initiatives at Auburn. We have deployed a Vufind discovery service http://catalog.lib.auburn.edu/code/ , and a d-space based ETD server http://etd.auburn.edu/ We hope to make progress on an institutional repository this year http://repo.lib.auburn.edu/ We also want to explore ERM solutions, open-source ILS, and we're beta testing a commercial discovery service that integrates article-level search across many of the e-journals and databases we subscribe to. Here's my general impression of what XC is based on watching the screencast videos ( http://www.screencast.com/users/eXtensibleCatalog ) and a quick browse of some of the code on google. *. A SOLR-based metadata database *. Java tools for OAI and ILS metadata harvesting into SOLR using XSL for format conversion *. PHP-in-drupal web tools for interacting with SOLR and the importers XC is not an ILS, an ERM, an IR, or a federated search engine. XC could support article-level search only if we can get the article contents into SOLR. Drupal brings along a platform of services, but Auburn has access to a commercial CMS via the university, and we have a lot of in house experience with media-wiki and word press, so we're not particularly excited about drupal. Does that sound right ? If so, then integrating XC with Vufind is one possible area of collaboration. XC seems very similar to Vufind. Vufind stores metadata in a SOLR server, has a PHP and javascript web frontend that accesses that data, and java-based tools for harvesting metadata into SOLR. We customized our Auburn Vufind instance http://catalog.lib.auburn.edu/ to remove its dependency on MARC. We have an XSL-based crosswalk in place to harvest data from our Content-DM digital library: http://diglib.auburn.edu/ , ex: http://catalog.lib.auburn.edu/vufind/Search/Home?lookfor=Caroline+Dean&type=all&submit=Find There's slightly dated information on some of our code here: http://catalog.lib.auburn.edu/code/ http://lib.auburn.edu/auburndevcat/ The core http://vufind.org project has several weaknesses including variable code quality, lack of regression tests, and the necessity for implementors to customize the code base. The vufind code Auburn runs is different from the code Michigan or Illinois run. We have posted our code online, but most other sites do not. The choice of PHP as an implementation language and the lack of discipline in locking APIs means that merging updates from the core project into our code base is often a chore, so it's easier to just fork and grab random patches that interest us. I could imagine a 12 month XC-Vufind collaboration something like this: *. Phase 1 - integrate Auburn Vufind with XC x. Merge Solr schema x. Merge import tools. Leverage XC import manager if such a beast exists for ILS import and OAI harvest into SOLR x. End of phase 1 - XC participating libraries all run XC-Vufind discovery instances *. Phase 2 - cloud support x. Extend XC-Vufind to support multiple SOLR backends x. Host a shared SOLR server that holds the common open-access records that all the XC-Vufind libraries search against x. Integrate easy cross-institution ILL in the discovery layer Anyway - that's one way I could image using XC at Auburn. I imagine that you and the XC team have a different vision for the project. I'm interested to hear what you have in mind via e-mail before we commit to attend your meeting. Cheers, Reuben
No comments:
Post a Comment