Bush Library Mechanic aka JCU Library Technologies

Thursday, August 13, 2015

Urquhart's Law: really?

Sorry, this is just fluff, but this morning's Campus Morning Mail another bit of fluff that I felt drawn to comment on:

Urquhart’s (really old) other law

CMM is a great believer in Urquhart’s Law of Political Communication, (as per the original House of Cards) that the correct response to scandalous statements is “you may say that but I could not possibly comment.” But it turns out another Urquhart had a different law. A reader found it and wanted to know what it means; “the inter-library loan demand for a periodical is as a rule a measure of its total use.” Readers under 50 ask an old librarian why anybody would need to get a library to loan a printed journal

And mystupidbrain wondered about the common sense assumption of using inter library loan to measure a journal's use. Even in a print world that makes no sense. ILL is for items you don't have in your library. Arguably what you do have in your library is the most used stuff, or the stuff your clients most want (because it's easier than sending ILL requests).

Assuming other libraries do the same, and that there are a core collection of journals everyone has (say, for example, New Scientist) - so you will never get a request for it from another library. So to say the journal is low-use because no-one ILLs it is patently flawed logic.

What ILL requests might tell you is how unique a particular subscription is - you may be the only library subscribing in the country. I think your outgoing ILL requests tell you a lot more about your collection, and its relationship to your clients' needs, than incoming requests ever could.

Of course all that runs through my head without knowing anything about the context of Urquhart's Law - I'd never heard of it. So I looked it up.

Donald J. Urquhart (1909-1994) (BSc, PhD) was a revered figure in British library circles. He founded the National Lending Library for Science and Technology in the early 1960s, against the opposition of the then current thinking of much of the library profession. It was a response to the post-war explosion in scientific information. Eventually the NLLST was absorbed by the British Library in 1973 to become the revered British Library Document Supply Service.

Urquhart's Law was an application of scientific methods (specifically probability) to collection management and document delivery for an entire (huge) sector of research - not a particular library's clientele. He published it in 1977, it was still being referenced in library science lit in the last decade.

I feel like a first year library student again. I know nothing John Snow.

Urquhart's autobiographical history of his creation - available via Amazon (who are the source of the image)

Monday, March 2, 2015

Readings & Past Exams/Reserver Online/Masterfile access issues

Not of interest to anyone outside of JCU, just using my blog to list the workarounds for a local issue:

Have logged this in ServiceNow:

Connection to Masterfile is causing warning errors in all 5 major browsers

Apparent cause is mfileprod.intra.jcu.edu.au using masterfile.jcu.edu.au certificate

Affects staff embedding Digital Library items in LearnJCU and the Masterfile web UI

Student access to Readings & Past Exams appears to be unaffected

*Firefox*

Allows login but blocks display of PDFs

Workaround: Grey shield can be used to enable content view

*Chrome*

Your connection is not private

Attackers might be trying to steal your information from masterfile.jcu.edu.au (for example, passwords, messages or credit cards).

This server could not prove that it is masterfile.jcu.edu.au; its security certificate is from mfileprod.infra.jcu.edu.au. This may be caused by a misconfiguration or an attacker intercepting your connection.

Proceed to masterfile.jcu.edu.au (unsafe)

Workaround: click on “Proceed to masterfile.jcu.edu.au (unsafe)”

*IE*

There is a problem with this website's security certificate.

The security certificate presented by this website was issued for a different website's address.

Security certificate problems may indicate an attempt to fool you or intercept any data you send to the server.

We recommend that you close this webpage and do not continue to this website.

Recommended iconClick here to close this webpage.

Not recommended icon Continue to this website (not recommended).

More information More information

Workaround: click on “Continue to this website (not recommended).”

*Safari*

Cannot Verify Server Identity

Workaround: click on “Continue”

Cheers, Alan

Friday, February 6, 2015

Link Resolution Quirk in ATSM Journals (Resolved)

ATSM have fixed this issue (see postscript)
No great drama here, just thought this might interest people who get frustrated with Link Resolvers but don't know who to blame.

This issue reported by a staff member who, using Summon, got a 360 Link to:

Zhang, F., Guo, S., & Wang, B. (2014). Experimental research on cohesive sediment deposition and consolidation based on settlement column system. Geotechnical Testing Journal, 37(3), 20130054. doi:10.1520/GTJ20130054

Which demanded payment (add to Cart and pay $25.) for an article we have subscription access to:

Our sub was active in 360 and you can browse to the article from the eJournal portal and get access. What the heck is going on?

What's Going On?

360 Link resolves to:
http://www.astm.org.elibrary.jcu.edu.au/DIGITAL_LIBRARY/JOURNALS/GEOTECH/PAGES/GTJ20130054.htm
so it's clearly going through EZproxy.
Browsing from the ejournal portal you get this URL:
http://www.astm.org.elibrary.jcu.edu.au/SUBSCRIPTION/DIGITAL_LIBRARY/JOURNALS/GEOTECH/PAGES/GTJ20130054.htm
i.e. SUBSCRIPTION in the exact same path inserted after the domain name.

So why the different URLs for the same article? I'm assuming the ATSM platform has been set up so that objects in the SUBSCRIPTION directory get an IP range check to see if you have access. But for public access to abstract and sales they have cloned everything but the PDF into a DIGITAL_LIBRARY folder in root and have set up the DOI to point at that rather than have the DOI point at IP restricted pages.
From previous experience 360 Link DOI via CrossRef overide any OpenURL 2 publisher conversion scripts that 360 Link might have.

So ASTM's solution is pragmatic and unsophisticated - and hamstrings our users coming from any source other than the journal TOC.

I've submitted a request to Proquest for any workaround they can implement. I thought maybe parsing the DOI and inserting the SUBSCRIPTION folder in the path - but that assumes that 360 Link plays any role in handing the DOI link back to the client. I guess another option is for Proquest to add ATSM to their list of IEDL publishers.

If only there was only one publisher platform for ejournals.

Postscript

I noticed in my endless rechecking of red flagged jobs for follow up that ASTM have fixed this issue by coding a check of the requester's IP address and if it matches a subscriber prompting the user whether they wish to access the full text under that subscription:

The subscriber name in our case looks like it's come from a very old registration with OCLC/NLA but that's a job for another day.

It works the same via the link resolver as it does just doing a straight dx.doi.org resolution (so the 360 Link helper frame isn't causing an iframe conflicts).

I don't know if logging a request through Proquest had any impact as neither Proquest or ASTM have contacted me. But happy to close the job.

Monday, January 5, 2015

Random notes from "Data for ROI and Benchmarking Ebook Collections"

I registered for Library Journals webcast "Data for ROI and Benchmarking Ebook Collections". This webcast can now be viewed On-demand." but as usual couldn't make it but they did record it so - I'm just writing down the points that stuck out for me.

Ying Zhang (Acquistions Librarian from University of Central Florida) did some analysis of her institutions use of ebooks acquired using one of three possible purchase methods. She measured ROI by calculating how many uses ebooks got for each $10 invested in each of the three methods.

Patron Driven Acquisition (PDA)

Titles acquired based on user demand
2.7 uses/$10

Firm

Titles selected and acquired individually
0.5 uses/$10

Package

Large pre-defind static collections acquired as one time purchases
4.4 uses/$10

Although this ROI measure makes it look like 'Package' is the best for ROI Ying makes the point that each method has it's place:

Package has the highest ROI but is stagnant and use will drop over time. The library and it's users can't weed out unwanted titles or add wanted ones.
Firm is how you shape and customise your collection, but has the highest admin costs.
PDA provides a low input mechansim by which the collection updates itself to the user's needs.

Michael Levine-Clark Associate Dean for Scholarly Communication and Collections Services University of Denver examined a some global usage data on 340k+ books from EBL from one provider that covered a bunch of libraries.

He found that titles were roughly evenly spread between Social Sciences, Arts & Humanities, and STEM.With Social Sciences having 15% more titles then STEM, which in turn had about 1.5% more titles than Arts & Humanities.

Reverse engineering from his bar chart I get a 'use ratio' for Social Science titles that is 32% higher than STEM titles (in spite of the higher number of titles) which in turn has a 'use ratio' of 15% higher than Arts & Humanities

This is sort of counter intuitive given A&H perceived reliance on monographs, but as Michael continues to look at the stats interesting things pop up:

STEM titles average far more pages per session (scanning?)
Arts & Humanities spend the most time per session and per book (immersive reading?)
His graph by LC number of number of titles available compared to usage shows that L(Education) titles get he most usage followed by N J T H M R D E, Z gradually tailing off with the last 5 classifications being Q P K U and F

He concludes that just using cost per access does not tell the true story of how your resources are being used.

Tuesday, April 1, 2014

Knowledge Unlatched - Open Access for all, funded by libraries

We've become a charter member of the 'Knowledge Unlatched' scheme/project/pilot along with over half of Australian university libraries (we finally beat New Zealand at something. Just.).

Basically it's an attempt to pay publishers for book production to make items open access (via Creative Commons licensing). The FAQ explains the what, why, how, who, when. The site says the items will be available through OAPEN.org and the Hathitrust - though my rudimentary tests show that so far they are just available through OAPEN. Interestingly the titles can also be purchased (from publishers, Amazon etc). Also interesting is that they are in Google Books, but not as full text.

Membership of the pilot has closed, but they are taking expressions of interest for the next stage. The more libraries that join the less we all pay per OA book . That sounds appealing to me.

The pilot worked out at about $46 per book - and we've purchased global open access for all people, not just our constituent populations.

I also like that publishers can still sell the books if someone wants to own a copy. The partner publishers are for the most part University Presses but with success maybe it will be seen as a viable model for others.

KU in 60 Seconds

I sent a feeler out to Proquest because I suddenly had nightmare visions of IP blocking of Hathitrust content for non-North American countries. Not realising that of course Proquest have long since sorted the Hathitrust into multiple collections to bypass this issue and we track the global open access part in 360 Core for inclusion in Summon and 360 Link.

Anyway one of my key themes for this year's work is increasing exposure of Open Education Resources through Summon so getting KU in our instance of Summon sounds like a nice kick start.

Monday, March 31, 2014

Windows 8, EZproxy and Secret Sauce AdWare

As of writing we've had 3 reported cases of Windows 8 not displaying content through EZproxy. Basically any proxied content doesn't display - no errors, no friendly error messages, just a blank screen. Everything else 'webby' works fine.

With 360 Link 1-Click with the helper frame you see the helper frame (it's not proxied by EZproxy) but the iframe content is blank - viewing source shows that no html has been transported to the browser.

Originally I suspected some sort of mixed content problem (we've had a lot of that this year in Chrome and Firefox) but no grey shields - or possibly a caching issue (we've had problems with expired EZproxy sessions being cached so students are denied access to content but are not prompted to start a new session. Scarily it affects all three of the main browsers (IE, Firefox and Chrome). Even connecting to the EZproxy login screen gives a blank screen.

But thanks to ranking smart guy at our IT Helpdesk Anthony Warrell we have a diagnosis and resolution. As he says:

The culprit turned out to be a piece of Ad-ware called Secret Sauce which had somehow managed to sneak past Norton 360 and find its way onto her computer. The tool I used to remove it is called ComboFix. I would not recommend providing the client with self-help instructions on how to clean their machine but have them obtain IT assistance with running ComboFix and removing the Ad-ware.

Instructions on using ComboFix can be found here: http://www.bleepingcomputer.com/combofix/how-to-use-combofix

Malwaretips.com says: "Some of the programs that are known to bundle Secret Sauce include 1ClickDownload, Superfish, Yontoo and FBPhotoZoom". All three students mentioned seeing unsolicited advertisements in their browser. Malwaretips says that product images in Facebook will display a 'See Similar' button that links to the ads they earn money for clicks.

We're fortunate that ITR are willing to perform the clean for on campus students but I'm not sure how we'll help an off campus student - particularly when part of the process is restarting Windows 8 in safe mode, so RDC or similar remote operation isn't an option.

I hunted high and low for any mention of the way Secret Sauce interferes with EZproxy and came up with nothing. Anyone else at other institutions seeing this? No mention of it in OCLC's user support areas. Are we the canary in the coalmine or is there something unique about our network environment exacerbating this?

I suspect that Secret Sauce doesn't like EZproxy rewriting URLs but I really don't know. I have no idea how widespread Secret Sauce is but have written to OCLC asking if by any chance they've heard about it.

Obviously prevention would be the ideal solution, but how do you get people to be more suspicious of free software, especially when half the time we're advocating for open source and open access? Even something as seemingly obvious as always selecting the 'customise' option when installing software? Even Java wants to install the Ask Toolbar.

Anyway I'm posting this for the future you who is googling 'Windows 8 EZproxy blank screens'. I hope it helps.

Sunday, February 9, 2014

VALA Day 1 Wrap Up

The first plenary was delivered by Christine Borgman (UCLA) talking us through the issues around research data management.

Persistent URL: http://www.vala.org.au/vala2014-proceedings/vala2014-plenary-1-borgman

She laid out some cautionary thoughts - librarians thinking they can handle 'data' the way we handle other information bundles is naive in the extreme. She was complimentary of Australia's 'broad' approach to research data management, but a black box approach to data obscures the complexity of what we are dealing with.


A preview of Borgman's upcoming book 'Big Data, Little Data, No Data

No-one can actually define data and even examining specific cases multitudes of complexity occur:
Different fields have different technologies, behaviours, and structures for dealing with, presenting and interpreting data. Different geolocations within a field will have significant variances.

Some of the issues to be dealt with revolve around the personal. Researchers often feel the data they have is their 'dowry' and makes them valuable to their institutions and sharing diminishes its value. Others don't mind sharing but they want personal contact with the person they are sharing for a number of reasons, and feel uncomfortable with unmediated access. On the receiving end researchers don't necessarily like using someone else's data because of issues of trust.

When you start examining the data, or as I coined it 'metareality', even more issues of complexity arise. One neat example was the fact that there are several taxonomies for describing drosophila genomes - how open is the data if it's coded in a way the user doesn't use?

And what about access? At what level of granularity do you provide access? A single link to a zip file of everything? URIs for individual files, or as Borgman mentioned 'A DOI for every cell in a spreadsheet'? What about format? What role do we have in ensuring the format is accessible over time?

What are the effects of knowledge infrastructures (institutional, national, discipline) on authority, influence and power? Recommended reading: http://knowledgeinfrastructures.org/

A strong case was made for skill sets not normally associated with librarians - economics, records management (determining at time of storage how long you're going to store it) and archival practice (provenance is a huge factor in building trust). It's long been said that some areas of JCU assume the Library will take on maintenance of research data management - we really need to be aware of just what is involved and how much we will need to be embedded in research areas as partners and the resources and skills that will demand. In an aside Christine mentioned a 200 page book telling you how simple data citation is!

Hacking the library catalogue: a voyage of discovery

Persistent URL: http://www.vala.org.au/vala2014-proceedings/vala2014-session-2-kreunen

A kind of interesting talk from UniMelb's University Digitisation Centre (UDC) about the work it's 5.4 staff do particularly in trying to manage metadata for scans from the print collections and how they managed to scrape data from the catalogue as automatically as they could. I could relate to the speedbumps they kept hitting with things that should work but don't, for example they figure out how to get their catalogue to present a record in XML, but the scanning software wouldn't accept the XML as valid, but the vendor insisted it was, so workaround on workaround.

Marking up NSW: Wikipedia, newspapers and the State Library

Persistent URL: http://www.vala.org.au/vala2014-proceedings/vala2014-session-2-cootes

A cool project involving the NSW State Library and some interested public libraries that identified 20 volunteers to enrich wikipedia entries for local newspapers that had been digitised and added to Trove. Lots of liaison with Wikipedia because that community is very sensitive lots of changes in a narrow section. So the word was sent out that the volunteers would be doing work and that other contributors should be 'nice' while they found their feet. One interesting observation was that it was 'unlibrarianlike' to have to accept that once you've added content in wikipedia it is no longer 'yours' and you have no veto over how it's changed, any more than any of the thousands of people who actively contribute to wikipedia.

I think we should consider this as another step in processing records for NQHeritage. I think as a profession we should be augmenting and improving wikipedia with reputable resources. As was pointed out: Wikipedia is the 5th most popular site on the web - users are going to use it so lets make it better - and expose our resources through this incredibly popular portal.

Journey into the user experience: creating a library website that's not for librarians

Persistent URL: http://www.vala.org.au/vala2014-proceedings/vala2014-session-4-murdoch

I did a standing ovation at the end of this one. Loved that they went all Steve Krug on usability testing. Loved the heat map (I will have find a way of creating one for our site through the Google Analytics data). Which effectively showed that 3 links were pretty much covering 90+% usage.

Jealous that their university web/marketing section trusted them enough to build templates outside the corporate templates (in fact the library laid the path for the rest of the university.

Check out what they ended up with!
http://www.library.aut.ac.nz/

Influences of technology on collaboration between academics and librarians

Persistent URL: http://www.vala.org.au/vala2014-proceedings/vala2014-session-5-pham

Report about an indepth case study of acadamic/librarian relations revolving around information literacy training and the learning management system at Monash. Pleased to say we have moved beyond some of the problems identified (like lecturers not seeing why a librarian doing IL for their students would need access to the subject in Blackboard) but I think our advances are patchy.

Just accept it! Increasing researcher input into the business of research outputs

Persistent URL: http://www.vala.org.au/vala2014-proceedings/vala2014-session-5-ogle

A report on the University of Newcastles improvements to automating the HERDC reporting and again it was nice to see we were more advanced in this area - their next big project was to integrate their research database with their institutional repository, something our Library, Research Services and ITR sorted a couple of years ago. I wonder what they would have thought of David Beitey's work on the Research Portfolios.

The day was rounded off by a roller coaster presentation 'Social media as an agent of socio-economic change: analytics and applications' by Johan Bollen from Indiana U

Persistent URL: http://www.vala.org.au/vala2014-proceedings/vala2014-plenary-2-bollen

Johan is probably best known for work showing a correlation between twitter 'mood' and stock market movements. Which came out of a project that assigned 'emotion' to tweets through 'big data' analysis applying Affective Norms for English Words (ANEW) which:

"...provides a set of normative emotional ratings for a large number of words in the English language. This set of verbal materials have been rated in terms of pleasure, arousal, and dominance in order to create a standard for use in studies of emotion and attention."

Turns out that 3 days after a spike in negative feeling in the twittersphere the Dow Jones will drop, and a positive spike will correlate with an increase after the same lag.

Bollen was engaging champion for the wisdom of crowds, pulling anecdotes and finding from all over the place to make his convincing case.

1 in 3 people in the world has internet access - North America topping the list with nearly 80% penetration and Africa at the bottom with it below 20% massive recent growth in Asia and the middle East.

If Twitter 'nation' was a nation only China and India would have more people, and analysing linkages and traffic can tell you much about the macro movements resulting from the micro actions of individuals.

Flickr (orange) and Twitter (blue) map created by Eric Fischer

While some think that the internet, by giving everyone a voice, will turn society into an 'idiocracy' Johan does not share this pessimism and gave a couple of examples where crowds do actually make better decisions than experts, and in fact showed us a mathematical formula (Condorcet's jury theorem) that shows a jury made up of people who are right even slightly more than 50% of the time gets more and more accurate the higher the number of jurors. Then pondered whether the current American political situation was because too many people are right just under 50% of the time which by the same formula magnifies the wrongness of the crowd.

Johan had 80 slides both whimsical and insightful, hope the video is up soon.

The finale was his mojito-fueled solution for research funding that eliminated the tortuous and wasteful process of writing and reviewing of proposals. Basically allocate an equal share of the total amount available to all the researchers but require the researchers to give half to a researcher whose work they think is valuable, a bunch of checks would need to be in place, but you would be crowdsourcing research funding allocation to the researchers themselves. The modelling shows that that any 'waste' would be less than the amount of resources spent in maintaining the current proposal/review system. Personally I worried that a standout researcher would receive way more money than required, but I wasn't Mojito assisted. Johan's explanation was much more cogent than mine.

Some links and thoughts to ponder:

Digital Humanities experiment with big data http://www.themacroscope.org/
Mood analysis shows that negative tweeps tend to cluster in networks with each other
In networking terms a retweet is as valuable as the original tweet
Bollen mistrusts Altmetrics because the major drivers are media outlets, not individuals

For another view of all the bits I missed or mucked up try Deborah Fitchett's or Hugh Rundle's blogs.