Episode 14 - Digital access to National Library of Scotland
Transcript
Kirsty McIntosh 0:00
Hello and welcome to the Scottish Tech Army podcast. I'm Kirsty McIntosh.
Graham Johnston 0:04
And I'm Graham Johnston and welcome to episode number 14 of the Scottish Tech Army podcast and we've got a cracking episode this week where we are going to explore a particular project that the Scottish Tech Army supported the National Library of Scotland with.
Kirsty McIntosh 0:19
Yes, hello and welcome to Tech Army towers our volunteer Alec Davis, and the Associate Director of Digital at the National Library of Scotland, Stuart Lewis. Hello, everybody.
Alec Davies 0:30
Hello
Stuart Lewis 0:30
Thank you very much.
Kirsty McIntosh 0:32
Alec, if I could come to you first. You are a volunteer with the Scottish Tech Army. Could you tell us a little bit about how you got here, please?
Alec Davies 0:40
Yes, I walk into the local park walking my dog, Molly and I came across Peter Jaco and we usually talk about old things to do with IT. I was in it for 40 years, believe it or not and he said there was this new organisation being set up, the Scottish Tech Army, and would I like to join? So, yes, I did and I've been doing that. It keeps me out of the pub with Molly, so it's good.
Kirsty McIntosh 1:10
Well, welcome aboard today, thank you very much indeed. Stuart, could you tell us a little bit about the work that you've been doing at the National Library of Scotland in relation in particular to this topic of Opentexts, please?
Stuart Lewis 1:23
Yes. The idea first came about three or four years ago. So we have quite an ambitious strategy at the National Library of Scotland that says we want to have a third of our collections in digital format. Now bear in mind, we are what's known as a legal deposit library, also commonly referred to sometimes as a copyright library - that means we're entitled to a copy of everything that's published in the UK. And a lot of people think, well, that's just books, but it also covers magazines, football programmes, newspapers, pretty much anything that's published in one way or another and a large amount of that even is sort of digital now, so we get around about a million digital publications every single year coming into the National Library. So we have around about 31 million items now, so that's well over a hundred miles of bookshelves that we have. So that's the sort of scale that we work at. And we said we're going to have you know, a third of those collections in digital format, it means we obviously have to do a lot of what we would call digitization. So that's essentially where we take books, we photograph them page by page, and then we can put them online for free for anybody to read. When we were thinking about that problem a few years ago, we were sort of thinking well, dozens of libraries, hundreds of libraries are doing this worldwide and sort of from it would be great, really, if we could avoid duplicating any efforts. So if someone's already digitised the book in another library, then we don't want to do that, we'll move on to other books that haven't been digitised. But it turns out, no one's ever really sort of compiled a list of everything that's been digitised worldwide, and we reckon that number's probably around about 40 million items. And so in a sense, that was sort of where the idea first came from. And then we've been doing a project over the past year or so, it's a government research funded project with Glasgow University, a few other libraries, and an organisation called a Hathi Trust in North America, basically saying, could we do this, so we did a sort of a trial on it. That project finished, we proved we could do it, in theory, and then obviously COVID came along. And with lockdown, and everything that had to close, you know, one of the biggest impacts of that for a lot of people was libraries had to close, you know, everything from school libraries, public libraries, University libraries, national libraries. And whilst a lot of you know, libraries do have digital services, now there's obviously millions of books that are unavailable because of that. And so in a sense, it gave us that impetus to get going with this project, really. And actually, just to say, right, let's gather a list of as many digitised books worldwide as we can get online.
Kirsty McIntosh 3:59
That's absolutely amazing.
Graham Johnston 4:01
It's incredible. And I really want to love that stat that you said earlier, because I missed it. You talked about 31 million. And then you said how many miles tell us how many miles?
Stuart Lewis 4:12
Oh, I lose track, I have a feeling it's about 120 miles, I think that we've got of shelves.
Graham Johnston 4:18
That is incredible. 120 miles of shelves!
Stuart Lewis 4:20
It is and the team that look after the items, they are extremely skilled because as you can imagine if we ever lost one - if we put one of those books back in the wrong place, we'd never find it again, so yeah, we have a fab team that looks after all of those.
Graham Johnston 4:36
I mean, I know what it's like to lose books in my house, right? And I've probably got like less than 100 there's several I can't find so that's, that's unbelievable.
Kirsty McIntosh 4:46
Now I know what I want to be when I grow up! That sounds like the perfect job for me, looking after all those books. That's fantastic. So the project request that you put into the Tech Army asked for help in in which particular, you know to do exactly what?
Stuart Lewis 5:01
Well, so as a library, in a sense, we were able to do a lot of the work with the data and actually compile that data, so sort of work with other libraries that we have relationships with, gather all that data together, so we were able to do that. But what we really needed help with was making that available online. And so actually, yeah, we had three really, really good volunteers, obviously Alec, who you'll be talking to you in a bit, sort of managing it and keeping us all on track. But then we had a designer, Sarah, I think she described herself as a designer who can code, so in a sense it's a brilliant combination and so she was the one that was able to make it look really good. She's quite an expert in usability and UX, graphic design, and so forth, so she was able to bring that experience to it. And then we also work with Brian who's a programmer. He's actually a games programmer, is his background. So something very, very different. But he just took to it sort of, I suppose, like this a duck to water. And he did then essentially make Sarah's design come to life and interact with that data. And make a really usable and very efficient site.
Kirsty McIntosh 6:09
Alec, tell us a little bit about how you how you got this project kind of kicked off and off the ground, and you know, your version of events, if you like.
Alec Davies 6:18
Okay. I was one of the original volunteers, I think I was in the first 100, I think. So the in those days, there was absolutely no admin software at all, basically, you had the site you logged on to and you had all the chats going on in the different areas, project management, business analysts, et cetera, et cetera. And basically, this project came along, and I thought it was really, really quite exciting, I loved the whole concept of it, I thought it would be fantastic. And I don't think Stuart really came out there about why it came out from the COVID point of view, it was the fact that no one could actually go to libraries. That was really the whole point of it. So it meant that researchers and academics and university students etc, could actually go along to see everything online, rather than being physically banned from effectively from getting into the libraries to actually see this stuff. That was the whole point of the project, really, at the start. I basically got involved in it pure and simply because as I said, I thought it was one of the most exciting ones I could see, you know and I like the interesting ones. It's as simple as that from my point of view. And that's really it. So when we kicked it off, I would say that the kickoff was probably the hardest part of the whole project from my point of view. We talked quite a lot, we spent weeks really trying to understand the skill sets of the people that we actually needed, if you agree with me on that one Stuart. You know, it was, we thought we needed programmers, we thought we needed designers, we thought we needed database analysis done, etc. at the beginning. We weren't 100% sure what we're going to do about testing, we wanted publicity etc. And at the end of the day, it really came down to two main people to be quite frank with you, who did all the work. And it certainly wasn't me. And the two main people were really sure on the data side and Sarah Semark, who we've talked about who did all the application work, and really the did 95% of it. And as Stuart said, Brian came along at the end, really it was the last month or so, to help Sarah sort out a lot of the bugs, did a great job as well. But the vast amount of work was Stuart and Sarah, my job if you really want to know what basically organising meetings so we could chat each week, and I organised a whole bunch of testers. I had all my old friends from the IT days, who came in and actually tested the system. And I also organised publicity to go on to the Times newspaper, I'm sure you saw the article in the paper. So that was really my part, in the whole thing.
Kirsty McIntosh 9:03
That sounds like the perfect boss, hands off!
Alec Davies 9:06
Well, again, if people are actually wanting to talk about projects, I mean, I was I was IBM, Hewlett Packard FIS, you know all these big corporate IT shops, you know, and my passion was always to try and keep projects as simple as possible. And you know, and I really tried to do that with this. It was really, you know people don't need to go into all these project management stuff. I mean, I've got project management stuff coming in my ears you know, qualifications. But do you really need to with small projects to have issue logs and risk logs and PMOs and everything else? You don't, Keep it as simple as possible. Use what you need, would be the advice I would give to people who are running a lot of these projects.
Kirsty McIntosh 9:47
Yeah, I think that's very interesting, especially working with volunteers as well is that, you know, one of the one of the joys about volunteering at the Tech Army is that there's no sort of prescriptive methodology. You know, we've tried to keep it as loose as possible that people, people deliver the projects and the best way that they know how, and then they have the opportunity to kind of learn new skills. So that's, that's absolutely fabulous.
Alec Davies 10:13
And Sarah was very... you know because she was key to a lot of this, she was very familiar with GitHub, so that's what we used. I basically said to Stuart and Sarah we'll use whatever tools you're most familiar with, we don't want to start something else off. So that was the way it was done as well.
Graham Johnston 10:32
incredible stuff. So if I'm on the website, now, what do I see? What do I get? How do I explore the content? And sort of what's different now than before you kicked the project off?
Stuart Lewis 10:44
Yeah, so in a sense, I suppose I mean, it, we try to make it as bit of a cliche, maybe I suppose, but as sort of as Google like as possible. So it's a very, it's a very simple search box, put something in there. You know, particularly I suppose, if you haven't got something in mind, you can be very sort of serendipitous and just put some words in and see what comes up. Obviously, if you know very much what you're looking for a title, an author, a publisher, anything like that, you can search for those. And in some ways, actually just to sort of help people because it's actually one of the one of the issues sometimes we have the National Library with 31 million items is almost - where do you start? You know, it's, it's the same when you pick up the, you know fringe flyer each year in Edinburgh it's like, it's just so big, you don't know where to start. And so we've actually got sort of some suggested searches on the homepage as well. So volcanoes in New Zealand, love poems, A Midsummer's Night Dream, they can actually search on those things, just to give you sort of examples of, of what's in there. And then yeah, you go in and basically, it will just list you items from all the different libraries that we've got that sort of match your search.
Graham Johnston 11:50
I've just gone on and as we speak, and just typed in, like, so I'm a Hibs fan, so I typed in - you mentioned football programmes.
Alec Davies 12:00
Right, time to go (laughter)
Graham Johnston 12:01
We've just gained lots of lots of listeners, and maybe lost a few. But, so I just typed in, because you mentioned programmes, which was something that was, you know, interesting, I didn't imagine that would be one of the elements on there. But yeah, it's amazing. You can go back and find previous Hibs programmes, and there are other football programmes that you can go and search as well, right. It's not just Hibs, but obviously, they're the only ones you'd really want to search. But that's incredible and it just documents it as you say, and you can click through and it's a phenomenal resource!
Stuart Lewis 12:31
We've got over 8 million items in there now and as I say we reckon as a sort of educated guess, there's probably about 14 million books that have been digitised worldwide. So we're you know, we're 20% of the way there. So a long, long way to go and in a sense, the collection right now isn't as diverse as it could be. It's mainly sort of from English speaking countries, the items. Having said that, what these libraries have collected and digitised is a little bit more diverse than that. So we have oh, I added it up the other day, I can't remember what it is now, but I think about 50 or 60 different languages are represented in the collections here. So yeah, we've got everything from where are we 36 items in Esperanto, 37 in Tahitian, 45 in Vietnamese, 55 in Mohawk, 66 in Aramaic, and so forth. So you know, we've got, yeah, fascinating collection from that that perspective as well,
Graham Johnston 13:28
It really is and thinking about, you know, during the pandemic, and people not able to get into the libraries I mean, this must have helped a significant amount of people who were doing, you know, academic research or studies or trying to research any element, to be able to go on there and have that resource at their fingertips and be able to access that information that they previously couldn't have. I can't - it must be kind of quite rewarding to think about just how many people that has actually helped.
Stuart Lewis 13:55
Yeah, definitely. And we get, you know, we do get sort of nice comments, particularly sort of through Twitter and things that people saying, you know, it is it is really helpful, or just, it's distracting, because, you know, once they're in there, you can, it's a bit like a bookshop, you can just get lost for an hour or two quite easily, which is really nice. But what's also sort of really interesting and one of the reasons we built it as well is actually just around the sort of the data element of this as well. So a lot of research now can be done at a different scale. So if you imagine maybe 20 years ago, research in the arts and humanities or whatever, you know, you might go to a library, you get a book out, you read that book for a week, you read another book for a week, and you another book for a week, and that's the scale that you can do your research. Whereas what we can do now that these are digitised, we can provide someone with 100,000 books. And they can write a computer programme to read those hundred thousand books or whatever, and do research at that scale. So you can do, it's what's called digital scholarship - text and data mining, things like that. So we can actually bring this up another level. And so that was, again, one of the features we wanted to make sure was in there so you actually get a sort of export button at the bottom of your search results as well. So if you've if you've done a search, you can then export those results and sort of if you're in that sort of field of research then download them all and yeah sort of do your work, you know, looking at that hundred thousand whatever of items across the world.
Kirsty McIntosh 15:18
is this available to schools? Schoolchildren?
Alec Davies 15:21
Anyone and it's a very important point from the start as well that I don't think covered it is free completely and utterly free. And when the books, when the items come up on the list, that means that they're available, so it should never hopefully come up and show you an item that you click on it and then sees it's not available. Everything is there for anyone to read at any given time. That was a decision made right - the first few weeks, I think Stuart on that. So it should be intuitive as well, it's quite easy to use.
Kirsty McIntosh 15:57
I love that. I love the idea of encouraging children to think of books as something like a data mine basically, something that they can they can research and kind of muck about with, just make them think of it - all those words or slightly differently as well. That's so exciting. It's fantastic. Did you have any kind of major snags on the way , Alec? Did you hit any major obstacles on the on the road?
Alec Davies 16:19
Very little, it was really just trying to get the resource put together. That was the that was the hardest part of the whole project after we got the resource, basically got Sarah, etc. on board everything went to plan. Yeah, it really went even under the testing. I had an Italian friend of mine who can speak four languages, he did some of the testing to make sure it was okay. I have a book publisher, one of my neighbours. So I've got the journalist who wrote the story is my neighbour in the book publisher is also a neighbour, it's a bit of a literary street, but she checked it to make sure there was nothing under copyright as well, which was her big thing, because obviously, she basically sells books. So yeah, so that's all been checked out as well. So it was good.
Kirsty McIntosh 17:06
So what's the oldest book in the repository? Then? Do you know?
Stuart Lewis 17:11
Really good question, but I'm guessing it's gonna be probably no, in the libraries that we've got so far will be the Gutenberg Bible. So that, you know, was the first sort of printed book in Europe so I'm pretty sure that, yeah, I suspect that that's what we've got. So yeah, around about 1455.
Graham Johnston 17:37
And what's next Now then, so you've, you've done this, this has been a great project. What's next on the roadmap for the digitization of the National Library of Scotland?
Stuart Lewis 17:49
Yeah, so in a sense I mean we continue our digitization, so we digitise in the order of about somewhere between 100,000 and 200,000 items a year. So that's us. But obviously libraries worldwide are digitising everything from you know small local libraries, digitising their local collections, maybe their local newspapers, their local parish magazines, that sort of thing through to your national Library's digitising and the hundreds and thousands and then even the likes of Google Books, you may have heard of their you know, Google are digitising millions of items a year. And so for us, I suppose there's two areas one with the site really, and with the project. One is to actually continue developing the site. So it is still very much sort of quite a young site, I guess, almost what you might call a minimum viable product. It works, it's searchable. However, there's an awful lot of functionality, you know, we still need to build, but obviously we wanted to get this resource out there, you know, because it could have taken you know, we could work on this for another six months or a year you know, but we wanted to get it out there so there's a lot of functionality we still need to build you know, things like advanced search capability, that sort of thing but but the lack that can come over time. So that's one thing we need to keep sort of developing the site in that respect and then the other half is we need to start obviously gathering more data from libraries worldwide. So as I said we've got about 8 million items so far so that's probably a you know about another 32 million items to go, so yeah, we need to start working with libraries and making those relationships with them and then gathering that data.
Kirsty McIntosh 19:22
How friendly is the library world? I mean, are you able to negotiate globally with other libraries to have access and stuff like that? You are friends together? I mean, I have to say I do note on Twitter that Orkney and Shetland has quite the fight, which is just wonderful on Twitter. They compete with one another but I know it's very tongue in cheek.
Stuart Lewis 19:41
Oh yeah. I mean, libraries actually, it's the wonderful thing about working in libraries, you know, everything we do, we get to share openly. You know, we have no commercial secrets. We have no non disclosure agreements, that type of thing. You know, everything we do we do so that we can make it available and share it with people. And so yeah, we have a bit of rivalry as you can imagine with sort of peer libraries - who's digitising the most, who's got the biggest percentage online, that sort of thing.
Kirsty McIntosh 20:09
The biggest scanner.
Stuart Lewis 20:12
We have, we have a scanner, we've recently post called the dragon. It was the first in the UK, I think, and everyone's quite, quite jealous of that. But yeah, in a sense, what we're doing with this site is actually in a sense, amplifying a lot of the work of other libraries as well, because we're just making things they have even more available again. And then we drive that traffic to their site. So yeah, sort of from like, you know, from the perspective of other libraries, it's a very positive development as well.
Graham Johnston 20:40
Sounds like a documentary, this just sounds like, you know, a filmmaker should be should be doing a documentary on this - it sounds amazing, you know, the project of digitised libraries and the unknown competitive streak between the different libraries, you know I'd never even thought about that. But you know what, I think that would grab viewers attention. So for all the budding filmmakers out there, this is a great opportunity to get in touch.
Kirsty McIntosh 21:04
So you've done this project Alec, and you've obviously really, really enjoyed it. So are you up for another project within the within the Tech Army? Or have you have you had enough for now.
Alec Davies 21:14
No, I'm picking up another project for the Scottish Government. And the trouble is that I've got with this one is just trying to go through all the security checks, etc. Nothing seems to work very quickly to be frank.
Kirsty McIntosh 21:28
They're very thorough.
Alec Davies 21:28
I mean, you fill in these forms and you wonder if anyone's ever going to happen again. So I have been waiting about a week for things to come through. So it's quite an interesting project as well.
Graham Johnston 21:42
So I'm just on the site now. And I can see that you can contribute and you can feedback as well. So is this stuff that you're really looking for people to provide the feedback and also the contributions to it, Stuart?
Stuart Lewis 21:53
Absolutely. Both of those. So the contributions yeah, really is is looking for other libraries right across the world to give us data that we can, you know, more digitised books that we can put onto here. And then the feedback yeah, we obviously absolutely loved the feedback, because that's what helps us to refine the site to make it better, and so forth. So whether it's sort of positive feedback, negative feedback, or just telling us what worked, what didn't, yeah, the feedback's always invaluable for us.
Kirsty McIntosh 22:21
Yeah it's the best way to test a product, isn't it, put it out there and let everybody tell you how good or bad it is and then that gives you a path to follow to make it better doesn't it? Alec, can you tell me a little bit about Sarah, you said that she was a designer who could code I had a conversation, two conversations actually in the last week with.. one was a woman who's a paralegal who has also trained in code, but it's to complement the job that she does. And in other words, that chap who has also put himself through Codeclan, because he wanted complementary coding skills to the career he already has. It sounds to me like Sarah's sort of very much the same. Would you say that what she does? That's how she describes herself?
Alec Davies 23:01
Yeah, she's primarily a designer at the end of the day, and it was all her screens that you're seeing up there on Opentexts.world. We were struggling to actually get a coder, if we call it that. And so she basically did it herself. I think she, to be fair, she got some help from her partner, as well, he could do some work as well. And, you know, it was how it was done.
Kirsty McIntosh 23:27
Interesting, it's the future of work, I think is those complementary skills that just about every job you have, you also have to code which I think is a good message to get out there.
Graham Johnston 23:35
I think it's so important, I really do. I'm seeing that more and more and more as well in my line of work is that people are you know, cross skilled and that's brilliant because a) because they want to do it and it adds more variety, but b) because just examples like that, where you can just get stuff done by having those different skills. Amazing.
Alec Davies 23:55
So I could talk about that for a long time, actually. Remember I've been in IT since 1980. And it was much more of generalist pieces of work. I mean, for example, I was a DBA I was a CICS programmer. You know, I did the networks, did everything in those days. And because of outsourcing and trying to get cheaper people to do it, IT compartmentalised an awful lot of work. And we've lost a lot of the generalists in IT. Too many people only knew a specific thing in it these days without actually knowing the bigger picture. That's the old guys like me speaking by the way, we all we all think the same thing.
Kirsty McIntosh 24:39
I think that's very true that you need the people that can see the bigger picture and as much as you need the people that can that can deal with the detail. I think it's always very much you need both.
Alec Davies 24:50
And Sarah very much had that attitude of "well I can do it. I'll get on with it", which was fantastic.
Kirsty McIntosh 24:56
Have you got a lot of coders in your librarian pool there, Stuart?
Stuart Lewis 25:00
Yeah, I mean, we're lucky, we actually have a team of four software developers, you know, a lot of people don't really realise I suppose that a library would have software developers or have the need for software developers. But you know, so much of what we do, obviously, is online and digital now. And so much of what we do is handling data in one or another. So I suppose for those that like numbers, it might mean nothing to a lot of people, but just, for example, a storage now that we have as the National Library of Scotland around about five and a half petabytes of storage. So we're talking sort of serious amounts of storage here. You know, that's the sort of scale that that we work at. And so obviously, yeah, having a team of software developers is great. But obviously with lockdown, you know, they were heads down and working extremely busy, working on all of our digital services and moving even more things online. And so that was actually yeah, why, again, Scottish Tech Army was absolutely fab into being able to come on and help us with this and add even more resource for getting our library materials available during lockdown.
Kirsty McIntosh 26:03
It's absolutely brilliant.
Graham Johnston 26:05
That's an astonishing amount of data. And all the all the mathematicians are trying to sort of work out how much that would cost them on an iCloud account right now.
Kirsty McIntosh 26:16
Well, listen, I just think it's great. I can't wait to go back in and I'm not going to be looking for Hibs programmes or Hearts programmes for that matter, but I've got a few ideas about some of the things I want to go in..
Alec Davies 26:27
I think if you just type in wee team that comes up with the Hibs stuff.
Graham Johnston 26:31
Oh, dear, it's descended into that.
Alec Davies 26:36
I'm actually a rugby fan. Rugby's my passion, I'm an Edinburgh season ticket holder.
Kirsty McIntosh 26:43
Well, listen, Stuart, thank you very much for your time today and congratulations on such a fantastic collection there and Alec, well done and congratulations on leading a team to deliver it for them. I think it's absolutely fabulous and I can't wait for everybody to hear about it. It's great. Thanks very much for your time.
Alec Davies 27:03
Thank you very much.
Graham Johnston 27:04
Well done guys, thank you.
Kirsty McIntosh 27:11
Before we go from this episode I’d just like to express my thanks to all of the people who’ve got in touch to give us feedback on last week’s mental health episode. Graham and I are very grateful for your feedback, Rab and Julie were fantastic guests and we very much hope we’ll be able to return to the topic again in the future. In the meantime, please don’t forget to look out for one another. Bye for now.