This workshop is a hands‑on data exploration and challenge to become a derived data‑set author on the British Library’s open data‑set platform (https://data.bl.uk). Participants will be able to upskill with new or known tools, gain familiarity with the British Library's data, and innovate research projects they may want to work on in the future.
- Do you want to understand some of the challenges of working with cultural heritage data in a large national library such as the British Library?
- Do you want to explore and get some 'hands-on' experience of working with the British Library’s digital collections and data?
- Do you want to leave a ‘legacy’ of being a data-set author/creator/curator on the British Library’s data-set platform?
- Do you have some digital literacy in using familiar data exploration tools such as Microsoft Excel (see 'GUIDANCE FOR THIS WORKSHOP' below)?
If the answer is 'Yes' to any of these, then this workshop could be for you!
Mahendra Mahey, manager of British Library Labs (BL Labs) will examine some of the British Library’s digital collections/data & discuss challenges he has had in making this cultural heritage data available openly or onsite at the British Library.
In this workshop, you to get to explore data-sets already available on https://data.bl.uk.
The workshop will conclude with reflections from the delegates and possibly highlighting a number derived data-sets that were generated by participants on the day that could now potentially exist on https://data.bl.uk. If selected, these new derived data-sets will be attributed with the creators'/authors' details and each will have its own cite-able Digital Object Identifier (D.O.I). These new data-sets would then be available for reuse by any researcher in the world.
GUIDANCE FOR THIS WORKSHOP
We strongly recommend you come to this workshop with an appropriate device such as a laptop pre-installed with appropriate tools to analayse different kinds of data-sets, e.g. Microsoft Excel may work with smaller data-sets such as metadata (see other data exploration tools below). If you don't have one, and would still like to attend, please request to 'pair up' with someone who is willing to share and has already signed up.
Other data exploration tools include: Notepad++ (e.g. for viewing text and XML); Open Refine (e.g. for cleaning data); Tableau Public (e.g. for visualising data); Google Fusion Tables (e.g for visualising geo-spatial data); Spacy (e.g. for text and data mining), RStudio (an open source Statistical package), MATLAB (data analysis tool) & NLTK (Natural Language processing).
Please note that this workshop is NOT about training you in using any of these tools, just tools you may be already familiar with to explore and find patterns in our data.
Datatypes you may be examining in this workshop could include: .ZIP, .PDF, .TXT, .CSV, .TSV. .XLS, .XLSX, RDF, .nt, XML (TEI, ALTO and bespoke), .JSON, .JPG, .JPEG, .TIFF and .WARC
Please ensure you are able to read these files on your device before the workshop if you are interested in exploring them during our session.
Places are free but limited, so please register via the Eventbrite link!