Title: | Interact with Metadata Records and Media on the Europeana Repository |
---|---|
Description: | Interact with the Europeana Data Model via a variety of API endpoints that contains digital collections from thousands of institutions around Europe. This translates to millions of Cultural Heritage Objects in the form of image, text, video, sound and 3D, accompanied by rich metadata. The Data Model design principles are based on the core principles and best practices of the Semantic Web and Linked Data efforts to which Europeana contributes (see, e.g., Doerr, Martin, et al. The europeana data model (edm). World Library and Information Congress: 76th IFLA general conference and assembly. Vol. 10. 2010.). The package also provides methods for bulk downloads of specific subsets of items, including both their metadata and their associated media files. |
Authors: | Alexandros Kouretsis [aut, cre], Andreas Giannakoulopoulos [aut], Laida Limniati [aut] |
Maintainer: | Alexandros Kouretsis <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2025-02-20 03:29:36 UTC |
Source: | https://github.com/alekoure/europeanar |
Function that downloads media, using the response object of the Europeana search API. It uses the fields 'type' and 'edmIsShownBy' to retrieve the items and store them in a local folder.
download_media(resp, download_dir = NULL, type_ = NULL, quiet = TRUE)
download_media(resp, download_dir = NULL, type_ = NULL, quiet = TRUE)
resp |
an S3 object of type 'europeana_search_api' or 'cursored_search' |
download_dir |
destination directory. If 'NULL' then 'tempdir()' is used |
type_ |
string in 'c("TEXT", "IMAGE", "SOUND", "VIDEO", "3D")' |
quiet |
boolean to suppress download file messages |
destination folder
#set your API key with set_key(api_key = "XXXX") #example_1 resp <- query_search_api("arioch", rows = 2) download_media(resp, type = "IMAGE") #example_2 bulk download res_bulk <- tidy_cursored_search(query = "animal", qf = "when:17 AND what:painting", max_items = 3) download_media(res_bulk)
#set your API key with set_key(api_key = "XXXX") #example_1 resp <- query_search_api("arioch", rows = 2) download_media(resp, type = "IMAGE") #example_2 bulk download res_bulk <- tidy_cursored_search(query = "animal", qf = "when:17 AND what:painting", max_items = 3) download_media(res_bulk)
This function is a simple wrapper of the 'Sys.getenv' base function. It gets the value of the environmental variable 'EUROPEANA_KEY'.
get_key()
get_key()
character with the API key stored as environmental variable
The Record API provides direct access to the Europeana data, which is modeled using the Europeana Data Model (EDM). While EDM is an open flexible data model featuring various kind of resources and relations between them, the Record API (and the Europeana Collections Portal) supports the retrieval of a segment of EDM for practical purposes.
These "atomic" EDM segments typically contain one Cultural Heritage Object (CHO), aggregation information that connects the metadata and digital representations, and a number of contextual resources related to the CHO, such as agents, locations, concepts, and time.
query_record_api(id, path = "/record/v2", ...)
query_record_api(id, path = "/record/v2", ...)
id |
string with the 'RECORD_ID' in the form of '/DATASET_ID/LOCAL_ID' |
path |
string that indicates version of the API |
... |
other parameters passed as query parameters |
S3 object of class 'europeana_record_api'. Contains the parsed content, the path, and the API response compatible with 'httr' methods.
https://pro.europeana.eu/page/record
Doerr M, Gradmann S, Hennicke S, Isaac A, Meghini C, Van de Sompel H (2010). “The europeana data model (edm).” In World Library and Information Congress: 76th IFLA general conference and assembly, volume 10, 15.
Wickham H (2020). httr: Tools for Working with URLs and HTTP. https://httr.r-lib.org/, https://github.com/r-lib/httr.
Ooms J (2014). “The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects.” arXiv:1403.2805 [stat.CO]. https://arxiv.org/abs/1403.2805.
#set your API key with set_key(api_key = "XXXX") #query search API res <- query_search_api("arioch", qf = "1712", media = TRUE) #get results in tidy format dat <- tidy_search_items(res) #query records API for each item lapply(dat$id, query_record_api)
#set your API key with set_key(api_key = "XXXX") #query search API res <- query_search_api("arioch", qf = "1712", media = TRUE) #get results in tidy format dat <- tidy_search_items(res) #query records API for each item lapply(dat$id, query_record_api)
The Search API allows to search the Europeana repository for metadata records and media. The Search API is the most straightforward to use. It works in a similar fashion to the Europeana website when it comes to interacting with the data. You can use the API to search for keywords, and it will return any entries that contain those words. You can refine your search with more advanced queries like Boolean Searches, or you can filter out parts of the results advanced filtering.
query_search_api( query = NULL, rows = NULL, profile = NULL, qf = NULL, reusability = NULL, media = NULL, thumbnail = NULL, landingpage = NULL, colourpalette = NULL, facet = NULL, limit = NULL, start = NULL, sort = NULL, path = "/record/v2/search.json", ... )
query_search_api( query = NULL, rows = NULL, profile = NULL, qf = NULL, reusability = NULL, media = NULL, thumbnail = NULL, landingpage = NULL, colourpalette = NULL, facet = NULL, limit = NULL, start = NULL, sort = NULL, path = "/record/v2/search.json", ... )
query |
(character) string with the search term(s) |
rows |
(numeric) that indicates the number of records to return |
profile |
(character) Profile parameter controls the format and richness of the response. |
qf |
(character) Facet filtering query. This parameter can be defined more than once. |
reusability |
(character) Filter by copyright status. |
media |
(character) Filter by records where an URL to the full media file is present in the edm:isShownBy or edm:hasView metadata and is resolvable. |
thumbnail |
(character) Filter by records where a thumbnail image has been generated for any of the WebResource media resources (thumbnail available in the edmPreview field). |
landingpage |
(character) Filter by records where the link to the original object on the providers website (edm:isShownAt) is present and verified to be working. |
colourpalette |
(character) Filter by images where one of the colours of an image matches the provided colour code. You can provide this parameter multiple times, the search will then do an 'AND' search on all the provided colours. See colour palette. |
facet |
(character) Name of an individual facet. See individual facets. |
limit |
(numeric) The number of records to return. Maximum is 100. Default: 10 |
start |
(numeric) The item in the search results to start with. The first item is 1. Default: 1 |
sort |
(character) Sort by a field, e.g., |
path |
(character) URL signature with the API version |
... |
parameters passed in get request |
In the 'query' parameter the Apache Lucene Query Syntax is inheritly supported by the Search API. Users can use Lucene and Apache SOLR guides to get the most out of the Europeana repository.
The response is always formatted in JSON and will contain a number of fields that present information about the handling of the request, while the concrete information about the record is presented in the "items" field (see Metadata Sets). The parsed information is stored in the 'content' field of the S3 object returned.
S3 object of class 'europeana_search_api'. Contains the parsed content, the path, and the API response compatible with 'httr' methods.
https://pro.europeana.eu/page/search
Doerr M, Gradmann S, Hennicke S, Isaac A, Meghini C, Van de Sompel H (2010). “The europeana data model (edm).” In World Library and Information Congress: 76th IFLA general conference and assembly, volume 10, 15.
Wickham H (2020). httr: Tools for Working with URLs and HTTP. https://httr.r-lib.org/, https://github.com/r-lib/httr.
Ooms J (2014). “The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects.” arXiv:1403.2805 [stat.CO]. https://arxiv.org/abs/1403.2805.
#set your API key with set_key(api_key = "XXXX") #query search API res <- query_search_api("arioch", qf = "1712", media = TRUE)
#set your API key with set_key(api_key = "XXXX") #query search API res <- query_search_api("arioch", qf = "1712", media = TRUE)
This function is a simple wrapper of the 'Sys.setenv' base function. It sets the value of the environmental variable 'EUROPEANA_KEY'. Alternatively, use .Renviron to set the key. Get and API key in the following link https://pro.europeana.eu/page/get-api.
set_key(api_key)
set_key(api_key)
api_key |
string with the API key |
No return value, called for setting the environmental variable 'EUROPEANA_KEY'.
This function is a "runner" of a particular query that consequently makes API requests until maximum items are reached or all related items have been collected.
tidy_cursored_search(query, max_items = 10000, ...)
tidy_cursored_search(query, max_items = 10000, ...)
query |
string with the search term(s) |
max_items |
numeric that indicates max items collected |
... |
params passed to get request, see also 'query_search_api()' |
S3 object of type 'cursored_search'. Contains a 'data.table' with all the responses transformed to tabular format, the path to the first request that starts the cursored search, and the corresponding response object compatible with 'httr' methods.
#set your API key with set_key(api_key = "XXXX") #query search API up to 3 items res <- tidy_cursored_search(query = "animal", max_items = 3, theme = "art", media = TRUE) head(res$data[, 1:3])
#set your API key with set_key(api_key = "XXXX") #query search API up to 3 items res <- tidy_cursored_search(query = "animal", max_items = 3, theme = "art", media = TRUE) head(res$data[, 1:3])
Transforms API response to a tidy 'data.table' for easier manipulation
tidy_search_items(resp)
tidy_search_items(resp)
resp |
an S3 object of type 'europeana_search_api' |
'data.table' with stacked results collected from the search api. Each row corresponds to a Cultural Heritage Object.
#set your API key with set_key(api_key = "XXXX") resp <- query_search_api("arioch") tidy_search_items(resp)
#set your API key with set_key(api_key = "XXXX") resp <- query_search_api("arioch") tidy_search_items(resp)