Abstract
The Piccolo Film Super 8 Home Cinema Dataset contains information on 5,425 reduction
prints for private sell-through from the German distributor piccolo film. The dataset is
presented in a table in CSV format. It was created by the DiCi-Hub project team at Johannes
Gutenberg-Universität Mainz.
The entries in the dataset are based on mail-order catalogues from piccolo film from 1974 to
1982, courtesy of Andreas Chmielewski. The catalogues were digitised by the Service
Centre for Digitisation and Photo Documentation at Johannes Gutenberg-Universität Mainz's
university library, and the text was extracted using the optical character recognition tool
tesseract. This text was then transformed into a table using ChatGPT. Any errors were then
checked and missing information added manually.
Each catalogue entry — and therefore our dataset — contains the following information
about the reduction prints: order number, title, price category, length, whether it is in colour or
black and white, and whether it has sound or is silent. The films were sorted into marketing
categories based on genre or prestigious production companies, such as Walt Disney. Some
films came with additional information, such as being split into several parts or having age
restrictions due to violence or erotic content. Additionally, we added to our table the page on
which the reduction print was listed in the catalogue, the catalogue's title and the year. Film
titles and marketing categories were transcribed exactly as they appeared in the catalogues.
Consequently, similar categories are not combined under one term; for example, children’s
tales can be found under both Märchen and Märchenfilme.
In order to enrich the data with additional information from TMDB, our student assistant Oğuz
Can Ayverdi developed an OpenRefine API. Further information about installation, use and
implementation can be found in the ReadMe file provided with the API. The API was
developed as an alternative to WikiData and IMDb.
In addition to conserving part of home movie culture, the dataset was designed to enable
researchers and lecturers to begin working with digital methods and tools. At an earlier
stage, the dataset was tested on B.A. and M.A. courses to provide insights into data
collection, modelling, enrichment, visualisation and critique.
TMDB Reconciliation Service: https://codeberg.org/oguzcanayverdi/tmdb-reconciliation-service
As long as there is no further specification, the item is under the following license: Creative Commons - Namensnennung - Weitergabe unter gleichen Bedingungen