Libraries perpetually balance our patrons' right to privacy against our
organizational need for usage data. The best way to ensure patron data
is not compromised is to not keep it in the first place, but we need
some level of data to ensure we know where our collection is and who's
responsible for it. Further, we often want to maintain some broader
statistical data about which types of items get checked out and when so we
can make larger planning decisions. What information is truly necessary? What information should or should not be stored in Koha's data?
Libraries can help protect patron privacy with patron category settings and system preferences that can either store or anonymize patron history. Pseudonymization is a tool that allows libraries to retain statistical information about transactions that is not linked to any individual borrowers.
Koha's Privacy Tools, or, what to read if you are short on time
Patron categories can be configured with privacy settings for how long any circulation history is kept if at all. The options are:
Never: delete circulation history when an item is returned,
Forever: keep history forever
Default: delete history after a period of time
Koha
has two features that give libraries a way to report on usage
statistics without retaining direct links between transactions and
borrowers. What's the difference?
Batch anonymization removes
borrower numbers from circulation history after materials are returned
and attributes all circulations to an "anonymous" patron. This allows
libraries to report on circulation statistics without any borrower
information attached. TheAnonymousPatron system preference tells Koha which patron account to use as the anonymous patron. Whether or not history is anonymized is set at in patron category configuration.
Pseudonymization lets libraries store a lot more
data about patrons and transactions, but without a meaningful link to
individual patrons. Transaction and borrower data is moved to
"pseudonymized" tables in the database. Borrower numbers are encrypted
in these tables. With pseudonymization, a library could retain and
report on borrower ages, cities, the library something was borrowed
from, etc.
For ByWater partner libraries, the system preference Pseudonymization is turned off by default. A library can enable this feature and then also specify what data to keep.
Patron Data
Libraries can help protect patron privacy with patron category settings
and system preferences. Personally identifiable information is any data that could potentially
identify a specific individual. Any information that can be used to
distinguish one person from another and can be used to deanonymize
previously anonymous data is considered PII. Koha can automatically anonymize patron circulation history by replacing the unique patron who borrowed an item with an "anonymous borrower" after the item is returned.
Libraries can specify how patron circulation data is anonymized for each patron category. System preferences control when and how staff and patrons
can see patron reading history, holds history, and suggestion history.
Circulation statistics are retained when patron history is anonymized.
Patron Categories
Default
privacy settings can be assigned for each patron category. Default
controls how long a patron’s checkout history is kept for new patrons. Note: changing this setting does not alter the privacy settings of existing patrons.
Default privacy options:
"Never" anonymizes patron checkouts immediately on return
"Forever" keeps patron checkout history indefinitely
"Default"
uses the amount of history kept is controlled by the cronjob
batch_anonymise.pl. This cron job can be set to anonymize after a
certain period, like 30, 180, or 365 days.
System
preferences control whether patrons can independently choose to keep
their reading history and holds history. If patrons are allowed to
choose their privacy settings, these choices will override what is set
at the patron category level. If a library sets a patron category to
never keep history and an individual patron chooses to keep circulation
history forever, that history is recorded and is visible to the patron
and potentially to staff. See below for system preferences controlling
visibility.
Note for those libraries using Aspen Discovery with Koha: "Default" retains reading history in Aspen for new patrons. They must opt out. "Never" does not retain reading history in Aspen. New patrons must opt in.
Patron Privacy Preferences
OPACPrivacy - When set to allow, patrons can choose their privacy settings for their reading history. This requires opacreadinghistory and AnonymousPatron.
opacreadinghistory - When set to allow, patrons will be able to see what books they have checked out in the past.
intranetreadinghistory - When set to allow, staff can view a patron’s checkout history if the library or patron has chosen to record it.
intranetreadinghistoryholds - When set to allow, staff will have access a patron's hold history if the library or patrons have chosen to record it.
AnonymousPatron -
This setting assigns a number that will replace the borrower number
when patron history is anonymized. This is used for anonymous
suggestions, holds, and checkout history. In order to use this, the library must have an "anonymous" patron account.
OPACHoldsHistory - When set to allow, patrons will see the list of their past holds on the OPAC.
The Koha manual has examples of how staff will see patron circulation history and holds history if intranetreadinghistory or intranetreadinghistoryholds are set to "Allow".
If OPACPrivacy and opacreadinghistory are set to allow, patrons will have a section of their account called "Your privacy." This is where they can choose to keep or delete their history. More
information and examples of what patrons see in the OPAC can be found in
the following sections of the Koha manual:
Note: If a library has set the system preference StoreLastBorrower to "Store", the patron will also see a note about how this information
is being stored in the library: "Please note, the last person to return
an item is tracked for the management of items returned damaged."
Transaction data
As
items get checked out, that transaction is recorded in the the Koha
issues table. This records the itemnumber of the item and the
borrowernumber of the
patron, thereby linking the item and patron. After an item is returned,
the transaction moves from the issues table to the old_issues table. If
patron reading history is not kept, the patron's borrower number is
replaced with the anonymous patron number.
The issues table shows the patrons borrowernumber: 19. Here's that same checkout after the item was returned:
The
borrowernumber has been replaced with the anonymous borrowernumber: 54.
The issues and old_issues tables know that this checkout happened and
when it happened, but they can't accurately tell us anything about the
patron who was involved.
Transaction
data is also recorded in the statistics table, however the data in the
statistics table is not well anonymized and is also impacted by deletion
of items or patrons.
Setting up Batch Anonymization
A
cron job called batch-anonymize can be enabled to run in the background to remove
borrower numbers from circulation history.
When borrower history is removed, Koha replaces patron data with an anonymous patron. If your library does not have a patron defined as
the ‘anonymous patron’, you can create one. Add the borrower number of the anonynous patron to the system preference AnonymousPatron. If this
system preference is not filled out, the cron will not run.
Next,
your library will need to determine which patron categories will have
their patron history anonymized. In Administration > Patron Categories, a library can choose three different privacy settings.
Default:
Default privacy means that the patrons in this category will be
affected by this cron and the library will determine how long the / how
often the cron (batch anonymize) will run.
Forever:
If a library category was chosen to retain patron history, this can be
set at Forever and the Batch Anonymize cron will ignore these patrons.
Never: The last option would be to set a patron category to Never, which would tell Koha to anonymize checkouts on return.
For
the patron categories that are set to Default, the Batch Anonymize cron
can be set up to run daily and will pick up patron history that is
older than X number of days. The library can choose the number of days
the patron history is kept prior to being removed from the system.
Cronjob
This is an example of the
cron line run that would remove patron history (set to default in the
patron category) after 5 days:
cronjobs/batch_anonymise.pl --days 5
If the library's Koha server is hosted by ByWater Solutions, please submit a ticket to request this cron.
Koha has a feature called pseudonymization that lets libraries choose to store a lot
more transaction data and to do so in a way that cannot be
connected back to a specific patron. Pseudonyimization is controlled with system preferences. See the Koha Community Manual. Turning it on will not change anything about the data processes in the issues, old_issues, and statistics tables. However, for each transaction Koha will also record
data in the pseudonymized_transactions table. This is a more effective replacement for Koha's statistics table. If using pseudonymnization, run Koha's database cleanup cron job to regularly delete statistics data.
Here is a checkout and return for the same example patron used
above:
Instead of the actual borrowernumber, there is a hashed_borrowernumber. That's a bunch of gibberish that cannot be connected back to an actual patron. The hashed value is unique and
consistent, allowing a count of distinct borrowers, but not who they were.
Libraries can also record extended patron attribute data in this
manner. See the Koha Community Manual for instructions. If the attribute is set for
retention and the patron has a value for it, that value will get
recorded in pseudonymized_borrower_attributes at the point of
transaction. Note that this table joins back to
pseudonymized_transactions on transaction_id.
All pseudonymized fields record the data that existed at point of checkout. If the item later moves to a
new collection code or the patron later changes categories, the values
in these pseudonymized tables will not change. Further, these tables are
not impacted by deletions of items or patrons.
If you turn on pseudonymization and set
up some new reports to use this new data, you will have a more robust
and stable collection of statistical data than you could achieve without
this feature.
System Preferences
Click on any system preference to see full details in the Koha manual.
StoreLastBorrower - When set to store, staff will be able to view the last patron to return an item in the history section of an item record, even if a patron has anonymized their history. This setting is independent of opacreadinghistory and AnonymousPatron.
Delete Old Data for Better Privacy
Data in the old_issues and statistics tables can be deleted regularly with the cron job cleanup_database.pl. If your library has support and hosting from ByWater Solutions, ask to have this set up.
Storing pseudonymized data allows libraries to let go
of old patron-identifying data that they may have been holding onto for
reporting purposes.
Go ahead and cleanup that database!
FAQ
What's the Difference Between Last Returned By and Last Borrower?
On the history portion of item details pages, 'Last Returned By' is saved by the system preference StoreLastBorrower. 'Last Borrower' and corresponding 'Previous Borrower' come from circulation history in old_issues.
The system preference StoreLastBorrower exists to help circulation staff track down who may have last had an item where damage or missing parts are discovered after return, especially with anonymization in play. This preference grabs the borrowernumber of the last account that had the item checked out, storing it in another table, items_last_borrower, and displaying the card number. Many libraries, mindful of privacy, have some degree of anonymizing transactions, whether that is letting patrons set their own privacy settings, anonymizing all checkouts on return as a default, or even only keeping checkout history for a few days before anonymizing. This process overwrites the last borrower stored in the old_issues table from the borrowernumber of the patron to that of the account set as the anonymous borrower by the AnonymousPatron system preference. (In our partner sites, this is usually one of the first few system accounts created as part of the default suite of accounts.)
In the following screen shot, you can see that 'Last Returned By' shows a card number, while the Last Borrower displays as anonymous, due to account privacy settings.
Without StoreLastBorrower on, a librarian trying to reach the previous borrower of that item would not be able to find out who had the item because the privacy on that account was set to anonymize on return.
Currently, StoreLastBorrower does not have a process by which it is eventually anonymized, though an enhancement is working its way through the Koha community to do so.
Can Libraries Collect Year of Birth?
Date
of birth is one field libraries can collect during patron registration,
often to confirm patron identity, or in some cases, maintain a separate
patron category based on age, as public libraries will often do. This
field is not a default requirement in Koha, unless a library sets it to
be in the BorrowerMandatoryField system preference.
Sometimes,
libraries want to be able to collect enough information to distinguish
patrons but not collect the full MM-DD-YYYY birthday. Unfortunately, the
date of birth field requires all three pieces: month, day, and year,
which Koha saves in the database in ISO format of YYYY-MM-DD. Anything
deviating from that will be an invalid date and can cause system errors,
so Koha has guardrails against being able to enter and save invalid
dates.
A library can opt not to collect date of
birth or simply not make it required, if they largely want to collect
it but want patrons to be able to opt out of providing that information.
As another option, a library otherwise requiring that field can come up
with a standardized internal convention for those patrons, such as
saving 01-01-1901 as a date of birth for very privacy-minded patrons,
training their front-line staff accordingly. As a third option, if a
library would like to collect year of birth for statistical purposes but
don't need the rest of the date, they could consider setting up and
using a patron attribute (or sort1 or sort2 fields if they are not
already in use) instead of using the date of birth field, and that way
they can use this data for reports later.
The General Data Protection Regulation is a regulation in EU law on data protection and privacy for all individual citizens of the European Union and the European Economic Area. GDPR has not been implemented as a law in the United States, however, ...
In this Koha Tutorial we will show you the new feature in Koha for Patron Clubs. This features adds the ability to create clubs in which patrons can enroll. It is particularly useful for tracking summer reading programs, book clubs, etc. Permissions ...
What are Batch Checkouts? In Koha, there is a workflow that would allow libraries to batch checkout items. To set this feature up, the system preference, BatchCheckouts, will need to be turned on. The other system preference, ...
This plugin is available on the ByWater Solutions GitHub site, and currently needs to be configured by a ByWater team member. The plugin configuration does not live in the staff interface, but on the backend where it is not accessible to partners. ...
Koha has many options and tools for managing patron accounts. Customize the Patron Registration Form Libraries can customize which patron fields display on the form, which fields are mandatory and which fields are collapsed within the patron detail ...