Importing Records into Koha

Importing Records into Koha


Koha can import MARC records from a variety of sources, which include both bibliographic records with items, or authority records. In order to be successful, the records need to be formatted correctly. Import records can be matched to existing catalog records and overlaid as part of an acquisitions or cataloging workflow. Batch files from 3rd-party vendors can be imported as part of regular catalog maintenance.

Bibliographic Records

There are a number of sources for bibliographic records. Records can be imported a few different ways:
  1. Individually via z39.50
  2. Batch files
  3. Pushed into the catalog via a service
Records can come in as new bibliographic records or can match and overlay. Record matching rules have to be configured in order to match and overlay the records while importing. These rules can be set up in Administration > Record Matching Rules. A typical record matching rule is to match on the ISBN or a control number. But it is possible to match on other fields.

Notes
MARC records should be a file with the extension of .mrc.

Preparing Item Records

Item records are usually imported as a part of bibliographic records. Item records can be added to the bibliographic records by a vendor through a gridding and mapping process when placing orders. Item records can also be added to bibliographic records in record editing tools like MarcEdit. It is important the item records are formatted correctly, otherwise the import file may be rejected or item fields will not be populated.

Required Item Fields

Item records in Koha are placed in the 952 MARC tag. Records that are prepared by vendors may place item data in a different tag, such as the 949 or 975. In that situation Koha will map the vendor item tag to the 952 in Koha through the system preference MarcItemFieldsToOrder. Records prepared by a vendors only use a different tag for item data when used to create baskets in Acquisitions, with or without EDI. An item record must have a minimum of three fields in order to successfully import.

Field
Field Label
Field Data
952$a
Home library
Branch code
952$b
Current library (or Holding library)
Branch code
952$y
Koha item type
Item type code

Optional Item Fields

Additional item fields can be included with an import. Some fields require an authorized value code in order to populate the field in Koha. Authorized value codes can be found in Administration > Authorized Values. An import file that includes 952 item records can include many of the available fields. Records prepared with a vendor item tag such as 949 or 975 will be limited to what fields can be mapped.

Warning
Be sure to enter the authorized value "code" for the fields that use authorized values rather than descriptions. For example, the 952$7 not for loan authorized value for "ordered" might be a '-1' in your system.

Notes
The only item status that can be mapped with the MarcItemFieldsToOrder system preference is the not for loan status. The system preference does allow libraries to include acquisition data for quantity and budget code.

Field
Field Label
Field Data
952$0
Withdrawn status
WITHDRAWN authorized value
952$1
Lost status
LOST authorized value
952$2
Source of classification
Classification source code
952$3
Materials specified
text
952$4
Damaged status
DAMAGED authorized value
952$5
Use restrictions
text
952$7
Not for loan
NOT_LOAN authorized value
952$8
Collection code
CCODE authorized value
952$c
Shelving location
LOC authorized value
952$d
Date acquired
YYYY-MM-DD
952$e
Source of acquisition
Vendor ID
952$f
Coded location qualifier
three-character code
952$g
Cost, normal purchase price
0.00 (no dollar sign)
952$h
Serial enumeration/chronology
text
952$i
Inventory number
Cutter, date, or term 
952$o
Full call number
text
952$p
Barcode
text, usually a number
952$t
Copy  number
text
952$v
Cost, replacement price
0.00 (no dollar sign)
952$w
Price effective from
YYYY-MM-DD
952$x
Non-public note
text
952$z
Public note
text

Add item records with MARC modification templates

If an incoming bibliographic records file does not contain item records, items can still be added by using MARC modification templates. Koha requires three minimum fields for item records:
  1. 952$a Home library (branch code)
  2. 952$b Current library (branch code)
  3. 952$y Koha item type (item type code)
The first action to add item records is to "add new field". Each succeeding action is to "Update existing or add new field". The example below is adding an item record for a BOOK item type to the East Branch (E branch code).


This template will add one item record to a bibliographic record. This method of adding item records could be useful for adding records to e-books, which typically do not have item records.
Notes
This feature was made available for Koha 25.05 with Bug 26869.

Import Settings

Go to Cataloging > Stage records for import to access the 'Stage MARC records for import' tool. From here, choose the file of records from your desktop to import into Koha. The next page has options to tell Koha how you would like the file to be processed.

When a file is staged for import, there are some options on how Koha should handle the incoming records. The first step is to decide whether to match to existing records and then decide what to do with item records. The most typical setting is to match and overlay bibliographic records and add item records.
  1. Comments: This section will allow you to make a note about this file, which can be seen from the imported set of files.
  2. Record Type: Bibliographic or authority records.
  3. Character Encoding: UTF-8 is set by default, and is generally the correct choice. However, if you find that you are importing records with incorrect diacritics, this encoding may need to be set to MARC8.
  4. Format: MARC/MARCXML. Again, this generally will be MARC, but if your file is in XML format, this should be adjusted.
  5. Marc Modification Templates: This option will allow a library to apply a template to make changes to the MARC records during the import.
  6. Matching Records: This tells Koha whether to look for matches in this new import with records already in your system. If you choose to look for matches, this setting determines how Koha will treat them. 
  7. Embedded item record data: For bib files, this allows you to choose whether or not to import any item data (found in field 952) in the MARC records.
Libraries can create an Import Profile for commonly-used selections.




The options for processing items include:
  1. Always add items
  2. Add items only if matching bib was found
  3. Add items only if no matching bib was found
  4. Replace items if matching bib was found (only for existing items)
The option selected depends on the cataloging workflow, but the most common option is to Always add items.

Managing Matches

The list of results will include whether or not records in the imported file match with any existing records in. your catalog. A staff member with the necessary permissions will be able to select any individual record with a match and the action that should be taken before the records are imported.


If you choose 'Ignore matches', Koha will follow the rules you set up for the import. For example, if you choose to 'add record if no match is found', choosing 'Ignore' will add the record to the system.

Authority Records

Authority records are imported in the same way as bibliographic records through Stage MARC records. When preparing the import profile, be sure to select Authority as the record type.


Record matching rules can be defined to match authorities records on control numbers such as the 001, 010, 024, and 035.

Third-party services prepare authority batch files based on a library's existing catalog data. These records are then uploaded and overlaid a few times a year. Because these files are large, the import and processing of the files can take a bit of time. It can sometimes help to break the files up into smaller batch sizes.

Can I import records from a spreadsheet?

Yes, it is possible to import records from a spreadsheet. This requires installing and configuring a CSV2MARC plugin. The instructions for the plugin require mapping the columns in the spreadsheet to MARC tags and subfields. The spreadsheet then needs to populate the record data in the corresponding columns. The plugin works well, but does require paying particular attention to the configuration to ensure the mapping is formatted correctly. This method is recommended for importing brief MARC records. It is possible to import full MARC records but the mapping is more difficult to troubleshoot if there is a typo preventing the import from succeeding.

Once the plugin is installed, CSV2MARC will be an option in the Import Profile in Stage MARC records.


Can I import my records using TCP/IP in Record Manager?

Record Manager is a service provided by OCLC for downloading MARC records. OCLC provides an option to push the records into a Koha catalog using a TCP/IP protocol. Historically, this connection has not worked well, but we have received reports from libraries that it is working. The instructions for setting this connection up within OCLC Record Manager are located in the Koha Wiki.

Can I import records via EDI?

EDI is an electronic delivery method used by some vendors. Libraries order materials and download MARC records as they prepare their orders. The workflow may vary by vendor or library. The process is described in the documentation on setting up EDI. Generally speaking, records still need to be imported via Stage MARC records into the baskets. The bibliographic records, themselves, are not send via EDI.

How many records can I import at one time?

The answer to that question depends on the size of the library system and the match points. We generally recommend 10,000 - 15,000 records or less at a time. Large batches require a bit of patience for the system to process.

Importing Large Numbers of Bibliographic Records

  1. Libraries can split records up and import them using Koha's import tool. This may take some time for Koha to process.
  2. Libraries can upload the files to a Workdrive and Bywater's Data team can upload them from the back end using the command line.  This can be scheduled for off hours by opening a ticket with Bywater.
Notes
While imports are taking place Elastic search will still be generating and updating which can cause some catalog abnormalities to appear.

What is the Reservoir in Cataloging Search?

Records imported using the 'Stage MARC records for import' tool remain in the reservoir until they are cleaned. Any z39.50 searches will appear in the reservoir as well.

The 'Coming from' column will include the file name or, for z39.50 searches, the search target. From the 'Action' button, libraries can view a MARC preview or card preview, or can import the record.



How Long Do Records Remain in the Reservoir?

Records will remain in the reservoir until one of two things happens:
  1. A librarian 'cleans' that file of records from 'Manage staged MARC records'
  2. The cleanup_database cron job purges records from the reservoir older than X days old
For ByWater partners, unless specified otherwise, that default cleanup period is 180 days. If your library would like to adjust that timing, please submit a ticket and we will be happy to help.

Troubleshooting

My records match, why are they not overlaying?

The Search index field in the record matching rule needs to match the indexes for the search engine Koha is using. The Koha Manual includes search index listings for both the Zebra and Elastic Search engines. The index names are similar between the search engines, but there are a few differences. The indexes also vary between authority and bibliographic records. If the records are not matching and overlaying as expected, double check the search index.

Bibliographic Indexes:
  1. Zebra
  2. Elastic (Marc21)
Authority Indexes
  1. Elastic (MARC21)

Why did some of my item data not populate the correct field?

Item data may not populate a field as expected for a few different reasons. The most common reason is an incorrect authorized value.
  1. Using the authorized value description instead of the authorized value "code".
  2. Using an authorized value in the wrong field. For example using a collection code (CCODE) in the shelving location (LOC) field.

Why am I getting a 500 loading error when trying to view the bibliographic detail view of a record?

This can happen if the incoming item record does not contain a required field. Koha requires three minimum fields for item records:
  1. 952$a Home library (branch code)
  2. 952$b Current library (branch code)
  3. 952$y Koha item type (item type code)
If one of these fields are missing, the item holding table of a bibliographic record will fail to load. If this happens, the import can be reversed. Once the records have been corrected, the import file should come in without errors.

Why are my authority records not matching as expected?

We have seen some batch authority records not match because of extra spaces in the matching control fields. Occasionally, a third-party vendor will prepare the records and either add or remove spaces. These extra spaces cause the records to not match when doing an exact match. To correct this, either the existing records in Koha need to be edited to add or remove spaces or the records of the import file need to be corrected. In either case, the fields need to match, including the spaces.

My import file appears to be stuck. What do I do?

Import files that do not seem to complete their processing when staging or importing could be stuck in a background process queue. This requires investigation by ByWater's systems team. Very often, clearing the stuck job will allow the import to finally process. Large import files can take some time to process. Contact support if the files do not process after a reasonable time frame.

Why did my import file fail?

The most common reason an import file failed is because of bad data. 
  1. Missing a required field in the item record data, such as the 952$a, 952$b, or the 952$y
  2. A bad authorized value
  3. Hidden characters or spaces

Why do my records have weird symbols in the records?

The � symbol sometimes appear in records. This is due to an encoding error in the records for special characters or punctuation, also known as diacritics. The encoding of the incoming record needs to match the encoding import profile setting. Switching the encoding from UTF-8 to MARC8 (or vice versa) can sometimes help.

Additional documentation can be found in our article on How to Prevent Diacritics from Misbehaving in Koha.
    • Related Articles

    • Authorities in Koha

      Koha has a number of ways of managing authority records. Here are a few options. Automatically Link Authority Records Libraries can link authorities while cataloging. A button in the Basic Editor will link the authorities in the record with existing ...
    • Editing MARC Records

      How Do I Move an Item to the Correct Record? Sometimes, in spite of all best efforts, cataloging errors happen. Here's one scenario that happens occasionally: a patron checks their account and sees a title they don't recognize in their checkouts. It ...
    • How to Prevent Diacritics from Misbehaving in Koha

      What are diacritics? Have you ever noticed a funky symbol in your catalog that looks like a diamond with a question mark in the middle? Here’s an example: Gabriel Garc�ia M�arquez Are you wondering what this symbol means and why it appears in your ...
    • MARC Modification Templates

      MARC modification templates are used for batch record modifications. These templates can be applied in the Batch record modification tool, when staging MARC records, or on individual bibliographic records. Setting Up MARC Modification Templates Go to ...
    • Record Matching Rules

      Record matching rules are used when importing MARC records into Koha. Koha includes some Record Matching Rules, such as ISBN, ISSN, 999$c (Koha's Unique Identifier), and 035 (OCLC number), by default. During an import, a staff member can select which ...