How to Use and Read ONIX Book Files: Getting the most out of your metadata

Posted: December 6, 2017  |  Updated: February 8, 2023

A no-nonsense handbook to navigating and getting the most out of your metadata distribution system


What is ONIX for Books?

First thing’s first: what exactly is ONIX? Is it the british spelling of a black stone? Is it a pokemon? No, and yes, but, not here. Developed by a company called EDItEUR in 2001, ONIX for Books is the global standard format for creating, transmitting, and, and communicating book product and bibliographic information electronically.   

“ONIX is an XML-based standard for rich book metadata, providing a consistent way for publishers, retailers and their supply chain partners to communicate rich information about their products.”

If you’ve ever seen html code or seen a movie with hackers, you’ve seen what an ONIX file looks like. The kind folks at EDItEUR created a standard of fields using codes so that no matter what language you speak, you can accurately communicate the information about your book that retailers need to sell it. It’s like digital Esperanto.

The files– which ONIX calls “messages”–are sent from publishers to distributors/retailers through a variety of systems. Since the format is a guideline and not a product, they can be sent as simply as via email attachment or as sophisticated as through third-party tools and File Transfer Protocol (FTP) providers.

The individual files can be viewed either in an internet browser (we recommend Chrome), in a simple text editing software (Notepad for Microsoft, TextEdit for Mac). Since ONIX messages are built to speak between computers, they can be difficult for humans to read. Third party software that breaks down each code into easy question and answer fields are available. Some of the most popular include ONIXEdit, Book Connect, OnixSuite, Title Manager, BookSonix and BiblioLive.

How does ONIX work?

Once completed, the ONIX messages are transmitted to the specific retailers, who use the information to populate fields for the product display as well as within their own cataloging and search functions.

Why does complete ONIX metadata matter?

In one word: sales. Books with complete metadata sell more copies, across both digital and nondigital platforms.

In 2016, the Nielsen Company, Bowker, and Baker and Taylor published the findings of their US survey of book sales and metadata, entitled “Nielsen Book US Study: The importance of Metadata for Discoverability and Sales,” which reinforced the findings of their 2012 Nielsen Book UK study, which found a strong link between books with complete and relevant metadata and increased sales–including for offline retailers.

Their results promote the well-founded idea that discoverability — “the ease with which a particular product can be found”–hinges on complete metadata, and makes the important distinction that it’s not just about the direct consumer discoverability, but for gatekeepers in the book industry supply chain, most notably librarians and booksellers, too:

“Providing accurate data on properties such as publication date, price, supplier and physical attributes aids booksellers in planning their stock management, from scheduling future orders, to planning shelf space or storage allocations, to ensuring shipments are made on the most economical terms (through referencing physical attribute data).

“Maintaining an efficient supply chain ensures that booksellers can focus on selling books – and maximizing sales for publishers and themselves. Where this valuable supply chain data isn’t available to the bookseller, at best they will need to carry out additional work (leading to decreased efficiency) and at worst they may not order the product due to an inability to plan for it effectively.”

While that all may sound daunting, the 2012 study examined just ten attributes out of 3,000+ ONIX code entries available for completion. In the 2016 study, Nielsen examined the metadata of the top 100,000 bestselling titles from July 2015 to July 2016. Beginning with eight basic fields that they identified as the basic level of completeness– ISBN, Title, Format/Binding, Publication Date, BISAC Subject Code, Retail Price, Sales Rights, Cover image, and Contributor–and found that conforming titles saw average sales that were 75% higher than titles that did not.

They also looked at two critical groupings of data that revealed more positive sales correlations:

Books with complete descriptive data (title description, author biography and review) saw 72% higher sales than those without
 Books with keywords saw 34% higher sales than those without.

Quick guide: Reading the XML

Get into the nitty-gritty of the ONIX messages!

I. Navigating ONIX’s Most Important Fields

With the exception of reviews and high-quality keywords, all of this vital information is most likely at hand. You have it, you just need to implement it.

If you have access to a third-party system this may be easy enough, but since each is built to speak the ONIX language, it is important to know:

What the fields are
 Where they live in the message
 Details you can further include to strengthen your book sales.

Creator of the ONIX system have an exhaustive code list that they update routinely. Most publishers still use ONIX 2.0 series as their system, but the data collective has unleashed the power of ONIX 3.0: a much more thorough listing that takes digital formats into account.

An ONIX message’s details are divided among 230 sections–including sections as specific as “price constraints,” “chinese school grade code,” and “supply date info” but the most vital are at at the beginning.

While the messages can look confusing, they do follow a solid logic, building on information as it is presented. Just like written language, there are phrases that are opened and closed to derive meaning and relationships from each statement.

We’ll pull apart each section and explain it and then bring it back together below.

Each detail is listed as its own line set off by tabs that note its order preceded by an opening information tag in brackets and closed by the same bracket with a “/” in front, like this. (Note: an an ellipse as seen below between the Product Identifier tags brackets in a clause means that you can expand the section down to see the information contained.)

<Product>
  <Product Identifier Type>...</Product Identifier Type> *
 
 <Title>
  <Title Type> # </Title Type>
  <TitlePrefix>The</TitlePrefix>
  <TitleWithoutPrefix>TEXT</TitleWithoutPrefix>
  </Title> 
<Contributor>

II. Formatting

In ONIX messages you can use standard html syntax to begin and end statements and to denote formatting differences. Here are the basics:

<d104 textformat="02"> -->add this after the major Section Heading to note that the following markup obeys HTML rules. End it just by closing the statement with </d104> 
<Heading> Field Entry </Heading>
 
 *Surrounding less-than and greater-than brackets offset and denote organizational headings beginning
 *Data pertaining to the heading goes between the <Heading> syntax
 *A bracket followed by a backslash denotes the end of a phrase.
 
Within the Field Entry, you can note formatting for e-commerce sites (otherwise it will populate as standard text and in a large block if there is a lot of content):
 
<b> Bold text </b>
 <strong> Important text </strong>
 <i> Italic text </i> 
 <em> Emphasized text </em> 
 <sub> Subscript text </sub> 
 <sup> Superscript text</sup> 
 <u> Underline </u>
 <p> - paragraph break
 <br> - line break
 <li> - list item

III. Descriptive Data Identifiers

1. Keywords

Entries to this part of the message are specifically used for indexing and search purposes. While not normally intended for display, best practice is to integrate those that make sense and seem to preform well into your other descriptive fields.

Where?

Under the <subject> heading.

How?

Code Number: 20

<subject>
  <b067>20</b067>
  <b070> keywords; separated by; semi-colons; can be long tail; or short; tail; terms; BUT; avoid; title,  subject, and series; terms; that;
 are; duplicative; because most retailers;  only allow; a certain number of characters;
 </b070>*
*StoryFit Metadata Keywords are optimized to meet retailer requirements.

2. BISAC Subject Code

Found nested under  <Product>, the <subject> product identifiers include 112 options for subject descriptions–everything from Dewey Decimal and Library of Congress organization categories to various European country standards, location by postal code, and Key Character Names found after the clause <b067>  The code for BISAC category code is “10.”

<Product>
 <subject>
 <b067>10</b067>
 <b069>BISAC Subject Code(FIC031000, for example)</b069>

For a full list of BISAC categories, visit the Book Industry Study Group’s listing of complete BISAC headings for fiction and nonfiction subject.  Note: ONIX also supplies the <b067>22</b067>  field for BISAC merchandising Theme, which would follow the same syntax as the Subject code, but with the <b069> field following the  <b067>22</b067> entry.

3. ISBN

Where?

Under <ProductIdentifier>

What?

Noted as <ProductIDType> code number </ProductIDTyper>:

ISBN-13

<span style="font-weight: 400;"> &lt;ProductIDType&gt;</span><span style="font-weight: 400;">15</span><span style="font-weight: 400;">&lt;/ProductIDType&gt;</span>

ISBN-10

<span style="font-weight: 400;"> &lt;ProductIDType&gt;</span><span style="font-weight: 400;">02</span><span style="font-weight: 400;">&lt;/ProductIDType&gt;</span>

ISBN-A

<span style="font-weight: 400;"> &lt;ProductIDType&gt;</span><span style="font-weight: 400;">26</span><span style="font-weight: 400;">&lt;/ProductIDType&gt;</span>
How?

Following the Product Identifier code as <IDValue> 9781000000000 </IDValue>

So the full ISBN entry would look like this:

<Product>
  <ProductIdentifier>
  <ProductIDType>15</ProductIDType>
 <IDValue>9781000000000</IDValue>
 </ProductIdentifier>

4. Title

Where?
<span style="font-weight: 400;">Under </span><span style="font-weight: 400;">&lt;Title&gt;</span>
What?

Lots of options for the Title:

<TitleType> code number </TitleType>

Most used code number will be “01,” which signals a distinctive title in a book and the cover title for a serial. Other options include:

00 - undefined
 02 - ISSN key title of serial
 03 - Title in original language
 04 - Title acronym or initialism
 05 - Abbreviated Title
 06 - Title in other language
 07 - Thematic title of journal issue
 08 - Former title
 10 - Distributor’s Title
 11 - Alternative title on cover
 12 - Alternative title on back
 13 - Expanded title
 14 - Alternative title
<TitleText>Full Title</Title Text>
 <TitlePrefix>A, An, The, etc.</TitlePrefix>
 <TitleWithoutPrefix>Title, but without the prefix</TitleWithoutPrefix>
How?

The full title entry looks like this:

<Title>
  <TitleType>01</TitleType>
  <TitlePrefix>The</TitlePrefix>
  <TitleWithoutPrefix>Book Title Example</TitleWithoutPrefix>
 </Title>
 

5. Format/Binding

Where?

Under <ProductIdentifier>

What?

Noted as <ProductForm>code</ProductForm>:

There are 135  format options with details from binding and paper type to operating system and file type . The most used codes are:

BA - Book
 BB - Hardback
 BC - Paperback / softback
 BH - Board Book
 AA - Audio
 AJ - Downloadable Audio file
 EA - Digital (delivery method unspecified)
 EB - Digital Download
 

Can I add specific format details about the Product?

There are 256 options. The most commonly used codes are:

B101 - Mass market (rack) paperback
 B102 - Trade paperback (US)
 B103 - Digest format paperback
 B104 - A-format paperback
 B105 - B-format paperback
 B106 - Trade paperback (UK)
 B107 - Tall rack paperback (US)
 B315 - Trade binding
 A103 - MP3 format
 A104 - WAV format
 B401 - Cloth over boards
 B221 - Picture book
 E101 - EPUB
 E116 - Amazon Kindle
 E121 - eReader
 E126 - Microsoft Reader
 E133 - Google Edition
 E134 - Book ‘app’ for iOSE135 - Book ‘app’ for Android
 E136 - Book ‘app’ for other operating system
 E141 - iBook
 B501 - With dust jacket
 B502 - With printed dust jacket
How?

The full title entry looks like this:

<Product>
  <ProductIdentifier>
  <ProductForm>BB</ProductForm>
  <ProductFormDetail>B501</ProductFormDetail>
 </ProductIdentifier>
 

6. Publication Date

The date itself is straight forward and found only under Product as:

<Product>
  <PublicationDate>YearMonthDay</PublicationDate>
 *Note: make sure to denote in the ONIX message what your standard date format is. This is found under <DateFormat> (CodeList Number 55)
How do I add specific details about the Publication Date?

There are plenty of juicy details to add to an ONIX message about a pending publication surrounding its release.

Publishing Status

<Product>
  <PublishingStatus>code</PublishingStatus> 

The options are:

00 - Unspecified
 01  - Cancelled
 02  - Forthcoming
 03  - Postponed indefinitely
 04  - Active
 05  - No longer our product
 06  - Out of stock indefinitely
 07  - Out of print
 08  - Inactive
 09  - Unknown
 10  - Remaindered
 11  - Withdrawn from sale
 12  - Recalled
 13  - Active, but not sold separately
 14  - Recalled
 15  - Recalled
 16  - Temporarily withdrawn from sale
 17  - Permanently withdrawn from sale
Availability
<Product>
  <SupplyDetail>
  <ProductAvailability>Code</ProductAvailability> 

The options are:

01 - Unspecified Cancelled
 09 - Not yet available, postponed indefinitely
 10 - Not yet available
 11 - Awaiting stock
 12 - Not yet available, will be POD
 20 - Available
 21 - In stock
 22 - To order
 22 - POD
 30 - Temporarily unavailable
 31 - Out of stock
 32 - Reprinting
 33 - Awaiting reissue
 34 - Temporarily withdrawn from sale
 40 - Not available (reason unspecified)
 41 - Not available, replaced by new product
 42 - Not available, other format available
 43 - No longer supplied by us
 44 - Apply direct
 45 - Not sold separately
 46 - Withdrawn from sale
 47 - Remaindered
 48 - Not available, replaced by POD
 49 - Recalled
 50 - Not sold as set
 51 - Not available, publisher indicates OP
 52 - Not available, publisher no longer sells product in this market
 97 - No recent update received
 98 - No longer receiving updates
 99 - Contact supplier

7. Retail Price

With the price it is important to note several things:

a. Currency

It’s important to note the currency to each ISBN. You can do so in the header of your message as

<DefaultCurrencyCode>CODE</DefaultCurrencyCode> 
or under the <price> field as
&lt;CurrencyCode&gt;<span style="color: #ff0000;">CODE</span>&lt;CurrencyCode&gt;.
b. Price type

There are 26 possibilities here. These are the most common:

03 - Fixed retail price excluding tax
 04 - Fixed retail price including tax
 05 - Supplier’s net price excluding tax
 07 - Supplier’s net price including tax
 41 - Publishers retail price excluding tax
 42 - Publishers retail price including tax
c. Price type qualifiers

There are 16 possibilities here. These are the most common:

01 - Member/subscriber price
 02 - Export Price
 03 - Reduced price applicable when the item is purchased as part of a set (or series, or collection)
 05 - Consumer Price
 06 - Corporate / Library / Education price
 07 - Reservation order price
 08 - Promotional offer price
 10 - Library Price
 11 - Education Price
 12 - Corporate price
 13 - Subscription service price
 14 - School library price
 15 - Academic library price
 16 - Public library price
d. Sales Rights

Nested under <Product>. Eight options:

00 - Sales rights unknown or unstated for any reason
 01 -For sale with exclusive rights in the specified countries or territories
 02 -For sale with non-exclusive rights in the specified countries or territories
 03 -Not for sale in the specified countries or territories (reason unspecified)
 04 -Not for sale in the specified countries (but publisher holds exclusive rights in those countries or territories
 05 - Not for sale in the specified countries (publisher holds non-exclusive rights in those countries or territories)
 06 -Not for sale in the specified countries (because publisher does not hold rights in those countries or territories)
 07 -For sale with exclusive rights in the specified countries or territories (sales restriction applies)
 08 -For sale with non-exclusive rights in the specified countries or territories (sales restriction applies)
e. Territory:
<RightsTerritory>TERRITORIES</RightsTerritory>
 

8. Contributor

Nested under <Product>. Using the following codes you can note everything from the contributor to co-author, editors, pseudonyms, and more (there are a total of 108 designated authorial tags).

<Product>
  <contributor>
  <b035>#</b035>
  <b039>First Name</b039>
  <b040>Last Name</b040>
 </contributor> 
Between <b035> # </b035> will be: 

A01 - By (author)
 A02 - With
 A08 - By (photographer)
 A09 - Created by
 A12 - Illustrated by
 A13 - Photographs by
 A14 - Text by
 A15 - Preface by
 A16 - Prologue by
 A19 - Afterword by
 A22 - Epilogue by
 A23 - Foreword by
 A24 - Introduction by
 A26 - Memoir by
 A29 - Introduction and notes by
 A32 - Contributions by
 A36 - Cover design or artwork by
 A38 - Original author
 A39 - Maps by
 A43 - Interviewer
 B01 - Edited by
 B02 - Revised by
 B03 - Retold by
 B04 - Abridged by
 B05 - Adapted by
 B06 - Translated by
 B07 - As told by
 B10 - Edited and translated by
 C01 - Compiled by
 E07 - Read by

9. Title Description

Found in the <TextType> or <OtherText> (List Number 153) , there are 24 fields that can be populated and should be if you have the information.

<othertext>
 <d102>Code Number</d102>
 <d104 textformat="02">Description</d104>
 </othertext> 
Here are the most common/most important for increased discoverability and sales:
01 - Sender-defined text Text which (a) is not for general distribution and (b) cannot be coded elsewhere.
 02 - Short description/annotation Limited to a maximum of 350 characters
 03 - Description Length unrestricted
 04 - Table of contents
 05 - Flap / cover copy
 06 - Review quote
 07 - Review quote: previous edition
 08 - Review quote: previous work
 09 - Endorsement
 10 - Promotional headline
 11 - Feature Describing an attention-grabbing feature of a product for promotional purposes.
 12 - Biographical note A note referring to all contributors to a product – NOT linked to a single contributor
 13 - Publisher’s notice Publisher statement of contractual obligations (disclaimer, sponsor statement, or legal notice, etc.)
 16 - Short description/annotation for collection (of which the product is a part.) Limited to a maximum of 350 characters
 17 - Description for collection (of which the product is a part.) Length unrestricted

10. Review

In addition to the information on relayed on reviews in the title data, you can include full reviews and award citations: 
<Product>
 <CitedContentType> OR <PrizeorAwardorAchievement>
 <CitationType>Code Name or Value<CitationType>
 information
 </CitedContentType> OR </PrizeorAwardorAchievement> 
Cited Content Type
 01 Review
 02 Bestseller list
 03 Media mention
 04 ‘One locality, one book’ program
 05 Curated list 
Prize/Award/Achievement
 01     Winner
 02     Runner-up
 03     Commended
 04     Short-listed
 05     Long-listed
 06     Joint winner
 07     Nominated

More Resources:

“A non-technical, beginners’ guide to ONIX for Books” by BookMachine.

“Three Ways To Do More With ONIX” by Digital Book World