How to Use and Read ONIX Book Files: Getting the most out of your metadata
Posted: December 6, 2017 | Updated: February 8, 2023
A no-nonsense handbook to navigating and getting the most out of your metadata distribution system
What is ONIX for Books?
First thing’s first: what exactly is ONIX? Is it the british spelling of a black stone? Is it a pokemon? No, and yes, but, not here. Developed by a company called EDItEUR in 2001, ONIX for Books is the global standard format for creating, transmitting, and, and communicating book product and bibliographic information electronically.
“ONIX is an XML-based standard for rich book metadata, providing a consistent way for publishers, retailers and their supply chain partners to communicate rich information about their products.”
If you’ve ever seen html code or seen a movie with hackers, you’ve seen what an ONIX file looks like. The kind folks at EDItEUR created a standard of fields using codes so that no matter what language you speak, you can accurately communicate the information about your book that retailers need to sell it. It’s like digital Esperanto.
The files– which ONIX calls “messages”–are sent from publishers to distributors/retailers through a variety of systems. Since the format is a guideline and not a product, they can be sent as simply as via email attachment or as sophisticated as through third-party tools and File Transfer Protocol (FTP) providers.
The individual files can be viewed either in an internet browser (we recommend Chrome), in a simple text editing software (Notepad for Microsoft, TextEdit for Mac). Since ONIX messages are built to speak between computers, they can be difficult for humans to read. Third party software that breaks down each code into easy question and answer fields are available. Some of the most popular include ONIXEdit, Book Connect, OnixSuite, Title Manager, BookSonix and BiblioLive.
How does ONIX work?
Once completed, the ONIX messages are transmitted to the specific retailers, who use the information to populate fields for the product display as well as within their own cataloging and search functions.
Why does complete ONIX metadata matter?
In one word: sales. Books with complete metadata sell more copies, across both digital and nondigital platforms.
In 2016, the Nielsen Company, Bowker, and Baker and Taylor published the findings of their US survey of book sales and metadata, entitled “Nielsen Book US Study: The importance of Metadata for Discoverability and Sales,” which reinforced the findings of their 2012 Nielsen Book UK study, which found a strong link between books with complete and relevant metadata and increased sales–including for offline retailers.
Their results promote the well-founded idea that discoverability — “the ease with which a particular product can be found”–hinges on complete metadata, and makes the important distinction that it’s not just about the direct consumer discoverability, but for gatekeepers in the book industry supply chain, most notably librarians and booksellers, too:
“Providing accurate data on properties such as publication date, price, supplier and physical attributes aids booksellers in planning their stock management, from scheduling future orders, to planning shelf space or storage allocations, to ensuring shipments are made on the most economical terms (through referencing physical attribute data).
“Maintaining an efficient supply chain ensures that booksellers can focus on selling books – and maximizing sales for publishers and themselves. Where this valuable supply chain data isn’t available to the bookseller, at best they will need to carry out additional work (leading to decreased efficiency) and at worst they may not order the product due to an inability to plan for it effectively.”
While that all may sound daunting, the 2012 study examined just ten attributes out of 3,000+ ONIX code entries available for completion. In the 2016 study, Nielsen examined the metadata of the top 100,000 bestselling titles from July 2015 to July 2016. Beginning with eight basic fields that they identified as the basic level of completeness– ISBN, Title, Format/Binding, Publication Date, BISAC Subject Code, Retail Price, Sales Rights, Cover image, and Contributor–and found that conforming titles saw average sales that were 75% higher than titles that did not.
They also looked at two critical groupings of data that revealed more positive sales correlations:
Books with complete descriptive data (title description, author biography and review) saw 72% higher sales than those without Books with keywords saw 34% higher sales than those without.
Quick guide: Reading the XML
Get into the nitty-gritty of the ONIX messages!
I. Navigating ONIX’s Most Important Fields
With the exception of reviews and high-quality keywords, all of this vital information is most likely at hand. You have it, you just need to implement it.
If you have access to a third-party system this may be easy enough, but since each is built to speak the ONIX language, it is important to know:
What the fields are Where they live in the message Details you can further include to strengthen your book sales.
Creator of the ONIX system have an exhaustive code list that they update routinely. Most publishers still use ONIX 2.0 series as their system, but the data collective has unleashed the power of ONIX 3.0: a much more thorough listing that takes digital formats into account.
An ONIX message’s details are divided among 230 sections–including sections as specific as “price constraints,” “chinese school grade code,” and “supply date info” but the most vital are at at the beginning.
While the messages can look confusing, they do follow a solid logic, building on information as it is presented. Just like written language, there are phrases that are opened and closed to derive meaning and relationships from each statement.
We’ll pull apart each section and explain it and then bring it back together below.
Each detail is listed as its own line set off by tabs that note its order preceded by an opening information tag in brackets and closed by the same bracket with a “/” in front, like this. (Note: an an ellipse as seen below between the Product Identifier tags brackets in a clause means that you can expand the section down to see the information contained.)
<Product> <Product Identifier Type>...</Product Identifier Type> * <Title> <Title Type> # </Title Type> <TitlePrefix>The</TitlePrefix> <TitleWithoutPrefix>TEXT</TitleWithoutPrefix> </Title> <Contributor>
II. Formatting
In ONIX messages you can use standard html syntax to begin and end statements and to denote formatting differences. Here are the basics:
<d104 textformat="02"> -->add this after the major Section Heading to note that the following markup obeys HTML rules. End it just by closing the statement with </d104>
<Heading> Field Entry </Heading>
*Surrounding less-than and greater-than brackets offset and denote organizational headings beginning
*Data pertaining to the heading goes between the <Heading> syntax
*A bracket followed by a backslash denotes the end of a phrase.
Within the Field Entry, you can note formatting for e-commerce sites (otherwise it will populate as standard text and in a large block if there is a lot of content):
<b> Bold text </b>
<strong> Important text </strong>
<i> Italic text </i>
<em> Emphasized text </em>
<sub> Subscript text </sub>
<sup> Superscript text</sup>
<u> Underline </u>
<p> - paragraph break
<br> - line break
<li> - list item
III. Descriptive Data Identifiers
1. Keywords
Entries to this part of the message are specifically used for indexing and search purposes. While not normally intended for display, best practice is to integrate those that make sense and seem to preform well into your other descriptive fields.
Where?
Under the <subject> heading.
How?
Code Number: 20
<subject> <b067>20</b067> <b070> keywords; separated by; semi-colons; can be long tail; or short; tail; terms; BUT; avoid; title, subject, and series; terms; that; are; duplicative; because most retailers; only allow; a certain number of characters; </b070>*
2. BISAC Subject Code
Found nested under <Product>, the <subject> product identifiers include 112 options for subject descriptions–everything from Dewey Decimal and Library of Congress organization categories to various European country standards, location by postal code, and Key Character Names found after the clause <b067> The code for BISAC category code is “10.”
<Product> <subject> <b067>10</b067> <b069>BISAC Subject Code(FIC031000, for example)</b069>
For a full list of BISAC categories, visit the Book Industry Study Group’s listing of complete BISAC headings for fiction and nonfiction subject. Note: ONIX also supplies the <b067>22</b067> field for BISAC merchandising Theme, which would follow the same syntax as the Subject code, but with the <b069> field following the <b067>22</b067> entry.
3. ISBN
Where?
Under <ProductIdentifier>
What?
Noted as <ProductIDType> code number </ProductIDTyper>:
ISBN-13
<span style="font-weight: 400;"> <ProductIDType></span><span style="font-weight: 400;">15</span><span style="font-weight: 400;"></ProductIDType></span>
ISBN-10
<span style="font-weight: 400;"> <ProductIDType></span><span style="font-weight: 400;">02</span><span style="font-weight: 400;"></ProductIDType></span>
ISBN-A
<span style="font-weight: 400;"> <ProductIDType></span><span style="font-weight: 400;">26</span><span style="font-weight: 400;"></ProductIDType></span>
How?
Following the Product Identifier code as <IDValue> 9781000000000 </IDValue>
So the full ISBN entry would look like this:
<Product> <ProductIdentifier> <ProductIDType>15</ProductIDType> <IDValue>9781000000000</IDValue> </ProductIdentifier>
4. Title
Where?
<span style="font-weight: 400;">Under </span><span style="font-weight: 400;"><Title></span>
What?
Lots of options for the Title:
<TitleType> code number </TitleType>
Most used code number will be “01,” which signals a distinctive title in a book and the cover title for a serial. Other options include:
00 - undefined 02 - ISSN key title of serial 03 - Title in original language 04 - Title acronym or initialism 05 - Abbreviated Title 06 - Title in other language 07 - Thematic title of journal issue 08 - Former title 10 - Distributor’s Title 11 - Alternative title on cover 12 - Alternative title on back 13 - Expanded title 14 - Alternative title
<TitleText>Full Title</Title Text> <TitlePrefix>A, An, The, etc.</TitlePrefix> <TitleWithoutPrefix>Title, but without the prefix</TitleWithoutPrefix>
How?
The full title entry looks like this:
<Title> <TitleType>01</TitleType> <TitlePrefix>The</TitlePrefix> <TitleWithoutPrefix>Book Title Example</TitleWithoutPrefix> </Title>
5. Format/Binding
Where?
Under <ProductIdentifier>
What?
Noted as <ProductForm>code</ProductForm>:
There are 135 format options with details from binding and paper type to operating system and file type . The most used codes are:
BA - Book BB - Hardback BC - Paperback / softback BH - Board Book AA - Audio AJ - Downloadable Audio file EA - Digital (delivery method unspecified) EB - Digital Download
Can I add specific format details about the Product?
There are 256 options. The most commonly used codes are:
B101 - Mass market (rack) paperback B102 - Trade paperback (US) B103 - Digest format paperback B104 - A-format paperback B105 - B-format paperback B106 - Trade paperback (UK) B107 - Tall rack paperback (US) B315 - Trade binding A103 - MP3 format A104 - WAV format B401 - Cloth over boards B221 - Picture book E101 - EPUB E116 - Amazon Kindle E121 - eReader E126 - Microsoft Reader E133 - Google Edition E134 - Book ‘app’ for iOSE135 - Book ‘app’ for Android E136 - Book ‘app’ for other operating system E141 - iBook B501 - With dust jacket B502 - With printed dust jacket
How?
The full title entry looks like this:
<Product> <ProductIdentifier> <ProductForm>BB</ProductForm> <ProductFormDetail>B501</ProductFormDetail> </ProductIdentifier>
6. Publication Date
The date itself is straight forward and found only under Product as:
<Product> <PublicationDate>YearMonthDay</PublicationDate> *Note: make sure to denote in the ONIX message what your standard date format is. This is found under <DateFormat> (CodeList Number 55)
How do I add specific details about the Publication Date?
There are plenty of juicy details to add to an ONIX message about a pending publication surrounding its release.
Publishing Status
<Product> <PublishingStatus>code</PublishingStatus>
The options are:
00 - Unspecified 01 - Cancelled 02 - Forthcoming 03 - Postponed indefinitely 04 - Active 05 - No longer our product 06 - Out of stock indefinitely 07 - Out of print 08 - Inactive 09 - Unknown 10 - Remaindered 11 - Withdrawn from sale 12 - Recalled 13 - Active, but not sold separately 14 - Recalled 15 - Recalled 16 - Temporarily withdrawn from sale 17 - Permanently withdrawn from sale
Availability
<Product> <SupplyDetail> <ProductAvailability>Code</ProductAvailability>
The options are:
01 - Unspecified Cancelled 09 - Not yet available, postponed indefinitely 10 - Not yet available 11 - Awaiting stock 12 - Not yet available, will be POD 20 - Available 21 - In stock 22 - To order 22 - POD 30 - Temporarily unavailable 31 - Out of stock 32 - Reprinting 33 - Awaiting reissue 34 - Temporarily withdrawn from sale 40 - Not available (reason unspecified) 41 - Not available, replaced by new product 42 - Not available, other format available 43 - No longer supplied by us 44 - Apply direct 45 - Not sold separately 46 - Withdrawn from sale 47 - Remaindered 48 - Not available, replaced by POD 49 - Recalled 50 - Not sold as set 51 - Not available, publisher indicates OP 52 - Not available, publisher no longer sells product in this market 97 - No recent update received 98 - No longer receiving updates 99 - Contact supplier
7. Retail Price
With the price it is important to note several things:
a. Currency
It’s important to note the currency to each ISBN. You can do so in the header of your message as
<DefaultCurrencyCode>CODE</DefaultCurrencyCode>
or under the <price> field as
<CurrencyCode><span style="color: #ff0000;">CODE</span><CurrencyCode>.
b. Price type
There are 26 possibilities here. These are the most common:
03 - Fixed retail price excluding tax 04 - Fixed retail price including tax 05 - Supplier’s net price excluding tax 07 - Supplier’s net price including tax 41 - Publishers retail price excluding tax 42 - Publishers retail price including tax
c. Price type qualifiers
There are 16 possibilities here. These are the most common:
01 - Member/subscriber price 02 - Export Price 03 - Reduced price applicable when the item is purchased as part of a set (or series, or collection) 05 - Consumer Price 06 - Corporate / Library / Education price 07 - Reservation order price 08 - Promotional offer price 10 - Library Price 11 - Education Price 12 - Corporate price 13 - Subscription service price 14 - School library price 15 - Academic library price 16 - Public library price
d. Sales Rights
Nested under <Product>. Eight options:
00 - Sales rights unknown or unstated for any reason 01 -For sale with exclusive rights in the specified countries or territories 02 -For sale with non-exclusive rights in the specified countries or territories 03 -Not for sale in the specified countries or territories (reason unspecified) 04 -Not for sale in the specified countries (but publisher holds exclusive rights in those countries or territories 05 - Not for sale in the specified countries (publisher holds non-exclusive rights in those countries or territories) 06 -Not for sale in the specified countries (because publisher does not hold rights in those countries or territories) 07 -For sale with exclusive rights in the specified countries or territories (sales restriction applies) 08 -For sale with non-exclusive rights in the specified countries or territories (sales restriction applies)
e. Territory:
<RightsTerritory>TERRITORIES</RightsTerritory>
8. Contributor
Nested under <Product>. Using the following codes you can note everything from the contributor to co-author, editors, pseudonyms, and more (there are a total of 108 designated authorial tags).
<Product> <contributor> <b035>#</b035> <b039>First Name</b039> <b040>Last Name</b040> </contributor>
Between <b035> # </b035> will be:
A01 - By (author) A02 - With A08 - By (photographer) A09 - Created by A12 - Illustrated by A13 - Photographs by A14 - Text by A15 - Preface by A16 - Prologue by A19 - Afterword by A22 - Epilogue by A23 - Foreword by A24 - Introduction by A26 - Memoir by A29 - Introduction and notes by A32 - Contributions by A36 - Cover design or artwork by A38 - Original author A39 - Maps by A43 - Interviewer B01 - Edited by B02 - Revised by B03 - Retold by B04 - Abridged by B05 - Adapted by B06 - Translated by B07 - As told by B10 - Edited and translated by C01 - Compiled by E07 - Read by
9. Title Description
Found in the <TextType> or <OtherText> (List Number 153) , there are 24 fields that can be populated and should be if you have the information.
<othertext> <d102>Code Number</d102> <d104 textformat="02">Description</d104> </othertext> Here are the most common/most important for increased discoverability and sales:
01 - Sender-defined text Text which (a) is not for general distribution and (b) cannot be coded elsewhere. 02 - Short description/annotation Limited to a maximum of 350 characters 03 - Description Length unrestricted 04 - Table of contents 05 - Flap / cover copy 06 - Review quote 07 - Review quote: previous edition 08 - Review quote: previous work 09 - Endorsement 10 - Promotional headline 11 - Feature Describing an attention-grabbing feature of a product for promotional purposes. 12 - Biographical note A note referring to all contributors to a product – NOT linked to a single contributor 13 - Publisher’s notice Publisher statement of contractual obligations (disclaimer, sponsor statement, or legal notice, etc.) 16 - Short description/annotation for collection (of which the product is a part.) Limited to a maximum of 350 characters 17 - Description for collection (of which the product is a part.) Length unrestricted
10. Review
In addition to the information on relayed on reviews in the title data, you can include full reviews and award citations: <Product> <CitedContentType> OR <PrizeorAwardorAchievement> <CitationType>Code Name or Value<CitationType> information </CitedContentType> OR </PrizeorAwardorAchievement>
Cited Content Type 01 Review 02 Bestseller list 03 Media mention 04 ‘One locality, one book’ program 05 Curated list Prize/Award/Achievement 01 Winner 02 Runner-up 03 Commended 04 Short-listed 05 Long-listed 06 Joint winner 07 Nominated
More Resources:
“A non-technical, beginners’ guide to ONIX for Books” by BookMachine.
“Three Ways To Do More With ONIX” by Digital Book World