iso2709
ISO2709 是MARC 的一种标准.
MARC是Machine Readable Catalog(ue)的缩写,意即“机器可读目录”,即以代码形式和特定结构记录在计算机存储载体上的、用计算机识别与阅读的目录。MARC可一次输入,多次使用,是信息技术发展和资源共享要求的产物。
MARC数据最早产生于美国。1961年,美国国会图书馆开始图书馆自动化的设想,随着计算机技术的进步,1963年,美国国会图书馆组织了在内部工作中采用电子计算机技术的可行性调查,1966年1月,产生了《标准机器能读目录款式的建议》,即MARC-1格式,1967年提出MARC-2,它是目前使用的各种机读目录格式的母本。1969年开始向全国发行MARCII格式书目磁带,并将MARCII格式称为US- MARC,即美国机器可读目录。作为一种计算机技术发展早期形成的数据格式,这一格式在定义时比较充分地照顾到图书馆书目数据在文献形式描述、内容描述、检索等方面的需要,表现为:字段数量多;著录详尽;可检索字段多;定长与不定长字段结合,灵活实用;保留主要款目及传统编目的特点;扩充修改功能强;并能在实践中不断发展完善。美国机读目录适合美国国情,英法等国家根据各自情况创建了自己的机读目录,为了进一步协调、促进国际交流,统一各国机读目录格式,国际图书馆联合会在USMARC基础上制订了“国际机读目录通信格式”,即UNIMARC,现在许多国家都采用UNIMARC进行文献编目。
CNMARC简介
CNMARC是中国机读目录(China Machine-Readable Catalogue)的缩写,是用于中国国家书目机构同其它国家书目机构以及中国国内图书馆与情报部门之间,以标准的计算机可读形式交换书目信息。中国机读目录研制于20世纪70年代。1979年成立了全国信息与文献标准化技术委员会,成立北京地区机读目录研制小组;1982年,中国标准总局公布了参照ISO2709制定的国家标准《文献目录信息交换用磁带格式》(GB2901-82),为中文MARC格式的标准化奠定了基础;1986年UNIMARC中译本面世。在此基础上,根据我国实际情况,编制《中国机读目录通讯格式》讨论稿,1992年2月正式出版《中国机读目录通讯格式》,即CN-MARC。CNMARC格式为我国机读目录实现标准化、与国际接轨,从数据结构方面提供了保障。
//一下内容由lakechang添加于2008-03-10,详细介绍了ISO2709的格式,但内容是英文的,由于时间关系,本人没有进行翻译:
LEADER
The leader is the first field in the record and has a fixed length of 24 octets (character positions 0-23). Only ASCII graphic characters are allowed in the Leader. The structure of the leader as defined in MARC 21 is represented schematically below. The numbers indicate the character positions occupied by each part of the leader.
Structure of the Leader in MARC 21 Records
RECORD_LENGTH RECORD_STATUS TYPE_OF_RECORD IMPLEMENTATION- DEFINED
00-04 05 06 07-08
CHARACTER_CODING_SCHEME INDICATOR_COUNT SUBFIELD_CODE_LENGTH
09 10 11
BASE_ADDRESS_OF_DATA IMPLEMENTATION-DEFINED ENTRY_MAP
12-16 17-19 20-23
Record length (character positions 00-04), contains a five-character ASCII numeric string equal to the length of the entire record, including itself and the record terminator. The five-character numeric string is right justified and unused positions contain zeroes (zero fill). The maximum length of a record is 99999 octets.
Record status (character position 05), contains an ASCII graphic character which indicates the relation of the record to a file (e.g., new, updated, etc.).
Type of record (character position 06), contains an ASCII graphic character which specifies the characteristics and defines the components of the record.
Implementation-defined (character positions 07-08). ANSI Z39.2 and ISO 2709 reserve character positions 07-08 for definition by a particular implementation. The individual MARC 21 formats define these character positions if needed. Positions may contain only ASCII graphic characters. Any position not defined contains a blank.
- Bibliographic level (bibliographic record, character position 07): contains an ASCII graphic character which also provides information about the components and characteristics of the record.
- Kind of data (community information record, character position 07): contains an ASCII graphic character which also provides information about the components and characteristics of the record.
- Type of control (bibliographic record, character position 08): contains an ASCII graphic character which also provides information about the components and characteristics of the record.
Character coding scheme (character position 09), contains a code that identifies the character coding scheme used in a record.
Indicator count (character position 10), contains one ASCII numeric character specifying the number of indicators occurring in each variable data field. In MARC 21 records, the indicator count is always 2.
Subfield code length (character position 11), contains one ASCII numeric character specifying the sum of the lengths of the delimiter and the data element identifier used in the record. In MARC 21 records, the subfield code length is always 2. The ANSI Z39.2 and ISO 2709 name for this data element is identifier length .
Base address of data (character positions 12-16), contains five ASCII numeric characters that specify the first character position of the first variable field in the record. It is equal to the sum of the lengths of the leader and the directory, including the field terminator at the end of the directory. The number is right justified and unused positions contain zeroes (zero fill).
Implementation-defined (character positions 17-19). ANSI Z39.2 and ISO 2709 reserve character positions 17-19 for definition by a particular implementation. The individual MARC 21 formats define these character positions is needed. Positions may contain only ASCII graphic characters. Any position not defined contains a blank.
Entry map (character positions 20-23), contains four single digit ASCII numeric characters that specify the structure of the entries in the directory.
- Length of length-of-field (character position 20): specifies the length of that part of each directory entry; in MARC 21 records, it is always set to 4.
- Length of starting-character-position (character position 21): specifies the length of that part of each directory entry; in MARC 21 records, it is always set to 5.
- Length of implementation-defined (character position 22): specifies that part of each directory entry; in MARC 21 records, a directory entry does not contain an implementation-defined portion, therefore this position is always set to 0.
- Undefined (character position 23): this character position is undefined; it is always set to 0.
Structure of an Entry Map in MARC 21 Record
Structure of an Entry Map in MARC 21 Record
Structure of an Entry Map in MARC 21 Record
LENGTH OF LENGTH OF LENGTH OF
LENGTH-OF-FIELD STARTING-CHARACTER- IMPLEMENTATION-
PART POSITION PART DEFINED PART UNDEFINED
20 21 22 23
--------------------------------------------------------------------------------
DIRECTORY
A directory entry in MARC 21 is made up of a tag, length-of-field, and field starting position. The directory begins in character position 24 of the record and ends with a field terminator. It is of variable length and consists of a series of fixed fields, referred to as "entries." One entry is associated with each variable field (control or data) present in the record. Each directory entry is 12 characters in length; the structure of each entry as defined in MARC 21 is represented schematically below. The numbers indicate the character positions occupied by the parts of the entry.
Structure of a Directory Entry in MARC 21 Records
TAG LENGTH_OF_FIELD STARTING_CHARACTER_POSITION
00-02 03-06 07-11
Tag (character positions 00-02), consists of three ASCII numeric characters or ASCII alphabetic characters (uppercase or lowercase, but not both) used to identify or label an associated variable field. The MARC 21 formats have used only numeric tags. The tag is stored only in the directory entry for the field; it does not appear in the variable field itself.
Length of field (character positions 03-06), contains four ASCII numeric characters which give the length, expressed as a decimal number, of the variable field to which the entry corresponds. This length includes the indicators, subfield codes, data and field terminator associated with the field. A field length number of fewer than four digits is right justified and unused positions contain zeroes (zero fill). MARC 21 sets the length of the length of field portion of the entry at four characters, thus a field may contain a maximum of 9999 octets.
Starting character position (character positions 07-11), contains five ASCII numeric characters which give the starting character position, expressed as a decimal number, of the variable field to which the entry corresponds relative to the base address of data of the record. A starting character position of fewer than five digits is right justified and unused positions contain zeroes (zero fill).
Order of entries Directory entries for control fields precede entries for data fields. Entries for control fields are sequenced by tag in increasing numerical order. Entries for data fields are arranged in ascending order according to the first character of the tag, with numeric characters preceding alphabetic characters. See Variable Fields below for order requirements for the fields to which the directory entries point.
VARIABLE FIELDS
The variable fields follow the leader and the directory in the record and consist of control fields and data fields. Control fields precede data fields in the record and are arranged in the same sequence as the corresponding entries in the directory. The sequence in which data fields are stored in the record is not necessarily the same as the order of the corresponding directory entries.
Control fields in MARC 21 formats are assigned tags beginning with two zeroes. They are comprised of data and a field terminator; they do not contain indicators or subfield codes. The control number field is assigned tag 001 and contains the control number of the record. Each record contains only one control number field (with tag 001), which is to be located at the base address of data.
Data fields in MARC 21 formats are assigned tags beginning with ASCII numeric characters other than two zeroes. Such fields contain indicators and subfield codes, as well as data and a field terminator. There are no restrictions on the number, length, or content of data fields other than those already stated or implied, e.g., those resulting from the limitation of total record length. The structure of a data field is shown schematically below.
Structure of a Variable Data Field in MARC 21 Records
Structure of a Variable Data Field in MARC 21 Records
INDICATOR_1 INDICATOR_2 DELIMITER DATA_ELEMENT_IDENTIFIER_1
DATA_ELEMENT_1 ... DELIMITER DATA_ELEMENT_IDENTIFIER_n
DATA_ELEMENT_n FT
Indicators are the first two characters in every variable data field, preceding any subfield code (delimiter plus data element identifier) which may be present. Each indicator is one character and every data field in the record includes two indicators, even if values have not been defined for the indicators in a particular field. Indicators supply additional information about the field, and are defined individually for each field. Indicator values are interpreted independently; meaning is not ascribed to the two indicators taken together. Indicators may be any ASCII lowercase alphabetic, numeric, or blank. A blank is used in an undefined indicator position, and may also have a defined meaning in a defined indicator position. The numeric character 9 is reserved for local definition as an indicator.
Subfield codes identify the individual data elements within the field, and precede the data elements they identify. Each data field contains at least one subfield code. The subfield code consists of a delimiter (ASCII 1F (hex)) followed by a data element identifier. Data element identifiers defined in MARC 21 may be any ASCII lowercase alphabetic or numeric character. In general, numeric identifiers are defined for data used to process the field, or coded data needed to interpret the field. Alphabetic identifiers are defined for the separate elements which constitute the data content of the field. The character 9 and the following ASCII graphic symbols are reserved for local definition as data element identifiers:
! " # $ % & ' ( ) * + , - . / : ; < = > ? { } _ ^ ` ~ [ ]
A data field may contain more than one data element, depending upon the definition of the field. The last character in a data field is the field terminator, which follows the last data element in the field.
--------------------------------------------------------------------------------
DESIGN PRINCIPLES FOR MARC 21
A MARC 21 format is a set of codes and content designators defined for encoding a particular type of machine-readable record. The MARC 21 formats as a group serve as a vehicle for authority, bibliographic, classification, community information, and holdings data of all types. These formats are intended to be communication formats and are primarily designed to provide specifications for the exchange of information between systems. The following description of design principles repeats, in some cases, information given above but is given again for completeness.
--------------------------------------------------------------------------------
Content Designation
The purpose of content designation is to identify and characterize the data elements which comprise a MARC record with sufficient precision to support manipulation of the data for a variety of functions. The MARC 21 formats have attempted to preserve consistency of content designation across formats where this is appropriate.
The MARC 21 content designation supports the sorting of data only to a limited extent. In general, sorting must be accomplished through the application of external algorithms to the data.
The MARC 21 formats provide for using content designation, e.g., tag values or indicators, to specify recommended display constants. A display constant is a term, phrase, and/or spacing or punctuation convention that may be system generated under prescribed circumstances to make a visual presentation of data in a record more meaningful to a user. The display constant text is not carried in the data, but may be supplied for display by the processing system.
--------------------------------------------------------------------------------
Variable Fields and Tags
The data in a MARC 21 record is organized into fields, each identified by a three-character tag. Although ANSI Z39.2 and ISO 2709 allow both alphabetic and numeric characters, MARC 21 formats use only numeric tags. The tag is stored in the directory entry for the field, not in the field itself. Variable field tags are defined in blocks according to the first character of the tag, which, with some exceptions, identifies the general function of the field's data within a record. The type of information in the field is identified by the remainder of the tag. The meaning of these blocks depends upon the type of record.
The bibliographic format blocks are:
0XX Control information, numbers, and codes
1XX Main entry
2XX Titles and title paragraph (title, edition, imprint)
3XX Physical description, etc.
4XX Series statements
5XX Notes
6XX Subject access fields
7XX Added entries other than subject or series; linking fields
8XX Series added entries; location, and alternate graphics
9XX Reserved for local implementation
The authority format blocks are:
0XX Control information, numbers, and codes
1XX Heading
2XX Complex see references
3XX Complex see also references
4XX See from tracings
5XX See also from tracings
6XX Reference notes, treatment decisions, notes, etc.
7XX Heading linking entries
8XX Location and alternate graphics
9XX Reserved for local implementation
The classification format blocks are:
0XX Control information, numbers, and codes
1XX Classification numbers and terms
2XX Complex see references
3XX Complex see also references
4XX Invalid number tracings
5XX Valid number tracings
6XX Note fields
70X-75X Index term fields
76X Number building fields
8XX Location and alternate graphics
9XX Reserved for local implementation
The community information format blocks are:
0XX Control information, numbers, and codes
1XX Primary names
2XX Titles, addresses
3XX Physical information, etc.
4XX Series information
5XX Notes
6XX Subject access fields
7XX Added entries other than subject
8XX Location and alternate graphics
9XX Reserved for local implementation
The holdings format blocks are:
0XX Control information, numbers, and codes
1XX Not defined
2XX Not defined
3XX Not defined
4XX Not defined
5XX Notes
6XX Not defined
7XX Not defined
8XX Holdings and location data, notes
9XX Reserved for local implementation
Within some blocks of variable fields, parallels of content designation are preserved, e.g., bibliographic records (1XX, 4XX, 6XX, 7XX, 8XX), authority records (1XX, 4XX, 5XX, 7XX), classification records (70X-75X), and community information records (1XX, 4XX, 6XX, 7XX). The following meanings are generally given to the final two characters of the tag of fields in these blocks:
X00 Personal names
X10 Corporate names
X11 Meeting names
X30 Uniform titles
X40 Bibliographic titles
X50 Topical terms
X51 Geographic names
X55 Genre/form terms
--------------------------------------------------------------------------------
Note Fields
Rules have been developed for the MARC 21 formats that guide when a separate field should be defined for note data and when the data should be included in a general note field. For the MARC 21 bibliographic format, a specific 5XX note field is defined when at least one of the following is true:
1) Categorical indexing or retrieval is required on the data defined for the note. The note is used for structured access purposes but does not have the nature of a controlled access point.
2) Special manipulation of that specific category of data is a routine requirement. Such manipulation includes special print or display formatting or selection or suppression from display or printed product.
3) Specialized structuring of information for reasons other than those given above, e.g., to support particular standards of data content when they cannot be supported in existing fields.
For the MARC 21 authority format, the specifications for notes are covered in the following two conditions:
1) A specific note field is needed when special manipulation of that specific category of data is a routine requirement. Such manipulation includes special print or display formatting or selection or suppression from display or printed product.
2) Multiple notes are generally not established to accommodate the same type of information for different types of authorities. Notes are thus not differentiated by or limited to subject, name, or series if the same information applies to more than one type.
--------------------------------------------------------------------------------
Local Fields
Certain tags have been reserved for local implementation. The MARC 21 formats specify no structure or meaning for local fields. Communication of such fields between systems is governed by mutual agreements on the content and content designation of the fields communicated.
In general, any tag containing the character 9 is reserved for local implementation within the block structure. Specifically the 9XX block is reserved for local implementation as indicated above. The historical development of the MARC 21 formats has left one exception to this general principle: field 490 (Series Statement) in the bibliographic format. There are several obsolete fields with tags containing the character 9 (e.g., 009 (Physical Description Fixed-Field for Archival Collection) and 039 (Level of Bibliographic Control and Coding Detail)). The indicator value 9 and subfield 9 are also reserved for local implementation.
--------------------------------------------------------------------------------
Repeatability
Theoretically, all fields, except 001 (Control Number) and 005 (Date and Time of Latest Transaction), and subfields may be repeated. The nature of the data, however, often precludes repetition (e.g., a bibliographic or community information record may contain only one field 245 (Title Statement); an authority or classification record may contain only one 1XX heading field). The repeatability or nonrepeatability of each field and subfield is specified in the MARC 21 formats.
--------------------------------------------------------------------------------
Coded Data
In addition to content designation, the MARC 21 formats include specifications for the content of certain data elements, particularly those that provide for the representation of data by coded values. Coded values consist of fixed-length ASCII character strings. Individual elements within a coded-data field or subfield are identified by relative character position. Although coded data occur most frequently in the leader, directory, and variable control fields, any field or subfield may be defined for coded data.
Certain common values for codes used in coded data have been defined: