Go to the first, previous, next, last section, table of contents.

BFD

The linker accesses object and archive files using the BFD libraries. These libraries allow the linker to use the same routines to operate on object files whatever the object file format. A different object file format can be supported simply by creating a new BFD back end and adding it to the library. To conserve runtime memory, however, the linker and associated tools are usually configured to support only a subset of the object file formats available. You can use objdump -i (see section `objdump' in The GNU Binary Utilities) to list all the formats available for your configuration.

링커는 BFD 라이브러리를 사용하여 오브젝트와 아카이브 파일에 접근한다. 이 라이브러리는 링커가 같은 함수를 사용하여 어떤 오브젝트 파일 형식도 다룰 수 있게한다. 다른 오브젝트 파일 형식은 새로운 BFD 백엔드를 만들고 라이브러리에 추가하여 지원할 수 있다. 그러나 실행 중 메모리를 절약하기 위해서 링커 등의 도구들은 가능한 오브젝트 파일 형식의 일부만을 지원하게 설정된다. 현재 설정에서 사용가능한 형식은 objdump -i으로 (The GNU Binary Utilities의 `objdump'을 참고) 확인할 수 있다.

As with most implementations, BFD is a compromise between several conflicting requirements. The major factor influencing BFD design was efficiency: any time used converting between formats is time which would not have been spent had BFD not been involved. This is partly offset by abstraction payback; since BFD simplifies applications and back ends, more time and care may be spent optimizing algorithms for a greater speed.

대부분 구현에서 BFD는 여러 상충하는 조건의 타협점이다. BFD 설계에 영향을 준 주된 요인은 효율성이다. 형식 간에 변환하는 시간은 BFD를 사용하지 않으면 필요하지 않는 시간이다. 이는 부분적으로 추상화의 비용이다. 그 대신 BFD가 프로그램과 백엔드를 간단하게 하여, 빠른 속도를 위해 알고리즘을 최적화하는데 더 많은 시간과 노력이 들일 수 있다.

One minor artifact of the BFD solution which you should bear in mind is the potential for information loss. There are two places where useful information can be lost using the BFD mechanism: during conversion and during output. See section Information Loss.
BFD를 사용할 때 염두할 점은 정보 손실의 가능성이다. BFD를 사용하면 변환과 출력 두 부분에서 유용한 정보가 손실될 수 있다. Information Loss를 참고하라.

어떻게 작동하나, BFD 개관

When an object file is opened, BFD subroutines automatically determine the format of the input object file. They then build a descriptor in memory with pointers to routines that will be used to access elements of the object file's data structures.
오브젝트 파일을 열면 BFD는 자동으로 파일의 형식을 판단한다. 그리고 오브젝트 파일 자료구조에 접근하는 함수의 포인터를 만든다.

As different information from the the object files is required, BFD reads from different sections of the file and processes them. For example, a very common operation for the linker is processing symbol tables. Each BFD back end provides a routine for converting between the object file's representation of symbols and an internal canonical format. When the linker asks for the symbol table of an object file, it calls through a memory pointer to the routine from the relevant BFD back end which reads and converts the table into a canonical form. The linker then operates upon the canonical form. When the link is finished and the linker writes the output file's symbol table, another BFD back end routine is called to take the newly created symbol table and convert it into the chosen output format.

오브젝트 파일에서 다른 정보가 필요하면 BFD는 파일의 다른 섹션에서 읽고 처리한다. 예를 들어 링커의 주된 기능은 심볼표를 처리이다. 각 BFD 백엔드는 오브젝트 파일의 심볼 표현을 내부 표준 형식으로 변환하는 함수를 가지고 있다. 링커가 오브젝트 파일의 심볼표를 요구하면, 관련 BFD 백엔드의 함수 포인터를 사용하여 심볼을 읽어서 표준 형식으로 변환한다. 그 후 링커는 표준 형식을 가지고 작업을 한다. 링크가 끝나면 링커는 새로 만든 심볼표를 출력 형식으로 변환하는 BFD 백엔드 함수를 사용하여 출력파일에 심볼표를 작성한다.

정보 손실

Information can be lost during output. The output formats supported by BFD do not provide identical facilities, and information which can be described in one form has nowhere to go in another format. One example of this is alignment information in b.out. There is nowhere in an a.out format file to store alignment information on the contained data, so when a file is linked from b.out and an a.out image is produced, alignment information will not propagate to the output file. (The linker will still use the alignment information internally, so the link is performed correctly).

정보는 출력 중에 손실될 수 있다. BFD가 지원하는 출력 형식들이 같은 기능을 제공하지 않기 때문에 한 형식에 저장된 정보가 다른 형식에서는 해당 항목이 없을 수 있다. 이 예는 b.out에 저장된 정렬 정보이다. 이 정보는 정렬 정보를 자료에 저장하는 a.out에는 해당이 없다. 그래서 b.out과 a.out 형식의 파일을 같이 링크하면 정렬 정보를 출력파일에 저장되지 않는다. (그래도 링커는 내부적으로 정렬 정보를 사용하여 링크는 올바르게 수행한다.)

Another example is COFF section names. COFF files may contain an unlimited number of sections, each one with a textual section name. If the target of the link is a format which does not have many sections (e.g., a.out) or has sections without names (e.g., the Oasys format), the link cannot be done simply. You can circumvent this problem by describing the desired input-to-output section mapping with the linker command language.

다른 예는 COFF 섹션 이름이다. COFF 파일은 이름을 가지는 섹션을 무제한 가질 수 있다. 링크의 대상이 (a.out 같이) 많은 섹션을 가지지 못하는 형식이거나 (Oasys 같이) 형식이 섹션 이름을 가지지 못한다면 링크는 간단히 수행될 수 없다. 이 문제를 해결하기 위해서 직접 입력 섹션을 출력 섹션으로 대응시키는 링커 명령어 언어를 사용해야 한다.

Information can be lost during canonicalization. The BFD internal canonical form of the external formats is not exhaustive; there are structures in input formats for which there is no direct representation internally. This means that the BFD back ends cannot maintain all possible data richness through the transformation between external to internal and back to external formats.

정보는 표준화 과정에서 손실될 수 있다. 외부 형식에 대한 BFD 내부 표준 형식은 모든 경우를 처리하지 못하기 때문에 입력 섹션의 구조와 직접 연관된 내부 구조가 없을 수 있다. 그래서 BFD 백엔드는 위부와 내부, 다시 외부로 변환하는 과정에서 가능한 정보를 유지못할 수 있다.

This limitation is only a problem when an application reads one format and writes another. Each BFD back end is responsible for maintaining as much data as possible, and the internal BFD canonical form has structures which are opaque to the BFD core, and exported only to the back ends. When a file is read in one format, the canonical form is generated for BFD and the application. At the same time, the back end saves away any information which may otherwise be lost. If the data is then written back in the same format, the back end routine will be able to use the canonical form provided by the BFD core as well as the information it prepared earlier. Since there is a great deal of commonality between back ends, there is no information lost when linking or copying big endian COFF to little endian COFF, or a.out to b.out. When a mixture of formats is linked, the information is only lost from the files whose format differs from the destination.

이 제한은 프로그램이 한 형식으로 읽어서 다른 형식으로 쓸 때만 문제가 된다. 각 BFD 벡엔드는 가능한 많은 정보를 유지하고, BFD 내부 표준 형식은 BFD 핵심과 별도로 백엔드에만 익스포트되는 구조를 가지고 있다. 한 형식으로 읽으면 BFD와 프로그램을 위해 표준 형식이 생성된다. 동시에 백엔드는 그렇지 않으면 손실되는 정보를 저장한다. 자료가 다시 같은 형식으로 쓰여진다면 백엔드는 BFD 핵심에서 제공되는 표준 형식과 이미 저장된 정보를 모두 이용한다. 백엔드 간에 유사점이 많이 때문에 big endian COFF를 little endian COFF로 혹은 a.out을 b.out로 복사하거나 링크할 때 손실되는 정보는 없다. 여러 형식을 같이 링크할 때 결과와 다른 형식의 파일에서만 정보 손실이 있다.

BFD 표준 오브젝트 파일 형식

The greatest potential for loss of information occurs when there is the least overlap between the information provided by the source format, that stored by the canonical format, and that needed by the destination format. A brief description of the canonical form may help you understand which kinds of data you can count on preserving across conversions.

표준 형식으로 저장되고 결과를 위해 사용되는, 입력 형식의 정보들 사이에 공통점이 적을 때 정보 손실 가능성이 크다. 표준 형식의 설명이 변환간에 유지되는 정보를 이해하는데 도움이 될 것이다.

files: Information stored on a per-file basis includes target machine architecture, particular implementation format type, a demand pageable bit, and a write protected bit. Information like Unix magic numbers is not stored here--only the magic numbers' meaning, so a ZMAGIC file would have both the demand pageable bit and the write protected text bit set. The byte order of the target is stored on a per-file basis, so that big- and little-endian object files may be used with one another.
파일 단위로 저장되는 정보에는 대상 아키텍처, 특정 구현 형식, 페이지 요구 비트, 쓰기 보호 비트가 있다. 유닉스 메직넘버와 같은 정보는 여기에 포함되지 않는다. 단 페이지 요구 비트와 쓰기 보호 비트를 설정하는 ZMAGIC과 같이 메직넘버의 의미는 포함이 된다. 대상의 바이트 순서는 파일 단위로 저장되기 때문에 big endian과 little endian 파일을 같이 사용할 수 있다.
sections: Each section in the input file contains the name of the section, the section's original address in the object file, size and alignment information, various flags, and pointers into other BFD data structures.
입력파일의 각 섹션은 섹션 이름, 오브젝트 파일에서 섹션의 주소, 크기, 정렬 정보, 많은 플래그, 다른 BFD 자료구조에 포인터를 포함한다.
symbols: Each symbol contains a pointer to the information for the object file which originally defined it, its name, its value, and various flag bits. When a BFD back end reads in a symbol table, it relocates all symbols to make them relative to the base of the section where they were defined. Doing this ensures that each symbol points to its containing section. Each symbol also has a varying amount of hidden private data for the BFD back end. Since the symbol points to the original file, the private data format for that symbol is accessible. ld can operate on a collection of symbols of wildly different formats without problems. Normal global and simple local symbols are maintained on output, so an output file (no matter its format) will retain symbols pointing to functions and to global, static, and common variables. Some symbol information is not worth retaining; in a.out, type information is stored in the symbol table as long symbol names. This information would be useless to most COFF debuggers; the linker has command line switches to allow users to throw it away. There is one word of type information within the symbol, so if the format supports symbol type information within symbols (for example, COFF, IEEE, Oasys) and the type is simple enough to fit within one word (nearly everything but aggregates), the information will be preserved.
각 심볼은 심볼을 정의한 오브젝트 파일 정보의 포인터, 이름, 값, 많은 플래그 비트를 포함한다. BFD 백엔드가 심볼표를 읽으면 모든 심볼을 정의된 섹션에 상대적으로 만든다. 이는 각 심볼이 심볼을 포함하는 섹션을 가리키게 한다. 또 각 심볼은 BFD 백엔드에 많은 숨겨진 정보를 가진다. 심볼이 원래 파일을 가리키기 때문에 심볼에 대한 사적인 자료를 이용할 수 있다. 그래서 링커는 문제없이 다른 형식의 심볼들을 다룰 수 있다. 보통 전역 심볼과 지역 심볼은 출력에 유지되기 때문에 출력파일은 (형식과 관계없이) 함수와 전역, 정적, 공통 변수의 심볼을 유지한다. a.out에서 긴 심볼명으로 심볼표에 저장되는 타입 정보와 같은 정보는 유지할 필요가 없다. 이 정보는 대부분 COFF 디버거에게 필요가 없다. 옵션으로 링커가 이 정보를 버리게 한다. 심볼 안에 타입 정보 워드가 있다. 그래서 (COFF, IEEE, Oasys와 같이) 형식이 심볼 안에 심볼 타입 정보를 지원하면 정보는 유지된다.
relocation level: Each canonical BFD relocation record contains a pointer to the symbol to relocate to, the offset of the data to relocate, the section the data is in, and a pointer to a relocation type descriptor. Relocation is performed by passing messages through the relocation type descriptor and the symbol pointer. Therefore, relocations can be performed on output data using a relocation method that is only available in one of the input formats. For instance, Oasys provides a byte relocation format. A relocation record requesting this relocation type would point indirectly to a routine to perform this, so the relocation may be performed on a byte being written to a 68k COFF file, even though 68k COFF has no such relocation type.
각 BFD 표준 재배치 기록은 재배치할 심볼의 참조, 재배치할 자료의 옵셧, 자료를 포함하는 섹션, 재배치 타입 기술의 포인터를 저장한다. 재배치는 재배치 타입 기술과 심볼 포인터를 사용하여 수행된다. 그래서 입력 형식에 있는 재배치 방법을 사용하여 출력 자료를 재배치한다. 예를 들어 Oasys는 바이트 재배치 형식을 제공한다. 이 종류의 재배치 기록은 수행할 함수를 간접적으로 지시하여 68k COFF가 이 종류의 재배치가 없다하더라도 68k COFF 파일로 출력되는 바이트에 재배치를 수행한다.
line numbers: Object formats can contain, for debugging purposes, some form of mapping between symbols, source line numbers, and addresses in the output file. These addresses have to be relocated along with the symbol information. Each symbol with an associated list of line number records points to the first record of the list. The head of a line number list consists of a pointer to the symbol, which allows finding out the address of the function whose line number is being described. The rest of the list is made up of pairs: offsets into the section and line numbers. Any format which can simply derive this information can pass it successfully between formats (COFF, IEEE and Oasys).
오브젝트 형식은 디버깅을 위해 심볼과 소스 줄번호, 출력 파일과 주소 간의 대응을 저장한다. 이 주소는 심볼 정보와 같이 재배치되야 한다. 줄번호 목록의 각 심볼은 목록의 처음을 가리킨다. 줄번호 목록의 앞은 줄번호가 나타내는 함수 주소를 찾는데 사용하는 심볼의 포인터를 포함한다. 목록의 나머지는 섹션에서의 옵셋과 줄번호 쌍으로 이루어져있다. 이 정보를 쉽게 얻을 수 있는 (COFF, IEEE, Oasys) 형식은 이 정보를 형식들 간에 전달할 수 있다.

Go to the first, previous, next, last section, table of contents.