[apple/ld64.git] / doc / design / linker.html

<html>
<head>
  <title>Linker</title>
</head>
<body>


<h1>
  Inside the Linker 
</h1>
<div class="doc_author">
  <p>Written by <a href="mailto:kledzik@apple.com">Nick Kledzik</a></p>
</div>


<h2>
  <a name="introduction">Introduction</a>
</h2>

<p>The Darwin linker is a new generation of linker.  It is not "section" based
like traditional linkers which mostly just interlace sections from multiple
object files into the output file.  The Darwin linker is based on "Atoms".
Traditional section based linking work well for simple linking, but their model
makes advanced linking features difficult to implement.  Features like dead code 
stripping, reordering functions for locality, and C++ coalescing require the
linker to work at a finer grain.
</p>

<p>An atom is an indivisible chunk of code or data.  An atom has a set of
attributes, such as: name, scope, content-type, alignment, etc.  An atom also
has a list of Fixups.  A Fixup contains: a kind, an optional offset, an optional
addend, and an optional target atom.</p>

<p>The Atom model allows the linker to use standard graph theory models for 
linking data structures.  Each atom is a node, and each Fixup is an edge. 
The feature of dead code stripping is implemented by following edges to mark
all live atoms, and then delete the non-live atoms.</p>
<br>
<h2>
  <a name="Atom model">Atom model</a>
</h2>

<p>An atom is an indivisible chuck of code or data.  Typically each user
written function or global variable is an atom.  In addition, the compiler may
emit other atoms, such as for literal c-strings or floating point constants, or
for runtime data structures like dwarf unwind info or pointers to initializers.
</p>

<p>A simple "hello world" object file would be modeled like this:</p>
<img src="hello.png" alt="hello world graphic"/>
<p>There are two atoms: main and an anonymous atom containing the c-string
literal "hello world".  The Atom "main" has two fixups.  One is the call site
for the call to printf, and the other is a fixup for the instruction that loads
the address of the c-string literal. </p>

<br>
<h2>
  <a name="File model">File model</a>
</h2>

<p>The linker views the input files as basically containers of Atoms and Fixups,
 and just a few attributes of their own.  The linker works with three kinds
of files: object files, static libraries, and dynamic libraries.  Each kind
of file has reader object which presents the file in the model expected by 
the linker.</p>
<h4> <a>Object File</a> 
</h4>
An object file is just a container of atoms.  When linking with
an object file, all atoms are added to the initial graph of atoms.

<h4> <a>Static Library (Archive)</a> 
</h4>
This is the traditional unix static archive which is just a collection of
object files with a "table of contents". When linking with a static library,
by default nothing is added to the initial graph of atoms. Instead, if there
are unresolved references (dangling edges) in the master graph of all atoms, 
and the table of contents for a static library says that one of the object files 
in the library defines one of the missing symbols (dangling edge), 
the set of atoms from the specified object file in the static library is added 
to the master graph of atoms. 

<h4> <a>Dynamic Library (Shared Object)</a> 
</h4>
Dynamic libraries are unique in that the don't directly add add any atoms.  
Their purpose is to check at build time that all references are resolved and
provide a list of dynamic libraries (SO_NEEDED) that will be needed at runtime. 
The way this is modeled in the linker is that a dynamic library contributes
no atoms to the initial graph of atoms.  Instead, (like static libraries) if
there are unresolved references (dangling edges) in the master graph of all atoms, 
if a dynamic library exports a required symbol, then a "proxy" atom is 
instantiated by the linker.  The proxy atom allows the master atom graph to have
all edges resolved and also records from which dynamic library a symbol came.</p>

<br>
<h2>
  <a name="Linking Steps">Linking Steps</a>
</h2>
<p>Through the use of abstract Atoms, the core of linking is architecture 
independent and file format independent.  All command line parsing is factored
out into a separate "options" abstraction which enables the linker to be driven
with different command line sets.</p>
<p>The overall steps in linking are:<p>
<ol>
  <li>Command line processing</li>
  <li>Parsing input files</li>
  <li>Resolving</li>
  <li>Passes/Optimizations</li>
  <li>Generate output file</li>
</ol>

<p>The Resolving and Passes steps are done purely on the master graph of atoms, 
so they have no notion of file formats such as mach-o or ELF.</p>

<h4> <a>Resolving</a> 
</h4>
<p>The resolving step takes all the atoms graphs from each object file and 
combines them into one master object graph.  Unfortunately, it is not as simple
as appending the atom list from each file into one big list.  There are many
cases where atoms need to be coalesced.  That is, two or more atoms need to 
be coalesced into one atom.  This is necessary to support: C language
 "tentative definitions", C++ weak symbols for templates and inlines defined
in headers, and for merging copies of constants like c-strings and floating
point constants.</p>

<p>The linker support coalescing by-name and by-content. By-name is used for
tentative definitions and weak symbols.  By-content is used for constant data
that can be merged. </p>

<p>When one atom has a reference (FixUp) to another atom, there is also a binding
type: by-name, direct, or indirect. A Fixup contains a tagged union that if
the binding type is by-name, the union field is a pointer to a c-string.  If
the binding type is direct, the union is a pointer to an Atom.  If the binding
type is indirect, the union is a index into a table of pointers to Atoms. Below
is a graphical representation of the binding types:</p>
<img src="bindings.png" alt="binding types graphic"/>

<p>Input file Atoms contain only direct and by-name references.  Direct 
references are used for atoms defined in the same object file for which the 
target atom is either unnamed or cannot change.  For instance, calling 
a static function in a translation unit will result in a direct reference 
to the static functions's atom.  Also the FDE (dwarf unwind info) for a function
has a direct reference to its function.  On the other hand references to 
global symbols (e.g. call to printf) use by-name binding in object files.
</p>

<p>The resolving process maintains some global linking "state", including:
a "symbol table" which is a map from c-string to Atom*, an indirect symbol
table which is a growable array of Atom*, and for each kind of coalesable
constants there is a content to Atom* map.  With these data structures,
the linker walks all atoms in all input files. For each
atom, it checks if the atom should be in one symbol table or one of the 
coalescing tables.  If so, it attempts to add the atom.  If there already is
a matching atom in that table, that means the current atom needs to be 
coalesced with the found atom.  
</p>

<p>To support coalescing, all references to coalesable atoms are changed to
indirect binding and an entry is added to the indirect table which points
to the current chosen atom.  When all input atoms have been processed by
the resolver, there should be only direct and indirect bindings left.  If
there are any NULL entries in the indirect table, that means there are  
undefined references.  The linker then looks to the supplied libraries (both
static and dynamic) to resolve those references.  
</p>

<p>Dead code stripping (if requested) is done at the end of resolving.  The
linker does a simple mark-and-sweep. It starts with "root" atoms (like "main"
in a main executable) and follows each references and marks each Atom that
it visits as "live".  When done, all atoms not marked "live" are removed.
</p>

<h4> <a>Passes</a> 
</h4>
<p>The Passes step
is an open ended set of routines that each get a change to modify or enhance
the master graph of atoms. Passes are only run if the master graph of 
atoms is completely resolved (no dangling edges). 
The current set of Passes in the Darwin linker are:</p>
<ul>
  <li>Objective-C optimizations (Apple)</li>
  <li>stub (PLT) generation</li>
  <li>GOT instantiation</li>
  <li>TLV instantiation (Apple)</li>
  <li>order_file optimization</li>
  <li>branch island generation</li>
  <li>branch shim generation</li>
  <li>dtrace probe processing (Apple)</li>
  <li>compact unwind encoding (Apple)</li>
</ul>
<p>Some of these passes are specific to Apple's runtime environments.  But many
of the passes are applicable to any OS (such as generating branch island for 
out of range branch instructions).</p>

<p>The general structure of a pass is to walk the master graph inspecting each
atom and doing something.  For instance, the stub pass, walks the graph looking
for atoms with call sites to proxy atoms (e.g. call to printf).  It then
instantiates a "stub" atom (PLT entry) and a "lazy pointer" atom for each 
proxy atom needed, and these new atoms are added to the master graph.  Next
all the noted call sites to proxy atoms are replaced with calls to the 
corresponding stub atom.</p>  

<h4><a>Generate Output File</a> 
</h4>
<p>Once the passes are done, the output file generator is given a sorted list
of atoms.  Its job is to create the executable content file wrapper and place
the content of the atoms into it. 
</p>


<h2>
  <a name="Future Directions">Future Directions</a>
</h2>

<h4><a>Sections</a> 
</h4>
<p>The current use of sections in mach-o .o files over-constrains the linker.
By default, the linker should preserve the section an atom is in.  But since
all sections must be contiguous in the output, that limits the ability of
the linker to order atoms for locality.  It would be helpful to enrich the
object file with with reason something is in the section it is.  For instance,
is the section found at runtime? Or was the use of a section just a quick
way to group some content together?
</p>
<p>The ELF model for sections is a little better than mach-o because ELF
sections have write and execute bits, whereas mach-o sections must be in some
segment and the segment has the write and execute bits.  
</p>

<h4><a>Mach-o Object File Format</a> 
</h4>
<p>
The messiest part of the linker is the mach-o parser. This is because mach-o
is a traditional section and symbols based file format.  The parser must infer
atom boundaries using two approaches.  The first is that some section types have  
well defined content which the linker can parse into atoms (e.g.  __cstring, 
__eh_frame). The other approach is a naming convention (which the compiler follows)
by which the linker breaks sections into atoms at any non-local (not starting 
with 'L') symbol. The processing the linker has to do parse mach-o .o files is a
significant part of the link time. 
</p>

<p>Given that the assembler writes object files once, whereas the linker reads
them many times (during development), it would make sense to optimize the object
file format to be something the linker can read/parse efficiently.</p>  

<h4><a>New Object File Model</a> 
</h4>
<p>LLVM has a nice model for its IR.  There are three representations:
the binary bit code file, the in-memory object model, and a textual 
representation.  LLVM contains utility possible code for converting between these
representations.  The same model makes sense for atoms too.  There should be
three representations for atoms: binary file, in-memory, and textual. The Darwin 
linker already has an in-memory C++ object model for Atoms.  All we need is a 
textual representation and binary file format.
</p>
<p>Note: in the darwin linker the binary format for input object files is  
independent of the output executable format.  That is, we could have one 
universal object file format which the linker could use as input to produce 
mach-o, ELF, or PE executables.</p>
<p>
The object file binary format should be designed to instantiate into atoms
as fast as possible.  The obvious way to do that is that the 
file format would be an array of atoms.  The linker just mmaps in the file and
looks at the header to see how many atoms there and instantiate that many atoms
with the atom attribute information coming from that array.  The trick is 
designing this in a way that can be extended as the Atom mode evolves and new
attributes are added.
</p>
<p>
In designing a textual format we want something easy for humans to read and
easy for the linker to parse.  Since an atom has lots of attributes most of
which are usually just the default, we should define default values for 
every attribute so that those can be omitted from the text representation.
One possile format is YAML.  Here is the atoms for a simple hello world
program expressed in YAML.
</p>
<pre>
---
target-triple:   x86_64-apple-darwin11
source:

atoms:
    - name:    _main
      scope:   linkage-unit
      type:    code
      alignment: 
          power: 4
      content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00,
                 00, 00, 31, c0, 5d, c3 ]
      fixups:
      - offset: 07
        kind:   pcrel32
        target: 2
      - offset: 0E
        kind:   call32
        target: _fprintf

    - type:    c-string
      merge:   by-content
      content: [ 73, 5A, 00 ]

...
</pre>

<p>One big use for the textual format will be writing test cases. The Darwin
linker test suite test cases are written mostly in C/C++ and a few assembly
files.  The use of C means the same test case can be compiled for different
architectures.  But writing test cases in C is problematic because the compiler 
may vary its output over time for its own optimization reasons which my 
inadvertently disable or break the linker feature trying to be tested. By 
writing test cases in the linkers own textual format, we can exactly specify 
every attribute of every atom and thus target specific linker logic.
</p>

<h4><a>Debug Info</a> 
</h4>
<p>Around 2005 when Apple switched from using STABS to using DWARF for debug 
information, we made a design decision to have the linker ignore DWARF in
.o files.  This improves linking performance because the linker is not
copying tons of debug info.  Instead, the linker adds "debug notes" into
output binary that contain the paths of the original .o files. During development
the Darwin debugger will notice the debug notes and the load the dwarf
debug information from the original object files.  For release builds,
a tool named dsymutil is run on the program.  It finds the debug notes and
then the original object files, then reads, merges and optimizes all the dwarf
debug information into one .dSYM file which can be loaded by the debugger
if needed.</p>

<p>The current way DWARF is generated is that all debug information for all
functions in a translation unit are merged and optimized into sections based 
on debug info kind.  For instance the mapping of instructions to source line
numbers for all functions is compressed and put in one section. This does not
play well in an Atom based file format.  One idea is to have the compiler
emit some intermediate representation debug information (one which is 
partitioned per atom) into the Atom based file format.  The linker could 
then have code to convert that intermediate debug into to final dwarf.
This is still an open question.</p>

<h4><a>Extending Atom attributes to ELF and XCOFF</a> 
</h4>
<p>The current set of attributes defined for Atoms in the darwin linker
were chosen to meet the requirements of developing code to run on iOS and 
Mac OS X.  Below is a list of the attributes and their possible values.
It may just require adding more values to support ELF and XCOFF.  Or there
may need to be new attributes added to capture new functionality.
</p>
<ul>
  <li>Name</li>
  <li>Size</li>
  <li>Section (I'd like to get rid of this)</li>
  <li>ContentType (currently some of this comes from section)</li>
  <ul>
	  <li>code</li>
	  <li>stub</li>
	  <li>data</li>
	  <li>zeroFill</li>
	  <li>initializerPointer</li>
	  <li>objc1Class</li>
	  <li>objc2Class</li>
	  <li>objcClassPointer</li>
	  <li>objc2CategoryList</li>
	  <li>non-lazy-pointer</li>
	  <li>lazy-pointer</li>
	  <li>constant</li>
	  <li>literal4</li>
	  <li>literal8</li>
	  <li>literal16</li>
	  <li>cstring</li>
	  <li>cstringPointer</li>
	  <li>utf16string</li>
	  <li>CFString</li>
	  <li>CFI</li>
	  <li>LSDA</li>
	  </ul>
  </li>
  <li>Scope
  <ul>
	  <li>translationUnit  (static functions)</li>
	  <li>linkageUnit      (visibility hidden)</li>
	  <li>global</li>
	  </ul>
  </li>
  <li>DefinitionKind
  <ul>
	  <li>regular</li>
	  <li>tentative         (ANSI C feature)</li>
	  <li>absolute          (assembly code feature)</li>
	  <li>proxy             (stand-in for dynamic library symbol)</li>
  </ul>
  </li>
  <li>Combine
  <ul>
	  <li>never</li>
	  <li>byName          (weak symbols)</li>
	  <li>byContent       (simple constants)</li>
	  <li>byContentAndReferences (complex constants)</li>
  </ul>
  </li>
  <li>SymbolTableStatus
  <ul>
	  <li>In</li>
	  <li>notIn              (anonymous)</li>
	  <li>inAsAbsolute       (assembly code feature)</li>
	  <li>inAndNeverStrip    (tell strip tool to leave)</li>
	  <li>inWithRandomName   (mach-o .o feature)</li>
  </ul>
  <li>Alignment
  <ul>
	  <li>powerOfTwo</li>
	  <li>modulus</li>
  </ul>
  <li>NeverDeadStrip (boolean)</li>
  <li>IsThumb (ARM specific)</li>
</ul>
<p>Where does dllexport fit in here?  Where does visibility protected and 
internal fit?  Protected seems like scope=global plus the rule to not 
indirect references to it.  Internal is like hidden plus enables some
compiler optimizations.  I'm not sure the linker needs to know about internal.
</p>

</body>
</html>
Commit	Line	Data
ebf6f434 A	1	<html>
	2	<head>
	3	<title>Linker</title>
	4	</head>
	5	<body>
	6
	7
	8	<h1>
	9	Inside the Linker
	10	</h1>
	11	<div class="doc_author">
	12	<p>Written by <a href="mailto:kledzik@apple.com">Nick Kledzik</a></p>
	13	</div>
	14
	15
	16	<h2>
	17	<a name="introduction">Introduction</a>
	18	</h2>
	19
	20	<p>The Darwin linker is a new generation of linker. It is not "section" based
	21	like traditional linkers which mostly just interlace sections from multiple
	22	object files into the output file. The Darwin linker is based on "Atoms".
	23	Traditional section based linking work well for simple linking, but their model
	24	makes advanced linking features difficult to implement. Features like dead code
	25	stripping, reordering functions for locality, and C++ coalescing require the
	26	linker to work at a finer grain.
	27	</p>
	28
	29	<p>An atom is an indivisible chunk of code or data. An atom has a set of
	30	attributes, such as: name, scope, content-type, alignment, etc. An atom also
	31	has a list of Fixups. A Fixup contains: a kind, an optional offset, an optional
	32	addend, and an optional target atom.</p>
	33
	34	<p>The Atom model allows the linker to use standard graph theory models for
	35	linking data structures. Each atom is a node, and each Fixup is an edge.
	36	The feature of dead code stripping is implemented by following edges to mark
	37	all live atoms, and then delete the non-live atoms.</p>
	38	<br>
	39	<h2>
	40	<a name="Atom model">Atom model</a>
	41	</h2>
	42
	43	<p>An atom is an indivisible chuck of code or data. Typically each user
	44	written function or global variable is an atom. In addition, the compiler may
	45	emit other atoms, such as for literal c-strings or floating point constants, or
	46	for runtime data structures like dwarf unwind info or pointers to initializers.
	47	</p>
	48
	49	<p>A simple "hello world" object file would be modeled like this:</p>
	50	<img src="hello.png" alt="hello world graphic"/>
	51	<p>There are two atoms: main and an anonymous atom containing the c-string
	52	literal "hello world". The Atom "main" has two fixups. One is the call site
	53	for the call to printf, and the other is a fixup for the instruction that loads
	54	the address of the c-string literal. </p>
	55
	56	<br>
	57	<h2>
	58	<a name="File model">File model</a>
	59	</h2>
	60
	61	<p>The linker views the input files as basically containers of Atoms and Fixups,
	62	and just a few attributes of their own. The linker works with three kinds
	63	of files: object files, static libraries, and dynamic libraries. Each kind
	64	of file has reader object which presents the file in the model expected by
65	the linker.</p>
66	<h4> <a>Object File</a>
67	</h4>
68	An object file is just a container of atoms. When linking with
69	an object file, all atoms are added to the initial graph of atoms.
70
71	<h4> <a>Static Library (Archive)</a>
72	</h4>
73	This is the traditional unix static archive which is just a collection of
74	object files with a "table of contents". When linking with a static library,
75	by default nothing is added to the initial graph of atoms. Instead, if there
76	are unresolved references (dangling edges) in the master graph of all atoms,
77	and the table of contents for a static library says that one of the object files
78	in the library defines one of the missing symbols (dangling edge),
79	the set of atoms from the specified object file in the static library is added
80	to the master graph of atoms.
81
82	<h4> <a>Dynamic Library (Shared Object)</a>
83	</h4>
84	Dynamic libraries are unique in that the don't directly add add any atoms.
85	Their purpose is to check at build time that all references are resolved and
86	provide a list of dynamic libraries (SO_NEEDED) that will be needed at runtime.
87	The way this is modeled in the linker is that a dynamic library contributes
88	no atoms to the initial graph of atoms. Instead, (like static libraries) if
89	there are unresolved references (dangling edges) in the master graph of all atoms,
90	if a dynamic library exports a required symbol, then a "proxy" atom is
91	instantiated by the linker. The proxy atom allows the master atom graph to have
92	all edges resolved and also records from which dynamic library a symbol came.</p>
93
94	<br>
95	<h2>
96	<a name="Linking Steps">Linking Steps</a>
97	</h2>
98	<p>Through the use of abstract Atoms, the core of linking is architecture
99	independent and file format independent. All command line parsing is factored
100	out into a separate "options" abstraction which enables the linker to be driven
101	with different command line sets.</p>
102	<p>The overall steps in linking are:<p>
103	<ol>
104	<li>Command line processing</li>
105	<li>Parsing input files</li>
106	<li>Resolving</li>
107	<li>Passes/Optimizations</li>
108	<li>Generate output file</li>
109	</ol>
110
111	<p>The Resolving and Passes steps are done purely on the master graph of atoms,
112	so they have no notion of file formats such as mach-o or ELF.</p>
113
114	<h4> <a>Resolving</a>
115	</h4>
116	<p>The resolving step takes all the atoms graphs from each object file and
117	combines them into one master object graph. Unfortunately, it is not as simple
118	as appending the atom list from each file into one big list. There are many
119	cases where atoms need to be coalesced. That is, two or more atoms need to
120	be coalesced into one atom. This is necessary to support: C language
121	"tentative definitions", C++ weak symbols for templates and inlines defined
122	in headers, and for merging copies of constants like c-strings and floating
123	point constants.</p>
124
125	<p>The linker support coalescing by-name and by-content. By-name is used for
126	tentative definitions and weak symbols. By-content is used for constant data
127	that can be merged. </p>
128
129	<p>When one atom has a reference (FixUp) to another atom, there is also a binding
130	type: by-name, direct, or indirect. A Fixup contains a tagged union that if
131	the binding type is by-name, the union field is a pointer to a c-string. If
132	the binding type is direct, the union is a pointer to an Atom. If the binding
133	type is indirect, the union is a index into a table of pointers to Atoms. Below
134	is a graphical representation of the binding types:</p>
135	<img src="bindings.png" alt="binding types graphic"/>
136
137	<p>Input file Atoms contain only direct and by-name references. Direct
138	references are used for atoms defined in the same object file for which the
139	target atom is either unnamed or cannot change. For instance, calling
140	a static function in a translation unit will result in a direct reference
141	to the static functions's atom. Also the FDE (dwarf unwind info) for a function
142	has a direct reference to its function. On the other hand references to
143	global symbols (e.g. call to printf) use by-name binding in object files.
144	</p>
145
146	<p>The resolving process maintains some global linking "state", including:
147	a "symbol table" which is a map from c-string to Atom*, an indirect symbol
148	table which is a growable array of Atom*, and for each kind of coalesable
149	constants there is a content to Atom* map. With these data structures,
150	the linker walks all atoms in all input files. For each
151	atom, it checks if the atom should be in one symbol table or one of the
152	coalescing tables. If so, it attempts to add the atom. If there already is
153	a matching atom in that table, that means the current atom needs to be
154	coalesced with the found atom.
155	</p>
156
157	<p>To support coalescing, all references to coalesable atoms are changed to
158	indirect binding and an entry is added to the indirect table which points
159	to the current chosen atom. When all input atoms have been processed by
160	the resolver, there should be only direct and indirect bindings left. If
161	there are any NULL entries in the indirect table, that means there are
162	undefined references. The linker then looks to the supplied libraries (both
163	static and dynamic) to resolve those references.
164	</p>
165
166	<p>Dead code stripping (if requested) is done at the end of resolving. The
167	linker does a simple mark-and-sweep. It starts with "root" atoms (like "main"
168	in a main executable) and follows each references and marks each Atom that
169	it visits as "live". When done, all atoms not marked "live" are removed.
170	</p>
171
172	<h4> <a>Passes</a>
173	</h4>
174	<p>The Passes step
175	is an open ended set of routines that each get a change to modify or enhance
176	the master graph of atoms. Passes are only run if the master graph of
177	atoms is completely resolved (no dangling edges).
178	The current set of Passes in the Darwin linker are:</p>
179	<ul>
180	<li>Objective-C optimizations (Apple)</li>
181	<li>stub (PLT) generation</li>
182	<li>GOT instantiation</li>
183	<li>TLV instantiation (Apple)</li>
184	<li>order_file optimization</li>
185	<li>branch island generation</li>
186	<li>branch shim generation</li>
187	<li>dtrace probe processing (Apple)</li>
188	<li>compact unwind encoding (Apple)</li>
189	</ul>
190	<p>Some of these passes are specific to Apple's runtime environments. But many
191	of the passes are applicable to any OS (such as generating branch island for
192	out of range branch instructions).</p>
193
194	<p>The general structure of a pass is to walk the master graph inspecting each
195	atom and doing something. For instance, the stub pass, walks the graph looking
196	for atoms with call sites to proxy atoms (e.g. call to printf). It then
197	instantiates a "stub" atom (PLT entry) and a "lazy pointer" atom for each
198	proxy atom needed, and these new atoms are added to the master graph. Next
199	all the noted call sites to proxy atoms are replaced with calls to the
200	corresponding stub atom.</p>
201
202	<h4><a>Generate Output File</a>
203	</h4>
204	<p>Once the passes are done, the output file generator is given a sorted list
205	of atoms. Its job is to create the executable content file wrapper and place
206	the content of the atoms into it.
207	</p>
208
209
210	<h2>
211	<a name="Future Directions">Future Directions</a>
212	</h2>
213
214	<h4><a>Sections</a>
215	</h4>
216	<p>The current use of sections in mach-o .o files over-constrains the linker.
217	By default, the linker should preserve the section an atom is in. But since
218	all sections must be contiguous in the output, that limits the ability of
219	the linker to order atoms for locality. It would be helpful to enrich the
220	object file with with reason something is in the section it is. For instance,
221	is the section found at runtime? Or was the use of a section just a quick
222	way to group some content together?
223	</p>
224	<p>The ELF model for sections is a little better than mach-o because ELF
225	sections have write and execute bits, whereas mach-o sections must be in some
226	segment and the segment has the write and execute bits.
227	</p>
228
229	<h4><a>Mach-o Object File Format</a>
230	</h4>
231	<p>
232	The messiest part of the linker is the mach-o parser. This is because mach-o
233	is a traditional section and symbols based file format. The parser must infer
234	atom boundaries using two approaches. The first is that some section types have
235	well defined content which the linker can parse into atoms (e.g. __cstring,
236	__eh_frame). The other approach is a naming convention (which the compiler follows)
237	by which the linker breaks sections into atoms at any non-local (not starting
238	with 'L') symbol. The processing the linker has to do parse mach-o .o files is a
239	significant part of the link time.
240	</p>
241
242	<p>Given that the assembler writes object files once, whereas the linker reads
243	them many times (during development), it would make sense to optimize the object
244	file format to be something the linker can read/parse efficiently.</p>
245
246	<h4><a>New Object File Model</a>
247	</h4>
248	<p>LLVM has a nice model for its IR. There are three representations:
249	the binary bit code file, the in-memory object model, and a textual
250	representation. LLVM contains utility possible code for converting between these
251	representations. The same model makes sense for atoms too. There should be
252	three representations for atoms: binary file, in-memory, and textual. The Darwin
253	linker already has an in-memory C++ object model for Atoms. All we need is a
254	textual representation and binary file format.
255	</p>
256	<p>Note: in the darwin linker the binary format for input object files is
257	independent of the output executable format. That is, we could have one
258	universal object file format which the linker could use as input to produce
259	mach-o, ELF, or PE executables.</p>
260	<p>
261	The object file binary format should be designed to instantiate into atoms
262	as fast as possible. The obvious way to do that is that the
263	file format would be an array of atoms. The linker just mmaps in the file and
264	looks at the header to see how many atoms there and instantiate that many atoms
265	with the atom attribute information coming from that array. The trick is
266	designing this in a way that can be extended as the Atom mode evolves and new
267	attributes are added.
268	</p>
269	<p>
270	In designing a textual format we want something easy for humans to read and
271	easy for the linker to parse. Since an atom has lots of attributes most of
272	which are usually just the default, we should define default values for
273	every attribute so that those can be omitted from the text representation.
274	One possile format is YAML. Here is the atoms for a simple hello world
275	program expressed in YAML.
276	</p>
277	<pre>
278	---
279	target-triple: x86_64-apple-darwin11
280	source:
281
282	atoms:
283	- name: _main
284	scope: linkage-unit
285	type: code
286	alignment:
287	power: 4
288	content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00,
289	00, 00, 31, c0, 5d, c3 ]
290	fixups:
291	- offset: 07
292	kind: pcrel32
293	target: 2
294	- offset: 0E
295	kind: call32
296	target: _fprintf
297
298	- type: c-string
299	merge: by-content
300	content: [ 73, 5A, 00 ]
301
302	...
303	</pre>
304
305	<p>One big use for the textual format will be writing test cases. The Darwin
306	linker test suite test cases are written mostly in C/C++ and a few assembly
307	files. The use of C means the same test case can be compiled for different
308	architectures. But writing test cases in C is problematic because the compiler
309	may vary its output over time for its own optimization reasons which my
310	inadvertently disable or break the linker feature trying to be tested. By
311	writing test cases in the linkers own textual format, we can exactly specify
312	every attribute of every atom and thus target specific linker logic.
313	</p>
314
315	<h4><a>Debug Info</a>
316	</h4>
317	<p>Around 2005 when Apple switched from using STABS to using DWARF for debug
318	information, we made a design decision to have the linker ignore DWARF in
319	.o files. This improves linking performance because the linker is not
320	copying tons of debug info. Instead, the linker adds "debug notes" into
321	output binary that contain the paths of the original .o files. During development
322	the Darwin debugger will notice the debug notes and the load the dwarf
323	debug information from the original object files. For release builds,
324	a tool named dsymutil is run on the program. It finds the debug notes and
325	then the original object files, then reads, merges and optimizes all the dwarf
326	debug information into one .dSYM file which can be loaded by the debugger
327	if needed.</p>
328
329	<p>The current way DWARF is generated is that all debug information for all
330	functions in a translation unit are merged and optimized into sections based
331	on debug info kind. For instance the mapping of instructions to source line
332	numbers for all functions is compressed and put in one section. This does not
333	play well in an Atom based file format. One idea is to have the compiler
334	emit some intermediate representation debug information (one which is
335	partitioned per atom) into the Atom based file format. The linker could
336	then have code to convert that intermediate debug into to final dwarf.
337	This is still an open question.</p>
338
339	<h4><a>Extending Atom attributes to ELF and XCOFF</a>
340	</h4>
341	<p>The current set of attributes defined for Atoms in the darwin linker
342	were chosen to meet the requirements of developing code to run on iOS and
343	Mac OS X. Below is a list of the attributes and their possible values.
344	It may just require adding more values to support ELF and XCOFF. Or there
345	may need to be new attributes added to capture new functionality.
346	</p>
347	<ul>
348	<li>Name</li>
349	<li>Size</li>
350	<li>Section (I'd like to get rid of this)</li>
351	<li>ContentType (currently some of this comes from section)</li>
352	<ul>
353	<li>code</li>
354	<li>stub</li>
355	<li>data</li>
356	<li>zeroFill</li>
357	<li>initializerPointer</li>
358	<li>objc1Class</li>
359	<li>objc2Class</li>
360	<li>objcClassPointer</li>
361	<li>objc2CategoryList</li>
362	<li>non-lazy-pointer</li>
363	<li>lazy-pointer</li>
364	<li>constant</li>
365	<li>literal4</li>
366	<li>literal8</li>
367	<li>literal16</li>
368	<li>cstring</li>
369	<li>cstringPointer</li>
370	<li>utf16string</li>
371	<li>CFString</li>
372	<li>CFI</li>
373	<li>LSDA</li>
374	</ul>
375	</li>
376	<li>Scope
377	<ul>
378	<li>translationUnit (static functions)</li>
379	<li>linkageUnit (visibility hidden)</li>
380	<li>global</li>
381	</ul>
382	</li>
383	<li>DefinitionKind
384	<ul>
385	<li>regular</li>
386	<li>tentative (ANSI C feature)</li>
387	<li>absolute (assembly code feature)</li>
388	<li>proxy (stand-in for dynamic library symbol)</li>
389	</ul>
390	</li>
391	<li>Combine
392	<ul>
393	<li>never</li>
394	<li>byName (weak symbols)</li>
395	<li>byContent (simple constants)</li>
396	<li>byContentAndReferences (complex constants)</li>
397	</ul>
398	</li>
399	<li>SymbolTableStatus
400	<ul>
401	<li>In</li>
402	<li>notIn (anonymous)</li>
403	<li>inAsAbsolute (assembly code feature)</li>
404	<li>inAndNeverStrip (tell strip tool to leave)</li>
405	<li>inWithRandomName (mach-o .o feature)</li>
406	</ul>
407	<li>Alignment
408	<ul>
409	<li>powerOfTwo</li>
410	<li>modulus</li>
411	</ul>
412	<li>NeverDeadStrip (boolean)</li>
413	<li>IsThumb (ARM specific)</li>
414	</ul>
415	<p>Where does dllexport fit in here? Where does visibility protected and
416	internal fit? Protected seems like scope=global plus the rule to not
417	indirect references to it. Internal is like hidden plus enables some
418	compiler optimizations. I'm not sure the linker needs to know about internal.
419	</p>
420
421	</body>
422	</html>
423