Role of linker in C++

This is only a rough draft Clear Explanation is coming soon.

Main Roles of the Linker

The role of linker is to combine the object files coming from the compiler into one singular file that could be loaded into memory and executed. In early days of programming linking was done manually but now a days it happens automatically through Linkers.

Linking can be divided into 2 different types,Static and dynamic Linking

Static linker takes several object files and combines them into one final executable file. Each object file contains separate sections for instructions, initialized data, and uninitialized data. The linker joins these sections together and produces a single executable that can be loaded into memory and run.

Dynamic linking uses shared libraries to avoid copying the same code into every program. A shared library is loaded into memory at run time or load time and linked with the program while it is running. This work is done by a dynamic linker

Static Linker:

1. Symbol Resolution

Before we dive into symbol resolution we first have to understand what are symbols and symbols table. First of all these symbols are not same as programs variable.

Each object file has a symbol table that stores information about the names (symbols) it defines and uses. In linking, symbols fall into three types: global symbols defined in the module, which can be used by other modules; external symbols, which are used in the module but defined elsewhere; and local symbols, which are used only inside the same module. Local symbols cannot be accessed by other modules

The linker resolves external references between object files.

Example:

file1.c

extern int x;

void func() { x = 5; }

file2.c

int x;

  • The compiler alone cannot resolve x
  • The linker matches x in file1.o to x in file2.o

2. Address Binding

Assigns actual memory addresses to:

  • Code (functions)
  • Global/static variables
  • Constants

Example:

  • main() → address 0x400500
  • x → address 0x601040

3. Relocation

  • Adjusts addresses in machine code based on the final memory layout
  • Fixes jump and call instructions

Example:

call func

  • Assembler leaves a placeholder
  • Linker replaces it with actual function address

4. Combining Object Files

The linker merges:

  • Code sections (.text)
  • Data sections (.data, .bss)
  • Read-only constants (.rodata)

This produces a single executable file.


5. Linking Libraries

  • Static libraries (.a or .lib) → code copied into executable
  • Dynamic/shared libraries (.so / .dll) → references resolved at runtime

Example:

printf(“Hello\n”);

  • Linker resolves printf from the C standard library

6. Producing the Executable

After resolving symbols and relocation, the linker outputs:

  • Executable file (e.g., a.out, program.exe)
  • Ready to be loaded into memory and run

Example Flow

Source code:

// main.c

extern void greet();

int main() { greet(); return 0; }

// greet.c

#include <stdio.h>

void greet() { printf(“Hello\n”); }

Compilation steps:

  1. gcc -c main.c → main.o
  2. gcc -c greet.c → greet.o
  3. gcc main.o greet.o -o program → linker combines them into executable

Object file

header
Text Section
Data Section
BSS Section
Symbol Section
Relocation information
Debugging information

1. File Header

The file header describes the object file itself.

Contains:

  • File type (relocatable, executable, shared)
  • Target architecture (x86-64, ARM)
  • Endianness
  • Offset to section table

Example:

ELF64, x86-64, relocatable


2. Section Table

A table that lists all sections in the file.

Each entry describes:

  • Section name
  • Section size
  • File offset
  • Attributes (read/write/execute)

The linker uses this to locate everything.


3. Sections (Core Content)

🔹 .text — Code Section

  • Contains machine instructions
  • Read-only, executable

Example:

mov eax, 5


🔹 .data — Initialized Global Data

  • Global/static variables with initial values

int x = 10;

Stored in .data.


🔹 .bss — Uninitialized Data

  • Global/static variables initialized to zero

int y;

No actual data stored — just size info.


🔹 .rodata — Read-Only Data

  • Constants and string literals

“Hello World”


4. Symbol Table (.symtab)

The symbol table lists all identifiers that matter to the linker.

Each symbol entry contains:

  • Name (main, x)
  • Type (function, variable)
  • Scope (local/global)
  • Section index
  • Offset within section

Example:

main → .text + 0x20

x    → .data + 0x00


5. Relocation Tables (.rel.* or .rela.*)

Used when the code references symbols whose addresses are not yet known.

Example:

call printf

The assembler:

  • Inserts a placeholder address
  • Creates a relocation entry

Relocation entry says:

At offset X, fix address of symbol printf

The linker resolves this later.


6. Debug Information (Optional)

If compiled with debug flags (-g):

Sections like:

  • .debug_info
  • .debug_line
  • .debug_abbrev

Used by debuggers (gdb, lldb).


Example: Simple Object File Contents

Source:

int x = 5;

int main() { return x; }

Object file contains:

  • .text → machine code for main
  • .data → value 5
  • .symtab → symbols main, x
  • .rel.text → relocation for x