Low Level Programming
Introduction
- Model of computation is a set of basic operations and their respective costs.
- Most models of computations are also abstract machines. i.e church Lambda calculus, turing machines, von neumann
- Computer Architecture describes the functionality, organization and implementation of computer systems.
- Von Neumann was robust and easy to program hence its popularity.
- Desc:
- This consists of one processor, one memory bank connected to a common bus
- CPU execute instructions fetched from memory by a control unit.
- ALU perfroms the needed calcs.
- Memory stores only bits and stores both encoded instructions and data to operate on.
- One byte = 8 bits
- Assembly Language for a chosen processor is a pl consisting of mnemonics for each possible binary encoded instruction(machine code).
- An architecture does not always define a precise instruction set unlike a model of computation.
Introduction 2(Electronics)
-
Electric charge - causes matter to experience a force - Coulombs
-
Electric current - Flow of charge - Amps
-
Voltage - electric potential difference between two points, voltage increases current, current pressure - Volts
-
Resistance - Difficulty of current flow through a material - Ohms
-
One Amp corresponds to one coulomb per second.
-
Ohms Law - Current flowing from one point to another is equal to voltage across the points divided by the resistance between them. I =V/R.
-
AC vs DC
-
Kirchhoff's Voltage Law - Sum of volatages around a circuit is zero.
-
Resistance serially connected , total resistance is just the sum.
-
Light Emitting Diodes (LED) - lights up when current flows via it.
-
Diode - electronic component that allows current to flow through it in one direction, low resistance in one direction(allow) and high in the other(impede)
-
Forward voltage - voltage amount dropped across LED when current flows through it.
-
Transistor - device used to switch or amplify current.
-
Types: BJTs - Bipolar Junction Transistor, FETs - Field Effective Transistors.
-
BJTs have three terminals: base, collector and emitter.
-
Two types of BJTs: NPN and PNP. Differ in how they respond when current is applied at base terminal.
-
NPN - applying a small current at the base allows a larger current to flow from the collector to the emitter.
-
Logic gates - circuit elements that implement logical functions where inputs and outputs are high and low voltages.
-
Combinational Logic circuit - combine logic gates in such a way that output is a function of the present inputs.
-
Sequential logic - output is a function of both present and past inputs.
-
Integrated circuits
-
Ics in Dual in-line package
-
Resistor-Transistor Logic circuits, Diode-Transistor Logic, Transistor-transistor Logic
-
Pinout diagram.
Von Neumann Arch Extensions
-
Registers:
- Memory cells placed directly on the CPU chip.
- Faster but more complicated and expensive.
- Don't use the bus.
- Registers are based on transistors while main memory uses condensers.
- Could implement main memory as registers are but:
- Registers are more expensive.
- Instructions encode the register number as part of their code hence more registers equals bigger instruction sizes.
- Registers add complexity to circuits to address them, more complex circuits are harder to speed up, not easy to set up a large register file to work on 5ghz
- Locality of reference
- Temporal locality - accesses to one address are likely to be close in time.
- Spatial locality - accesses X, the next one wil likely be close to X
- General purpose registers
- They are interchangeable and can be used in many different commands.
- Names can be r1,r2,r3 ..... r15, represent meaning they bear for some special instruction i.e rc1 - cycle, rcx - cycle counter, rax- accumulator
- registers do not have addresses as they are implemented differently to main memory
- Some registers have system wide importance hence only modified by the OS. i.e rip register - stores address of next instruction. rflags - current program state.
- Registers for floating point numbers and specialised parallel instructions(SIMD). i.e multimedia xmm0 - xmm15.
- SIMD(Single Instruction Multiple Data) allow a single instruction to be applied simultaneously to multiple data items.
- System registers
- designed to be used specifically by the OS.
- They don't hold values in computations instead store information required by system-wide data structures.
- i.e cr2,cr3 - virtual memory, cr8 - interrupts.
-
Hardware Stack:
- Stack is a ds in general that supports pushing and popping.
- Used in computations and store of local variables and implement function call sequence.
- There is hardware support for such data structure.
- Useful to implement function calls in higher-level languages, save the context of computations.
-
Interrupts:
- Allows one to change execution order based on events external to the program itself.
- Execution suspended, Registers saved and CPU executes corresponding handling routine.
- i.e signal from external device, zero division, invalid instruction, privilege instruction in non-privileged environment.
-
Protection rings:
- Mechanism designed to limit the applications capabilities for security and robustness reasons.
- Invented for Multics OS predecessor to Unix.
- Each ring corresponds to certain privilege level.
- Each instruction type is linked with one or more privilege levels and is not executable on others.
- A CPU is always in a state corresponding to one of the so-called protection rings.
- Each defines a set of allowed instructions.
-
Virtual Memory:
- This is an abstraction over physical memory, which helps distribute it between programs in a safe and effective way.
- Also isolates programs from one another.
- Parallelism will be key and exploitation of locality of data.
- CPU speeds increased faster than memory speeds.
- Memory keeps getting further with the introduction of new levels of caches.
- CPUs with caches...as transistor density increased, cache capabilities were integrated onto CPUs.
- GPUs grew out of need to provide hardware support for displays as use of graphics use grew.
- GPU speed also increase faster tha memory speed.
- GPU also integrated onto CPUs as transistor density increased.
- NIC(Network Integration Controller) capabilities also integrated onto CPUs.
- Work/Time = Work/Instructions(Path Length) * Instruction/Cycle(IPC) * Cycle/Time(Frequency)
- Better Algorithm - Same work with fewer instructions
- Compiler can optimize for fewer instructions, choose those with better IPC.
- Cache efficient algorithms: Higher IPC.
- Vectorization: same work with fewer instructions.
- Parallelization: more instructions per cycle.
Unix
- Unix postulates that everything is a file in the sense that looks like a stream of bytes.
- Abstract things like:
- data access on a hard drive/ssd.
- data exchange between programs.
- interaction with external devices.
- OS purpose is to abstract and manage resources via a set of routines to handle communication with devices, files, programs.
- A programs cannot bypass the OS and interact with resources directly.
- Communication happens via system calls provided to user applications.
- Unix identifies a file with its descriptor as soon as it is opened by a program...this is nothing but an integer value.
- File is opened by invoking the open system call however 3 other files opened stdin, stdout, stderror. 0, 1, 2.
- Write system call invoked for stdout - writes a given amount of bytes from memory starting at a given address to a file with a given descriptor