Low Level Programming

Introduction

Model of computation is a set of basic operations and their respective costs.
Most models of computations are also abstract machines. i.e church Lambda calculus, turing machines, von neumann
Computer Architecture describes the functionality, organization and implementation of computer systems.
Von Neumann was robust and easy to program hence its popularity.
Desc:
- This consists of one processor, one memory bank connected to a common bus
- CPU execute instructions fetched from memory by a control unit.
- ALU perfroms the needed calcs.
- Memory stores only bits and stores both encoded instructions and data to operate on.
- One byte = 8 bits
Assembly Language for a chosen processor is a pl consisting of mnemonics for each possible binary encoded instruction(machine code).
An architecture does not always define a precise instruction set unlike a model of computation.

Introduction 2(Electronics)

Electric charge - causes matter to experience a force - Coulombs
Electric current - Flow of charge - Amps
Voltage - electric potential difference between two points, voltage increases current, current pressure - Volts
Resistance - Difficulty of current flow through a material - Ohms
One Amp corresponds to one coulomb per second.
Ohms Law - Current flowing from one point to another is equal to voltage across the points divided by the resistance between them. I =V/R.
AC vs DC
Kirchhoff's Voltage Law - Sum of volatages around a circuit is zero.
Resistance serially connected , total resistance is just the sum.
Light Emitting Diodes (LED) - lights up when current flows via it.
Diode - electronic component that allows current to flow through it in one direction, low resistance in one direction(allow) and high in the other(impede)
Forward voltage - voltage amount dropped across LED when current flows through it.
Transistor - device used to switch or amplify current.
Types: BJTs - Bipolar Junction Transistor, FETs - Field Effective Transistors.
BJTs have three terminals: base, collector and emitter.
Two types of BJTs: NPN and PNP. Differ in how they respond when current is applied at base terminal.
NPN - applying a small current at the base allows a larger current to flow from the collector to the emitter.
Logic gates - circuit elements that implement logical functions where inputs and outputs are high and low voltages.
Combinational Logic circuit - combine logic gates in such a way that output is a function of the present inputs.
Sequential logic - output is a function of both present and past inputs.
Integrated circuits
Ics in Dual in-line package
Resistor-Transistor Logic circuits, Diode-Transistor Logic, Transistor-transistor Logic
Pinout diagram.

Von Neumann Arch Extensions

Registers:
- Memory cells placed directly on the CPU chip.
- Faster but more complicated and expensive.
- Don't use the bus.
- Registers are based on transistors while main memory uses condensers.
- Could implement main memory as registers are but:
  - Registers are more expensive.
  - Instructions encode the register number as part of their code hence more registers equals bigger instruction sizes.
  - Registers add complexity to circuits to address them, more complex circuits are harder to speed up, not easy to set up a large register file to work on 5ghz
- Locality of reference
  - Temporal locality - accesses to one address are likely to be close in time.
  - Spatial locality - accesses X, the next one wil likely be close to X
- General purpose registers
  - They are interchangeable and can be used in many different commands.
  - Names can be r1,r2,r3 ..... r15, represent meaning they bear for some special instruction i.e rc1 - cycle, rcx - cycle counter, rax- accumulator
  - registers do not have addresses as they are implemented differently to main memory
  - Some registers have system wide importance hence only modified by the OS. i.e rip register - stores address of next instruction. rflags - current program state.
  - Registers for floating point numbers and specialised parallel instructions(SIMD). i.e multimedia xmm0 - xmm15.
  - SIMD(Single Instruction Multiple Data) allow a single instruction to be applied simultaneously to multiple data items.
- System registers
  - designed to be used specifically by the OS.
  - They don't hold values in computations instead store information required by system-wide data structures.
  - i.e cr2,cr3 - virtual memory, cr8 - interrupts.
Hardware Stack:
- Stack is a ds in general that supports pushing and popping.
- Used in computations and store of local variables and implement function call sequence.
- There is hardware support for such data structure.
- Useful to implement function calls in higher-level languages, save the context of computations.
Interrupts:
- Allows one to change execution order based on events external to the program itself.
- Execution suspended, Registers saved and CPU executes corresponding handling routine.
- i.e signal from external device, zero division, invalid instruction, privilege instruction in non-privileged environment.
Protection rings:
- Mechanism designed to limit the applications capabilities for security and robustness reasons.
- Invented for Multics OS predecessor to Unix.
- Each ring corresponds to certain privilege level.
- Each instruction type is linked with one or more privilege levels and is not executable on others.
- A CPU is always in a state corresponding to one of the so-called protection rings.
- Each defines a set of allowed instructions.
Virtual Memory:
- This is an abstraction over physical memory, which helps distribute it between programs in a safe and effective way.
- Also isolates programs from one another.


- Parallelism will be key and exploitation of locality of data.
- CPU speeds increased faster than memory speeds.
- Memory keeps getting further with the introduction of new levels of caches.
 
- CPUs with caches...as transistor density increased, cache capabilities were integrated onto CPUs.
- GPUs grew out of need to provide hardware support for displays as use of graphics use grew.
- GPU speed also increase faster tha memory speed.
- GPU also integrated onto CPUs as transistor density increased.
- NIC(Network Integration Controller) capabilities also integrated onto CPUs.

- Work/Time = Work/Instructions(Path Length) * Instruction/Cycle(IPC) * Cycle/Time(Frequency)
- Better Algorithm - Same work with fewer instructions
- Compiler can optimize for fewer instructions, choose those with better IPC.
- Cache efficient algorithms: Higher IPC.
- Vectorization: same work with fewer instructions.
- Parallelization: more instructions per cycle.

Unix

Unix postulates that everything is a file in the sense that looks like a stream of bytes.
Abstract things like:
- data access on a hard drive/ssd.
- data exchange between programs.
- interaction with external devices.
OS purpose is to abstract and manage resources via a set of routines to handle communication with devices, files, programs.
A programs cannot bypass the OS and interact with resources directly.
Communication happens via system calls provided to user applications.
Unix identifies a file with its descriptor as soon as it is opened by a program...this is nothing but an integer value.
File is opened by invoking the open system call however 3 other files opened stdin, stdout, stderror. 0, 1, 2.
Write system call invoked for stdout - writes a given amount of bytes from memory starting at a given address to a file with a given descriptor