A multiprocessor can be defined as the computer which uses two or more processing units under the integrated control. Multi-processing is also defined as the way of using two or more than two CPUs within a single computer. As we all know that there are processors inside the computers, the multi processors, as the name indicates, have the ability to support more than one processor at a same time. Usually in multi-processing the processors are organized in the parallel form and hence a large number of the executions can be brought at the same time i.e. multi-processing helps in executing the same instructions a number of time at a particular time. Some other related definition of the multi processors are that multi-processing is the sharing of the execution process by the interconnection of more than one microprocessor using tightly or loosely couples technology. Usually multi-processing tasks carries two simultaneous steps. One is the performing the task of editing and the other is the handling the data processing. A multi-processor device comprising, over a single semiconductor chip a plurality of processors including a first group of processors and a second group of processors; a first bus to which the first group of processors is coupled; a second bus to which the second group of processors is coupled; a first external bus interface to which the first bus is coupled; and a second external bus interface to which the second bus is coupled. The term multiprocessing is also used to refer to a computer that has many independent processing elements. The processing elements are almost full computers in their own right. The main difference is that they have been freed from the encumbrance of communication with peripherals.
MULTIPROCESSORS IN THE TERMS OF ARCHITECTURE
The processors are usually made up of the small and medium scale ICs which usually contains a less or large number of the transistors. The multi processors involves a computer architecture Most common multiprocessor systems today use an SMP architecture. In the case of multi-core processors, the SMP architecture applies to the cores, treating them as separate processors. SMP systems allow any processor to work on any task no matter where the data for that task are located in memory; with proper operating system support, SMP systems can easily move tasks between processors to balance the workload efficiently.
- Increased processing power
- Scale resource use to application requirements
Additional operating system responsibilities
- All processors remain busy
- Even distribution of processes throughout the system
- All processors work on consistent copies of shared data
- Execution of related processes synchronized
- Mutual exclusion enforced
Multiprocessing is a type of processing in which two or more processors work together to process more than one program simultaneously. Multi processor systems have more than one processor that’s why known as multi processor systems.
In multiprocessor system there is one master processor and other are the Slave. If one processor fails then master can assign the task to other slave processor. But if Master will be fail than entire system will fail. Central part of Multiprocessor is the Master. All of them share the hard disk and Memory and other memory devices.
Examples of multiprocessors
1. Quad-Processor Pentium Pro
- SMP, bus interconnection.
- 4 x 200 MHz Intel Pentium Pro processors.
- 8 + 8 Kb L1 cache per processor.
- 512 Kb L2 cache per processor.
- Snoopy cache coherence.
- Compaq, HP, IBM, NetPower.
- Windows NT, Solaris, Linux, etc.
2. SGI Origin 2000
- NUMA, hypercube interconnection.
- Up to 128 (64 x 2) MIPS R 10000 processors.
- 32 + 32 Kb L1 cache per processor.
- 4 Mb L2 cache per processor.
- Distributed directory-based cache coherence.
- Automatic page migration/replication.
- SGI IRIX with Pthreads
Classifications of multiprocessor architecture
- Nature of data path
- Interconnection scheme
- How processors share resources
- Separate address space for each processor.
- Processors communicate via message passing.
B) Shared-Memory Architectures
- Single address space shared by all processors.
- Processors communicate by memory read/write.
- SMP or NUMA.
- Cache coherence is important issue.
1. Classifying Sequential and Parallel Architectures(DATA PATH)
- Stream: sequence of bytes
- Data stream
- Instruction stream
- Flynn’s classifications:
MISD multiprocessing: MISD multiprocessing offers mainly the advantage of redundancy, since multiple processing units perform the same tasks on the same data, reducing the chances of incorrect results if one of the units fails. MISD architectures may involve comparisons between processing units to detect failures. Apart from the redundant and fail-safe character of this type of multiprocessing, it has few advantages, and it is very expensive. It does not improve performance. It can be implemented in a way that is transparent to software. It is used inarray processorsand is implemented in fault tolerant machines.
MIMD multiprocessing: MIMD multiprocessing architecture is suitable for a wide variety of tasks in which completely independent and parallel execution of instructions touching different sets of data can be put to productive use. For this reason, and because it is easy to implement, MIMD predominates in multiprocessing.
Processing is divided into multiplethreads, each with its own hardware processor state, within a single software-defined process or within multiple processes. Insofar as a system has multiple threads awaiting dispatch (either system or user threads), this architecture makes good use of hardware resources.
MIMD does raise issues of deadlock and resource contention, however, since threads may collide in their access to resources in an unpredictable way that is difficult to manage efficiently. MIMD requires special coding in the operating system of a computer but does not require application changes unless the programs themselves use multiple threads (MIMD is transparent to single-threaded programs under most operating systems, if the programs do not voluntarily relinquish control to the OS). Both system and user software may need to use software constructs such assemaphores(also called locksorgates) to prevent one thread from interfering with another if they should happen to cross paths in referencing the same data. This gating or locking process increases code complexity, lowers performance, and greatly increases the amount of testing required, although not usually enough to negate the advantages of multiprocessing.
Similar conflicts can arise at the hardware level between processors (cache contention and corruption, for example), and must usually be resolved in hardware, or with a combination of software and hardware (e.g.,cache-clear instructions).
SISD multiprocessing: In asingle instruction stream, single data streamcomputer one processor sequentially processes instructions, each instruction processes one data item.
SIMD multiprocessing: In asingle instruction stream, multiple data streamcomputer one processor handles a stream of instructions, each one of which can perform calculations in parallel on multiple data locations. SIMD multiprocessing is well suited toparallel or vector processing, in which a very large set of data can be divided into parts that are individually subjected to identical but independent operations. A single instruction stream directs the operation of multiple processing units to perform the same manipulations simultaneously on potentially large amounts of data. For certain types of computing applications, this type of architecture can produce enormous increases in performance, in terms of the elapsed time required to complete a given task. However, a drawback to this architecture is that a large part of the system falls idle when programs or system tasks are executed that cannot be divided into units that can be processed in parallel.
2. Interconnection scheme
Describes how the system’s components, such as processors and memory modules, are connected
- Consists of nodes (components or switches) and links (connections)
- Parameters used to evaluate interconnection schemes
- Node degree
- Bisection width
- Network diameter
- Cost of the interconnection scheme
- Shared bus
- Single communication path between all nodes
- Contention can build up for shared bus
- Fast for small multiprocessors
- Form supernodes by connecting several components with a shared bus; use a more scalable interconnection scheme to connect supernodes
- Dual-processor Intel Pentium
Shared bus multiprocessor organization.
- Crossbar-switch matrix
- Separate path from every processor to every memory module (or from every to every other node when nodes consist of both processors and memory modules)
- High fault tolerance, performance and cost
- Sun UltraSPARC-III
Crossbar-s witch matrix multiprocessor organization.
- n -dimensional hypercube has 2 nodes in which each node is n connected to n neighbor nodes
- Faster, more fault tolerant, but more expensive than a 2-D mesh network
- n CUBE (up to 8192 processors)
- Multistage network
- Switch nodes act as hubs routing messages between nodes
- Cheaper, less fault tolerant, worse performance compared to a crossbar-switch matrix
- IBM POWER4
COUPLING of PROCESSORS
- Tightly coupled systems
- Processors share most resources including memory
- Communicate over shared buses using shared physical memory
- Tasks and/or processors communicate in a highly synchronized fashion
- Communicates through a common shared memory
- Shared memory system
- Loosely coupled systems
- Processors do not share most resources
- Most communication through explicit messages or shared virtual memory (although not shared physical memory)
- Tasks or processors do not communicate in a synchronized fashion
- Communicates by message passing packets
- Overhead for data exchange is high
- Distributed memory system
Comparison between them
- Loosely coupled systems: more flexible, fault tolerant, scalable
- Tightly coupled systems: more efficient, less burden to operating system programmers
Multiprocessor Operating System Organizations
Classify systems based on how processors share operating system responsibilities
- Separate kernels
- Symmetrical organization
1) Master/slave organization
- Master processor executes the operating system
- Slaves execute only user processors
- Hardware asymmetry
- Low fault tolerance
- Good for computationally intensive jobs
2) Separate kernels organization
- Each processor executes its own operating system
- Some globally shared operating system data
- Loosely coupled
- Catastrophic failure unlikely, but failure of one processor results in termination of processes on that processor
- Little contention over resources
Example: Tandem system
3) Symmetrical organization
- Operating system manages a pool of identical processors
- High amount of resource sharing
- Need for mutual exclusion
- Highest degree of fault tolerance of any organization
- Some contention for resources
Example: BBN Butterfly
Memory Access Architectures
- Can classify multiprocessors based on how processors share memory
- Goal: Fast memory access from all processors to all memory
- Contention in large systems makes this impractical
1) Uniform memory access (UMA) multiprocessor
- All processors share all memory
- Access to any memory page is nearly the same for all processors and all memory modules (disregarding cache hits)
- Typically uses shared bus or crossbar-switch matrix
- Also called symmetric multiprocessing (SMP)
- Small multiprocessors (typically two to eight processors)
2) Nonuniform memory access (NUMA) multiprocessor
- Each node contains a few processors and a portion of system memory, which is local to that node
- Access to local memory faster than access to global memory (rest of memory)
- More scalable than UMA (fewer bus collisions)
3) Cache-only memory architecture (COMA) multiprocessor
- Physically interconnected as a NUMA is
- Local memory vs. global memory
- Main memory is viewed as a cache and called an attraction memory (AM)
- Allows system to migrate data to node that most often accesses it at granularity of a memory line (more efficient than a memory page)
- Reduces the number of cache misses serviced remotely
- Duplicated data items
- Complex protocol to ensure all updates are received at all processors
4) No-remote-memory-access (NORMA) multiprocessor
- Does not share physical memory
- Some implement the illusion of shared physical memory shared virtual memory (SVM)
- Loosely coupled
- Communication through explicit messages
- Distributed systems
- Not networked system
Features of the multiprocessors
- Many multiprocessors share one address space
- They conceptually share memory.
- Sometimes it is often implemented just like a multicomputer
- In it the communication is implicit. It reads and writes access to the shared memories.
- Usually the multi processors are characterized by the complex behaviour.
- The MPU handles high-level tasks, including axis profile generation, host/controller communication, user-program execution, and safety event handling.
- Advanced real time algorithm and special filter execution
- Digital encoder input up to 20 million counts per second
- Analog Sin-Cos encoder input and interpolation up to a multiplication factor of 65,536
- Fast, high-rate Position Event Generator (PEG) to trigger external devices
- Fast position registration (Mark) to capture position on input event
- High resolution analog or PWM command generation to the drive
- High Speed Synchronous Interface channel (HSSI) to manage fast communication with remote axes or I/O expansion modules
Advantages of Multiprocessor Systems
Some advantages of multiprocessor system are as follows:
- Reduced Cost: Multiple processors share the same resources. Separate power supply or mother board for each chip is not required. This reduces the cost.
- Increased Reliability: The reliability of system is also increased. The failure of one processor does not affect the other processors though it will slow down the machine. Several mechanisms are required to achieve increased reliability. If a processor fails, a job running on that processor also fails. The system must be able to reschedule the failed job or to alert the user that the job was not successfully completed.
- More work: As we increase the number of processors then it means that more work can be done in less time. Id more than one processor cooperates on a task then they will take less time to complete it.
- If we divide functions among several processors, then if one processor fails then it will not affect the system or we can say it will not halt the system, but it will effect on the work speed. Suppose I have five processors and one of them fails due to some reasons then each of the remaining four processors will share the work of failed processor. So it means that system will not fail but definitely failed processor will effect on its speed.
- If you pay attention on the matter of which save much money among multi-processor systems and multiple single-processor systems then you will know that multiprocessor systems save moremoneythan multiple single-processor systems because they can share power supplies, memory and peripherals.
- Increased Throughput: An increase in the number of processes completes the work in less time. It is important to note that doubling the number of processors does not halve the time to complete a job. It is due to the overhead in communication between processors and contention for shared resources etc.
Morris Mano, “Computer System Architecture”, Prentice Hall, 2007