image

Solution of Assignment Synopsis & Project Dissertation Report


PRODUCT DETAILS

Online-Typing-and-Filling

Title Name Amity Solved Assignment Computer Architecture and Parallel Processing
University AMITY
Service Type Assignment
Course MCA
Semister Semester-V Cource: MCA
Short Name or Subject Code COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
Commerce line item Type Semester-V Cource: MCA
Product Assignment of MCA Semester-V (AMITY)

Solved Assignment


  Questions :-

                                                                                                                                      COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

Section A 

Q1.    Explain Flynn classifications of various computers based on notions of instructions and data streams.           

Q2 .    What are static and dynamic networks of multiprocessor? Give two examples of both.

Q3 .     Differentiate between RICS and CICS processors.                   

Q4 .     Explain nonlinear pipeline processor with suitable example.  

Q5 .     Differentiate between multiprocessors and multi-computers.  

 

Section B

Case Study

ABC.com is a website where you can watch original movie DVDs.  It currently maitains the list of visitors and details of their visit.  The website gets almost 1 billion visitors everyday and at midnight it processes all the information.  It takes almost 5 hours to pocess all the information and the system remains down for that long.  It causes the company a huge loss.  The company decided to buy a super computer for faster analysis.  The supercomputer has 10 processors.  Now the need is to design a parallel algorithm for the following problems: We now have the list of visitors for the day and the number of movies they watched.

Q1.      Design a parallel algorithm that would sort the names alphabetically.

Q2.      Now write a parallel search algorithm that would find a visitor "John" in this sorted list and show how many movies he watched.

Q3.      Can either sorting or searching achieve super linear speedup?

 

 

Section C

Question No.  1

Marks - 10


What was the period of third generation of electronics computers? 

 

Options

 
   

1955-64

 

1965-74

 

1970-80

 

1975-90

 

Question No.  2

Marks - 10


Which of the following is an example of shared memory multiprocessor  models?

 

Options

 
   

Uniform Memory Access (UMA)

 

Non-uniform Memory Access (NUMA)

 

Cache Only Memory Access (COMA)

 

All of the above.

 

Question No.  3

Marks - 10


Network latency refers to …

 

Options

 
   

The maximum data transfer rate in terms of M bytes/sec transmission through the network.

 

The worst-case time delay for a unite message to be transferred through the network.

 

Implementation cost such as those for wires, switches, connectors, arbitration etc.

 

The ability of the network to be modularly expandable with a scalable performance with increasing machine resource

 

Question No.  4

Marks - 10


Wich of the following is an example of dynamic connection network?     

 

Options

 
   

Digital Bus

 

Omega Network

 

Crossbar Switch Network

 

All of the above

 

Question No.  5

Marks - 10


. VLIW stands for…

 

Options

 
   

Very Large Integration Word

 

Very Long Integration Word

 

Very Long Instruction Word

 

Very Light Image Word

 

Question No.  6

Marks - 10


Compiler used in implicit parallelism is…                                         

 

Options

 
   

Parallelizing compiler

 

Concurrency preserving compiler

 

Simple HLL compiler

 

None of these.

 

Question No.  7

Marks - 10


Number of nodes in the graph is called the…           

 

Options

 
   

In- degree

 

Out- degree

 

Total- degree

 

Network size.

 

Question No.  8

Marks - 10


. Intel i960CA model is an example of….     

 

Options

 
   

CICS scalar processors

 

RICS scalar processors

 

Super scalar processors

 

Vector processors

 

Question No.  9

Marks - 10


 

Which of the following is an example of static connection network?                       

 

Options

 
   

3-cube

 

Switch- modules

 

Systolic array

 

a. and c.

 

Question No.  10

Marks - 10


Paging and segmentation are associated to …

 

Options

 
   

Cache- memory

 

Associative memory

 

Virtual memory

 

Volatile memory

 

Question No.  11

Marks - 10


In Distributed-Memory Multi-computers, each node is an autonomous computer consisting of a…….

 

Options

 
   

Processor

 

Local Memory

 

Attached Disk

 

All of the above

 

Question No.  12

Marks - 10


 

PRAM stands for…….

 

Options

 
   

Programmable Random Access Memory.

 

Parallel Random Access Memory.

 

Parallel Random Access Machine.

 

None of the above.

 

Question No.  13

Marks - 10


 

Synchronization of all PE’s in an SIMD computer is done by using…….

 

Options

 
   

Hardware

 

Software

 

Firmware

 

Both a. and b.

 

Question No.  14

Marks - 10


 

. Parallelism can be achieved using …….

 

Options

 
   

Hardware

 

Software

 

Both hardware and software.

 

It can not be achieved using hardware or software.

 

Question No.  15

Marks - 10


. Network routing algorithms could be ……

 

Options

 
   

Static only

 

Dynamic only

 

Both static and dynamic.

 

Depends on network user.

 

Question No.  16

Marks - 10


Network scalability refers to ……

 

Options

 
   

The ability of a network to be modularly expandable with a scalable performance with increasing machine resources.

 

The machine data transfer rate, in terms of M bytes /s transmitted through the network.

 

The worst case time delay for a unit message to be transferred through the network.

 

Implementation costs such as those for wires, switches, connectors, arbitration and interface logic.

 

Question No.  17

Marks - 10


Base line network is an example of ….

 

Options

 
   

Static connection network.

 

Dynamic connection network.

 

It depends on the design.

 

None of the above.

 

Question No.  18

Marks - 10


 

Crossbar network is an example of ……

 

Options

 
   

Static connection network.

 

Dynamic connection network.

 

It depends on the design.

 

None of the above.

 

Question No.  19

Marks - 10


 

Which of the following are speedup performance laws?

 

Options

 
   

Amdahl’s law for a fixed work load.

 

Fixed –load speedup.

 

Amdahl’s law revisited.

 

All the above.

 

Question No.  20

Marks - 10


 

Which of the following are scalability metrics?

 

Options

 
   

Machine size.

 

Clock Rate.

 

I/O demand

 

All the above

 

Question No.  21

Marks - 10


 

. Instruction level parallelism could be achieved through……

 

Options

 
   

Instruction Pipelining.

 

Vector Processing.

 

Array Processing

 

Multi Processor Architecture.

 

Question No.  22

Marks - 10


 

. Which of the following is the fastest device in terms of speed?

 

Options

 
   

Cache Memory

 

CPU Registers

 

Main Memory

 

Disk Storage

 

Question No.  23

Marks - 10


 

Cache Locality of references refers to which of the following?

 

Options

 
   

Temporal Locality

 

Sequential Locality

 

Both a. and b.

 

None of the above.

 

Question No.  24

Marks - 10


Cache hit ratio is?

 

Options

 
   

Number of times data found in cache / Total number of access.

 

Number of times data not found in cache / total number of access.

 

Number of times data found in RAM/ total number of access.

 

Number of times data found in hard disk/ total number of access.

 

Question No.  25

Marks - 10


Which techniques are examples of virtual memory?

 

Options

 
   

Demand paged memory management.

 

Segmented memory management.

 

Segmented paged memory management

 

a. and b. only.

 

Question No.  26

Marks - 10


 

Virtual memory could be achieved using which of the memory device?

 

Options

 
   

Cache memory

 

Main memory

 

CPU Register

 

Secondary Memory

 

Question No.  27

Marks - 10


Which is the fastest memory replacement technique?

 

Options

 
   

FCFS

 

LRU

 

OPT

 

Depends on demand of size and order.

 

Question No.  28

Marks - 10


 

Which of the followings are cache addressing models?

 

Options

 
   

Physical address caches.

 

Virtual address caches.

 

Both a. and b.

 

None of the above.

 

Question No.  29

Marks - 10


In which of the cache mapping scheme, word size of the cache is smallest?

 

Options

 
   

Direct mapping

 

Associative mapping

 

Set associative mapping

 

In all the above cases, word size of the cache is same

 

Question No.  30

Marks - 10


 

Cache coherence may occur in …..

 

Options

 
   

Multi-computer System.

 

Multiprocessor Systems.

 

Single Processor Systems.

 

Both a. and b.

 

Question No.  31

Marks - 10


Cache coherence is the problem in which…..

 

Options

 
   

Duplicate data is in different caches of processors in multiprocessor systems.

 

Duplicate data in cache and in RAM.

 

Duplicate data in cache, RAM and Hard Disk.

 

All the above.

 

Question No.  32

Marks - 10


Write-Through Caches applies the technique……

 

Options

 
   

Whenever there is modification in cache data, in main memory data modification will be finally once with same value which is finally modified in cache.

 

Whenever there is modification in cache data, simultaneously in main memory data modification will occur.

 

There is no relation of data modification of cache and main memory.

 

It varies from processor to processor.

 

Question No.  33

Marks - 10


 

Write-Back Cache applies the technique…….

 

Options

 
   

Whenever there is modification in cache data, in main memory data modification will be finally once with same value which is finally modified in cache.

 

Whenever there is modification in cache data, simultaneously in main memory data modification will occur.

 

There is no relation of data modification of cache and main memory.

 

It varies from processor to processor.

 

Question No.  34

Marks - 10


 

What is the basic unit in store-and-forward routing?

 

Options

 
   

Input/ Output stream.

 

Packet.

 

Byte.

 

Message.

 

Question No.  35

Marks - 10


Masking instruction is an instruction of which category?

 

Options

 
   

Pipeline Instruction

 

Vector Instruction

 

Scalar Instruction

 

Array Instruction

 

Question No.  36

Marks - 10


DEC VAX 9000 is a….

 

Options

 
   

Mainframes.

 

Mini Super Computer.

 

Mini Computer.

 

Multi Computer Architecture.

 

Question No.  37

Marks - 10


Multi-pipelining could be achieved in ….

 

Options

 
   

Multi- Processor Systems.

 

Multi-Computer System.

 

Super Computer

 

Mainframe

 

Question No.  38

Marks - 10


Label-1 cache may be in ….

 

Options

 
   

Inside processor chip.

 

Outside processor chip and on board.

 

A part of the RAM.

 

A part of the hard disk.

 

Question No.  39

Marks - 10


Master-slave architecture is associated with…

 

Options

 
   

Multi-processor architecture.

 

Multi-Computer architecture.

 

Multi level cache coherence.

 

All the above.

 

Question No.  40

Marks - 10


Using Flynn´s taxonomy, the various architectures can be divided into categories.

 

Options

 
   

3

 

4

 

5

 

6

  Answers :-

                                                                                                               COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

 Section  A

Q1.Explain Flynn classifications of various computers based on notions of instructions and data streams.           

Answer:

Classifications

The four classifications defined by Flynn are based upon the number of concurrent instruction (or control) streams and data streams available in the architecture.

Flynn´s taxonomy (multiprogramming context)

 

Single instruction stream

Multiple instruction streams

Single program

Multiple programs

Single data stream

SISD

MISD

   

Multiple data streams

SIMD

MIMD

SPMD

MPMD

 

SISD (Single instruction stream, single data stream)

SISD

A sequential computer which exploits no parallelism in either the instruction or data streams. Single control unit (CU) fetches single instruction stream (IS) from memory. The CU then generates appropriate control signals to direct single processing element (PE) to operate on single data stream (DS) i.e. one operation at a time.

Examples of SISD architecture are the traditional uniprocessor machines like a PC (currently manufactured PCs have multiple cores) or old mainframes.

SIMD (Single instruction stream, multiple data streams)

SIMD

A computer which exploits multiple data streams against a single instruction stream to perform operations which may be naturally parallelized. For example, an array processor or GPU.

MISD (Multiple instruction streams, single data stream)

MISD

Multiple instructions operate on a single data stream. Uncommon architecture which is generally used for fault tolerance. Heterogeneous systems operate on the same data stream and must agree on the result. Examples include the Space Shuttle flight control computer.

MIMD (Multiple instruction streams, multiple data streams)

MIMD

Multiple autonomous processors simultaneously executing different instructions on different data. Distributed systems are generally recognized to be MIMD architectures; either exploiting a single shared memory space or a distributed memory space. A multi-core superscalar processor is a MIMD processor.

Diagram comparing classifications

Visually, these four architectures are shown below where each "PU" is a processing unit (of a uni-core or multi- core CPU):

SISD

MISD

   

SIMD

MIMD

   

 

 

Q2 .What are static and dynamic networks of multiprocessor? Give two examples of both.

Answer:

Interconnection networks make a major factor to differentiate modern multiprocessor architectures. They can be categorized according to a number of criteria such as topology, routing strategy and switching technique . Interconnection networks are bulid up of switching elements; topology is the pattern in which the individual switches are connected to other elements, like processors, memories and other switches.

Direct topologies connect each switch directly to a node, while in indirect topologies at least some of the switches connect to other switches.

Using switching technique as a criterion one can mention some classes:

  • circuit switching, in which the entire path through the network is reserved before a message is transferred,
  • packet switchingwith virtual cut through, in which a packet is forwarded immediately after it determines an appropriate switch output,
  • Wormhole routing, which relaxes requirements of completely buffering of blocked packets in a single switch, typical for packet switching.

Wormhole routing is currently the most popular technique in commercial parallel machines.

Queuing technique is another important factor in switching network architectures, since it strongly influences the aggregated bandwidth of the network. In the simple input queuing technique the packets queue at the switch input awaiting the availability of the desired switch output; higher performance is offered by output queuing which in turn is difficult to implement. A solution is to form multiple queues at each switch input or to create a central buffer shared among all switch inputs and outputs.

The networks can be classified as static or dynamic. Static interconnection networks are mainly used in message-passing architectures; the following types are commonly defined:

  • Completely-connected network.
  • Star-connected network.
  • Linear arrayor ring of processors.
  • Mesh network(in 2- or 3D). Each processor has a direct link to four/six (in 2D/3D) neighbour processors. Extensions of this kind of networks is a wraparound mesh or torus. Commercial examples are Intel Paragon XP/S and Cray T3D/E. These examples cover also another class, namelythe direct network
  • Tree networkof processors. Communication bottleneck likely to occur in large configurations can be alleviated by inreasing the number of communication links for processors closer to the root, which results in the fat-tree topology, efficiently used in the TMC CM5 computer. CM5 could be also an example of indirect network
  • Hypercube network. Classically this is a multidimensional mesh of processors with exactly two processors in each dimension. An example of such a system is the Intel iPSC/860 computer. Some new projects incorporate the idea of several procesors in each node which results in fat hypercube, i.e. indirect network An example is the SGI/Cray Origin2000 computer.

Dynamics interconnection networks implement one of four main alternatives:

  • Bus-based networks- the simplest and efficient solution when the cost and moderate number of processors are involved (see Table 1. Its main drawback is a bottleneck to the memory when number of processors becomes large and also a single point of failure. To overcome the problems, sometimes several parallel buses are incorporated. The classical example of such machine is the SGI Power Challenge computer with packet data bus.

 

Table 1: Properties of various types of multiprocessor interconnections

Property

Bus

Crossbar

Multistage

Speed

low

high

high

Cost

low

high

moderate

Reliability

low

high

high

Configurability

high

low

moderate

Complexity

low

high

moderate

 

  • Crossbar switching networks, which employ a grid of switching elements. The network is nonblocking, since the connection of a processor to a memory bank does not block the connection of any other processor to any other memory bank. In spite of high speed, their use is limited, due to nonlinear complexity (o(p2), p- number of processors) and the cost (cf. Table 1). They are applied mostly in multiprocessor vector computers (like Cray YMP) and in multiprocessors with multilevel interconnections (e.g. HP/Convex Exemplar SPP). One outstanding example is the Fujitsu VPP500 which incorporates 224x224 crossbar switch.
  • Multistage interconnection networksformulate the most advanced pure solution, which lies between the two extremes (Table 1). A typical example is the omega network, which constists of  stages, where p is number of inputs and outputs (usually number of processor and of memory banks). Its complexity is, less than for the crossbar switch. However, in the omega network some memory accesses can be blocked. Although machines of this kind of interconnections offer virtual global memory model of programming and ease of use they are still not much popular. Examples from the past cover BBN Butterfly and IBM RP-3 computers, at present IBM RS6K SP incorporates multistage interconnections with Vulcan switch.
  • Multilevel interconnection networkseems to be a relatively recent development. The idea comes directly from clusters of computers and consists of two or more levels of connections with different aggregated bandwidths. Typical examples are: SGI/Cray Origin2000, IBM RS6K SP with PowerPC604 SMP nodes and HP/Convex Exemplar. This kind of architecture is getting the most interest at present.

 

 

Q3 .Differentiate between RICS and CICS processors.    

Answer:

CISC Computer

RISC Computer

The acronym is variously used.

If it reads as above (i.e. as CISC computer), it means a computer that has a 
Complex Instruction Set Chip as its cpu.

It is also referred to as CISC computing.

It is sometimes called a CISC “chip”. This could have a tautology in the last two words, but it can be overcome by thinking of it as a 
Complex Instruction Set Computer chip.

The acronym is variously used.

If it reads as above (i.e. as RISC computer), it means a computer that has 
Reduced Instruction Set Chip as its cpu.

It is also referred to as RISC computing.

It is sometimes called a RISC “chip”. This could have a tautology in the last two words, but it can be overcome by thinking of it as a 
Reduced Instruction Set Computer chip.

CISC chips have an increasing number of components and an ever increasing instruction set and so are always slower and less powerful at executing “common” instructions

RISC chips have fewer components and a smaller instruction set, allowing faster accessing of “common” instructions

CISC chips execute an instruction in two to ten machine cycles

RISC chips execute an instruction in one machine cycle

CISC chips do all of the processing themselves

RISC chips distribute some of their processing to other chips

CISC chips are more common in computers that have a wider range of instructions to execute

RISC chips are finding their way into components that need faster processing of a limited number of instructions, such as printers and games machines

 

 

CISC [Complex instruction set Computing]

  1. Very large instruction sets reaching up to and above three hundred seperate instructions.
  2. Performance was improved by allowing the simplification of program compilers, as the range of more advanced instructions available led to less refinements having to be made at the compilation process. However, the complexity of the processor hardware and architecture that resulted can cause such chips to be difficult to understand and program for, and also means they can be expensive to produce.
  3. More specialized addressing modes and registers also being implemented, with variable length instruction codes.
  4. Instruction pipelining can not be implemented easily.
  5. Many complex instructions can access memory, such as direct addition between data in two memory locations.
  6. Mainly used in normal PC’s, Workstations and servers.
  7. CISC systems shorten execution time by reducing the number of instructions per program.
  8. Examples of CISC Processors: Intel x86.

 

RISC [Reduced instruction set Computing]

  1. Small set of instructions.
  2. Simplified and reduced instruction set, numbering one hundred instructions or less. Because of simple instructions, RISC chips requires fewer transistors to produce processors. Also the reduced instruction set means that the processor can execute the instructions more quickly, potentially allowing for greater speeds. However, only allowing such simple instructions means a greater burden is placed upon the software itself. Less instructions in the instruction set means a greater emphasis on the efficient writing of software with the instructions that are available.
  3. Addressing modes are simplified back to four or less, and the length of the codes is fixed in order to allow standardization across the instruction set.
  4. Instruction pipelining can be implemented easily.
  5. Only LOAD/STORE instructions can access memory.
  6. Mainly used for real time applications.
  7. RISC systems shorten execution time by reducing the clock cycles per instruction (i.e. simple instructions take less time to interpret).
  8. Examples of RISC Processors: Atmel AVR, PIC, ARM.

 

 

Q4 .Explain nonlinear pipeline processor with suitable example.  

Answer:

Non-Linear Pipelines

Template function parallel_pipeline supports only linear pipelines. It does not directly handle more baroque plumbing, such as in the diagram below.

However, you can still use a pipeline for this. Just topologically sort the filters into a linear order, like this:

The light gray arrows are the original arrows that are now implied by transitive closure of the other arrows. It might seem that lot of parallelism is lost by forcing a linear order on the filters, but in fact the only loss is in the latency of the pipeline, not the throughput. The latency is the time it takes a token to flow from the beginning to the end of the pipeline. Given a sufficient number of processors, the latency of the original non-linear pipeline is three filters. This is because filters A and B could process the token concurrently, and likewise filters D and E could process the token concurrently.

In the linear pipeline, the latency is five filters. The behavior of filters A, B, D and E above may need to be modified in order to properly handle objects that don’t need to be acted upon by the filter other than to be passed along to the next filter in the pipeline.

The throughput remains the same, because regardless of the topology, the throughput is still limited by the throughput of the slowest serial filter. If parallel_pipeline supported non-linear pipelines, it would add a lot of programming complexity, and not improve throughput. The linear limitation of parallel_pipeline is a good tradeoff of gain versus pain.

 

 

 

Q5 .Differentiate between multiprocessors and multi-computers.           

Answer: 

Multiprogramming

One time, I was at the post office standing in line waiting my turn to be served. My turn came but I was not fully prepared because my mail was not prepackaged. They gave me an empty box to do that on the side. I started packaging my mail while another customer is occupying my spot. It does not make sense to block the whole line while packaging my mail however it is a better idea to allow other customers proceed and get served in the mean time. I think this example (to some extent) is very similar in concept to multiprogramming model where programs are like customers and CPU is like the post office assistant. Assuming one assistant (single processor system) then only one customer can be served at a time. While a customer is being served he or she continues until he or she finishes or waits on the side. As long as the assistant is helping a customer he does not switch to serve other customers.

In a multiprogramming system there are one or more programs (processes or customers) resident in computer’s main memory ready to execute. Only one program at a time gets the CPU for execution while the others are waiting their turn. The whole idea of having a multi-programmed system is to optimize system utilization (more specifically CPU time). The currently executing program gets interrupted by the operating system between tasks (for example waiting for IO, recall the mail packaging example) and transfer control to another program in line (another customer). Running program keeps executing until it voluntarily gives the CPU back or when it blocks for IO. As you can see, the design goal is very clear: processes waiting for IO should not block other processes which in turn wastes CPU time. The idea is to keep the CPU busy as long as there are processes ready to execute.

Note that in order for such a system to function properly, the operating system must be able to load multiple programs into separate partitions of the main memory and provide the required protection because the chance of one process being modified by another process is likely to happen. Other problems that need to be addressed when having multiple programs in memory is fragmentation as programs enter or leave (swapping) the main memory. Another issue that needs to be handled as well is that large programs may not fit at once in memory which can be solved by using virtual memory. In modern operating systems programs are split into equally sized chunks called pages but this is beyond the scope of this article.

In summary, Multiprogramming system allows multiple processes to reside in main memory where only one program is running. The running program keeps executing until it blocks for IO and the next program in line takes the turn for execution. The goal is to optimize CPU utilization by reducing CPU idle time. Finally, please note that the term multiprogramming is an old term because in modern operating systems the whole program is not loaded completely into the main memory.

 

Multiprocessing

Multiprocessing sometimes refers to executing multiple processes (programs) at the same time.  This is confusing because we already have multiprogramming (defined earlier) and multitasking (will talk about it later) that are better to describe multiple processes running at the same time. Using the right terminology keeps less chance for confusion so what is multiprocessing then?

Multiprocessing refers actually to the CPU units rather than running processes. If the underlying hardware provides more than one processor then that is multiprocessing. There are many variations on the basic scheme for example having multiple cores on one die or multiple dies in one package or multiple packages in one system. In summary, multiprocessing refers to the underlying hardware (multiple CPUs, Cores) while multiprogramming refers to the software (multiple programs, processes). Note that a system can be both multi-programmed by having multiple programs running at the same time and multiprocessing by having more than one physical processor.

 

 

 Section  B

Case Study 

ABC.com is a website where you can watch original movie DVDs.  It currently maitains the list of visitors and details of their visit.  The website gets almost 1 billion visitors everyday and at midnight it processes all the information.  It takes almost 5 hours to pocess all the information and the system remains down for that long.  It causes the company a huge loss.  The company decided to buy a super computer for faster analysis.  The supercomputer has 10 processors.  Now the need is to design a parallel algorithm for the following problems:

We now have the list of visitors for the day and the number of movies they watched.

 

Q1.Design a parallel algorithm that would sort the names alphabetically.

Answer:

  1. #include <stdio.h>
  2. #include <string.h>
  3. void main()
  4. {
  5. char name[10][8], tname[10][8], temp[8];
  6. int i, j, n;
  7. printf("Enter the value of n ");
  8. scanf("%d", &n);
  9. printf("Enter %d names n", );
  10. for (i = 0; i < n; i++)
  11. {
  12. scanf("%s", name[i]);
  13. strcpy(tname[i], name[i]);
  14. }
  15. for (i = 0; i < n - 1 ; i++)
  16. {
  17. for (j = i + 1; j < n; j++)
  18. {
    1. if (strcmp(name[i], name[j]) > 0)
    2. {
    3. strcpy(temp, name[i]);
    4. strcpy(name[i], name[j]);
    5. strcpy(name[j], temp);
    6. }
  19. }
  20. }
  21. printf(" ---------------------------------------- ");
  22. printf("Input NamestSorted names ");
  23. printf("------------------------------------------ ");
  24. for (i = 0; i < n; i++)
  25. {
  26. printf("%s %s ", tname[i], name[i]);
  27. }
  28. printf("------------------------------------------ ");
  29. }

 

 

Q2.Now write a parallel search algorithm that would find a visitor "John" in this sorted list and show how many movies he watched.

Answer:

  1. function next_character(candidates,w,s,i) =
  2. if (i == #w) then candidates
  3. else
  4. let letter    = w[i];
  5. next_l    = s->{c + i: c in candidates};
  6. candidates = {c in candidates; n in next_l | n == letter};
  7. in next_char(candidates, w, s, i+1);
  8. function string_search(w, s) =
  9. next_character([0:#s - #w + 1],w,s,0);
  10. string_search("John",
  11. "john watched movie"= i);

 

 

Q3.Can either sorting or searching achieve super linear speedup?

Answer;

Although parallel computation costs time in inter processor communications while sequential computation doesn’t, parallel computation can still achieve super linear speedup by utilizing resources more efficiently. One example is the reduction of RAM access time.

This happens when the working set of a problem is greater than the cache size when executed sequentially, but can fit nicely in each available cache when executed in parallel.

This leads to my first experiment in a shared memory multi-processer model and the second experiment in a distributed system.

Setting up data nodes is another experiment for speedup in distributed model. By partitioning the data in such a way that all data are fitted in caches of multiple data nodes, we can achieve a faster RAM access in a distributed system. However, this has a high requirement for Ethernet speed. I will describe this late

In computer architecture, speedup is a metric for improvement in performance between two systems processing the same problem. More technically, is it the improvement in speed of execution of a task executed on two similar architectures with different resources. The notion of speedup was established by Amdahl´s law, which was particularly focused on parallel processing. However, speedup can be used more generally to show the effect on performance after any resource enhancement.

Sometimes a speedup of more than A when using A processors is observed in parallel computing, which is called super-linear speedup. Super-linear speedup rarely happens and often confuses beginners, who believe the theoretical maximum speedup should be A when A processors are used.

One possible reason for super-linear speedup in low-level computations is the cache effect resulting from the different memory hierarchies of a modern computer: in parallel computing, not only do the numbers of processors change, but so does the size of accumulated caches from different processors. With the larger accumulated cache size, more or even all of the working set can fit into caches and the memory access time reduces dramatically, which causes the extra speedup in addition to that from the actual computation.

An analogous situation occurs when searching large datasets, such as the genomic data searched by BLAST implementations. There the accumulated RAM from each of the nodes in a cluster enables the dataset to move from disk into RAM thereby drastically reducing the time required by e.g. mpiBLAST to search it.

Super-linear speedups can also occur when performing backtracking in parallel: an exception in one thread can cause several other threads to backtrack early, before they reach the exception themselves.

Super-linear speedups can also occur in parallel implementations of branch-and-bound for optimization: the processing of one node by one processor may affect the work other processors need to do for the other nodes.

 

 

 

Section  C 

 

Question No.  1          Marks - 10

________________________________________

What was the period of third generation of electronics computers?         

Options                      

  1. 1955-64
  2. 1965-74
  3. 1970-80
  4. 1975-90

 

 

Question No.  2          Marks - 10

________________________________________

Which of the following is an example of shared memory multiprocessor models?           

Options                      

  1. Uniform Memory Access (UMA)
  2. Non-uniform Memory Access (NUMA)
  3. Cache Only Memory Access (COMA)
  4. All of the above.

 

 

Question No.  3          Marks - 10

________________________________________

Network latency refers to …              

Options          

  1. The maximum data transfer rate in terms of M bytes/sec transmission through the network.
  2. The worst-case time delay for a unite message to be transferred through the network.
  3. Implementation cost such as those for wires, switches, connectors, arbitration etc.
  4. The ability of the network to be modularly expandable with a scalable performance with increasing machine resource

 

 

Question No.  4          Marks - 10

________________________________________

Which of the following is an example of dynamic connection network?              

Options                      

  1. Digital Bus
  2. Omega Network
  3. Crossbar Switch Network
  4. All of the above

 

 

Question No.  5          Marks - 10

________________________________________

. VLIW stands for…               

Options                      

  1. Very Large Integration Word
  2. Very Long Integration Word
  3. Very Long Instruction Word
  4. Very Light Image Word

 

 

Question No.  6          Marks - 10

________________________________________

Compiler used in implicit parallelism is…                                                     

Options                      

  1. Parallelizing compiler
  2. Concurrency preserving compiler
  3. Simple HLL compiler
  4. None of these.

 

 

Question No.  7          Marks - 10

________________________________________

Number of nodes in the graph is called the…                      

Options                      

  1. In- degree
  2. Out- degree
  3. Total- degree
  4. Network size.

 

 

Question No.  8          Marks - 10

________________________________________

. Intel i960CA model is an example of….       

Options                      

  1. CICS scalar processors
  2. RICS scalar processors
  3. Super scalar processors
  4. Vector processors

 

 

Question No.  9          Marks - 10

________________________________________ 

Which of the following is an example of static connection network?                                 

Options                      

  1. 3-cube
  2. Switch- modules
  3. Systolic array
  4. a And c

 

 

Question No.  10        Marks - 10

________________________________________

Paging and segmentation are associated to …         

Options                      

  1. Cache- memory
  2. Associative memory
  3. Virtual memory
  4. Volatile memory

 

 

Question No.  11        Marks - 10

________________________________________

In Distributed-Memory Multi-computers, each node is an autonomous computer consisting of a…….              

Options                      

  1. Processor
  2. Local Memory
  3. Attached Disk
  4. All of the above

 

 

Question No.  12        Marks - 10

________________________________________ 

PRAM stands for…….          

Options                      

  1. Programmable Random Access Memory.
  2. Parallel Random Access Memory.
  3. Parallel Random Access Machine.
  4. None of the above.

 

 

Question No.  13         Marks - 10

________________________________________

Synchronization of all PE’s in an SIMD computer is done by using…….     

Options                      

  1. Hardware
  2. Software
  3. Firmware
  4. Both a. and b.

 

Question No.  14        Marks - 10

________________________________________

. Parallelism can be achieved using …….     

Options                      

  1. Hardware
  2. Software
  3. Both hardware and software.
  4. It can not be achieved using hardware or software.

 

Question No.  15        Marks - 10

________________________________________

. Network routing algorithms could be ……         

Options                      

  1. Static only
  2. Dynamic only
  3. Both static and dynamic.
  4. Depends on network user.

 

(Solved by www.solvezone.in; please do not copy, plagiarism warning; contact: 8882309876)

 

Question No.  16        Marks - 10

________________________________________

Network scalability refers to ……    

Options          

  1. The ability of a network to be modularly expandable with a scalable performance with increasing machine resources.
  2. The machine data transfer rate, in terms of M bytes /s transmitted through the network.
  3. The worst case time delay for a unit message to be transferred through the network.
  4. Implementation costs such as those for wires, switches, connectors, arbitration and interface logic.

 

Question No.  17        Marks - 10

________________________________________

Base line network is an example of ….       

Options          

  1. Static connection network.
  2. Dynamic connection network.
  3. It depends on the design.
  4. None of the above.

 

Question No.  18        Marks - 10

________________________________________

Crossbar network is an example of ……   

Options                      

  1. Static connection network.
  2. Dynamic connection network.
  3. It depends on the design.
  4. None of the above.

 

Question No.  19        Marks - 10

________________________________________

Which of the following are speedup performance laws? 

Options                      

  1. Amdahl’s law for a fixed work load.
  2. Fixed –load speedup.
  3. Amdahl’s law revisited.
  4. All the above.

 

Question No.  20        Marks - 10

________________________________________ 

Which of the following are scalability metrics?       

Options          

  1. Machine size.
  2. Clock Rate.
  3. I/O demand
  4. All the above

 

Question No.  21        Marks - 10

________________________________________

. Instruction level parallelism could be achieved through……     

Options                      

  1. Instruction Pipelining.
  2. Vector Processing.
  3. Array Processing
  4. Multi Processor Architecture.

 

Question No.  22        Marks - 10

________________________________________ 

. Which of the following is the fastest device in terms of speed?  

Options                      

  1. Cache Memory
  2. CPU Registers
  3. Main Memory
  4. Disk Storage

 

Question No.  23        Marks - 10

________________________________________

Cache Locality of references refers to which of the following?   

Options                      

  1. Temporal Locality
  2. Sequential Locality
  3. Both a. and b.
  4. None of the above.

 

Question No.  24        Marks - 10

________________________________________

Cache hit ratio is?       

Options          

  1. Number of times data found in cache / Total number of access.
  2. Number of times data not found in cache / total number of access.
  3. Number of times data found in RAM/ total number of access.
  4. Number of times data found in hard disk/ total number of access.

 

Question No.  25        Marks - 10

________________________________________

Which techniques are examples of virtual memory?          

Options                      

  1. Demand paged memory management.
  2. Segmented memory management.
  3. Segmented paged memory management
  4. a and b. only.

 

Question No.  26        Marks - 10

________________________________________

Virtual memory could be achieved using which of the memory device?  

Options                      

  1. Cache memory
  2. Main memory
  3. CPU Register
  4. Secondary Memory

 

Question No.  27        Marks - 10

________________________________________

Which is the fastest memory replacement technique?         

Options                      

  1. FCFS
  2. LRU
  3. OPT
  4. Depends on demand of size and order.

 

Question No.  28        Marks - 10

________________________________________ 

Which of the followings are cache addressing models?    

Options                      

  1. Physical address caches.
  2. Virtual address caches.
  3. Both a. and b.
  4. None of the above.

 

Question No.  29        Marks - 10

________________________________________

In which of the cache mapping scheme, word size of the cache is smallest?          

Options          

  1. Direct mapping
  2. Associative mapping
  3. Set associative mapping
  4. In all the above cases, word size of the cache is same

 

Question No.  30        Marks - 10

________________________________________ 

Cache coherence may occur in …..

Options                      

  1. Multi-computer System.
  2. Multiprocessor Systems.
  3. Single Processor Systems.
  4. Both a. and b.

 

Question No.  31        Marks - 10

________________________________________

Cache coherence is the problem in which…..           

Options                      

  1. Duplicate data is in different caches of processors in multiprocessor systems.
  2. Duplicate data in cache and in RAM.
  3. Duplicate data in cache, RAM and Hard Disk.
  4. All the above.

 

Question No.  32        Marks - 10

________________________________________

Write-Through Caches applies the technique……              

Options                      

  1. Whenever there is modification in cache data, in main memory data modification will be finally once with same value which is finally modified in cache.
  2. Whenever there is modification in cache data, simultaneously in main memory data modification will occur.
  3. There is no relation of data modification of cache and main memory.
  4. It varies from processor to processor.

 

Question No.  33        Marks - 10

________________________________________

Write-Back Cache applies the technique…….      

Options                      

  1. Whenever there is modification in cache data, in main memory data modification will be finally once with same value which is finally modified in cache.
  2. Whenever there is modification in cache data, simultaneously in main memory data modification will occur.
  3. There is no relation of data modification of cache and main memory.
  4. It varies from processor to processor.

 

Question No.  34        Marks - 10

________________________________________ 

What is the basic unit in store-and-forward routing?      

Options                      

  1. Input/ Output stream.

 

Question No.  35        Marks - 10

________________________________________

Masking instruction is an instruction of which category?               

Options                      

  1. Pipeline Instruction
  2. Vector Instruction
  3. Scalar Instruction
  4. Array Instructions

 

Question No.  36        Marks - 10

________________________________________

DEC VAX 9000 is a….          

Options          

  1. Mini Super Computer.
  2. Mini Computer.
  3. Multi Computer Architecture.

 

Question No.  37        Marks - 10

________________________________________

Multi-pipelining could be achieved in ….   

Options          

  1. Multi- Processor Systems.
  2. Multi-Computer System.
  3. Super Computer
  4. Mainframe

 

Question No.  38        Marks - 10

________________________________________

Label-1 cache may be in ….              

Options          

  1. Inside processor chip.
  2. Outside processor chip and on board.
  3. A part of the RAM.
  4. A part of the hard disk.

 

Question No.  39        Marks - 10

________________________________________

Master-slave architecture is associated with…      

Options                      

  1. Multi-processor architecture.
  2. Multi-Computer architecture.
  3. Multi level cache coherence.
  4. All the above.

 

Question No.  40        Marks - 10

________________________________________

Using Flynn´s taxonomy, the various architectures can be divided into categories.       

Options                      

  1. 3
  2. 4
  3. 5
  4. 6

Review

Average user rating

4.8 / 5

Rating breakdown

5
80% Complete (danger)
1
4
80% Complete (danger)
1
3
80% Complete (danger)
0
2
80% Complete (danger)
0
1
80% Complete (danger)
0

January 29, 2015
This was nice in buy
Assignment from solve zone is probably one of the first preference of students.

October 09, 2016
This was nice in buy
I recommend a website that was really helpful throughout your session.

March 19, 2017
Some day ago
This was nice in buy
This was good in buy . I found all the answer correct and meaningful and had scored good marks
Back to top