Above figure shows basic architecture of AVR.
There are several parts in the architecture of AVR which are discussed in the later sections. In this part we will learn about:
- What kind of architecture it has?
- Harvard architecture
- Von-Neuman architecture
- What kind of instructions set does it use?
- How are instructions executed?
Harvard architecture has separate data and instruction buses, allowing transfers to be performed simultaneously on both busses.
A Von Neumann architecture has only one bus which is used for both data transfers and instruction fetches, and therefore data transfers and instruction fetches must be scheduled – they can not be performed at the same time.as you can in the figure below:
It is possible to have two separate memory systems for a Harvard architecture. As long as data and instructions can be fed in at the same time, then it doesn’t matter whether it comes from a cache or memory. But there are problems with this. Compilers generally embed data (literal pools) within the code, and it is often also necessary to be able to write to the instruction memory space, for example in the case of self-modifying code, or, if an ARM debugger is used, to set software breakpoints in memory. If there are two completely separate, isolated memory systems, this is not possible. There must be some kind of bridge between the memory systems to allow this.
Using a simple, unified memory system together with a Harvard architecture is highly inefficient. Unless it is possible to feed data into both busses at the same time, it might be better to use a von Neumann architecture processor.
Use of caches:
At higher clock speeds, caches are useful as the memory speed is proportionally slower. Harvard architectures tend to be targeted at higher performance systems, and so caches are nearly always used in such systems.
Von Neumann architectures usually have a single unified cache, which stores both instructions and data. The proportion of each in the cache is variable, which may be a good thing. It would in principle be possible to have separate instruction and data caches, storing data and instructions separately. This probably would not be very useful as it would only be possible to ever access one cache at a time.
Caches for Harvard architectures are very useful. Such a system would have separate caches for each bus. Trying to use a shared cache on a Harvard architecture would be very inefficient since then only one bus can be fed at a time. Having two caches means it is possible to feed both buses simultaneously….exactly what is necessary for a Harvard architecture.
This also allows to have a very simple unified memory system, using the same address space for both instructions and data. This gets around the problem of literal pools and self-modifying code. What it does mean, however, is that when starting with empty caches, it is necessary to fetch instructions and data from the single memory system, at the same time. Obviously, two memory accesses are needed therefore before the core has all the data needed. This performance will be no better than a von Neumann architecture. However, as the caches fill up, it is much more likely that the instruction or data value has already been cached, and so only one of the two has to be fetched from memory. The other can be supplied directly from the cache with no additional delay. The best performance is achieved when both instructions and data are supplied by the caches, with no need to access external memory at all.
This is the most sensible compromise and the architecture used by ARMs Harvard processor cores. Two separate memory systems can perform better but would be difficult to implement.
so, in general, we can say that RISC is more favorable for people who know high-level language such as C, C++.and makes the job tougher for the one who codes in assembly language.
Another main advantage is more than 95% of the instructions are executed in one clock cycle, remaining 5% can also be executed in single clock by code scheduling.
Another advantage that the instruction are executed using pipelining method in risc which reduces the time for execution and makes risc faster than cisc!!
what is pipelining?
In general pipelining can be understood from the below diagrams as an overlapping kind of execution .
Refer to figure below, it can be seen that in normal execution in time ‘t1’the instruction is first fetched and then in time ‘t2’ the instruction is decoded but in pipelining in time ‘t1+t2’ the first instn being fetched and executed and the second instrn is also fetched when the first being executed. We can see that in total time T the no of instruction executed completely is 3 but in pipelining the no of instrn executed completely is 5.
therefore we can say pipelining decreases execution time for an MCU
in AVR the contents of a general instruction are :
so the pipelining process is represented as: