Novel Memory Architecture Dual Core Processor on Altera FPGA DENano Board

Mr. Venkategowda, N1,
Assistant Professor ECE Dept, MITE, Moodbidre.

Mr. Ajay Pinto2
MITE,

Naveen Pai G3
DR.MVSIT

Mr. Shivraj G4
Student, NMIT, Mangalore.,
VTU University Belagum, Karnataka State, India

Abstract - This project work presents a framework to design a shared memory and DUAL-Core Processor, on a programmable platform. Complete flow is proposed by a programming model and architecture. The effectiveness of shared memory blocks is demonstrated by implementing on FPGA. The aim of the project is to develop an architecture, design, implement and test DUAL-Core Processor on Altera DE1 FPGA Board. Four Processors are connected in star topology and share common memory for program and data. Depending on the application the Processor is expected to run at 150 MHz, with effective instruction cycle speed of 600MHz (150 x 4).

Index terms - FPGA, programming model, CPU, HW, SW, Dual-core processor, shared memory, RISC instruction set.

I. INTRODUCTION:

The Multiprocessing represents a new paradigm for processor design. This trend has been pursued most semiconductor industries, both for high performance computer and complex embedded system. Reconfigurable platforms have emerged as an important alternative to the ASIC design, featuring flexibility v/s relatively lower performance. However implementing a multiprocessor on FPGA poses to solve many problems. Most of these issues, such as cache coherence, multiprocessor interrupt managements, processors identification and synchronization are to be address by system designer. This paper presents a platform which simplifies design of shared memory architected for Dual core processor on FPGA. We introduce a synthesizable architecture and a programming model using verilog which run on commercial FPGA. In the many multiprocessor projects, there has been a heavy focus on either HW or SW to provide facilities for this communication. Both approaches limit speed and flexibility. In the dual-core processors on FPGA give rise to more flexible with high speed and low cost. Importance of multiprocessors has grown throughout in the year 1990s designers sought a way to build servers and supercomputers with higher performance than a single microprocessor.

II. CONVENTIONAL APPROACH:

In this approach, the system is designed for parallel execution. The system consists of several processor modules (processor core P1, P2, P3, P4….) and dedicated data memory used for read and writes processor operations by means of memory arbiter.

If the processor 4 is to communicate with processor 1, it requires more number of clock cycles to transfer the contents in the processor 1 to processor 4 and results in more delay because Processor4 need to place request to arbiter. Arbiter need to approve and send the request to Processor1 is to communicate to arbiter with required data/instruction. Arbiter is to pass on the data/instruction to processor 4.

III. PROPOSED METHOD:

The disadvantages of conventional approach are that if consumes more power, the design utilizes more number of gates, uses four clock cycle and requires one clock per request for each processor, it requires memory arbiter to communicate with all the processor, separate logic circuit as to be designed and for each processor having dedicated memory.
To overcome these disadvantages shared memory architecture for dual core processor is designed with less number of gates. The system consumes less power and latency is also less. Architecture of shared memory dual core processor is shown figure

**Processor:** Non-Pipeline execution, Instruction fetch, Instruction decode, Instruction execution, Register(R/W)

**Shared Memory:** DATA input, DATA output ,ADDRESS, READ/write operation. All carried at during same clock. In this approach the shared memory used is called Dual-port memory. Any processor can write/read from the memory at the same time.

**Methodology:** The steps involved in the dual core processor are as follows: First designing a processor, designing shared memory, Connecting bus architecture, Simulation, Synthesis and implementation, testing on FPGA board

**Advantages:** In this approach uses only one clock cycle is required to execute a full instruction, Reduced gate count, chip area and power, Decrease in latency and Scalable architecture

I. **PROCESSOR DESIGN:**

Unpipelined implementation is not the most economical or the highest-performance implementation. Instead, it is designed to lead naturally to a pipelined implementation. The number of dependent steps varies with the machine architecture. A non-pipelined processor executes only a single instruction at a time. The specification of processor design as follows:-Architecture contains 23 instructions (6 arithmetic+8 logical+4 datapath+5 branching instruction).Pipelined architecture, Four stage instruction execution (IF, ID, IE, ST), Harvard memory architecture (one for code memory and main memory), 8 bit data and 8bit address bus, 8 bit memory mapped I/O register, 1 Special purpose status registers, 13 General purpose CPU registers. Uniform instruction width for all the instruction.

II. **SHARED MEMORY:**

Shared-memory machines usually support the caching of both shared and private data. Private data are used by a single processor, while shared data are used by multiple
processors; essentially providing communication among the processors through reads and writes of the shared data.

When a private item is cached, its location is migrated to the cache, reducing the average access time as well as the memory bandwidth required. Since no other processor uses the data, the program behavior is identical to that in a uniprocessor. Each individual processor is simulated and verified and combined together and designed as Dual core. The respective operation output is shown sequentially in the out bus as shown in the simulation figure.

Figure 6: schematic of memory

Figure 8: RTL Schematic of memory design

Figure 9: Memory with dual core design

Figure 7: Shows the flow chart of shared memory.
The synthesis we got the less no of logic resource utilization takes less time and parallel operation is performed. After operation is performed, so that time taken by single processor core designed and verified for its Arithmetic and Logical Function. Each core is simulated and implemented on hardware and verified for its Arithmetic and Logical Function. The Dual core designed also been implemented on HW and parallel data operation is performed, so that time taken by single processor to perform multiple operation is justified by Dual core, that it takes less time and parallel operation is performed. After synthesis we got the less no of logic resource utilization comparing the available resources.

**REFERENCES**


[2] Design and Implementation of a 32-bit RISC Processor on Xilinx FPGA, Wael M ElMedany¹, Khalid A AlKooheji, ¹ Department of Communications and Electrical Engineering, Faculty of Engineering, Fayoum University, Egypt, Computer Engineering Department, Information Technology College, University Of Bahrain, 32038 Bahrain.


[4] White paper NVIDIA benefits of Dual Core CPUs in Mobile Devices. Copyright © 2011 NVIDIA Corporation. All rights reserved.


**AUTHORS PROFILE:**

**Mr. Venkategowda N** received his Bachelor of engineering degree in Medical Electronics from visveswaraya Technical University, belgaum,Karnataka, India in 2010 and Post Graduate degree in VLSI Design and Embedded system from VTU University, Karnataka, India in 2012. He started his carrier as Assistant Professor in AIT tumkur and sampoorna institute of technology and research VTU. Currently working as Assistant Professor in mangalore institute of technology and engineering,mangalore,karnataka INDIA. His research interests include vlsi,embedded system,Imageprocessing and medical electronics.

**Mr.Basavaraj H.J** received his Bachelor of engineering degree in Electronics and instrumentam from kuvempu University, belgaum,Karnataka, India in2007 and Post Graduate degree in digital signal processing from VTU University, Karnataka, India in 2012. He started his carrier as Assistant Professor in MITE mangalore. Currently working as Assistant Professor in mangalore institute of technology and engineering,mangalore,karnataka INDIA.

**Mr.AjayPinto** received his Bachelor of engineering degree in Electronics and communication from visveswaraya Technical University, belgaum, Karnataka, India in 2010 and Post
Graduate degree in digital electronics and communication system from VTU, Karnataka, India in 2012. He started his carrier as Assistant Professor in SSE Mukka, Mangalore. Currently working as Assistant Professor in Mangalore Institute of Technology and Engineering, Mangalore, Karnataka, India. His research interests include electronics and communication system and power systems.

Mr. Naveen Pai received his Bachelor of Engineering degree in Electronics and Communication from Visveswaraya Technical University, Belgaum, Karnataka, India in 2009 and Post Graduate degree in digital electronics and communication system from VTU, Karnataka, India in 2013. He started his carrier as Assistant Professor in Dr MVSIT, Mangalore, India. His research interests include electronics and communication system and power systems and VLSI.