Wednesday, March 17, 2010

C to Verilog

Welcome to C-to-Verilog

C-to-Verilog is a free service for circuit designers.Visitors can create hardware circuits using the service provided here on this website. Users may submit their C programs and download a verilog module which can then be embedded on FPGAs.Website will automatically synthesize the C program into a verilog module. For additional information on how to use our website, watch the screencast.

pls login

http://www.c-to-verilog.com/index.html

Setup and Hold times

Many designers are familiar with setup and hold time definitions - however, few can identify correctly the launch and capture edges and the slack/violation between two flops during timing analysis. In this post, we will cover setup/hold times in a design with clear examples.

Setup time is defined as the minimum amount of time BEFORE the clock’s active edge by which the data must be stable for it to be latched correctly. Any violation in this minimum required time causes incorrect data to be captured and is known as setup violation.

Hold time is defined as the minimum amount of time AFTER the clock’s active edge during which the data must be stable. Any violation in this required time causes incorrect data to be latched and is known as hold violation.

The setup time in a design determines the maximum frequency at which the chip can run without any timing failures. Factors affecting the setup analysis are the clock period Tclk, Clock to Q propagation delay of the launch flop Tck->q, negative clock skew Tskew, required setup time of the capture flop Tfs and combinational logic delay Tcomb between the two flops being timed. The following condition must be satisfied.

Tfs <= Tclk – Tck->q – Tskew – Tcomb

Hold analysis depends on the Tck->q, combinational logic delay, the clock skew and the hold time requirement Tfh of the capture flop. It is independent of the frequency of the clock. The condition below must be satisfied.

Tck->q + Tskew + Tcomb >= Tfh

Consider the figure below depicting a flop to flop path in the same domain with some combinational logic between them. We will now calculate the setup and hold time slacks in the design based on the given timing parameters.

Setup and Hold time illustration - Full cycle transfer

For setup checks in single cycle paths, the clock edges that are relevant is shown in the Figure above. The data required time for the capture flop B to meet setup is

Data Required time = (Clock Period + Clock Insertion Delay + Clock Skew - Setup time of the flop) = 8 + 2 + 0.25 -0.1 = 10.15 ns

The data arrival time from the launch flop is

Data Arrival time = (Clock Insertion Delay + CK->Q Delay of the launch flop + Combinational logic Delay) = 2 + 0.1 + 5 = 7.1 ns.

Setup slack is

Setup Margin = Data Required Time - Data Arrival Time = 10.15 - 7.10 = 3.05 ns

Similarly for hold checks assuming the hold time requirement of the flop B is 100 ps, the data expected time is

Data expected time = (Clock Insertion Delay + Clock skew + Hold time requirement of flop) = 2 + 0.25 +0.1 = 2.35 ns.

So the hold time slack is

Hold Margin = Data Arrival time - Data expected time = 7.10 - 2.35 = 4.85 ns

Consider the case where the clock to flop B is inverted (or that the flop is negative edge trigerred). In this particular case, the relevant edges for setup/hold are as shown in the figure below.

Setup and Hold time illustration - Half cycle transfer

In this scenario, the setup margin considering all the other parameters to be the same is

Data Required time = (half_clock_period + clock insertion delay + Ck->Q delay of flop A - Setup time required for flop B) = 4 + 2 + 0.25 -0.1 = 6.15 ns

Since the Data Arrival time remains the same, there is a setup violation of

Setup violation = 6.15 ns - 7.10 ns = -1.05 ns

There is no hold violation since the data arrival time remains the time but the data expected time is any time after (Clock skew + Hold time requirement of flop B)

Data expected time = 0.25 + 0.1 = 0.35 ns

Hold Margin = 7.10 - 0.35 = 6.75 ns

courtesy : http://nigamanth.net/vlsi/2007/09/13/setup-and-hold-times/

Hardware Description Languages

What is an HDL?

HDL(Hardware Description Language):- A hardware description language or HDL is any language from a class of computer languages for formal description of electronic circuits. It can describe the circuit's operation, its design, and tests to verify its operation by means of simulation.

HDL’s are used to write specifications of some piece of hardware.

HDL specifies a model for the expected behaviour of a circuit before that circuit is designed and built. The end result is a silicon chip that would be manufactured in a fab.

A simulation program, designed to implement the underlying semantics of the language statements, coupled with simulating the progress of time, provides the hardware designer with the ability to model a piece of hardware before it is physically created.

HDL’s find applications in Programmable Logic Devices (PLD’s) - simple PLD’s to complex PLD’s like CPLD (Complex Programmable Gate Array), FPGA (Field Programmable Gate Array) etc.

HDLs in use today – ABEL, PALASM, CUPL for less complex devices; VHDL, Verilog for larger CPLD & FPGA devices.

Two applications of HDL processing: Simulation and Synthesis.

Thanks to Moore’s Law, the number of programmable logic gates (e.g. AND gates, NAND gates, etc) in today’s chips are now in the millions.

With such electronic capacities on a single chip, it is now possible to place whole electronic systems on a chip.

Using HDL’s it is almost as easy to program hardware as to program software. But, one needs to understand the principles of digital electronic design (e.g. multiplexer’s, flip-flop’s, buffer’s, counter’s, etc).

Why simulate first?

Physical bread-boarding is not possible as designs reach higher levels of integration.

A simulator interprets the HDL description and produces a readable output, such as a timing diagram, that predicts how the hardware will behave before it is actually fabricated.

Simulation allows the detection of functional errors in a design without having to physically create the circuit.

Logic Simulation

The stimulus that tests the functionality of the design is called a test bench.

To simulate, the design is first described in HDL, verified by simulating the design and checking it with a test bench which is also written in HDL.

Logic simulation is a fast, accurate method of analyzing a circuit by checking functionality using waveforms

HDL’s in demand today

Two standard HDL’s that are supported by IEEE.

VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) – Sometimes referred to as VHSIC HDL, this was developed from an initiative by US. Dept. of Defense.
Verilog HDL – developed by Cadence Design Systems and later transferred to a consortium called Open Verilog International (OVI).

Tutorials in Communication and Signal Processing

Tutorial 1 - Basic concepts in signal analysis, power, energy and spectrum (Revised Oct 2008)

Tutorial 2 - What is Differential Phase Shift Keying

Tutorial 3 - What is the DVB-S standard, a simplified version

Tutorial 4 - Fourier Analysis Made easy - Part 1

Tutorial 5 - Fourier Analysis Made easy - Part 2

Tutorial 6 - Fourier Analysis Made Easy - Part 3

Tutorial 7 - Hilbert Transform and the Complex Envelope

Tutorial 8 - All About Modulation - Part 1

Tutorial 9 - Baseband, Passband and Amplitude Modulation (AM)

Tutorial 10 - All about Traveling Wave Tube Amplifiers (TWTA) and non-linear amplification

Tutorial 11 - Link Budgets 101

Tutorial 12 - Convolutional Coding and Decoding Made Easy

Tutorial 13 - Coding Concepts and Block Coding

Tutorial 14 - Inter Symbol Interference (ISI) and Raised cosine filtering

Tutorial 15 - How to interpret an Eye diagram

Tutorial 16 - Partial Response signaling and Quadrature Partial Response (QPR) modulation

Tutorial 17 - Frequency Modulation (FM) , FSK, MSK and more

Tutorial 18 - Unlocking the Phase Locked Loop (PLL) - Part 1

Tutorial 19 - Unlocking the Phase Locked Loop (PLL) - Part 2

Tutorial 20 - Modulation performance metrics and computation of BER

Tutorial 21 - Linear Time Invariant systems and Matched filtering

Tutorial 22 - Orthogonal Frequency Division Multiplex (OFDM, DMT)

Tutorial 23a - About Lattice and cosets

Tutorial 23b - Trellis coded Modulation (TCM)

Tutorial 24 - Turbo Coding and MAP Decoding, Part 1 ,Part 2 - The MAP decoding algorithm, step-by-step Companion worksheet

Tutorial 25 - CDMA

Tutorial 26 - Filters, analog, digital and adaptive equilization

Design of VLSI Systems

Hai friends here i attached the links for the detailed description of following subjects

please go through the links

For direct link to 'Table of contents' please clik here

For publications click here

Chapter 11 - VLSI For Telecommunication Systems

Chapter 12 - Digital Signal Processing Architectures

Chapter 13 - Architectures for video processing

Multirate DSP

Posted by Hafsal | | DSP | 0 comments »

As per the request from one viewer................

Multi-rate processing and sample rate conversion, or interpolation and decimation as they re known, are a clever digital signal processing (DSP) techniques that broadband and wireless design engineers can employ during the system design process. Using these techniques, design engineers can gain an added degree of freedom that could improve the overall performance of a system architecture.

Multi-rate processing finds use in signal processing systems where various sub-systems with differing sample or clock rates need to be interfaced together. At other times multi-rate processing is used to reduce computational overhead of a system. For example, an algorithm requires k operations to be completed per cycle. By reducing the sample rate of a signal or system by a factor of M, the arithmetic bandwidth requirements are reduced from kfs operations to kfs/M operations per second.

The above applications provide a glimpse of just a few of the communication applications where using multi-rate processing and sample-rate conversion makes sense. But to implement these techniques, designers must first better understand how they work.

For more details pls go through the link
http://www.commsdesign.com/showArticle.jhtml?articleID=16504259

pdf

www2.ee.ic.ac.uk/hp/staff/pnaylor/notes/DSP5.pdf
www.spsc.tugraz.at/courses/dsplab/multirate/multirate.pdf
eceweb.uccs.edu/wickert/ece5650/notes/sampling_theory/multirate_sim.pdf

ppt

vada.skku.ac.kr/ClassInfo/system_level_design/sdr_slides/Multirate-1.ppt
vada.skku.ac.kr/ClassInfo/system_level_design/sdr_slides/Multirate-2.ppt -

DRAM types

This is the basic form, from which all others are derived. An asynchronous DRAM chip has power connections, some number of address inputs (typically 12), and a few (typically 1 or 4) bidirectional data lines. There are four active low control signals:

/RAS, the Row Address Strobe. The address inputs are captured on the falling edge of /RAS, and select a row to open. The row is held open as long as /RAS is low.
/CAS, the Column Address Strobe. The address inputs are captured on the falling edge of /CAS, and select a column from the currently open row to read or write.
/WE, Write Enable. This signal determines whether a given falling edge of /CAS is a read (if high) or write (if low). If low, the data inputs are also captured on the falling edge of /CAS.
/OE, Output Enable. This is an additional signal that controls output to the data I/O pins. The data pins are driven by the DRAM chip if /RAS and /CAS are low, and /WE is high, and /OE is low. In many applications, /OE can be permanently connected low (output always enabled), but it can be useful when connecting multiple memory chips in parallel

This interface provides direct control of internal timing. When /RAS is driven low, a /CAS cycle must not be attempted until the sense amplifiers have sensed the memory state, and /RAS must not be returned high until the storage cells have been refreshed. When /RAS is driven high, it must be held high long enough for precharging to complete.

Asynchronous DRAM Types

FPM RAM
EDO RAM
Burst EDO RAM
Video RAM

Fast Page Mode (FPM) DRAM or FPRAM

Fast page mode DRAM is also called FPM DRAM, Page mode DRAM, Fast page mode memory, or Page mode memory.

In page mode, a row of the DRAM can be kept "open" by holding /RAS low while performing multiple reads or writes with separate pulses of /CAS. so that successive reads or writes within the row do not suffer the delay of precharge and accessing the row. This increases the performance of the system when reading or writing bursts of data.

Static column is a variant of page mode in which the column address does not need to be strobed in, but rather, the address inputs may be changed with /CAS held low, and the data output will be updated accordingly a few nanoseconds later.

Nibble mode is another variant in which four sequential locations within the row can be accessed with four consecutive pulses of /CAS. The difference from normal page mode is that the address inputs are not used for the second through fourth /CAS edges; they are generated internally starting with the address supplied for the first /CAS edge.

Extended Data Out (EDO) DRAM

EDO DRAM is similar to Fast Page Mode DRAM with the additional feature that a new access cycle can be started while keeping the data output of the previous cycle active. This allows a certain amount of overlap in operation (pipelining), allowing somewhat improved speed. It was 5% faster than Fast Page Mode DRAM, which it began to replace in 1993.

To be precise, EDO DRAM begins data output on the falling edge of /CAS, but does not stop the output when /CAS rises again. It holds the output valid (thus extending the data output time) until either /RAS is deasserted, or a new /CAS falling edge selects a different column address.

Single-cycle EDO has the ability to carry out a complete memory transaction in one clock cycle. Otherwise, each sequential RAM access within the same page takes two clock cycles instead of three, once the page has been selected. EDO's speed and capabilities allowed it to somewhat replace the then-slow L2 caches of PCs. It created an opportunity to reduce the immense performance loss associated with a lack of L2 cache, while making systems cheaper to build. This was also good for notebooks due to difficulties with their limited form factor, and battery life limitations. An EDO system with L2 cache was tangibly faster than the older FPM/L2 combination.

Single-cycle EDO DRAM became very popular on video cards towards the end of the 1990s. It was very low cost, yet nearly as efficient for performance as the far more costly VRAM.

Burst EDO (BEDO) DRAM

An evolution of the former, Burst EDO DRAM, could process four memory addresses in one burst, for a maximum of 5-1-1-1, saving an additional three clocks over optimally designed EDO memory. It was done by adding an address counter on the chip to keep track of the next address. BEDO also added a pipelined stage allowing page-access cycle to be divided into two components. During a memory-read operation, the first component accessed the data from the memory array to the output stage (second latch). The second component drove the data bus from this latch at the appropriate logic level. Since the data is already in the output buffer, faster access time is achieved (up to 50% for large blocks of data) than with traditional EDO.

Although BEDO DRAM showed additional optimization over EDO, by the time it was available the market had made a significant investment towards synchronous DRAM, or SDRAM . Even though BEDO RAM was superior to SDRAM in some ways, the latter technology gained significant traction and quickly displaced BEDO.

Video DRAM (VRAM)

VRAM is a dual-ported variant of DRAM which was once commonly used to store the frame-buffer in some graphics adaptors.

It was invented by F. Dill and R. Matick at IBM Research in 1980, with a patent issued in 1985 (US Patent 4,541,075). The first commercial use of VRAM was in the high resolution graphics adapter introduced in 1986 by IBM with the PC/RT system.

VRAM has two sets of data output pins, and thus two ports that can be used simultaneously. The first port, the DRAM port, is accessed by the host computer in a manner very similar to traditional DRAM. The second port, the video port, is typically read-only and is dedicated to providing a high-speed data channel for the graphics chipset.

Typical DRAM arrays normally access a full row of bits (i.e. a word line) at up to 1024 bits at one time, but only use one or a few of these for actual data, the remainder being discarded. Since DRAM cells are destructively read, each bit accessed must be sensed, and re-written. Thus, typically, 1024 sense amplifiers are typically used. VRAM operates by not discarding the excess bits which must be accessed, but making full use of them in a simple way. If each horizontal scan line of a display is mapped to a full word, then upon reading one word and latching all 1024 bits into a separate row buffer, these bits can subsequently be serially streamed to the display circuitry. This will leave access to the DRAM array free to be accessed (read or write) for many cycles, until the row buffer is almost depleted. A complete DRAM read cycle is only required to fill the row buffer, leaving most DRAM cycles available for normal accesses.

Such operation is described in the paper "All points addressable raster display memory" by R. Matick, D. Ling, S. Gupta, and F. Dill, IBM Journal of R&D, Vol 28, No. 4, July 1984, pp379-393. To use the video port, the controller first uses the DRAM port to select the row of the memory array that is to be displayed. The VRAM then copies that entire row to an internal row-buffer which is a shift-register. The controller can then continue to use the DRAM port for drawing objects on the display. Meanwhile, the controller feeds a clock called the shift clock (SCLK) to the VRAM's video port. Each SCLK pulse causes the VRAM to deliver the next datum, in strict address order, from the shift-register to the video port. For simplicity, the graphics adapter is usually designed so that the contents of a row, and therefore the contents of the shift-register, corresponds to a complete horizontal line on the display.

In the late 1990s, standard DRAM technologies (e.g. SDRAM) became cheap, dense, and fast enough to completely displace VRAM, even though it was only single-ported and some memory bits were wasted.

Synchronous Dynamic RAM (SDRAM)

Single Data Rate (SDR) SDRAM is a synchronous form of DRAM.

SDR SDRAM
DDR SDRAM
DDR2 SDRAM
DDR3 SDRAM
DDR4 SDRAM
Rambus RAM
VC-RAM
SGRAM
GDDR2
GDDR3
GDDR4
GDDR5

SDR SDRAM
Single Data Rate (SDR) SDRAM is a synchronous form of DRAM.

DDR SDRAM

DDR SDRAM (double data rate synchronous dynamic random access memory) is a class of memory integrated circuit used in computers. It achieves nearly twice the bandwidth of the preceding [single data rate] SDRAM by double pumping (transferring data on the rising and falling edges of the clock signal) without increasing the clock frequency.

With data being transferred 64 bits at a time, DDR SDRAM gives a transfer rate of (memory bus clock rate) × 2 (for dual rate) × 64 (number of bits transferred) / 8 (number of bits/byte). Thus with a bus frequency of 100 MHz, DDR SDRAM gives a maximum transfer rate of 1600 MB/s.

JEDEC has set standards for speeds of DDR SDRAM, divided into two parts: The first specification is for memory chips and the second is for memory modules. As DDR-SDRAM is superseded by the newer DDR2 SDRAM, the older version is sometimes referred to as DDR1-SDRAM.

All above listed (except DDR-300 [1]) are specified by JEDEC as JESD79. All RAM speeds in-between or above these listed specifications are not standardized by JEDEC — most often they are simply manufacturer optimizations using higher-tolerance or overvolted chips.

The package sizes in which DDR SDRAM is manufactured are also standardized by JEDEC.

There is no architectural difference between DDR SDRAM designed for different clock frequencies, e.g. PC-1600 (designed to run at 100 MHz) and PC-2100 (designed to run at 133 MHz). The number simply designates the speed that the chip is guaranteed to run at, hence DDR SDRAM can be run at either lower[2] or higher clock speeds than those for which it was made. These practices are known as underclocking and overclocking respectively.

DDR SDRAM for desktop computers DIMMs have 184 pins (as opposed to 168 pins on SDRAM, or 240 pins on DDR2 SDRAM), and can be differentiated from SDRAM DIMMs by the number of notches (DDR SDRAM has one, SDRAM has two). DDR for notebook computers SO-DIMMs have 200 pins which is the same number of pins as DDR2 SO-DIMMs. These two specifications are notched very similarly and care must be taken during insertion when you are unsure of a correct match. DDR SDRAM operates at a voltage of 2.5 V, compared to 3.3 V for SDRAM. This can significantly reduce power usage. Chips and modules with DDR-400/PC-3200 standard have a nominal voltage of 2.6 Volt.

Many new chipsets use these memory types in dual-channel configurations, which doubles or quadruples the effective bandwidth.

DDR2 SDRAM
DDR2 SDRAM or double-data-rate two synchronous dynamic random access memory is a random access memory technology used in electronic engineering for high speed storage of the working data of a computer or other digital electronic device.

It is a part of the SDRAM (synchronous dynamic random access memory) family of technologies, which is one of many DRAM (dynamic random access memory) implementations, and is an evolutionary improvement over its predecessor, DDR SDRAM.

Its primary benefit is the ability to operate the external data bus twice as fast as DDR SDRAM. This is achieved by improved bus signaling, and by operating the memory cells at half the clock rate (one quarter of the data transfer rate), rather than at the clock rate as in the original DDR. DDR2 memory at the same clock speed as DDR will provide the same bandwidth but markedly higher latency, providing worse performance.

Like all SDRAM implementations, DDR2 stores memory in memory cells that are activated with the use of a clock signal to synchronize their operation with an external data bus. Like DDR before it, DDR2 cells transfer data both on the rising and falling edge of the clock (a technique called "double pumping"). The key difference between DDR and DDR2 is that in DDR2 the bus is clocked at twice the speed of the memory cells, so four bits of data can be transferred per memory cell cycle. Thus, without speeding up the memory cells themselves, DDR2 can effectively operate at twice the bus speed of DDR.

DDR2's bus frequency is boosted by electrical interface improvements, on-die termination, prefetch buffers and off-chip drivers. However, latency is greatly increased as a trade-off. The DDR2 prefetch buffer is 4 bits deep, whereas it is 2 bits deep for DDR and 8 bits deep for DDR3. While DDR SDRAM has typical read latencies of between 2 and 3 bus cycles, DDR2 may have read latencies between 4 and 6 cycles. Thus, DDR2 memory must be operated at twice the bus speed to achieve the same latency.

Another cost of the increased speed is the requirement that the chips are packaged in a more expensive and more difficult to assemble BGA package as compared to the TSSOP package of the previous memory generations such as DDR SDRAM and SDR SDRAM. This packaging change was necessary to maintain signal integrity at higher speeds.

Power savings are achieved primarily due to an improved manufacturing process through die shrinkage, resulting in a drop in operating voltage (1.8 V compared to DDR's 2.5 V). The lower memory clock frequency may also enable power reductions in applications that do not require the highest available speed.

According to JEDEC[1] the maximum recommended voltage is 1.9 volts and should be considered the absolute maximum when memory stability is an issue (such as in servers or other mission critical devices). In addition, JEDEC states that memory modules must withstand up to 2.3 volts before incurring permanent damage (although they may not actually function correctly at that level).

DDR3 SDRAM

In electronic engineering, DDR3 SDRAM or double-data-rate three synchronous dynamic random access memory is a random access memory technology used for high speed storage of the working data of a computer or other digital electronic device.

DDR3 is part of the SDRAM family of technologies and is one of the many DRAM (dynamic random access memory) implementations. DDR3 SDRAM is an improvement over its predecessor, DDR2 SDRAM.

The primary benefit of DDR3 is the ability to transfer I/O data at eight times the speed of the memory cells it contains, thus enabling faster bus speeds and higher peak throughput than earlier memory technologies. However, there is no corresponding reduction in latency, which is therefore proportionally higher. In addition, the DDR3 standard allows for chip capacities of 512 megabits to 8 gigabits, effectively enabling a maximum memory module size of 16 gigabytes.

DDR3 memory promises a power consumption reduction of 30% compared to current commercial DDR2 modules due to DDR3's 1.5 V supply voltage, compared to DDR2's 1.8 V or DDR's 2.5 V. The 1.5 V supply voltage works well with the 90 nanometer fabrication technology used for most DDR3 chips. Some manufacturers further propose using "dual-gate" transistors to reduce leakage of current.[1]

According to JEDEC[2] the maximum recommended voltage is 1.575 volts and should be considered the absolute maximum when memory stability is the foremost consideration, such as in servers or other mission critical devices. In addition, JEDEC states that memory modules must withstand up to 1.975 volts before incurring permanent damage, although they may not actually function correctly at that level.

The main benefit of DDR3 comes from the higher bandwidth made possible by DDR3's 8 bit deep prefetch buffer, in contrast to DDR2's 4 bit prefetch buffer or DDR's 2 bit buffer.

DDR3 modules can transfer data at the effective clock rate of 800–1600 MHz using both rising and falling edges of a 400–800 MHz I/O clock. In comparision, DDR2's current range of effective data transfer rate is 400–800 MHz using a 200–400 MHz I/O clock, and DDR's range is 200–400 MHz based on a 100–200 MHz I/O clock. To date, the graphics card market has been the driver of such bandwidth requirements, where fast data transfer between framebuffers is required.

DDR3 prototypes were announced in early 2005. Products in the form of motherboards are appearing on the market as of mid-2007[3] based on Intel's P35 "Bearlake" chipset and memory DIMMs at speeds up to DDR3-1600 (PC3-12800).[4] AMD's roadmap indicates their own adoption of DDR3 in 2008.

DDR3 DIMMs have 240 pins, the same number as DDR2, and are the same size, but are electrically incompatible and have a different key notch location

DDR4 SDRAM

DDR4 SDRAM will be the successor to DDR3 SDRAM. It was revealed at the Intel Developer Forum in San Francisco and is currently in the design state and is expected to released in 2012.

The new chips are expected run at 1.2 volts or below, versus the 1.5 volts of DDR3 chips and have data transfers in excess of 2000 million per second

RDRAM

The first PC motherboards with support for RDRAM debuted in 1999. They supported PC-800 RDRAM, which operated at 400 MHz and delivered 1600 MB/s of bandwidth over a 16-bit bus using a 184-pin RIMM form factor. Data is transferred on both the rising and falling edges of the clock signal, a technique known as double data rate. For marketing reasons the physical clock rate was multiplied by two (because of the DDR operation); therefore, the 400 MHz Rambus standard was named PC-800. This was significantly faster than the previous standard, PC-133 SDRAM, which operated at 133 MHz and delivered 1066 MB/s of bandwidth over a 64-bit bus using a 168-pin DIMM form factor.

Moreover, if a mainboard has a dual- or quad-channel memory subsystem, all of the memory channels must be upgraded simultaneously. Sixteen-bit modules provide one channel of memory, while 32-bit modules provide two channels. Therefore, a dual channel mainboard accepting 16-bit modules must have RIMMs added or removed in pairs. A dual channel mainboard accepting 32-bit modules can have single RIMMs added or removed as well.

PC600: 16-bit, single channel RIMM, specified to operate at 300 MHz clock speed, 1200 MB/s bandwidth
PC700: 16-bit, single channel RIMM, specified to operate at 355 MHz clock speed, 1420 MB/s bandwidth
PC800: 16-bit, single channel RIMM, specified to operate at 400 MHz clock speed, 1600 MB/s bandwidth
PC1066 (RIMM 2100): 16-bit, single channel RIMM specified to operate at 533 MHz clock speed, 2133 MB/s bandwidth
PC1200 (RIMM 2400): 16-bit, single channel RIMM specified to operate at 600 MHz clock speed, 2400 MB/s bandwidth
RIMM 3200: 32-bit, dual channel RIMM specified to operate at 400 MHz clock speed, 3200 MB/s bandwidth
RIMM 4200: 32-bit, dual channel RIMM specified to operate at 533 MHz clock speed, 4200 MB/s bandwidth
RIMM 4800: 32-bit, dual channel RIMM specified to operate at 600 MHz clock speed, 4800 MB/s bandwidth
RIMM 6400: 32-bit, dual channel RIMM specified to operate at 800 MHz clock speed, 6400 MB/s bandwidth

Video game consoles:- Rambus's RDRAM saw use in several video game consoles, beginning in 1996 with the Nintendo 64. The Nintendo console utilized 4 MB RDRAM running with a 500 MHz clock on an 9-bit bus, providing 500 MB/s bandwidth. RDRAM allowed N64 to be equipped with a large amount of memory bandwidth while maintaining a lower cost due to design simplicity. RDRAM's narrow bus allows circuit board designers to use simpler design techniques to minimize cost. The memory, however, was disliked for its high random access latencies. In the N64, the RDRAM modules are cooled by a passive heatspreader assembly.

Sony uses RDRAM in the PlayStation 2. The PS2 was equipped with 32 MB of the memory, and implemented a dual-channel configuration resulting in 3200 MB/s available bandwidth. The PlayStation 3 utilizes 256 MB of Rambus's XDR DRAM, which could be considered a successor to RDRAM, on a 64-bit bus at 400 MHz with an octal data rate[1] (cf. double data rate) providing speeds of 3.2 GHz, allowing a large 204.8 Gbit/s (25.6 GB/s) bandwidth.

Video cards:-Cirrus Logic implemented RDRAM support in their Laguna graphics chip, with two members of the family; the 2D-only 5462 and the 5464, a 2D chip with 3D acceleration. RDRAM offered a cost-advantage while being potentially faster than competing DRAM technologies with its high bandwidth. The chips were used on the Creative Graphics Blaster MA3xx series, among others.

VC-RAM

Virtual Channel Random Access Memory (VC-RAM in short, other names such as VC-SDRAM, VCSDRAM, VCDRAM, or VCM) was a proprietary type of SDRAM produced by NEC, but released as an open standard with no licensing fees. VCM creates a state in which the various system processes can be assigned their own virtual channel, thus increasing the overall system efficiency by avoiding the need to have processes share buffer space. This is accomplished by creating different "blocks" of memory, allowing each individual memory block to interface separately with the memory controller and have its own buffer space. The only motherboards ever able to support VC-RAM were for AMD Athlon and Intel Pentium 3 processors. A VC-RAM module is physically similar to a SDR SDRAM memory module, so VC-RAM capable motherboards are also able to use standard SDR SDRAM, but they cannot be mixed together.

VC-RAM is faster than SDRAM because it has significantly lower latencies. The technology was a potential competitor of Rambus or RDRAM because VC-RAM was not nearly as expensive as RDRAM was; however, the technology did not catch on. Instead, NEC accepted dual data rate (DDR) SDRAM as the successor of SDR SDRAM.

NEC claims 3% boost in Quake benchmarks with 20% increase in system performance.

GDDR2

The first commercial product to claim using the "DDR2" technology was the NVIDIA GeForce FX 5800 graphics card. However, it is important to note that this GDDR-2 memory used on graphics cards is not DDR2 per se, but rather an early midpoint between DDR and DDR2 technologies. Using "DDR2" to refer to GDDR-2 is a colloquial misnomer. In particular, the performance-enhancing doubling of the I/O clock rate is missing. It had severe overheating issues due to the nominal DDR voltages. ATI has since designed the GDDR technology further into GDDR3, which is more true to the DDR2 specifications, though with several additions suited for graphics cards.

GDDR3 is now commonly used in modern graphics cards and some tablet PCs. However, further confusion has been added to the mix with the appearance of budget and mid-range graphics cards which claim to use "GDDR2". These cards actually use standard DDR2 chips designed for use as main system memory. These chips cannot achieve the clock speeds that GDDR3 can but are inexpensive enough to be used as memory on mid-range cards.

GDDR3

GDDR3, Graphics Double Data Rate 3, is a graphics card-specific memory technology, designed by ATI Technologies with the collaboration of JEDEC.

It has much the same technological base as DDR2, but the power and heat dispersal requirements have been reduced somewhat, allowing for higher-speed memory modules, and simplified cooling systems. Unlike the DDR2 used on graphics cards, GDDR3 is unrelated to the JEDEC DDR3 specification. This memory uses internal terminators, enabling it to better handle certain graphics demands. To improve bandwidth, GDDR3 memory transfers 4 bits of data per pin in 2 clock cycles.

The GDDR3 Interface transfers two 32 bit wide data words per clock cycle from the I/O pins. Corresponding to the 4n-pre fetch a single write or read access consists of a 128 bit wide, one-clock-cycle data transfer at the internal memory core and four corresponding 32 bit wide, one-half-clock-cycle data transfers at the I/O Pins. Single-ended unidirectional Read and Write Data strobes are transmitted simultaneously with Read and Write data respectively in order to capture data properly at the receivers of both the Graphics SDRAM and the controller. Data strobes are organized per byte of the 32 bit wide interface.

GDDR4

GDDR4 SDRAM (Graphics Double Data Rate, version 4) is a type of graphics card memory specified by the JEDEC Semiconductor Memory Standard. Its main competitor appears to be Rambus's XDR DRAM. GDDR4 is the memory successor to GDDR3. It should be noted that neither is related to the JEDEC DDR3 memory standard.

GDDR5

GDDR5 (Graphics Double Data Rate, version 5) is a type of graphics card memory the standards of which were set out in the GDDR5 specification by JEDEC. GDDR5 is the successor to GDDR4 and unlike its predecessors has two parallel DQ links which provide doubled I/O throughput when compared to GDDR4. GDDR5 SGRAM is a high speed dynamic random-access memory designed for applications requiring high bandwidth. GDDR5 SGRAM uses a 8n prefetch architecture and DDR interface to achieve high-speed operation and can be configured to operate in x32 mode or x16 (clamshell) mode which is detected during device initialization. The GDDR5 interface transfers two 32 bit wide data words per WCK clock cycle to/from the I/O pins. Corresponding to the 8n prefetch a single write or read access consists of a 256 bit wide, two CK clock cycle data transfer at the internal memory core and eight corresponding 32 bit wide one-half WCK clock cycle data transfers at the I/O pins.

A quick and easy intro to writing device drivers for Linux like a true kernel developer!

Pre-requisites

In order to develop Linux device drivers, it is necessary to have an understanding of the following:

C programming. Some in-depth knowledge of C programming is needed, like pointer usage, bit manipulating functions, etc.
Microprocessor programming. It is necessary to know how microcomputers work internally: memory addressing, interrupts, etc. All of these concepts should be familiar to an assembler programmer.

There are several different devices in Linux. For simplicity, this brief tutorial will only cover type char devices loaded as modules. Kernel 2.6.x will be used (in particular, kernel 2.6.8 under Debian Sarge, which is now Debian Stable)

User space and kernel space

When you write device drivers, it’s important to make the distinction between “user space” and “kernel space”.

Kernel space. Linux (which is a kernel) manages the machine’s hardware in a simple and efficient manner, offering the user a simple and uniform programming interface. In the same way, the kernel, and in particular its device drivers, form a bridge or interface between the end-user/programmer and the hardware. Any subroutines or functions forming part of the kernel (modules and device drivers, for example) are considered to be part of kernel space.
User space. End-user programs, like the UNIX shell or other GUI based applications (kpresenter for example), are part of the user space. Obviously, these applications need to interact with the system’s hardware . However, they don’t do so directly, but through the kernel supported functions.

All of this is shown in figure 1.

Figure 1: User space where applications reside, and kernel space where modules or device drivers reside

Interfacing functions between user space and kernel space

The kernel offers several subroutines or functions in user space, which allow the end-user application programmer to interact with the hardware. Usually, in UNIX or Linux systems, this dialogue is performed through functions or subroutines in order to read and write files. The reason for this is that in Unix devices are seen, from the point of view of the user, as files.

On the other hand, in kernel space Linux also offers several functions or subroutines to perform the low level interactions directly with the hardware, and allow the transfer of information from kernel to user space.

Usually, for each function in user space (allowing the use of devices or files), there exists an equivalent in kernel space (allowing the transfer of information from the kernel to the user and vice-versa). This is shown in Table 1, which is, at this point, empty. It will be filled when the different device drivers concepts are introduced.

Interfacing functions between kernel space and the hardware device

There are also functions in kernel space which control the device or exchange information between the kernel and the hardware. Table 2 illustrates these concepts. This table will also be filled as the concepts are introduced.

The first driver: loading and removing the driver in user space

I’ll now show you how to develop your first Linux device driver, which will be introduced in the kernel as a module.

For this purpose I’ll write the following program in a file named nothing.c

= #include MODULE_LICENSE("Dual BSD/GPL");

Since the release of kernel version 2.6.x, compiling modules has become slightly more complicated. First, you need to have a complete, compiled kernel source-code-tree. If you have a Debian Sarge system, you can follow the steps in Appendix B (towards the end of this article). In the following, I’ll assume that a kernel version 2.6.8 is being used.

Next, you need to generate a makefile. The makefile for this example, which should be named Makefile, will be:

= obj-m := nothing.o
Unlike with previous versions of the kernel, it’s now also necessary to compile the module using the same kernel that you’re going to load and use the module with. To compile it, you can type:
$ make -C /usr/src/kernel-source-2.6.8 M=pwd modules

This extremely simple module belongs to kernel space and will form part of it once it’s loaded.

In user space, you can load the module as root by typing the following into the command line:

# insmod nothing.ko

The insmod command allows the installation of the module in the kernel. However, this particular module isn’t of much use.

It is possible to check that the module has been installed correctly by looking at all installed modules:

# lsmod

Finally, the module can be removed from the kernel using the command:

# rmmod nothing

By issuing the lsmod command again, you can verify that the module is no longer in the kernel.

The summary of all this is shown in Table 3.

The “Hello world” driver: loading and removing the driver in kernel space

When a module device driver is loaded into the kernel, some preliminary tasks are usually performed like resetting the device, reserving RAM, reserving interrupts, and reserving input/output ports, etc.

These tasks are performed, in kernel space, by two functions which need to be present (and explicitly declared): module_init and module_exit; they correspond to the user space commands insmod and rmmod , which are used when installing or removing a module. To sum up, the user commands insmod and rmmod use the kernel space functions module_init and module_exit.

Let’s see a practical example with the classic program Hello world:

=

#include
#include
#include

MODULE_LICENSE("Dual BSD/GPL");

static int hello_init(void) {
printk("<1> Hello world!\n");
return 0;
}

static void hello_exit(void) {
printk("<1> Bye, cruel world\n");
}

module_init(hello_init);
module_exit(hello_exit);

The actual functions hello_init and hello_exit can be given any name desired. However, in order for them to be identified as the corresponding loading and removing functions, they have to be passed as parameters to the functions module_init and module_exit.

The printk function has also been introduced. It is very similar to the well known printf apart from the fact that it only works inside the kernel. The <1> symbol shows the high priority of the message (low number). In this way, besides getting the message in the kernel system log files, you should also receive this message in the system console.

This module can be compiled using the same command as before, after adding its name into the Makefile.

= obj-m := nothing.o hello.o

In the rest of the article, I have left the Makefiles as an exercise for the reader. A complete Makefile that will compile all of the modules of this tutorial is shown in Appendix A.

When the module is loaded or removed, the messages that were written in the printk statement will be displayed in the system console. If these messages do not appear in the console, you can view them by issuing the dmesg command or by looking at the system log file with cat /var/log/syslog.

Table 4 shows these two new functions.

The complete driver “memory”: initial part of the driver

I’ll now show how to build a complete device driver: memory.c. This device will allow a character to be read from or written into it. This device, while normally not very useful, provides a very illustrative example since it is a complete driver; it’s also easy to implement, since it doesn’t interface to a real hardware device (besides the computer itself).

To develop this driver, several new #include statements which appear frequently in device drivers need to be added:

=

/* Necessary includes for device drivers */
#include
#include
#include
#include /* printk() */
#include /* kmalloc() */
#include /* everything... */
#include /* error codes */
#include /* size_t */
#include
#include /* O_ACCMODE */
#include /* cli(), *_flags */
#include /* copy_from/to_user */

MODULE_LICENSE("Dual BSD/GPL");

/* Declaration of memory.c functions */
int memory_open(struct inode *inode, struct file *filp);
int memory_release(struct inode *inode, struct file *filp);
ssize_t memory_read(struct file *filp, char *buf, size_t count, loff_t *f_pos);
ssize_t memory_write(struct file *filp, char *buf, size_t count, loff_t *f_pos);
void memory_exit(void);
int memory_init(void);

/* Structure that declares the usual file */
/* access functions */
struct file_operations memory_fops = {
read: memory_read,
write: memory_write,
open: memory_open,
release: memory_release
};

/* Declaration of the init and exit functions */
module_init(memory_init);
module_exit(memory_exit);

/* Global variables of the driver */
/* Major number */
int memory_major = 60;
/* Buffer to store data */
char *memory_buffer;
After the #include files, the functions that will be defined later are declared. The common functions which are typically used to manipulate files are declared in the definition of the file_operations structure. These will also be explained in detail later. Next, the initialization and exit functions—used when loading and removing the module—are declared to the kernel. Finally, the global variables of the driver are declared: one of them is the major number of the driver, the other is a pointer to a region in memory, memory_buffer, which will be used as storage for the driver data.

The “memory” driver: connection of the device with its files

In UNIX and Linux, devices are accessed from user space in exactly the same way as files are accessed. These device files are normally subdirectories of the /dev directory.

To link normal files with a kernel module two numbers are used: major number and minor number. The major number is the one the kernel uses to link a file with its driver. The minor number is for internal use of the device and for simplicity it won’t be covered in this article.

To achieve this, a file (which will be used to access the device driver) must be created, by typing the following command as root:

# mknod /dev/memory c 60 0

In the above, c means that a char device is to be created, 60 is the major number and 0 is the minor number.

Within the driver, in order to link it with its corresponding /dev file in kernel space, the register_chrdev function is used. It is called with three arguments: major number, a string of characters showing the module name, and a file_operations structure which links the call with the file functions it defines. It is invoked, when installing the module, in this way:

=

int memory_init(void) {
int result;

/* Registering device */
result = register_chrdev(memory_major, "memory", &memory_fops);
if (result <>memory: cannot obtain major number %d\n", memory_major);
return result;
}

/* Allocating memory for the buffer */
memory_buffer = kmalloc(1, GFP_KERNEL);
if (!memory_buffer) {
result = -ENOMEM;
goto fail;
}
memset(memory_buffer, 0, 1);

printk("<1>Inserting memory module\n");
return 0;

fail:
memory_exit();
return result;
}

Also, note the use of the kmalloc function. This function is used for memory allocation of the buffer in the device driver which resides in kernel space. Its use is very similar to the well known malloc function. Finally, if registering the major number or allocating the memory fails, the module acts accordingly.

The “memory” driver: removing the driver

In order to remove the module inside the memory_exit function, the function unregsiter_chrdev needs to be present. This will free the major number for the kernel.

=

void memory_exit(void) {
/* Freeing the major number */
unregister_chrdev(memory_major, "memory");

/* Freeing buffer memory */
if (memory_buffer) {
kfree(memory_buffer);
}

printk("<1>Removing memory module\n");

}

The buffer memory is also freed in this function, in order to leave a clean kernel when removing the device driver.

The “memory” driver: opening the device as a file

The kernel space function, which corresponds to opening a file in user space (fopen), is the member open: of the file_operations structure in the call to register_chrdev. In this case, it is the memory_open function. It takes as arguments: an inode structure, which sends information to the kernel regarding the major number and minor number; and a file structure with information relative to the different operations that can be performed on a file. Neither of these functions will be covered in depth within this article.

When a file is opened, it’s normally necessary to initialize driver variables or reset the device. In this simple example, though, these operations are not performed.

The memory_open function can be seen below:

=

int memory_open(struct inode *inode, struct file *filp) {

/* Success */
return 0;
}

This new function is now shown in Table 5.

The “memory” driver: closing the device as a file

The corresponding function for closing a file in user space (fclose) is the release: member of the file_operations structure in the call to register_chrdev. In this particular case, it is the function memory_release, which has as arguments an inode structure and a file structure, just like before.

When a file is closed, it’s usually necessary to free the used memory and any variables related to the opening of the device. But, once again, due to the simplicity of this example, none of these operations are performed.

The memory_release function is shown below:

=

int memory_open(struct inode *inode, struct file *filp) {

/* Success */
return 0;
}
int memory_release(struct inode *inode, struct file *filp) {

/* Success */
return 0;
}

This new function is shown in Table 6.

The “memory” driver: reading the device

To read a device with the user function fread or similar, the member read: of the file_operations structure is used in the call to register_chrdev. This time, it is the function memory_read. Its arguments are: a type file structure; a buffer (buf), from which the user space function (fread) will read; a counter with the number of bytes to transfer (count), which has the same value as the usual counter in the user space function (fread); and finally, the position of where to start reading the file (f_pos).

In this simple case, the memory_read function transfers a single byte from the driver buffer (memory_buffer) to user space with the function copy_to_user:

=

ssize_t memory_read(struct file *filp, char *buf,
size_t count, loff_t *f_pos) {

/* Transfering data to user space */
copy_to_user(buf,memory_buffer,1);

/* Changing reading position as best suits */
if (*f_pos == 0) {
*f_pos+=1;
return 1;
} else {
return 0;
}
}

The reading position in the file (f_pos) is also changed. If the position is at the beginning of the file, it is increased by one and the number of bytes that have been properly read is given as a return value, 1. If not at the beginning of the file, an end of file (0) is returned since the file only stores one byte.

In Table 7 this new function has been added.

The “memory” driver: writing to a device

To write to a device with the user function fwrite or similar, the member write: of the file_operations structure is used in the call to register_chrdev. It is the function memory_write, in this particular example, which has the following as arguments: a type file structure; buf, a buffer in which the user space function (fwrite) will write; count, a counter with the number of bytes to transfer, which has the same values as the usual counter in the user space function (fwrite); and finally, f_pos, the position of where to start writing in the file.

=

ssize_t memory_write( struct file *filp, char *buf,
size_t count, loff_t *f_pos) {

char *tmp;

tmp=buf+count-1;
copy_from_user(memory_buffer,tmp,1);
return 1;
}

In this case, the function copy_from_user transfers the data from user space to kernel space.

In Table 8 this new function is shown.

The complete “memory” driver

By joining all of the previously shown code, the complete driver is achieved:

=

Before this module can be used, you will need to compile it in the same way as with previous modules. The module can then be loaded with:

# insmod memory.ko

It’s also convenient to unprotect the device:

# chmod 666 /dev/memory

If everything went well, you will have a device /dev/memory to which you can write a string of characters and it will store the last one of them. You can perform the operation like this:

$ echo -n abcdef >/dev/memory

To check the content of the device you can use a simple cat:

$ cat /dev/memory

The stored character will not change until it is overwritten or the module is removed.

The real “parlelport” driver: description of the parallel port

I’ll now proceed by modifying the driver that I just created to develop one that does a real task on a real device. I’ll use the simple and ubiquitous computer parallel port and the driver will be called parlelport.

The parallel port is effectively a device that allows the input and output of digital information. More specifically it has a female D-25 connector with twenty-five pins. Internally, from the point of view of the CPU, it uses three bytes of memory. In a PC, the base address (the one from the first byte of the device) is usually 0x378. In this basic example, I’ll use just the first byte, which consists entirely of digital outputs.

The connection of the above-mentioned byte with the external connector pins is shown in figure 2.

The “parlelport” driver: initializing the module

The previous memory_init function needs modification—changing the RAM memory allocation for the reservation of the memory address of the parallel port (0x378). To achieve this, use the function for checking the availability of a memory region (check_region), and the function to reserve the memory region for this device (request_region). Both have as arguments the base address of the memory region and its length. The request_region function also accepts a string which defines the module.

=

/* Registering port */
port = check_region(0x378, 1);
if (port) {
printk("<1>parlelport: cannot reserve 0x378\n");
result = port;
goto fail;
}
request_region(0x378, 1, "parlelport");

The “parlelport” driver: removing the module

It will be very similar to the memory module but substituting the freeing of memory with the removal of the reserved memory of the parallel port. This is done by the release_region function, which has the same arguments as check_region.

=

/* Make port free! */
if (!port) {
release_region(0x378,1);
}

The “parlelport” driver: reading the device

In this case, a real device reading action needs to be added to allow the transfer of this information to user space. The inb function achieves this; its arguments are the address of the parallel port and it returns the content of the port.

=

/* Reading port */
parlelport_buffer = inb(0x378);

Table 9 (the equivalent of Table 2) shows this new function.

The “parlelport” driver: writing to the device

Again, you have to add the “writing to the device” function to be able to transfer later this data to user space. The function outb accomplishes this; it takes as arguments the content to write in the port and its address.

=

/* Writing to the port */
outb(parlelport_buffer,0x378);

Table 10 summarizes this new function.

The complete “parlelport” driver

I’ll proceed by looking at the whole code of the parlelport module. You have to replace the word memory for the word parlelport throughout the code for the memory module. The final result is shown below:

=

Initial section

In the initial section of the driver a different major number is used (61). Also, the global variable memory_buffer is changed to port and two more #include lines are added: ioport.h and io.h.

=

/* Necessary includes for drivers */
#include
#include
#include
#include /* printk() */
#include /* kmalloc() */
#include /* everything... */
#include /* error codes */
#include /* size_t */
#include
#include /* O_ACCMODE */
#include
#include /* cli(), *_flags */
#include /* copy_from/to_user */
#include /* inb, outb */

MODULE_LICENSE("Dual BSD/GPL");

/* Function declaration of parlelport.c */
int parlelport_open(struct inode *inode, struct file *filp);
int parlelport_release(struct inode *inode, struct file *filp);
ssize_t parlelport_read(struct file *filp, char *buf,
size_t count, loff_t *f_pos);
ssize_t parlelport_write(struct file *filp, char *buf,
size_t count, loff_t *f_pos);
void parlelport_exit(void);
int parlelport_init(void);

/* Structure that declares the common */
/* file access fcuntions */
struct file_operations parlelport_fops = {
read: parlelport_read,
write: parlelport_write,
open: parlelport_open,
release: parlelport_release
};

/* Driver global variables */
/* Major number */
int parlelport_major = 61;

/* Control variable for memory */
/* reservation of the parallel port*/
int port;

module_init(parlelport_init);
module_exit(parlelport_exit);

Module init

In this module-initializing-routine I’ll introduce the memory reserve of the parallel port as was described before.

=
int parlelport_init(void) {
int result;

/* Registering device */
result = register_chrdev(parlelport_major, "parlelport",
&parlelport_fops);
if (result <>parlelport: cannot obtain major number %d\n",
parlelport_major);
return result;
}

printk("<1>Inserting parlelport module\n");
return 0;

fail:
parlelport_exit();
return result;
}

Removing the module

This routine will include the modifications previously mentioned.

=

void parlelport_exit(void) {

/* Make major number free! */
unregister_chrdev(parlelport_major, "parlelport");

printk("<1>Removing parlelport module\n");
}

Opening the device as a file

This routine is identical to the memory driver.

=

int parlelport_open(struct inode *inode, struct file *filp) {

/* Success */
return 0;

}

Closing the device as a file

Again, the match is perfect.

=

int parlelport_release(struct inode *inode, struct file *filp) {

/* Success */
return 0;
}

Reading the device

The reading function is similar to the memory one with the corresponding modifications to read from the port of a device.

=

ssize_t parlelport_read(struct file *filp, char *buf,
size_t count, loff_t *f_pos) {

/* Buffer to read the device */
char parlelport_buffer;

/* We transfer data to user space */
copy_to_user(buf,&parlelport_buffer,1);

/* We change the reading position as best suits */
if (*f_pos == 0) {
*f_pos+=1;
return 1;
} else {
return 0;
}
}

Writing to the device

It is analogous to the memory one except for writing to a device.

=

ssize_t parlelport_write( struct file *filp, char *buf,
size_t count, loff_t *f_pos) {

char *tmp;

/* Buffer writing to the device */
char parlelport_buffer;

tmp=buf+count-1;
copy_from_user(&parlelport_buffer,tmp,1);

return 1;
}

LEDs to test the use of the parallel port

In this section I’ll detail the construction of a piece of hardware that can be used to visualize the state of the parallel port with some simple LEDs.

WARNING: Connecting devices to the parallel port can harm your computer. Make sure that you are properly earthed and your computer is turned off when connecting the device. Any problems that arise due to undertaking these experiments is your sole responsibility.

The circuit to build is shown in figure 3 You can also read “PC & Electronics: Connecting Your PC to the Outside World” by Zoller as reference.

In order to use it, you must first ensure that all hardware is correctly connected. Next, switch off the PC and connect the device to the parallel port. The PC can then be turned on and all device drivers related to the parallel port should be removed (for example, lp, parport, parport_pc, etc.). The hotplug module of the Debian Sarge distribution is particularly annoying and should be removed. If the file /dev/parlelport does not exist, it must be created as root with the command:

# mknod /dev/parlelport c 61 0

Then it needs to be made readable and writable by anybody with:

# chmod 666 /dev/parlelport

The module can now be installed, parlelport. You can check that it is effectively reserving the input/output port addresses 0x378 with the command:

$ cat /proc/ioports

To turn on the LEDs and check that the system is working, execute the command:

$ echo -n A >/dev/parlelport

This should turn on LED zero and six, leaving all of the others off.

You can check the state of the parallel port issuing the command:

$ cat /dev/parlelport

Final application: flashing lights

Finally, I’ll develop a pretty application which will make the LEDs flash in succession. To achieve this, a program in user space needs to be written with which only one bit at a time will be written to the /dev/parlelport device.

=

#include
#include

int main() {
unsigned char byte,dummy;
FILE * PARLELPORT;

/* Opening the device parlelport */
PARLELPORT=fopen("/dev/parlelport","w");
/* We remove the buffer from the file i/o */
setvbuf(PARLELPORT,&dummy,_IONBF,1);

/* Initializing the variable to one */
byte=1;

/* We make an infinite loop */
while (1) {
/* Writing to the parallel port */
/* to turn on a LED */
printf("Byte value is %d\n",byte);
fwrite(&byte,1,1,PARLELPORT);
sleep(1);

/* Updating the byte value */
byte<<=1; if (byte == 0) byte = 1; } fclose(PARLELPORT); } It can be compiled in the usual way: $ gcc -o lights lights.c and can be executed with the command: $ lights The lights will flash successively one after the other! The flashing LEDs and the Linux computer running this program are shown in figure 4.

Conclusion

Having followed this brief tutorial you should now be capable of writing your own complete device driver for simple hardware like a relay board (see Appendix C), or a minimal device driver for complex hardware. Learning to understand some of these simple concepts behind the Linux kernel allows you, in a quick and easy way, to get up to speed with respect to writing device drivers. And, this will bring you another step closer to becoming a true Linux kernel developer.

Bibliography

A. Rubini, J. Corbert. 2001. Linux device drivers (second edition). Ed. O’Reilly. This book is available for free on the internet.
Jonathan Corbet. 2003/2004. Porting device drivers to the 2.6 kernel. This is a very valuable resource for porting drivers to the new 2.6 Linux kernel and also for learning about Linux device drivers.
B. Zoller. 1998. PC & Electronics: Connecting Your PC to the Outside World (Productivity Series). Nowadays it is probably easier to surf the web for hardware projects like this one.
M. Waite, S. Prata. 1990. C Programming. Any other good book on C programming would suffice.

Appendix A. Complete Makefile

=

obj-m := nothing.o hello.o memory.o parlelport.o

Appendix B. Compiling the kernel on a Debian Sarge system

To compile a 2.6.x kernel on a Debian Sarge system you need to perform the following steps, which should be run as root:

1. Install the “kernel-image-2.6.x” package.
2. Reboot the machine to make this the running kernel image. This is done semi-automatically by Debian. You may need to tweak the lilo configuration file /etc/lilo.conf and then run lilo to achieve this.
3. Install the “kernel-source-2.6.x” package.
4. Change to the source code directory, cd /usr/src and unzip and untar the source code with bunzip2 kernel-source-2.6.x.tar.bz2 and tar xvf kernel-source-2.6.x.tar. Change to the kernel source directory with cd /usr/src/kernel-source-2.6.x
5. Copy the default Debian kernel configuration file to your local kernel source directory cp /boot/config-2.6.x .config.
6. Make the kernel and the modules with make and then make modules.

Reference

http://www.freesoftwaremagazine.com/articles/drivers_linux

Design For Test