Numerically Controlled Oscillator for Software Radio Applications

Ishmael Zibani\textsuperscript{1}, Kelebaone Tsamaase\textsuperscript{2}, Kagiso Motshidisi\textsuperscript{3}, Pran Mahindroo\textsuperscript{4}

\textsuperscript{1,2,3,4}(Electrical Department, University of Botswana, Botswana)

Abstract: A Digital Down Converter (DDC) performs the critical frequency translation needed to recover the information from a digitized modulated signal. In terms of system performance, the critical component in digital down conversion is the Numerically Controlled Oscillator (NCO). The NCO synthesizes a range of frequencies from a fixed time base. It generates sine values from a ROM lookup table (LUT). The size of the LUT is one of the main concerns as it has to store enough sine samples in order to generate sine values of varying frequency. The size of the LUT will therefore limit the bandwidth of the sine output. A progression of state approach for the NCO is proposed. As we advance through the states, outputs are generated to represent sine and cosine values. The state machine is clocked with a varying clock signal in order to generate various frequency outputs. It is known that only a quarter of the sine/cosine function need be stored, and the rest of the sinusoid can be generated by applying appropriate sign convention. In this study, it is demonstrated how this can be achieved.

Keywords: DDC, NCO, ROM-LUT, progression of state.

I. Introduction

In Software Radio, much of the signal processing is done in software using a reconfigurable hardware platform. The DDC performs the frequency translation to recover the original signal.\cite{1}, \cite{3}, \cite{10}. Inside the DDC, the digitized modulated input from the Analog to Digital Converter, ADC is mixed with locally generated sinusoid to shift the spectrum of the signal. The mixed signal has to be filtered to isolate the portion of the spectrum containing the signal of interest. The filter typically has to be a narrow-band filter with a fairly high rejection of unwanted spectrum. This translates to an expensive filter if it is done at the input sample rate. Instead, a multi-rate approach can be used, in which the signal is first decimated to a much lower sample rate using a less computationally intensive filter. Then the signal is cleaned up with a second more complex filter working at the decimated sample rate\cite{2}, \cite{5}, \cite{8}, \cite{9}. See Figure 1.

\begin{figure}[h]
\centering
\includegraphics[width=\textwidth]{fig1.png}
\caption{Block diagram of a Digital Down Converter}
\end{figure}

The rest of the paper is divided as follows: In section II, a brief discussion on basic operation of a NCO is provided. The motivation for the proposal is given in section III. In sections IV and V, a detailed outline and design of the proposed system is given. Section VI through to VII, we present a design example and discuss the results, ending with a conclusion in section VIII.

DOI: 10.9790/2834-1106040513 www.iosrjournals.org 5 | Page
II. Numerically Controlled Oscillator

The NCO, sometimes called a local oscillator generates digital samples of two sine waves precisely offset by 90 degrees in phase creating sine and cosine signals [8], [10], [11]. (See Figure 2). It uses a digital phase accumulator (adder) and sine/cosine look-up tables. The ADC clock is fed into the local oscillator. The digital samples out of the local oscillator are generated at a sampling rate exactly equal to the ADC sample clock frequency, $f_s$. Since the data rates from these two mixer input sources are both at the ADC sampling rate, $f_s$, the complex mixer output samples at $f_s$. The sine and cosine input from the local oscillator create in-phase and quadrature (I and Q) output that are important for maintaining phase information contained in the input signal. (see figure 1). The decimating low pass filter accepts input samples from the mixer output at the full ADC sampling frequency, $f_s$. It utilizes digital signal processing to implement a FIR (finite impulse response) filter transfer function. The filter passes all signals from 0 Hz up to a programmable cutoff frequency or bandwidth, and rejects all signals above that cutoff frequency. The digital filter is a complex filter which processes both I and Q signals from the mixer [2], [4], [7].

![Figure 2. Conventional NCO](image)

III. Motivation For The Proposed NCO

The conventional NCO uses a ROM LUT to store the required sine/cosine samples. [1], [5], [12]. The ROM has a fixed structure and therefore the designer cannot take advantage of redundancy in a system they are designing. Furthermore, the sine/cosine function follows a predetermined sequence, therefore random addressing is not required. In a ROM, the addressing is performed by an address decoder which consists of multiplexers (Figure 3). To store $n$ words of length $w$, the ROM will undoubtedly have $n \times m$ memory cells. It will also consist of $n \times s$, $m$ multiplexers, where $s = \frac{\log_2 n}{\log_2 2} = \frac{\ln n}{\ln 2}$. With the proposed method, only $s$ memory cells are required, together with some combinational logic to form the required output.

A simple example is given in Figure 3. One quarter of cosine and sine samples are to be stored. The sample rate is $30^9$ and the word length is 4bits. The multiplexer (Figure 3a) consists of 7 gates. So to produce sine samples of word length 4, 16 memory cells and 7x4 logic gates are required. Therefore to produce both sine and cosine samples, 56 gates and 32 flipflops are required.

Using the proposed method, we will use 2 flipflops and 7 combinational logic. Using the ROM LUT approach, the number of memory cells will grow exponentially as compared to the proposed method. The decoder will also become very complex.

![Figure 3a Conventional ROM LUT implementation](image)
The block diagram of the proposed NCO is given in Figure 4. Here, a $\frac{1}{4}$ Sin_Cos Function is used instead of a Sin/Cos LUT. The three components driving a ROM LUT (i.e., phase increment register, adder, and address register) are replaced by the frequency synthesizer. The additional logic is used to generate appropriate sin_cos values in the other three quadrants. Compare figures 2 and 4.
V. Design Of Sin-Cos Using Progression Of States

The ¼ Sin.Cos function can be implemented using a state machine, with appropriate outputs at each state. Each time we go through a state cycle, we cover the one quadrant (shaded in Figure 5). The output PULSE (Figure 4) signifies the beginning of a new state cycle, hence a new quadrant. With proper sign convention, the same samples are reused in other quadrants.

Figure. 5 Analyzing the Sine Function

Figure 6b shows the flow chart for the Sine Function. The first 2 quadrants are defined by counting up and down the states. A reflection along the dc line will define the last 2 quadrants. See Figure 6a.

Suppose we are at the beginning of the first quadrant. dc will be (externally) set to 1 to signify that we are in the ‘positive’ quadrants. We start counting at the beginning of the quadrant and continue counting up as long as we are in the first (odd quadrant), until we reach the end of that quadrant. Then we count down and continue like so as long as we are in the second (even) quadrant, until we reach the beginning of the next quadrant. At this point, dc is set to 0, so that repeating the above procedure will produce sine values for the third and fourth quadrants. Then the whole cycle repeats again. A similar argument goes with the cosine function.

Figure. 6a. The Sine Function

Figure 6b. Flow Chart.
Figure 7a shows the behavior of sine and cosine functions in all four quadrants. If we assume absolute values for sine and cosine, we can see that when sine is increasing, cosine is decreasing but with different increments. Therefore inverting sine bits won’t generate cosine bits. So sine and cosine values are generated as previously shown in Figure 3.

The output of the 2 bit down counter (figure 4) identifies the current quadrant. dc(sin) = 1 in the first 2 quadrants. Therefore dc(sin) = \text{MSB} \cdot \text{LSB} + \text{MSB} \cdot \text{LSB} = \text{MSB} \cdot \text{LSB}

Therefore, dc(cos) = \text{MSB} \cdot \text{LSB} + \text{MSB} \cdot \text{LSB} = \text{MSB} \cdot \text{LSB}

For argument sake, let’s assume that we store sine samples for the whole 360\(^\circ\). Top of Figure 8 shows some of the original samples. In the ROM LUT approach, a change of frequency is achieved in altering the phase angle (also see Figure 2). For example, to increase the frequency of the output sine wave, the phase angle is increased so that some of the samples are ‘dropped’. That is we read samples a,c,e,f,etc instead of a,b,c,d,e,f,etc. (shown middle Figure 8). To achieve the same frequency with the proposed method, the same samples are generated much faster. For example, instead of generating samples at t0,t1,t2,t3,etc, we generate at t0, \(t0 + \frac{\pi}{2}\), t1, \(t1 + \frac{\pi/2}{2}\), t2, etc, (shown in bottom of Figure 8).

![Figure 7a. Sin and Cos](image)

<table>
<thead>
<tr>
<th>Quadrant</th>
<th>MSB</th>
<th>LSB</th>
<th>dc (cos)</th>
<th>dc (sin)</th>
</tr>
</thead>
<tbody>
<tr>
<td>first</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>second</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>third</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>fourth</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

![Figure 7b. Truth Table for dc](image)

The frequency synthesizer outputs different clock waveforms in response to different frequency input words. It basically consists of a frequency input register and a binary counter. See Figure 9. Note that the quad input inverts the sine in the first quadrant along the y axis to obtain the sine in the second quadrant, the half sine wave thus obtained is reflected along the x axis (dc) to obtain third and fourth quadrants, hence the controlled inverters operated by the dc. (See Figures 7 and 9). The shape of the output clock, clk\(_{0}\) is a pulse. Since it is used for clocking purposes, there is no need to add a square-wave shaper. The frequency of clk\(_{0}\), \(f_{\text{out}}\), is given by, \(f_{\text{ref}} / (w_v + 1)\), where \(f_{\text{ref}}\) is the frequency of the reference clock and \(w_v\) is the value of the frequency input word. For example, if \(w_v = 0\), then \(f_{\text{out}} = f_{\text{ref}}\).

The bandwidth or frequency span of \(f_{\text{out}}\), \(\Delta f_{\text{out}}\) is given by \(f_{\text{ref}} \cdot (1 - (w_{\text{max}} + 1)^{-1})\), where \(w_{\text{max}}\) is the maximum value of the input frequency word. Finally, the bandwidth of the sine (or cosine) output is given by \(\Delta f_{\text{sin}} = \Delta f_{\text{out}} / (s + 1 + 3s)\), where \(s\) is the number of non-zero samples in the first quadrant, \(s_{\text{max}} = 2^n\), where \(n\) is the word length for \(\frac{1}{4}\) sine or \(\frac{1}{4}\) cosine.

Again note that the dc also forms part of the sin and cos output. However, they are generated outside the \(\frac{1}{4}\) sin_cos function to minimize its complexity.
Numerically Controlled Oscillator For Software Radio Applications

Figure 8 Generating Sine samples

Figure 9 Logic diagram of the proposed NCO
VI. Altera Design Example

For software radio applications, the word length \( n+1 \), of each complete sine/cosine sample is 16bits [6]. For the following example, \( n+1=16 \) was used. As pointed out earlier, we design for a quarter, i.e. 15bit, so \( n=15 \) (word length for the quarter). The 16th bit is supplied externally and it is also used to differentiate between positive and negative quadrants. For \( n=15 \), we have \( 2^{15} \) samples (including the all zeros sample). If we take the maximum value of the sine function to be 1, then the least significant bit, LSB will have a weight of \( \frac{1}{2^{15-1}} \). This corresponds to sample rate/angle resolution of \( \left( \frac{90}{(2^{15}-1)} \right)^0 \). For our demonstration purposes, this resolution was found to be too high. An angle resolution of \( 1^0 \) was used in this example.

For \( 0^0<θ<90^0 \), \( 0<\sinθ<1 \). So we need to multiply \( \sinθ \) with a suitable multiplying factor \( m_θ \), so that we can represent sine values in binary form. For word length \( n \), \( m_θ=2^n-1 \). In our example, the binary equivalent of \( \sin90^0 \) is \( (2^{15}-1)\sin90 = 11111111111110 \). All other entries were computed in the same way. (See Figure 10).

In the final stage, the MSB is supplied externally from the two bit down counter and added to make a 16 bit word. (See Figure 9).

<table>
<thead>
<tr>
<th>Present State</th>
<th>Next State</th>
<th>Output</th>
</tr>
</thead>
<tbody>
<tr>
<td>S0</td>
<td>S1</td>
<td>S1</td>
</tr>
<tr>
<td>S1</td>
<td>S0</td>
<td>S2</td>
</tr>
<tr>
<td>S2</td>
<td>S1</td>
<td>S3</td>
</tr>
<tr>
<td>S3</td>
<td>S2</td>
<td>S4</td>
</tr>
</tbody>
</table>

\[ \begin{array}{cccc}
\text{State Table for } \frac{1}{4} \text{ Sin_Cos Function} \\
\text{Present State} & \text{Next State} & \text{Input} & \text{Output} \\
\hline
\text{S} & \text{S} & \text{S} & \text{S} \\
\text{S} & \text{S} & \text{S} & \text{S} \\
\text{S} & \text{S} & \text{S} & \text{S} \\
\text{S} & \text{S} & \text{S} & \text{S} \\
\hline
\end{array} \]

VII. Discussion Of The Results

Figure 10 show the sine and cosine output waveforms. The results from Figure 10 were plotted using Microsoft excel to obtain the graphs in Figure 11. The positive sine and cosine were generated using \( \frac{1}{4} \) sine and \( \frac{1}{4} \) cosine respectively. The dc value was used to ‘raise’ the sine and cosine so that they appear above the x-axis. The results obtained were expected.

The Delay Matrix (Fig. 13) gives us the critical speed paths that limit the design performance. From the Delay Matrix, the longest delay is 25.8nS. If we choose to use 26nS instead, then we get a bandwidth of 38.46MHz. From \( f_{\text{sin(max)}} = f_{\text{max}} \frac{(s+1)+3s}{8} \), \( f_{\text{max}} = f_{\text{ofr}} \), so that \( f_{\text{sin(max)}} = f_{\text{ofr}} \frac{(s+1)+3s}{8} = 38.46\text{MHz} \). After the design has been verified, it can be migrated to an ALTERA’s HardCopy device. This involves removing the reprogrammability of the original FPGA device, resulting in increased speed, reduced complexity, area and power consumption.

DOI: 10.9790/2834-1106040513www.iosrjournals.org
For example, to calculate the error in the sine output, will be the difference between the true sine value, $\sin_{\text{true}}$, and the 16bit binary value (reconverted to decimal for easy manipulation). The decimal equivalent of the sine binary values, $\sin_{\text{dec}}$, are also shown in Figure 10. We can calculate average % error, $\%err_{\text{ave}}$, using either sine or cosine values. Thus we have, $\%err_{\text{ave}} = \left(\sum_{i=0}^{n} |\sin_{\text{true}}(i) - \sin_{\text{bin}}(i)| / \text{89} \right) \times 100 \%$. ($0^\circ$ and $90^\circ$ have been left out since they give an error of zero). Using values from the complete table of Figure 10 and true sine values, $\%err_{\text{ave}}$ came to be 0.00218003%.

Angle resolution is related to the sampling rate. A high sampling rate will mean a good angle resolution but slower output speed. So during design, the angular resolution and speed trade offs must be understood. By looking at the slowest sine/cosine output required, the number of desired samples per quarter can be set. The number of (non zero) samples to be used is $\min(\text{desired}, 2^n - 1)$, otherwise duplicate samples will be stored. For this example, 10 samples per sample rate was chosen for this example. This gives an accuracy of 1 part in 90 or 1.1%.

Figure 11. Graphs for sine and cosine functions

Figure 12 Sine and Cosine output waveforms.
Numerically Controlled Oscillator For Software Radio Applications

![Image](https://example.com/image.png)

**Figure 13.** Delay Matrix for Sine and Cosine.

**References**


DOI: 10.9790/2834-1106040513 www.iosrjournals.org 13 | Page