

Volume-12, Issue-12, December 2023 JOURNAL OF COMPUTING TECHNOLOGIES (JCT) International Journal Page Number: 01-08

# A Low Area And High Speed VLSI Architecture of The Wavelet Filter For Image De-Noising

Vaibhav Verma<sup>1</sup>, Prof. Amit Chauhan<sup>2</sup> <sup>1</sup>M.Tech Student, <sup>2</sup>Assistant Professor, <sup>1,2</sup>Department of Electronics and Communication, <sup>1,2</sup>, Vidhyapeeth Institute of Science & Technology (VIST), RGPV, Bhopal, INDIA

Abstract—This dissertation presents an efficient VLSI filter implementation for image denoising with performance enhancement. The test images are of different size and resolution. The compression performance is measured; objectively peak signal to noise ratio and subjectively visual quality of image and it is found that Daubechies 6-tap wavelet filter wavelets outperform. The optimized filters provide good performance under special conditions they no longer adhere to the mathematical properties of wavelets, namely the orthogonal between the forward and inverse filters. Simulation is performed using MATLAB and VLSI –Xilinx 14.7 software. The image process to be visualizes using the MATLAB software and the filter architecture to be optimized using the Xilinx version-14.7.

Keywords —Image Processor, filter wavelets, VLSI, MATLAB and Xilinx 14.7.

## I. INTRODUCTION

## A. VLSI In Image Processing

Interpolation or scaling of digital images is a topic that has been getting a lot of interest as of late for several reasons. Picture scaling is the act of resizing a digital image, and it is a nontrivial operation that includes a tradeoff between efficiency, smoothness, and sharpness. This is because image scaling entails resizing the pixels that make up the image. These days, the image scalar is extensively used in a variety of portable medical devices, digital electronic equipment, digital cameras, digital picture frames, mobile phones, touch panel computers, and other similar products. The design of a low-cost, high-quality, and highperformance image scalar for multimedia devices using the VLSI technology has developed into a key trend in recent years. The requirement for and relevance of image scaling are becoming more and more apparent as the graphic and video applications available on mobile cell devices continue to develop and expand. Linear and nonlinear interpolation techniques are the two primary categories that make up the picture scaling algorithms that are based on interpolation. The simplest form of linear interpolation is a low-complexity algorithm called a closest neighbour algorithm. Nevertheless, this approach produces scaled pictures with blocking and aliasing art effects as a consequence of its use. The bilinear interpolation technique is the way of scaling that is used the most often. This approach allows the target pixel to be reached by using the linear interpolation model in both the horizontal and vertical dimensions. The bicubic interpolation algorithm is yet another well-known polynomial-based approach. This technique employs an extended cubic model to obtain the target pixel by means of a 2D regular grid. When compared to linear methods, result in a significant improvement in image quality. This is accomplished by a reduction in the effects of blocking, aliasing, and blurring.

## **B. Image Processing**

The term "image processing" refers to a series of operations that are carried out on a picture in order to produce an improved version of the image or to derive some helpful information from it. It is a sort of signal processing in which the input is an image and the output might either be the picture itself or the characteristics or features that are connected with that image.

## **II. LITERATURE REVIEW**

**S.S.** Wu et al.,[11] For this research, we present the CMWMF hardware design, which can implement weighted mode, median, and joint bilateral filters using a constant amount of memory. This study is an attempt to accommodate the high memory and computing needs associated with processing depth maps including a large number of depth candidates. To lower the static random access memory (SRAM) size for hardware implementation, we take use of the geometry smoothing feature of natural pictures in the suggested architecture. Instead than relying on the number of labels or the size of the local supporting

window, this design always keeps the same amount of disparity values. To make the process hardware-friendly, we present a new weighted median search algorithm, which computes on each cycle of input. It is suggested to handle out-of-order joint histograms using an indexchecking approach. As the aforementioned methods offer many filter types while using a consistent amount of SRAM, we included them into our design. That's why this design may cut down on SRAM by 92.4% without sacrificing much in the way of speed. Our tests on the KITTI and Middlebury datasets, as well as with real-world depth cameras, indicate that the data that has been saved is adequate. In situations when there are several potential depth refinement possibilities, the suggested architecture is a strong option.

Sumi, H., et al., [12] Many near-infrared (multi-NIR) spectral CMOS image sensors (CIS) and camera systems have been created lately, allowing for novel applications. In order to make the multi-NIR camera system useful for consumer cameras, the multi-NIR filter is an essential piece of technology. Using a Fabry-Perot architecture, cutting-edge technology for processing multi-NIR signals has been created. On a 5-M pixel BSI-CIS, three distinct NIR filter types are laid out in a Bayer pattern with a pixel size of  $2 \times 2$  m 2. Suppression to below 75 nm of thickness disparities between the three band pass filter types. Signal processing technology has also been created that analyses and blends each signal of a multi-NIR signal with lowintensity visible light pictures, allowing for their use in surveillance, car, and fund us cameras for health management applications. This allows for easy detection of transitions in state even at 0.1 lux, thanks to a high picture SNR (Signal-to-Noise Ratio).

According to the research of M. V. Siva et al.[13] Resizing images, or "image scaling," is a common operation in the field of Digital Image Processing. An effective technique and architecture for picture scaling are devised in this work, yielding high-quality, space-saving downsized images. The edge detection is performed via a linear spacevariant edge detector, and the blurring effects of the bilinear interpolation are mitigated by a spatial sharp filter. The hardware efficiency of a simplified bilinear interpolation is exploited. Matlab is used to realise both the suggested and the current methods. Peak Signal to Noise Ratio (PSNR) and Structural Similarity (SSIM) are two metrics used to evaluate an image's quality. The GPDK 90nm CMOS technology is used for hardware implementation, and the Verilog hardware description language is used for synthesis in the Cadence Genus tool.

**S.** Y. Huang et al.,[14] For phase-based processing including video magnification, frame interpolation, and view synthesis, complex steerable pyramid (CSP) is often used to split pictures into muti-scale and orientated subbands. Bandpass filtering in the frequency domain, which is traditionally implemented with the use of fast Fourier transform (FFT). High-precision computing and intricate memory access are necessary for hardware implementation of FFT, though. In this work, we investigate how to design a finite impulse response filter in

CSP with little impact on performance or memory use. In order to interpolate frames with a PSNR of 38.6 dB, we leverage Kaiser windowing for our filter designs and implement 9-tap radial and 11-tap angular filters. We then go on to talk about VLSI architectural designs, proposing a stripe-based computation flow for 2-D CSP in order to reduce the line buffer size to 15%. Using TSMC 40nm technology, we build two VLSI circuits for testing purposes. The first is a 1-D CSP engine that can render in 4K UHD at 30 frames per second while using 67.8 percent less logic gates than an FFT-based version. The second is a full-high-definition, 60-frames-per-second (2-D) CSP engine. It has 3.5M gates of logic and 32 KB of SRAM. Moreover, we use an FPGA system to construct a 2-D CSP engine that runs at 80 MHz, producing 1024x1024 video at 16 frames per second.

M. V. Siva et al. [15] If the source and destination devices have different resolutions, the picture will need to be scaled. Interpolating an image means filling in a missing pixel using data from neighbouring pixels. Because of its simplicity, bilinear interpolation is often employed as an interpolation method for picture scaling. Using a spatial sharp filter as a pre-filter before to bilinear interpolation helps mitigate the blurring that may occur in a scaled picture. Adaptive edge detection is used to prevent unnecessary edge loss. In this work, we offer a low-cost VLSI architecture for edge-enhanced picture scaling and an approximation method to implement it. Results from an analysis of the quality of the scaled pictures show that they are on par with those obtained using conventional image scaling techniques. If you're looking to scale photographs without losing quality, the recommended architecture takes up far less space than the alternatives.

#### III. PROBLEM FORMULATION AND OBJECTIVES

Non-linearities in the components do not affect digital filters in the same way that they do analogue filters, which makes the design of digital filters much simpler. The electrical components that make up an analogue filter are not flawless; their values are set to a limit tolerance (for example, resistor values often have a tolerance of 5%), and those values are subject to fluctuate depending on temperature and may drift over the course of time. The impact of variable component faults is substantially increased when the order of an analogue filter, and therefore its component count, grows. [Case in point:] an analogue filter with three orders. Since the coefficient values in digital filters are saved in the memory of the computer, these filters are far more reliable and accurate. In signal processing, one of the most important steps is called filtering, and its purpose is to remove undesired signals and noise from the source signal. Filters are used to distinguish between frequencies that carry relevant information and those that do not.

## **A. Problem Identification**

After an examination of the relevant literature, the following issue was discovered:

- The current system is made up of the more adder circuit for the standard wavelet design. In addition to that, it has a minimal path latency.
- The adder circuit placement in the current system needs more space than what is now available.
- There is currently no effective VLSI design of a wavelet-based filter for picture de-noising.
- The design process required a greater number of components overall.
- The route latency is significant, which results in increased power consumption.
- A lower data rate or throughput, as well as a lower frequency.



## **IV. PROPOSED METHODOLOGY**

Fig.1: Flow Chart

- In the first step, an input picture is assigned in the MATLAB environment.
- Image preparation, including picture scaling, resizing, and enhancement is the second step.
- Step Three: Activate the Picture Initialization and Extract Function.
- Implementation of the wavelet and iwavelet on a very large scale integrated circuit (VLSI).
- Apply the inverse feature transform as the fifth step.
- Step four now involves calculating all of the parameters and checking all of the findings on the test bench.

#### A. Methodology

## The working of proposed methodology is as followings-

## Image Initialization and Preprocessing

Throughout this module, we do several preprocessing tasks for images, such as reducing picture noise, equalising histograms, estimating image sizes, and so on. At this step, we will take the pixels from the picture that has been provided and organise them in a matrix. At this step, we determine if the picture that was entered consisted of a single sample or not. In the event that there was more than one sample picture, this was transformed into a single sample intensity image (Grayscale image).

## Page3|

The first thing that we do is extract the binary points from the input picture, and then we construct the features by applying the feature transform across a degree range of 0-360 degrees. The image representation is denoted by this Feature transform as a set of projections along a variety of directions. The Feature transform is comprised of a Feature function, which is responsible for computing the projections of an image in certain directions of the x' and y' axes.

$$R_{\theta}(x') = \int_{-\infty}^{\infty} f(x'\cos\theta - y'\sin\theta, x'\sin\theta + y'\cos\theta) \, dy'$$

For computing a projection in the Feature transform, the equation 3 is applied at an angle of  $\theta$ .

Here 
$$\begin{bmatrix} x'\\ y' \end{bmatrix} = \begin{bmatrix} \cos\theta & \sin\theta\\ -\sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} x\\ y \end{bmatrix}$$
.

The Feature points may be produced by making use of the Feature function and performing the Feature transform.

#### **Implement Wavelet**

The picture is divided into  $K^*K$  pixel blocks before it is sent through the WAVELET encoding process. K may represent any value from 2 to 6, and so on. The calculation of the WAVELET for a sequence of length K that begins with f(i).

The coefficients of the WAVELET transform are derived from each block of the input data. After that, we apply this calculated WAVELET to the Feature points in order to encode them. WAVELET has the best energy compaction capabilities for pictures that have a high degree of correlation.

## Implement Inverse Wavelet

Inverse Wavelet is used to recover (decode) the projections at the receiver, which are then used in the process of reconstructing the picture.

$$D(u) = \alpha(u) \sum_{i=0}^{K-1} f(i) \cos\left[\frac{\pi(2i+1)u}{2K}\right]$$

Here, u ranges from 0, 1 ... K-1 and the Wavelet coefficients is D(u). The inverse Wavelet (IWAVELET) is expressed as,

$$f(i) = \sum_{u=0}^{K-1} \alpha(u) D(u) \cos\left[\frac{\pi(2i+1)u}{2K}\right]$$

## C. Apply Inverse Feature Transform

Formulas for the explicit and efficient inversion of the Feature transform and its dual are known. They may be used in a variety of applications. The formula allows for the inversion of the Feature transform in n dimensions, and the power of the Laplacian ()(n1)/2 may be expressed as a pseudo differential operator using the Fourier transform if it is required.

The picture that has been reassembled is attained at long last. In order to evaluate how well the suggested image compression method performs, we compute PSNR and MSE using equations 1 and 2, respectively.

#### V. SIMULATION AND RESULT

**MATLAB-** To build algorithms, visualise data, analyse data, and perform numerical computations, MATLAB® is a high-level technical computing language and interactive environment. As compared to more conventional programming languages like C, C++, and Fortran, MATLAB facilitates the rapid resolution of technical computer challenges.

## Simulation

Test projects at the framework level should be shaped in HDL vernaculars, and framework-level testing may be accomplished using ISIM or the Model Sim way of reasoning test framework. Screens that monitor and confirm the results of the apparatus being tested, or mirrored waveforms of the input signals, may be included into test seat jobs. The following types of games may be played with Model Sim or ISIM.

**Synthesis** As compared to competing projects, Xilinx's secured calculations for fusion enable plans to run up to 30% faster and permit more glaring reason thickness, both of which lessen venture time and expenses. The increasing complexity of FPGA surfaces, which now include memory and I/O blocks, has also led to the development of more impressive mix calculations that partition unnecessary modules into cuts. 1

## Image 1:



Fig. 2: Image initialization

Fig. 2 depicts the initialization procedure for an image; after clicking on the image, you will be sent to MATLAB.



**Fig. 3: Extract Feature points** 

Extracting certain features (or pixels) from a picture is seen in Fig. 3. Feature points (sometimes called "corners") in photos are those that remain the same regardless of the viewer's perspective, the zoom level



Fig. 4: (a) Reshape image (b) Transformed image

Once the picture has been preprocessed, the reshaped image and the transformed image are shown in Fig. 4. The filter files are opened in the xilinx startup window, as shown in Figure 5. You can see the files, the simulation tabs, and the implementation tabs in the windows.



Fig. 5: Top module of filter in xilinx environment

Input and output combinations may be seen in the filter's top module, which is seen in **Fig.5**.



Fig.6: Complete RTL View

All data in Fig. 6 are in their original, U or X form, demonstrating the high impendence of the Xilinx test bench.



Fig. 8: Assign clock and reset

The clock and reset pulse is seen in Figure 5.10; both the clock and reset pulse need to be set to 1 for the trigger to operate.

X1: 1,000,000 ps



Fig. 9 (a) Output Resized Image (b) Gray Image

Gray scaling is the process of transforming a picture from different color spaces such as RGB, CMYK, HSV, etc. to shades of grey, and Fig. 09 illustrates the enlarged and grey image. It might be completely black, or completely white.



Fig. 10: (a) Inverse Transformed Image (b) Output Filtered Image

Fig. 10 displays the filtered and converted picture. When a picture is filtered, the colours of its pixels are changed to affect its overall look. Filters may be applied to photographs to alter their brightness, sharpness, and even bring in a wide range of additional effects.



Fig. 11: (a) Original Image (b) White image



Fig. 12: (a) Gray scale (b) Feature Extracted



Fig. 13: Activate filter



Fig.14: (a) Output Resized Image (b) Gray Image



Fig. 15: (a) Inverse Transformed Image (b) Output Filtered Image

Area or number of component

 TABLE 1: Device utilization summa

|                                   |                                               | Available Utilization |             |  |  |
|-----------------------------------|-----------------------------------------------|-----------------------|-------------|--|--|
|                                   | Device Utilization Summary (estimated values) |                       |             |  |  |
| Logic Utilization                 | Used                                          | Available             | Utilization |  |  |
| Number of Slice Registers         | 349                                           | 126800                | 0%          |  |  |
| Number of Slice LUTs              | 418                                           | 63400                 | 0%          |  |  |
| Number of fully used LUT-FF pairs | 254                                           | 513                   | 49%         |  |  |
| Number of bonded IOBs             | 57                                            | 210                   | 27%         |  |  |
| Number of Block RAM/FIFO          | 1                                             | 135                   | 0%          |  |  |
| Number of BUFG/BUFGCTRLs          | 1                                             | 32                    | 3%          |  |  |
| Number of DSP48E1s                | 8                                             | 240                   | 3%          |  |  |

The suggested approximate multiplier implementation for duration is summarised in Table I. Just 418 of the 63400 possible lookups in the "slice" database have been utilised. Just 254 of the possible 513 lookup table and flip flop pairings were actually utilised. Just 57 of the 210 bonded input/output blocks were ever utilised. The sum of usable space may now be determined. Thus, the suggested digital filter occupies a total of 11.71 percent of the available space.

## Memory

The actual amount of time required to do an Xst run is 34 seconds. Time taken by the CPU to finish Xst: 34.92 seconds The current memory utilisation is 4649268 bytes.

**TABLE II: Simulation results** 

| Sr No. | Parameter           | Value               |
|--------|---------------------|---------------------|
| 1      | Method              | Wavelet filter      |
| 2      | Area                | 11.71               |
| 3      | Delay               | 0.897 ns            |
| 4      | Power               | 0.082 W             |
| 5      | Power Delay product | 81.5                |
| 6      | Frequency           | 1114 MHz            |
| 7      | Throughput          | 89120000 pixels/sec |

| Sr No. | Parameter                            | Result Compar          | Proposed Work       |
|--------|--------------------------------------|------------------------|---------------------|
| 1      | Filter Type                          | Bilateral Filter       | Wavelet Filter      |
| 2      | Delay                                | NA                     | 0.897 ns            |
| 3      | Frequency                            | 236.697 MHz            | 1114 MHz            |
| 4      | Slice look up table                  | 5142                   | 418                 |
| 5      | Fully used look up-flip<br>flop pair | 1782                   | 254                 |
| 6      | Bounded I/O boxes                    | 69                     | 57                  |
| 7      | Number of DSP48E1s                   | 36                     | 8                   |
| 8      | Throughput                           | 59171103<br>pixels/sec | 89120000 pixels/sec |

In comparing the results of the proposed work to those of the current effort, as shown in Table III, it is clear that the suggested work yields superior results.

## **VI. CONCLUSION**

This work proposes a VLSI filter implementation for image denoising with performance optimization. The orthogonal filter with discrete wavelet transform (DWT) is used as a tool for generating feature points and encoding procedure in this work. We investigate DWT as a method for encoding the feature points and IDWT as a method for decoding the feature points. The rebuilt picture is produced using an inverted version of the Feature transform. The filter architecture to change the filter section in needed wavelet transforms and to optimise the filter bank section in the suggested design were both parts of the filter architecture.

The transform function, which is intended to make the image processing approach more effective. The primary emphasis of this study was placed on the processing of images using FPGA technology. All of these objectives are to be accomplished by optimising the filter and transform architecture. In most cases, we use DWT transformations in order to carry out a comparative examination of various picture compression methods. Each method comes with its own set of advantages and disadvantages. When viewed from the point of view of maintaining a good picture quality on the rebuilt image.

## REFERENCES

- [1] C. Y. Lien, C. H. Tang, P. Y. Chen, Y. T. Kuo and Y. - L. Deng, "A Minimal expense VLSI Engineering of the Two-sided Channel for Continuous Picture Denoising," in IEEE Access, vol. 8, pp. 64278-64283, 2020, doi: 10.1109/ACCESS.2020.2984688.
- M. Mody, R. Allu, J. Villarreal, W. Wallace, N. Nandan and A. Baranwal, "High Throughput VLSI Engineering for Two-sided Sifting in PC Vision," 2022 IEEE Worldwide Meeting on Gadgets, Registering and Correspondence Advances
   (CONECCT), Bangalore, India, 2022, pp. 1-4, doi: 10.1109/CONECCT55679.2022.9865827.
- [3] C. Saranya, M. V. Vijayananth, S. Mouleeshwaran and D. Kaviyarasu, "A Picture Handling Application: Plan of VLSI based Reciprocal Channel utilizing Distance Grid Technique," 2022 Second Worldwide Meeting on Trend setting innovations in Savvy Control, Climate, Processing and Correspondence Designing (ICATIECE), Bangalore, India, 2022, pp. 1-5, doi: 10.1109/ICATIECE56365.2022.10047701.
- [4] S. D. Palekar, J. Kalambe and R. M. Patrikar, "Biochemical Blood Detecting Stage With CMOS Picture Sensor and Programming Based Frequency Channel," in IEEE Sensors Diary, vol. 22, no. 22, pp. 21753-21760, 15 Nov.15, 2022, doi: 10.1109/JSEN.2022.3208810.
- [5] K, R. Karthickkeyan, S. Kishore and R. Sharan, "Stall Multiplier-Based Strong Model of FIR Channels for VLSI Applications," 2022 sixth Worldwide Gathering on Hardware, Correspondence and Aviation Innovation, Coimbatore, India, 2022, pp. 249-254, doi: 10.1109/ICECA55336.2022.10009401.
- [6] P. T. L. Pereira, G. Paim, P. Ü. L. d. Costa, E. A. C. d. Costa, S. J. M. de Almeida and S. Bampi, "Building Investigation for Energy-Proficient Fixed-Point Kalman Channel VLSI Plan," in IEEE Exchanges for Extremely Huge Scope Reconciliation (VLSI) Frameworks, vol. 29, no. 7, pp. 1402-1415, July 2021, doi: 10.1109/TVLSI.2021.3075379.
- [7] I. -S. Joe et al., "Improvement of Cutting edge Between Variety Channel Framework on Sub-Micron-Pixel CMOS Picture Sensor for Portable Cameras with High Responsiveness and High Goal," 2021 Discussion on VLSI Innovation, Kyoto, Japan, 2021, pp. 1-2.
- [8] A. Yang, "Research on image filtering method to combine mathematics morphology with adaptive median filter," 9th International Conference on Optical Communications and Networks, Nanjing, 2021, pp. 55-59, doi: 10.1049/cp.2021.1152.

- [9] Y. Mishra and R. Rastogi, "Plan of FIR Channel utilizing New Window Capability to eliminate Boisterous Sign," 2021 third Global Gathering on Advances in Processing, Correspondence Control and Systems administration (ICAC3N), More noteworthy Noida, India, 2021, pp. 1011-1017.
- [10] W. He et al., "A Minimal expense Rapid Item Following VLSI Framework In light of Bound together Textural and Dynamic Compressive Highlights," in IEEE Exchanges on Circuits and Frameworks II: Express Briefs, vol. 68, no. 3, pp. 1013-1017, Walk 2021,

