Resource Efficient Single Precision Floating Point Multiplier Using Karatsuba Algorithm

ABSTRACT


INTRODUCTION
In many applications like signal and numerical processing, floating point arithmetic operations are mostly used. The floating point number standard is defined by IEEE [3] for different formats like singleprecision and double-precision. In floating-point operations, multiplication process is the complex one and performance deciding block. Hence, efficient implementation of floating-point multipliers is the major concern.
Over the last few decades, a lot of work has been done at both algorithmic level and implementation level done to improve the performance of floating point computations [13]- [15]. Several works have also focused on implementation in FPGA platforms [11], [20]. In spite of tremendous efforts, this arithmetic is still often the bottleneck in many computations.
The mantissa multiplication is the main part of the floating-point multiplication with respect to performance. For the single precision numbers 24-bit length of the mantissa, and generally multipliers need more hardware [1]- [2]. In this work, an attempt has been made to develop an approach for the mantissa multiplication of floating point numbers for single precision has been presented which permits to use a lesser amount of complexity and rich in performance.
The contribution of this paper can be summarized as follows: • First, Single precision floating multiplier using Karatsuba algorithm is designed using Vedic multiplier • Further in multiplication process, full adder block is replaced with modified 2x1 multiplexers and modified 3:2 & 4:2 compressor techniques wherever addition is done with full adder to improve the performance of the multiplication block. • Furthermore, performance analysis is summarized for proposed single precision floating point multiplication.
The respite of the paper is systematized as follows: Section 2 explains the implementation of the floating-point multiplication, design approach of karatsuba algorithm is discussed in Section 3, karatsuba algorithm using existing techniques are described in Section 4. In Section 5, karatsuba algorithm using proposed techniques are designated, simulation results are discussed in section 6.

FLOATING POINT MULTIPLICATION
The floating-point binary format are defined by IEEE-754 standard which is used for representing floating point numbers [12]. This standard specifies the format for single and double precision numbers i.e., 32-bit and 64-bit respectively. FPU arithmetic computations comprise addition, subtraction, multiplication, division and inverse etc. Generally, arithmetic operations of floating point numbers involves the mantissa, exponent and sign parts of the operands and then combining them after rounding and normalization. Brief overview of the computational flow of this arithmetic operation is given below.
Floating point multiplication is like normal multiplication with, excluding that it entails a complex mantissa multiplication which needs 24-bit large multiplier. which stances restriction on the performance of the hardware design. It can be figured in the subsequent steps [3]: • XOR operation of the MSB bits to acquire the sign bit of the final product.
• Addition of the exponents parts of the given numbers.
• Multiplication of mantissas parts of the given numbers.
• Rounding operation is performed for final mantissa product. • Final step is to normalize the acquired result to regulate exponent and mantissa parts.

Floating Point Multiplication
• The logical XOR of the sign-bit of both operands gives the output sign: Out = in1 ⊕ in2 • The addition of both input exponents is given by final output exponent and then modify it by Base. Out1 = Exp_in1 + Exp_in2 -127 (SPFPM) Out2 = Exp_in1 + Exp_in2 -1023 (DPFPM) (1) • Generally for any given floating-point number the base is drawn using this expression: (2 exponent bits−1 − 1). • The output of the mantissa multiplication is Z = (-1S) *2(E-Bias) *(1.M) • If there is an additional carry generated after multiplication, the product result is right shifted by 1-bit and the exponent result is incremented by one to make the result normalized. • Rounding is required in order to trim back the 106-bit mantissa multiplication result to 53-bit only. This can be done as per IEEE standard [16]- [17].

DESIGN APPROACH OF KARATSUBA ALGORITHM FOR MULTIPLICATION
The Karatsuba algorithm for floating-point multiplication follows divide and conquer method. The basic steps for this algorithm are discussed below [10]. Let us consider two floating point numbers W and X : the below steps will explain how the algorithm follows to perform multiplication: • Firstly, Divide each mantissa part into three parts such that each part should have equal number of bits i.e., 8-bits namely A0, A1, A2 and B0, B1, B2 for the inputs A and B commonly. • The procedure for Karatsuba multiplication is interpreted using equations which are drafted below. This perception utilizes appropriate amount of adder blocks in places of few multipliers to improve the hardware efficiency.
• From the equation (4), it is observed that Karatsuba algorithm needs 9 multipliers and 8 adders to get result.
• In this, to perform mantissa multiplication vedic multiplier is used with different techniques like multiplexers and compressors [6], [4]. Figure 2 illustrates, multiplication of two 24-bit mantissa numbers using Vedic multiplier with Ripple Carry addition.

Vedic Multiplier
Vedic multiplier is developed based on Urdhava Triyakbhayam sutras. Partial products creation can be done using vertical and crosswise manner and then parallel addition of these partial product is done by using different available adders. In this paper, mantissa bits of both the numbers are multiplied using Vedic multiplier which works on the principle of UT. The partial products creation and their summation are produced in one step and which minimize the carry feeding from LSB to MSB. Because of this speed of the multiplier is improved as compared to the available multipliers [7]- [9]. Figure 3 illustrates the single precision floating point multiplication of two 24-bit numbers using Vedic multiplier with Ripple Carry Adder using multiplexers.

EXISTING WORK
The various existing techniques are listed below: • Ripple Carry Adder is used for partial products addition in Vedic multiplier which is used in Karatsuba algorithm. • Next, in place of Ripple Carry Adder multiplexers are used for partial products addition in Vedic multiplier which is used in Karatsuba algorithm. • Finally, different 3:2 techniques with XOR-Mux and XOR-XNOR-Mux logics are used for partial products addition in Vedic multiplier which is used in Karatsuba algorithm. • Finally, performance constraints of all existing techniques are discussed in this paper.

Karatsuba Algorithm with RCA using Full Adder based Multiplication
The Ripple Carry Adder with full adder is used to perform partial product addition. In Karatsuba algorithm, multiplication is used RCA for addition.

4x1 Multiplexer based SPFP Multiplication using Karatsuba Algorithm
Full adder block is planned using two 4x1 multiplexers to complete partial product addition which is shown in Figure 4. From the Figure 4, A, B,C are considered as three inputs, among these three B,C inputs are acts as selection lines for both multiplexers and I0,I1,I2,I3 are input values to multiplexer. Out of these four inputs two input lines in first multiplexer I0, I3 is tied to A value and I1, I2 are tied to Ᾱ for generating sum output. For, second multiplexer I1, I2 are tied to A input value and I0 is connected to logic 0 value and I3 connected to logic 1 value.

Karatsuba Algorithm with 3:2 Compressor based multiplication
In multiplication, different types of compressors are used to add the partial product addition to improve the performance capability [15]. Compressors are usually involved to reduce the critical path which is important to improve the performance at the stage of reduction of the partial products. The symbolic representations of 3:2 compressors using XOR-Mux and XOR-XNOR-Mux logics are shown in Figure 5 and Figure 6.

PROPOSED WORK
The contribution of this paper is involved in improving the performance of the single precision floating point multiplication. In this, Karatsuba algorithm is analyzed and to further improvement of this algorithm, different modified techniques are used in multiplier part. Brief outline of the modifications in mantissa multiplication part are discussed below: • Firstly, mantissa multiplication for single precision numbers is completed with modified multiplexers 1 and 2 models using two 2x1 Multiplexers. Adder in the above-mentioned multiplication is replaced with 4x1 multiplexer and 2x1 multiplexer to improve the area and speed. • Mantissa multiplication for single precision numbers is performed using modified 4:2 compressor techniques with XOR-MUX and modified 2x1 multiplexers. And also, it is done with XOR-XNOR-MUX and 2x1 multiplexers to improve the area and speed • Performance Constraints of proposed techniques are summarized and conclusion is drawn. In next section, detailed explanation is given for each proposed model designed with multiplexer and 4:2 compressor logics.

Modified 2x1 Multiplexer based Multiplication
In this, multiplexers are replaced in place of full adder to add partial products. Full adder block is designed using two 2x1 multiplexers which are explained using Figure 7 and Figure 8. From Figure 7, outputs of the XNOR, XOR with A, B are the inputs to the first multiplexer and it gives sum output. Outputs of the AND and OR of A, B are the inputs to the second multiplexer and it bounces final carry. Here, common select line for both multiplexers is cin. The performance comparison is done for modified models of multiplexer based single precision floating point multiplication with respect to area and delay.

Modified Compressor based Multiplication
In this, compressors are replaced in place of full adder to add partial products to achieve better results in terms of delay and power. To design 4:2 compressor two types of modules are involved: XOR-XNOR-Mux block and XNOR-MUX. The representation of 4:2 compressors using both the logics are shown in Figure 9 and Figure 10.

SIMULATION RESULTS
The modules involved in floating point multiplication are implemented by using Verilog-Hardware Description Language. All blocks are simulated and synthesized on FPGA targets with Xilinx ISE. In this, karatsuba algorithm with vedic multiplier using different existing and proposed techniques are considered for simulation and synthesis. Further, the performance comparison is done for those different existing and proposed techniques in view of delay and area.  From Table 2, it is noticed that SPFPM using karatsuba algorithm with vedic multiplier with consideration of 4:2 compressor using XOR-MUX provides better results in terms of delay and area. From Table 3, it is illustrated that SPFPM using karatsuba algorithm with vedic multiplier with consideration of 4:2 compressor with XOR-Mux logic provides better results in view of area and delay.

CONCLUSION
In this paper, floating point multiplication for single precision numbers is developed by using Karatsuba algorithm with vedic multiplier with different existing like full adder, using multiplexers and 4:2, 3:2 compressors and proposed techniques such as modified 2x1 multiplexers and modified 3:2, 4:2 compressors. Further, the performance comparison analysis is done among these techniques in terms of area and delay. From the simulated results, it is inferred that single precision multiplication using karatsuba algorithm with 4:2 Compressor with XOR-MUX logic combination provides better results in terms of area and delay than that of other techniques.