Fast Algorithm for Computing the Discrete Hartley Transform of Type-II

The generalized discrete Hartley transforms (GDHTs) have proved to be an efficient alternative to the generalized discrete Fourier transforms (GDFTs) for real-valued data applications. In this paper, the development of direct computation of radix-2 decimation-in-time (DIT) algorithm for the fast calculation of the GDHT of type-II (DHT-II) is presented. The mathematical analysis and the implementation of the developed algorithm are derived, showing that this algorithm possesses a regular structure and can be implemented in-place for efficient memory utilization.The performance of the proposed algorithm is analyzed and the computational complexity is calculated for different transform lengths. A comparison between this algorithm and existing DHT-II algorithms shows that it can be considered as a good compromise between the structural and computational complexities.


Introduction
Over last years, the discrete Hartley transform (DHT) has gained popularity in the fields of digital signal and image processing, as it possesses many desirable properties, used in many applications [1][2][3]. In general, there are four types of this transform known as type-I, -II, -III and -IV DHTs respectively. In the literature, the type-I DHT is just the DHT and other types are called the generalized discrete Hartley transforms (GDHTs), because their definition contains shifts in either time, frequency index or both indices. The GDHTs are also known as the W transforms that were introduced by Z. Wang [4,5], the only difference between them is that they use different scaling factors in their definitions. If these scaling factors are ignored, the same fast algorithms can be used for both GDHTs and W transforms. The (GDHT/W) transforms have proven to be important tools in the signal processing and related fields, such as for fast computation of different types of convolutions [6][7][8][9], filter banks, signal representations [10] and many other applications.
The direct computation of the GDHTs is intensive and requires large arithmetic operations of order N 2 , where N is the transform length; therefore fast GDHTs (GFHTs) algorithms have been introduced to reduce the arithmetic complexity and implementation costs. Among them, Hu et al. [11] proposed several fast algorithms for computing the GDHTs, Bi and Chen [12] derived a split-radix algorithm for the computation of GDFTs and GDHTs. Chiper [13] introduced an algorithm for decomposing DHT-II of length N using two adjacent N/2 sets of coefficients. The aforementioned algorithms are focused on the sequences with length N being power of two. Other fast algorithm for calculating DHT-II with length N being power of three is introduced by Shuet al. [14] and for composite transform lengths is developed by Bi et al. [15]. Moreover, Shuet al. [16] developed a new fast algorithm for the direct computation of the (DHT-II) based on decomposition of DHT-II into two DHTs of type-II of length N/2 that can be used to solve such a problem.
While Shu's algorithm is based on decimation-in-frequency (DIF) approach; however, for any transform to stand as a good candidate for real time applications, its complete fast algorithms need to be developed, such as the decimation-in-time (DIT) approach. Therefore, it is the aim of this paper to develop such an algorithm for fast computation of the DHT-II.
The rest of the paper is organized as follows. Section 2 presents the derivation of the new fast DIT algorithm for computing the DHT-II. The analysis of the computational complexity for the developed algorithm is given and comparison with the Hu's algorithm is also provided in Section 3. Finally, Section 4 concludes the paper.

Algorithm Derivation
The DHT-II transform for a real-valued sequence ( ) of length N is defined as [1]: x n c a s n X (1) and the corresponding inverse transform (called type-III DHT) is given by: where cas( ) = cos( ) + sin( ), = 2 / and is the transform length. The decimation-in-time algorithm derivation begins by dividing the input sequence ( )into its even ( ) and odd ( ) parts. Therefore (1)can be decomposed as:  x cas (5) Using the following casproperty (5) can be simplified to: cas cas cas (7) Therefore, ( ) can be decomposed further to yield: The second summation of (8) can be simplified further to x n + cas x n + cas n k x n + cas n x n + cas n (9) Substituting (9) into (8) we get: where ( ) and ( − − 1) can be identified as two N/2 point DHT-II for odd part (2 + 1) of ( ), given by: X k x cas (11) Replacing (4) and (10) into (3), we obtain the following recursive formula: For radix-2 algorithm, another point + needs to be computed. This point can be derived using trigonometric identities and the periodicity property of DHT-II, we get: Examine (12) and (13), we realize that the in-place property of these decompositions is not possible, due to the fact that the computation of ( ) and + points require ( ) as well as ( − − 1). However this difficulty can solve by additionally considering decompositions for ( − − 1) and − − 1 points, as follows: Equations (12)-(15) can be implemented using an in-place butterfly structure shown in Figure 1. An example for calculating a 16-point DHT-II using the developed algorithm is shown in Figure 2.

Computational Complexity
The radix-2 DHT-II DIT algorithm combines four points together to formulate the inplace butterflyshown in Figure 1. Each butterfly calculates four points and requires four multiplications and six additions. In general, this algorithm requires log stages of butterfly computations in which each stage uses multiplications and 3 /2 additions. Therefore, the calculation of the whole transform is satisfies:  (16) where ( ) and ( ) stand for the number of multiplications and additions respectively. It should be noted that (16) is for general calculation of the arithmetic complexity of radix-2 DIT algorithm using single butterfly; thus ±1 are considered as a twiddle factor that is counted as a multiplication. The total number of multiplications and additions could be reduced further using more than one butterfly.
The arithmetic complexities given by (16) are recursive. To obtain complexity in a closed form, the initial values of these complexities are required. In this case, the initial values can be the number of operations that are needed by length-4 DHTs-II, which are equal to M(4)=2 for multiplications and A(4)=6 for additions. Solving (16) by repeated substitution of the initial values and using multiple butterflies, we get: A comparison has been made between this algorithm and Hu's radix-2 DIT algorithm [11] in terms of number of multiplications and additions, as shown in Table 1. The result of this comparison reveals that the total number of multiplications and additions of the proposed algorithm is better than Hu's algorithm. Furthermore, comparison in terms of structural complexity for the signal flow graphs of the latter [ Figure 1 of 11] with the former shown in Figure 2, we can easily deduce a high regularity and structural simplicity of the developed algorithm in contrast to Hu's algorithm.  Total  8  10  22  32  8  28  36  16  34  66  110  28  84  112  32  98  178  276  84  220  304  64  258  450  708  228  540  768  128  642  1090  1732  580  1276  1856  256  1538  2562  4100  1412  2940  4352  512  3586  5890  9476  3332  6652  9984  1024  8194  13314  21508  7684  14844  22528 .

Conclusion
This paper has been focused on a new fast algorithm for direct computation of the DHT-II transform. The presented radix-2 decimation-in-time GFHT algorithm has a regular signal flow graph that provides flexibility for different transform lengths, substantially reducing the arithmetic complexity as compared with the indirect algorithms. The developed algorithm has been implemented through the DIT approach, and its computational complexity is analyzed and compared with existing algorithms, showing its significantly reduce the structural complexity with a better indexing scheme and ease implementation.