Randomized Lattice Decoding
Abstract
Sphere decoding achieves maximumlikelihood (ML) performance at the cost of exponential complexity; lattice reductionaided successive interference cancelation (SIC) significantly reduces the decoding complexity, but exhibits a widening gap to ML performance as the dimension increases. To bridge the gap between them, this paper presents randomized lattice decoding based on Klein’s sampling technique, which is a randomized version of Babai’s nearest plane algorithm (i.e., SIC). To find the closest lattice point, Klein’s algorithm is used to sample some lattice points and the closest among those samples is chosen. Lattice reduction increases the probability of finding the closest lattice point, and only needs to be run once during preprocessing. Further, the sampling can operate very efficiently in parallel. The technical contribution of this paper is twofold: we analyze and optimize the performance of randomized lattice decoding resulting in reduced decoding complexity, and propose a very efficient implementation of random rounding. Simulation results demonstrate nearML performance achieved by a moderate number of samples, when the dimension is not too large. Compared to existing decoders, a salient feature of randomized lattice decoding is that it will sample a closer lattice point with higher probability. A byproduct is that boundary errors for finite constellations can be partially compensated if we discard the samples falling outside of the constellation.
I Introduction
Decoding for the linear multiinput multioutput (MIMO) channel is a problem of high relevance in multiantenna, cooperative and other multiterminal communication systems. The computational complexity associated with maximumlikelihood (ML) decoding poses significant challenges for hardware implementation. When the codebook forms a lattice, ML decoding corresponds to solving the closest lattice vector problem (CVP). The worstcase complexity for solving the CVP optimally for generic lattices is nondeterministic polynomialtime (NP)hard. The best CVP algorithms to date are Kannan’s [1] which has be shown to be of complexity where is the lattice dimension (see [2]) and whose space requirement is polynomial in , and the recent algorithm by Micciancio and Voulgaris [3] which has complexity with respect to both time and space. In digital communications, a finite subset of the lattice is used due to the power constraint. ML decoding for a finite lattice can be realized efficiently by sphere decoding [4, 5, 6], whose average complexity grows exponentially with for any fixed SNR [7]. This limits sphere decoding to low dimensions. The decoding complexity is especially felt in coded systems. For instance, to decode the perfect code [8], one has to search in a 32dimensional (realvalued) lattice. The stateoftheart sphere decoding is slow for this dimension. Although some fastdecodable codes have been proposed recently [9], the decoding still relies on sphere decoding.
Thus, we often have to resort to an approximate solution. The problem of solving CVP approximately was first addressed by Babai in [10], which in essence applies zeroforcing (ZF) or successive interference cancelation (SIC) on a reduced lattice. This technique is referred to as latticereductionaided decoding [11, 12]. It is known that Lenstra, Lenstra and Lovász (LLL) reduction achieves full diversity in MIMO fading channels [13, 14] and that latticereductionaided decoding has constant gap to (infinite) lattice decoding [15]. It was further shown in [16] that minimum mean square error (MMSE)based latticereduction aided decoding achieves the optimal diversity and multiplexing tradeoff. In [17], it was shown that Babai’s decoding using MMSE can provide nearML performance for smallsize MIMO systems. However, the analysis in [15] revealed a widening gap to lattice decoding. Thus, for high dimensional system and highlevel modulation such as 64QAM, the performance loss relative to ML is still large.
In this work, we present randomized lattice decoding to narrow down the gap between latticereductionaided SIC and sphere decoding. We use Klein’s randomized CVP algorithm [18], which is a randomized version of Babai’s nearest plane algorithm (i.e., SIC). The core of Klein’s algorithm is randomized rounding which generalizes the standard rounding by not necessarily rounding to the nearest integer. Thus far, Klein’s algorithm has mostly remains a theoretic tool in the lattice literature, and we are unaware of any experimental work for Klein’s algorithm in the MIMO literature. In this paper, we sample some lattice points by using Klein’s algorithm and choose the closest from the list of sampled lattice points. By varying the list size , it enjoys a flexible tradeoff between complexity and performance. It is worth noting that Klein applied his algorithm to find the closest lattice point only when it is very close to the input vector. We do not have this restriction in this paper, although in essence it is also a probabilistic boundeddistance decoder. The technical contribution of this paper is twofold: we analyze and optimize the performance of randomized lattice decoding which leads to reduced decoding complexity, and propose a very efficient implementation of Klein’s random rounding. Simulation results demonstrate nearML performance achieved by a moderate number of samples when the dimension is not too large. The performancecomplexity tradeoff of randomized lattice decoding is comparable to that of the new decoding algorithms proposed in [19, 20] very recently.
Randomized lattice decoding distinguishes itself from previous listbased detectors [21, 22, 23, 24] in several ways. Firstly, the way it builds its list is distinct. More precisely, it randomly samples lattice points with a discrete distribution centered at the received signal and returns the closest among them. Hence, random lattice decoding is more likely to find the closest lattice point than [24] where a list of candidate lattice points is built in the vicinity of the SIC output. Secondly, the expensive lattice reduction is only performed once during preprocessing, which means that the extra complexity is in addition to that of lattice reduction. In [22], a bank of parallel lattice reductionaided detectors was used. The cosetbased lattice detection scheme in [23] also needs lattice reduction many times. Thirdly, randomized lattice decoding enjoys a proven gain given the list size ; all previous schemes might be viewed as various heuristics apparently without such proven gains. Note that listbased detectors (including our algorithm) may prove useful in the context of incremental lattice decoding [25], as it provides a fallback strategy when SIC starts failing due to the variation of the lattice.
It is worth mentioning that Klein’s sampling techique is emerging as a fundamental building block in a number of new lattice algorithms [26, 27]. Thus, our analysis and implementation may benefit those algorithms as well.
The paper is organized as follows: Section II presents the transmission model and lattice decoding, followed by a description of Klein’s randomized decoding algorithm in Section III. In Section IV the finetuning and analysis of Klein’s decoding is given, and the efficient implementation and extensions to complexvalued systems and MMSE are proposed in Section V. Section VI evaluates the performance and complexity by computer simulation. Some concluding remarks are offered in Section VII.
Notation: Matrices and column vectors are denoted by upper and lowercase boldface letters (unless otherwise stated), and the transpose, inverse, pseudoinverse of a matrix by , , and , respectively. The inner product in the Euclidean space between vectors and is defined as , and the Euclidean length . rounds to a closest integer, while to the closest integer smaller than or equal to . The and prefixes denote the real and imaginary parts. We use the standard big and small O notation and .
Ii Lattice Coding and Decoding
Consider an flatfading MIMO system model consisting of transmitters and receivers
(1) 
where , , of block length denote the channel input, output and noise, respectively, and is the fullrank channel gain matrix with , all of its elements are i.i.d. complex Gaussian random variables . The entries of are i.i.d. complex Gaussian with variance each. The codewords satisfy the average power constraint . Hence, the signaltonoise ratio (SNR) at each receive antenna is .
When a lattice spacetime block code is employed, the codeword is obtained by forming a matrix from vector , where is obtained by multiplying QAM vector by generator matrix of the encoding lattice, i.e., . By columnbycolumn vectorization of the matrices and in (1), i.e., and , the received signal at the destination can be expressed as
(2) 
When and , (2) reduces to the model for uncoded MIMO communication . Further, we can equivalently write
(3) 
which gives an equivalent realvalued model. We can also obtain an equivalent real model for coded MIMO like (3). The QAM constellations can be interpreted as the shift and scaled version of a finite subset of the integer lattice , i.e., , where the factor arises from energy normalization. For example, we have for MQAM signalling.
Therefore, with scaling and shifting, we consider the generic () realvalued MIMO system model
(4) 
where , given by the realvalued equivalent of , can be interpreted as the basis matrix of the decoding lattice. Obviously, and . The data vector is drawn from a finite subset to satisfy the power constraint.
A lattice in the dimensional Euclidean space is generated as the integer linear combination of the set of linearly independent vectors [28, 29]:
(5) 
where is the set of integers, and represents a basis of the lattice . In the matrix form, . The lattice can have infinitely many different bases other than . In general, a matrix , where is an unimodular matrix, i.e., and all elements of are integers, is also a basis of .
Since the vector can be viewed as a lattice point, MIMO decoding can be formulated as a lattice decoding problem. The ML decoder computes
(6) 
which amounts to solving a closestvector problem (CVP) in a finite subset of lattice . Note that the complexity of the standard ML decoding that uses exhaustive search is exponential in , and also increases with the alphabet size. ML decoding may be accomplished by the sphere decoding. However, the expected complexity of sphere decoding is exponential for fixed SNR [7].
A promising approach to reducing the computational complexity of sphere decoding is to relax the finite lattice to the infinite lattice and to solve
(7) 
which could benefit from lattice reduction. The downside is that the found lattice point will not necessarily be a valid point in the constellation.
This search can be carried out more efficiently by lattice reductionaided decoding [12]. The basic idea behind this is to use lattice reduction in conjunction with traditional lowcomplexity decoders. With lattice reduction, the basis is transformed into a new basis consisting of roughly orthogonal vectors
(8) 
where is a unimodular matrix. Indeed, we have the equivalent channel model
Then conventional decoders (ZF or SIC) are applied on the reduced basis. This estimate is then transformed back into . Since the equivalent channel is much more likely to be wellconditioned, the effect of noise enhancement will be moderated. Again, the resulting estimate is not necessarily in , remapping of onto the finite lattice is required whenever .
Babai preprocessed the basis with lattice reduction, then applied either the rounding off (i.e., ZF) or nearest plane algorithm (i.e., SIC) [10]. For SIC, one performs the QR decomposition , where has orthogonal columns and is an upper triangular matrix [30]. Multiplying (4) on the left with we have
(9) 
In SIC, the last symbol is estimated first as . Then the estimate is substituted to remove the interference term in when is being estimated. The procedure is continued until the first symbol is detected. That is, we have the following recursion:
(10) 
for .
It is known that SIC finds the closest vector if the distance from input vector to the lattice is less than half the length of the shortest GramSchmidt vector. In other words, for SIC the minimum distance from a lattice point to the boundary of the decision region is given by
(11) 
Here the GramSchmidt vectors corresponding to a basis ,…, are the vectors ,…, where is the projection of orthogonal to the vector space generated by ,…,. These are the vectors found by the GramSchmidt algorithm for orthogonalization.
In order to quantify the worstcase loss in the minimum squared Euclidean distance relative to infinite lattice decoding (ILD), [15] defined the proximity factor for SIC
(12) 
and proved that under LLL reduction
(13) 
where is a parameter associated with LLL reduction [31]. Meanwhile, if one applies dual KZ reduction, then [15]
(14) 
Obviously, the gap to optimum decoding widens with , although SIC has very low complexity excluding the QR decomposition.
Iii Randomized Lattice Decoding
Klein [18] proposed a randomized algorithm that pushed up the minimum distance to
The parameter could take any value, but it was only useful when , since in other regions Babai and Kannan’s algorithms would be more efficient. Its complexity is which is polynomial for fixed . Klein described his randomized algorithm in the recursive form, shown in Table I.
Function Klein 

1: if then 
2: return 
3: else 
4: Let be the projection of in the direction of 
5: 
6: 
7: 
8: return Klein 
9: end if 
In essence, Klein’s algorithm is a randomized version of SIC, where standard rounding in SIC is replaced by randomized rounding. Here, we rewrite it into the nonrecursive form more familiar to the communications community. It is summarized by the pseudocode of the function Rand_SIC in Table II. Rather than returning a vector in the lattice as in Table I, it returns the data estimate . We also assume that the preprocessing of (9) has been done, hence the input rather than . This will reduce the complexity since we will call it many times.
Function Rand_SIC 

1: for to do 
2: 
3: Rand_Round 
4: end for 
5: return 
The randomized SIC randomly samples a lattice point that is close to . To obtain the closest lattice point, one calls Rand_SIC times and chooses the closest among those lattice points returned, with a large . The function Rand_Round rounds randomly to an integer according to the following discrete Gaussian distribution [18]
(15) 
If is large, Rand_Round reduces to standard rounding (i.e., decision is confident); if is small, it make a guess (i.e., decision is unconfident).
Lemma 1
([18])
The proof of the lemma was given in [18] and is omitted here. The next lemma states the probability that Klein’s algorithm or Rand_SIC returns .
Lemma 2
([18]) Let be a vector in and be a vector in . The probability that Klein’s algorithm or Rand_SIC return is bounded by
(16) 
Proof. The proof of the lemma was given in [18] for the recursive Klein algorithm in Table I. Here, we give a more straightforward proof for Rand_SIC. Let and consider the invocation of Rand_SIC. Using Lemma 1 and (15), the probability of is at least
(17) 
as . By multiplying these probabilities, we obtain a lower bound on the probability that Rand_SIC returns
(18) 
So the probability is as stated in Lemma 2.
A salient feature of (16) is that the closest lattice point is the most likely to be sampled. Particularly, it resembles the Gaussian distribution. The closer is to , the more likely it will be sampled.
Klein suggested and showed that the probability of returning is
(19) 
The significance of lattice reduction can be seen here, as increasing will increase the probability (19).
As lattice reductionaided decoding normally ignores the boundary of the constellation, the samples returned by Rand_SIC come from an extended version of the original constellation. In the final step, we need to remove those samples that happen to lie outside the boundary of the original constellation and choose the closest among the rest lattice points.
Iv Analysis and Optimization
The list size is often limited in communications. Given , Klein’s choice the parameter is not necessarily optimum. In this Section, we want to answer the following questions about randomized lattice decoding:

Given , what is the optimum value of ?

Given and associated optimum , how much is the gain in decoding performance?

What is the limit of randomized lattice decoding?
Indeed, there exists an optimum value of when is finite, since means uniform sampling of the entire lattice while means Babai’s algorithm. We shall present an approximate analysis of optimum for a given in the sense of maximizing the minimum distance , and then estimate the decoding gain. The analysis is not exact since it is based on the minimum distance only; nonetheless, it serves as a useful guideline to determine these parameters in practical implementation of Klein’s algorithm.
Iva Optimum Parameter
The choice of parameter has a significant impact on the probability Rand_SIC returns . Let , where (so that ) is the parameter that is to be optimized. Then we have . We use this bound to estimate :
(20)  
Hence
(21)  
With this choice of parameter , (16) is lowerbounded by
(22) 
Now, let be a point in the lattice, with . With calls to Klein’s algorithm, the probability of missing is not larger than . Therefore, any such lattice point is found with probability . From (22), we obtain
(23) 
calls to Rand_SIC can find the closest vector point if the distance from input vector to the lattice is less than (such a point may not exist if is too small.) Of course, only matters when there are more than one such lattice point. In this sense, can be thought of as the bounded distance of Rand_SIC. We point out that itself is not exactly the minimum distance, and it could be larger than , the minimum distance of the ML decoder, but we are mostly interested in the case for complexity reasons. Moreover, gives a tractable measure to optimize. For this reason, defining the pseudo minimum distance , we can derive from (23)
(24) 
It is natural that is chosen to maximize the value of for the best decoding performance. Let the derivative of the right side of (24) with respect to be zero:
(25) 
Because , we have
(26) 
Consequently, the optimum can be determined from the following equation
(27) 
By substituting (27) back into (24), we get
(28) 
To further see the relation between and , we calculate the derivative of the function , with respect to . It follows that
Hence
Therefore, is a monotonically decreasing function when . Then, we can check that a large value of is required for a small list size , while has to be decreased for a large list size . It is easy to see that Klein’s choice of parameter , i.e., , is only optimum when . If we choose to reduce the implementation complexity, then .
IvB Effect of K on Performance Gain
We shall determine performance gain of randomized lattice decoding over Babai’s decoding. Following [15], we see than the gain in squared minimum distance
Since is just the pseudo minimum distance, this estimate of can be optimistic. From (11) and (28), we get
(29) 
By substituting (29) in (27), we have
(30) 
Equation (30) reveals the relation between and . Larger requires larger . For fixed performance gain , the computational complexity of randomized lattice decoding is polynomial in .
Table III shows the computational complexity required to achieve the performance gain from dB to dB. It can be seen that, if is moderate and if is not too big, is affordable to recover a significant fraction of SNR loss relative to ML decoding.
Gain in dB  Complexity  

To achieve nearML performance, should be approximately equal to the proximity factor. It is known that the real gap to ML decoding is much smaller the worsecase bounds (13) and (14). Thus, we can run simulations to estimate the gap, which is often less than 10 dB when is not too large. Therefore, nearML performance is achievable for small to moderate values of if the following condition is satisfied:
(31) 
Then, we can determine the list size from (27).
We point out that Table III should be used with caution, as the estimate of is optimistic. The real gain certainly cannot be larger than the gap to ML decoding. Moreover, the closer Klein’s algorithm performs to ML decoding, the more optimistic the estimate will be. This is because the minimum distance alone does not precisely characterize the performance.
IvC Limits
Random lattice decoding has its limits. Because equation (29) only holds when , we must have . Obviously as (i.e., ). From (28), as . That is, we can achieve boundeddistance decoding for at the complexity . Albeit its exponential complexity, this is actually more encouraging than Klein’s original analysis of the complexity, which is for .
On the other hand, if we do want to achieve , randomized lattice decoding will not be useful. This is because () for , i.e., it reduces to uniform sampling. One can still apply Klein’s choice , but it will be less efficient than uniform sampling, even if is superexponential in . Therefore, as , random lattice decoding might be even worse than sphere decoding if one sticks to ML decoding.
V Implementation
In this Section, we address several issues of implementation. In particular, we propose an efficient implementation of Klein’s decoder, extend it to complexvalued lattices, and to MMSE.
Va Efficient Randomized Rounding
The core of Klein’s decoder is the randomized rounding with respect to discrete Gaussian distribution (15). Unfortunately, it can not be generated by simply quantizing the continuous Gaussian distribution. A rejection algorithm is given in Exercise of [32] to generate a random variable with the discrete Gaussian distribution from the continuous Gaussian distribution, which is efficient only when the variance is large. From (15), the variance in our problem is less than . From the analysis in Section IV, we recognize that can be large, especially for small . Therefore, the implementation complexity can be high.
Here, we propose an efficient implementation of random rounding by truncating the discrete Gaussian distribution and prove the accuracy of this truncation. Efficient generation of results in high decoding speed.
In order to generate the random integer with distribution (15), a naive way is to calculate the cumulative distribution function
(32) 
Obviously, . Therefore, we generate a realvalued random number that is uniformly distributed on ; then we let if . A problem is that this has to be done online, since depends on and . The implementation complexity can be high, which will slow down decoding.
We now try to find a good approximation to distribution (15). Write , where . Let . Distribution (15) can be rewritten as follows
(33) 
where is an integer and
Because , for every invocation of Rand_Round, we have . We use this bound to estimate the probability that is rounded to the integer set . Now the probability that is not one of these points can be bounded as
(34)  
But, since and , we have
(35)  
Hence
(36) 
Since , the tail bound (35) decays very fast. Consequently, it is almost sure that a call to Rand_Round returns an integer in as long as is not too small.
Therefore, we can approximate distribution (15) by point discrete distribution as follows.
(37) 
where
Fig. 2 shows the distribution (15), when and . The distribution of tends to concentrate at and with probability 0.9 and 0.08 respectively. Fig. 3 compare the bit error rates associated with different for an uncoded () system with . It is seen that the choice of is indistinguishable from larger . In fact, it is often adequate to choose a 3point approximation as the probability in the central 3 points is almost 1.
The following lemma provides a theoretical explanation to the above observations.
Lemma 3
Let () be the nontruncated discrete Gaussian distribution, and be the truncated point distribution. Then the statistical distance between and satisfies:
As a consequence, the statistical distance between the tuples of distributions used by calls to Klein’s algorithm corresponding to the nontruncated and truncated Gaussians is . An important property of the statistical distribution is that an algorithm behaves similarly if fed two nearby distributions. More precisely, if the output satisfies a property with probability when the algorithm uses a distribution , then the property is still satisfied with probability if fed instead of (see [33, Chap. 8]).
VB Complex Randomized Lattice Decoding
Since the traditional lattice formulation is only directly applicable to a realvalued channel matrix, the randomized lattice decoding was given for the realvalued equivalent of the complexvalued channel matrix. This approach doubles the channel matrix dimension and may lead to higher complexity. From the complex lattice viewpoint [34], we study the complex randomized lattice decoding. The advantage of this algorithm is that it reduces the computational complexity by incorporating complex LLL reduction [34].
Due to the orthogonality of real and imaginary part of the complex subchannel, real and imaginary part of the transmit symbols are decoded in the same step. This allows us to derive complex randomized lattice decoding by performing randomized rounding for the real and imaginary parts of the received vector separately.
In this sense, given the real part of input , the randomized lattice decoding returns real part of with probability
(38) 
Similarly, given the imaginary part of input , the randomized lattice decoding returns imaginary part of with probability
(39) 
By multiplying these two probabilities, we get a lower bound on the probability that the complex randomized lattice decoding returns
(40)  
Let , where . Along the same line of the analysis in the preceding Section, we can easily obtain
(41) 
Given calls, inequality (41) implies the choice of the optimum value of :
(42) 
and minimum distance of complex randomized lattice decoding
(43) 
Let us compare with the dimensional real randomized lattice decoding
(44) 
We have
(45) 
which means real randomized lattice decoding and complex randomized lattice decoding have the same decision region. They also have the same parameter for the same .
VC MMSEBased Randomized Lattice Decoding
The MMSE detector takes the SNR term into account and thereby leading to an improved performance. As shown in [17], MMSE detector is equal to ZF with respect to an extended system model. To this end, we define the extended channel matrix and the extended receive vector by
This viewpoint allows us to incorporate the MMSE criterion in the real and complex randomized lattice decoding schemes.
VD Other issues
Each call to Rand_SIC incurs complexity only. Thus, the complexity of randomized lattice decoding is , excluding preprocessing (lattice reduction and QR decomposition). Meanwhile, randomized lattice decoding allows for fully parallel implementation, since the samples can be taken independently from each other. Thus the decoding speed could be as high as that of a standard latticereductionaided decoder if it is implemented in parallel.
Since Klein’s decoding is random, there is a small chance that all the samples are further than the Babai point. Therefore, it is worthwhile always running Babai’s algorithm in the very beginning.
The call can be stopped as soon as the nearest sample point found has distance .
Vi Simulation Results
This section examines the performance of randomized lattice decoding. We assume perfect channel state information at the receiver. For comparison purposes, the performances of Babai’s decoding, lattice reduction aided MMSESIC decoding and ML decoding are also shown.
Fig. 3 shows the bit error rate for an uncoded system with , QAM and LLL reduction (=0.99). Observe that even with samples ( dB), the performance of the real Klein’s decoding enhanced by LLL reduction is considerably better (by dB) than that of Babai’s decoding. MMSEbased real Klein’s decoding can achieve further improvement of dB. We found that ( dB) is sufficient for Real MMSEbased Klein’s decoding to obtain nearoptimum performance for uncoded systems with ; the SNR loss is less than dB. The complex version of MMSE Klein’s decoding suffers about dB loss at a BER of when compared to the real version. Note that the complex LLL algorithm has half of the complexity of real LLL algorithm. At high dimensions, the real LLL algorithm seems to be slightly better than complex LLL, although their performances are indistinguishable at low dimensions [34].
Fig. 5 and Fig. 6 show the achieved performance of randomized lattice decoding for the Golden code [35] using QAM and Perfect code using QAM [8]. The decoding lattices are of dimension 8 and 32 in the real space, respectively. In Fig. 5, the real MMSEbased Klein decoder with ( dB) enjoys 2dB gain. In Fig. 6, the complex MMSEbased Klein decoder with ( dB), ( dB) and ( dB) enjoys 3dB, 4dB and 5dB gain respectively. It again confirms that the proposed randomized lattice decoding bridges the gap to ML performance. Reference [19] proposed a decoding scheme for the Golden code that suffers a loss of dB with respect to ML decoding, i.e., the performance is about the same as that of LRMMSESIC. These experimental results are expected, as LLL reduction has been shown to increase the probability of finding the closest lattice point. Also, increasing the list size available to the decoder improves its performance gain. Varying the number of samples allows us to negotiate a tradeoff between performance and computational complexity.
Fig. 7 compares the average complexity of Babai’s decoding, Klein’s decoding and sphere decoding for uncoded MIMO systems using QAM. The channel matrix remains constant throughout a block of length and the preprocessing is only performed once at the beginning of each block. For the preprocessing, the effective LLL reduction has average complexity [36], and the LLL algorithm can output the matrices and of the QR decomposition. It can be seen that the average flops with Klein’s decoding increases slowly with the dimension, while the average flops of sphere decoding is exponential in dimension. The computational complexity gap between Klein’s decoding and Babai’s decoding is nearly constant for fixed .
Vii Conclusions
In this paper, we studied samplingbased randomized lattice decoding where the standard rounding in SIC is replaced by random rounding. We refined the analysis of Klein’s algorithm and applied it to uncoded and coded MIMO systems. In particular, given the number of samples , we derived the optimum parameter to maximize the pseudo minimum distance , thereby optimizing the performance of Klein’s randomized decoding algorithm. For fixed performance gain, we proved that the value of retains the polynomial complexity of the decoding scheme. We also proposed an efficient implementation of random rounding which exhibits indistinguishable performance, supported by the statistical distance argument for the truncated discrete Gaussian distribution. The simulations conducted verified that the performance of the proposed randomized decoding is superior to that of Babai’s decoding. With the new approach, a significant fraction of the gap to ML decoding can be recovered for practical values of . It is particularly easy to recover the first 3 dB loss of Babai’s decoding, which needs samples only. The computational structure of the proposed decoding scheme is straightforward and allows for an efficient parallel implementation.
References
 [1] R. Kannan, “Minkowski’s convex body theorem and integer programming,” Math. Oper. Res., vol. 12, pp. 415–440, Aug. 1987.
 [2] G. Hanrot and D. Stehlé, “Improved analysis of Kannan’s shortest vector algorithm,” in Crypto 2007, Santa Barbara, California, USA, Aug. 2007.
 [3] D. Micciancio and P. Voulgaris, “A deterministic single exponential time algorithm for most lattice problems based on Voronoi cell computations,” in To appear in the proceedings of STOC’10, 2010. [Online]. Available: http://eccc.hpiweb.de/report/2010/014/
 [4] M. O. Damen, H. E. Gamal, and G. Caire, “On maximum likelihood detection and the search for the closest lattice point,” IEEE Trans. Inf. Theory, vol. 49, pp. 2389–2402, Oct. 2003.
 [5] E. Viterbo and J. Boutros, “A universal lattice code decoder for fading channels,” IEEE Trans. Inf. Theory, vol. 45, pp. 1639–1642, Jul. 1999.
 [6] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, “Closest point search in lattices,” IEEE Trans. Inf. Theory, vol. 48, pp. 2201–2214, Aug. 2002.
 [7] J. Jalden and B. Ottersen, “On the complexity of sphere decoding in digital communications,” IEEE Trans. Signal Process., vol. 53, pp. 1474–1484, Apr. 2005.
 [8] F. Oggier, G. Rekaya, J.C. Belfiore, and E. Viterbo, “Perfect space time block codes,” IEEE Trans. Inf. Theory, vol. 52, pp. 3885–3902, Sep. 2006.
 [9] E. Biglieri, Y. Hong, and E. Viterbo, “On fastdecodable spacetime block codes,” IEEE Trans. Inf. Theory, vol. 55, pp. 524–530, Feb. 2009.
 [10] L. Babai, “On Lovász’ lattice reduction and the nearest lattice point problem,” Combinatorica, vol. 6, no. 1, pp. 1–13, 1986.
 [11] H. Yao and G. W. Wornell, “Latticereductionaided detectors for MIMO communication systems,” in Proc. Globecom’02, Taipei, China, Nov. 2002, pp. 17–21.
 [12] C. Windpassinger and R. F. H. Fischer, “Lowcomplexity nearmaximumlikelihood detection and precoding for MIMO systems using lattice reduction,” in Proc. IEEE Information Theory Workshop, Paris, France, Mar. 2003, pp. 345–348.
 [13] M. Taherzadeh, A. Mobasher, and A. K. Khandani, “LLL reduction achieves the receive diversity in MIMO decoding,” IEEE Trans. Inf. Theory, vol. 53, pp. 4801–4805, Dec. 2007.
 [14] X. Ma and W. Zhang, “Performance analysis for VBLAST systems with latticereduction aided linear equalization,” IEEE Trans. Commun., vol. 56, pp. 309–318, Feb. 2008.
 [15] C. Ling, “On the proximity factors of lattice reductionaided decoding,” IEEE Trans. Signal Process., submitted for publication. [Online]. Available: http://www.commsp.ee.ic.ac.uk/~cling/
 [16] J. Jalden and P. Elia, “LRaided MMSE lattice decoding is DMT optimal for all approximately universal codes,” in Proc. Int. Symp. Inform. Theory (ISIT’09), Seoul, Korea, 2009.
 [17] D. Wuebben, R. Boehnke, V. Kuehn, and K. D. Kammeyer, “Nearmaximumlikelihood detection of MIMO systems using MMSEbased lattice reduction,” in Proc. IEEE Int. Conf. Commun. (ICC’04), Paris, France, Jun. 2004, pp. 798–802.
 [18] P. Klein, “Finding the closest lattice vector when it’s unusually close,” Proc. ACMSIAM Symposium on Discrete Algorithms, pp. 937–941, 2000.
 [19] G. R.B. Othman, L. Luzzi, and J.C. Belfiore, “Algebraic reduction for the Golden code,” in IEEE Int. Conf. Commun. (ICC’09), Dresden, Germany, Jun. 2009.
 [20] L. Luzzi, G. R.B. Othman, and J.C. Belfiore, “Augmented lattice reduction for MIMO decoding,” 2010, submitted. [Online]. Available: http://arxiv.org/abs/1001.1625
 [21] D. W. Waters and J. R. Barry, “The Chase family of detection algorithms for multipleinput multipleoutput channels,” IEEE Trans. Signal Process., vol. 56, pp. 739–747, Feb. 2008.
 [22] C. Windpassinger, L. H.J. Lampe, and R. F. H. Fischer, “From latticereductionaided detection towards maximumlikelihood detection in MIMO systems,” in Proc. Int. Conf. Wireless and Optical Communications, Banff, Canada, Jul. 2003.
 [23] K. Su and F. R. Kschischang, “Cosetbased lattice detection for MIMO systems,” in Proc. Int. Symp. Inform. Theory (ISIT’07), Jun. 2007, pp. 1941–1945.
 [24] J. Choi and H. X. Nguyen, “SIC based detection with list and lattice reduction for MIMO channels,” IEEE Trans. Veh. Technol., vol. 58, pp. 3786–3790, Sep. 2009.
 [25] H. Najafi, M. E. D. Jafari, and M. O. Damen, “On the robustness of lattice reduction over correlated fading channels,” 2010, submitted. [Online]. Available: http://www.ece.uwaterloo.ca/~modamen/submitted/Journal_TWC.pdf
 [26] P. Q. Nguyen and T. Vidick, “Sieve algorithms for the shortest vector problem are practical,” J. of Mathematical Cryptology, vol. 2, no. 2, 2008.
 [27] C. Gentry, C. Peikert, and V. Vaikuntanathan, “Trapdoors for hard lattices and new cryptographic constructions,” in 40th Annual ACM Symposium on Theory of Computing, Victoria, Canada, 2008, pp. 197–206.
 [28] P. M. Gruber and C. G. Lekkerkerker, Geometry of Numbers. Amsterdam, Netherlands: Elsevier, 1987.
 [29] J. W. S. Cassels, An Introduction to the Geometry of Numbers. Berlin, Germany: SpringerVerlag, 1971.
 [30] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, UK: Cambridge University Press, 1985.
 [31] A. K. Lenstra, J. H. W. Lenstra, and L. Lovász, “Factoring polynomials with rational coefficients,” Math. Ann., vol. 261, pp. 515–534, 1982.
 [32] L. Devroye, NonUniform Random Variate Generation. New York: SpringerVerlag, 1986, pp. 117.
 [33] D. Micciancio and S. Goldwasser, Complexity of Lattice Problems: A Cryptographic Perspective. Boston: Kluwer Academic, 2002.
 [34] Y. H. Gan, C. Ling, and W. H. Mow, “Complex lattice reduction algorithm for lowcomplexity fulldiversity MIMO detection,” IEEE Trans. Signal Process., vol. 57, pp. 2701–2710, Jul. 2009.
 [35] J.C. Belfiore, G. Rekaya, and E. Viterbo, “The Golden code: A 2 x 2 fullrate spacetime code with nonvanishing determinants,” IEEE Trans. Inf. Theory, vol. 51, pp. 1432–1436, Apr. 2005.
 [36] C. Ling and N. HowgraveGraham, “Effective LLL reduction for lattice decoding,” in Proc. Int. Symp. Inform. Theory (ISIT’07), Nice, France, Jun. 2007.