diff --git a/README.md b/README.md
index b7f6457..d2506b6 100644
--- a/README.md
+++ b/README.md
@@ -1,43 +1,41 @@
-# LLLplus
+# LLLplus.jl
 
 [![Build Status](https://travis-ci.org/christianpeel/LLLplus.jl.svg?branch=master)](https://travis-ci.org/christianpeel/LLLplus.jl)
+<!---
 [![LLLplus](http://pkg.julialang.org/badges/LLLplus_release.svg)](http://pkg.julialang.org/?pkg=LLLplus&ver=0.7)
-
-Lattice reduction and related lattice tools are used in wireless
-communication, cryptography, and mathematics.  This package provides
-the following tools: Lenstra-Lenstra-Lovacsz (LLL) lattice reduction,
-Seysen lattice reduction, a sphere decoder, and VBLAST matrix
-decomposition. This package was created as a way to explore the Julia
-language, and is not intended to be a cannonical tool for lattice
-reduction; *use at your own risk!* :-)
-
-[LLL](https://en.wikipedia.org/wiki/Lenstra%E2%80%93Lenstra%E2%80%93Lov%C3%A1sz_lattice_basis_reduction_algorithm) [1]
-lattice reduction is a powerful tool in computer science that is used
-to cryptanalysis of public-key systems, to solve quadratic equations,
-and to solve other linear problems such as in multi-terminal wireless.
-The LLL is often used as a bounded-complexity approximate solution to
-the
+-->
+
+Lattice reduction and related lattice tools are used in 
+cryptography, digital communication, and integer programming.  LLLplus
+includes Lenstra-Lenstra-Lovacsz (LLL) lattice reduction, Seysen
+lattice reduction, VBLAST matrix decomposition, and a closest vector
+problem (CVP) solver.
+
+[LLL](https://en.wikipedia.org/wiki/Lenstra%E2%80%93Lenstra%E2%80%93Lov%C3%A1sz_lattice_basis_reduction_algorithm)
+[1] lattice reduction is a powerful tool that is widely used in
+cryptanalysis, in cryptographic system design, in digital
+communications, and to solve other integer problems.  LLL reduction is
+often used as an approximate solution to the
 [shortest vector problem](https://en.wikipedia.org/wiki/Lattice_problem#Shortest_vector_problem_.28SVP.29)
-(SVP).
-Seysen [2] introduced a lattice reduction which focuses on global
-optimization rather than local optimization as in LLL.
-The
-[closest vector problem](https://en.wikipedia.org/wiki/Lattice_problem#Closest_vector_problem_.28CVP.29)
-(CVP) is related to the SVP; in the context of multi-antenna decoding
-it is referred to as
-[sphere-decoding](https://en.wikipedia.org/wiki/Lattice_problem#Sphere_decoding).
-
-Finally, we include code to do a
-[V-BLAST](https://en.wikipedia.org/wiki/Bell_Laboratories_Layered_Space-Time)
-(Vertical-Bell Laboratories Layered Space-Time) matrix
-decomposition. This decomposition is used in a detection algorithm [3]
-for decoding spatially-multiplexed streams of data on multiple
-antennas or other multi-terminal systems. V-BLAST is not as widely
-used outside of the wireless communication community as lattice
-reduction and CVP techniques such as the sphere decoder. In the Julia
-package [MUMIMO.jl](https://github.com/christianpeel/MUMIMO.jl) we use
-the LLL, sphere-decoder, and V-BLAST functions in this package to
-decode multi-user, multi-antenna signals.
+(SVP).  We also include Gauss/Lagrange 2-dimensional lattice reduction
+as a learning tool, and Seysen [2] lattice reduction which
+simultaneously reduces a basis `B` and its dual `inv(B)'`. The LLL and
+Seysen algorithms are based on [3]. The
+[CVP](https://en.wikipedia.org/wiki/Lattice_problem#Closest_vector_problem_.28CVP.29)
+solver is based on [4] and can handle lattices and bounded integer
+constellations.
+<!-- Many GPS researchers refer to the CVP as the "integer least squares" -->
+<!-- problem, while digital communication researchers often call it the -->
+<!-- [sphere-decoder](https://en.wikipedia.org/wiki/Lattice_problem#Sphere_decoding). -->
+
+We also include code to do a
+[Vertical-Bell Laboratories Layered Space-Time](https://en.wikipedia.org/wiki/Bell_Laboratories_Layered_Space-Time)
+(V-BLAST) [5] matrix decomposition which is used in digital
+communications. In digital communications (see
+[MUMIMO.jl](https://github.com/christianpeel/MUMIMO.jl)) the LLL,
+Seysen, V-BLAST, and CVP functions are used to solve (exactly or
+approximately) CVP problems in encoding and decoding multi-terminal
+signals.
 
 ### Examples
 
@@ -52,19 +50,19 @@ using LLLplus
 N = 1000;
 H = randn(N,N) + im*randn(N,N);
 println("Testing LLL on $(N)x$(N) complex matrix...")
-@time (B,T,Q,R) = lll(H);
+@time B,T,Q,R = lll(H);
 M = 200;
 println("Testing VBLAST on $(M)x$(M) chunk of same matrix...")
-@time (W,P,B) = vblast(H[1:M,1:M]);
+@time W,P,B = vblast(H[1:M,1:M]);
 
 # Time LLL, Seysen decompositions of a 100x100 Int64 matrix with
 # rand entries distributed uniformly between -100:100
 N = 100;
 H = rand(-100:100,N,N);
 println("Testing LLL on $(N)x$(N) real matrix...")
-@time (B,T,Q,R) = lll(H);
+@time B,T,Q,R = lll(H);
 println("Testing Seysen on same $(N)x$(N) matrix...")
-@time (B,T) = seysen(H);
+@time B,T = seysen(H);
 ```
 
 ### Execution Time results
@@ -76,31 +74,42 @@ In the tests we time execution of the lattice-reduction functions,
 average the results over multiple random matrices, and show results as
 a function of the size of the matrix and of the data type. 
 
-We first show how the time varies with matrix size (1,2,4,...64); the
+We first show how the time varies with matrix size (1,2,4,...256); the
 vertical axis shows execution time on a logarithmic scale; the x-axis
 is also logarithmic. The generally linear nature of the LLL curve supports
 the polynomial-time nature of the algorithm. Each data point
-is the average of execution time of 10 runs of a lattice-reduction
+is the average of execution time of 40 runs of a lattice-reduction
 technique, where the matrices used were generated using 'randn' to
 emulate unit-variance Gaussian-distributed values.
-![Time vs matrix size](benchmark/perfVsNfloat32.png)
+
+![Time vs matrix size](benchmark/perfVsNfloat64.png)
 
 Though the focus of the package is on floating-point, 
 all the modules can handle a variety of data types. In the next figure
 we show execution time for several datatypes (Int32, Int64,
-Int128, Float64, BitInt, and BigFloat) which are used to
-generate 100 4x4 matrices, over which execution time for the lattice
+Int128, Float32, Float64, DoublFloat, BitInt, and BigFloat) which are used to
+generate 40 128x128 matrices, over which execution time for the lattice
 reduction techniques is averaged.  The vertical axis is a logarithmic
 representation of execution time as in the previous
-figure. ![Time vs data type](benchmark/perfVsDataTypeN16.png)
+figure.
+
+![Time vs data type](benchmark/perfVsDataType.png)
+
+The algorithm pseudocode in the monograph [6] and the survey paper [3]
+were very helpful in writing the lattice reduction tools in LLLplus
+and are a good resource for further study.
+
+LLLplus does not pretend to be a tool for number-theory work. For
+that, see the [Nemo.jl](https://github.com/wbhart/Nemo.jl) package
+which uses the [FLINT](http://flintlib.org/) C library to do LLL
+reduction on Nemo-specific data types.
 
 ### Future
 
 Possible improvements include:
-* Change from "LLLplus" to "LLLtoy" or some such to emphasize
-  the nature of this package.
 * Add Block-Korkin-Zolotarev lattice redution, with improvements
-    as in [4], and Brun lattice reduction 
+  as in [7], code for Babai's simple CVP approximation [8], and
+  Brun's integer relation decomposition.
 * The [SVP](http://www.latticechallenge.org/svp-challenge/) Challenge
   and the
   [Ideal](http://www.latticechallenge.org/ideallattice-challenge/)
@@ -109,26 +118,33 @@ Possible improvements include:
   performance tests. The main
   [Lattice](http://www.latticechallenge.org/) Challenge also lists
   references which could be used to replicate tests.
-* Compare with the [Number Theory Library](http://www.shoup.net/ntl/).
+* Compare with the [fplll](https://github.com/fplll/fplll) library,
+  the [Number Theory Library](http://www.shoup.net/ntl/), and
+  NEMO/FLINT. 
+<!-- * Make additions to -->
+<!--   [ArbFloats](https://github.com/JuliaArbTypes/ArbFloats.jl) and -->
+<!--   [IntervalArithmetic](https://github.com/eeshan9815/IntervalArithmetic.jl) -->
+<!--   to enable them to work. -->
 
 ### References
 
-[1] Lenstra, A. K.; Lenstra, H. W., Jr.; Lovász, L. (1982). "Factoring
-polynomials with rational coefficients". Mathematische Annalen 261
-(4): 515–534.
+[1] A. K. Lenstra; H. W. Lenstra Jr.; L. Lovász, ["Factoring polynomials with rational coefficients"](http://ftp.cs.elte.hu/~lovasz/scans/lll.pdf). Mathematische Annalen 261, 1982
+
+[2] M. Seysen, ["Simultaneous reduction of a lattice basis and its reciprocal basis"](http://link.springer.com/article/10.1007%2FBF01202355) Combinatorica, 1993.
+
+[3] D. Wuebben, D. Seethaler, J. Jalden, and G. Matz, ["Lattice Reduction - A Survey with Applications in Wireless Communications"](http://www.ant.uni-bremen.de/sixcms/media.php/102/10740/SPM_2011_Wuebben.pdf). IEEE Signal Processing Magazine, 2011.
+
+[4] A. Ghasemmehdi, E. Agrell, ["Faster Recursions in Sphere Decoding"](https://publications.lib.chalmers.se/records/fulltext/local_141586.pdf) IEEE
+Transactions on Information Theory, vol 57, issue 6 , June 2011.
+
+[5] P. W. Wolniansky, G. J. Foschini, G. D. Golden, R. A. Valenzuela, ["V-BLAST: An Architecture for Realizing Very High Data Rates Over the Rich-Scattering Wireless Channel"](http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=738086). Proc. URSI
+ISSSE: 295–300, 1998. 
 
-[2] M. Seysen,
-["Simultaneous reduction of a lattice basis and its reciprocal basis"]
-(http://link.springer.com/article/10.1007%2FBF01202355) Combinatorica,
-Vol 13, no 3, pp 363-376, 1993.
+[6] M. R. Bremner, ["Lattice Basis Reduction: An Introduction to the LLL
+ Algorithm and Its Applications"](https://www.amazon.com/Lattice-Basis-Reduction-Introduction-Applications/dp/1439807027) CRC Press, 2012.
 
-[3] P. W. Wolniansky, G. J. Foschini, G. D. Golden, R. A. Valenzuela
-(September 1998). ["V-BLAST: An Architecture for Realizing Very High
-Data Rates Over the Rich-Scattering Wireless Channel"]
-(http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=738086). Proc. URSI
-ISSSE: 295–300. 
+[7] Y. Chen, P. Q. Nguyen, ["BKZ 2.0: Better Lattice Security Estimates"](http://www.iacr.org/archive/asiacrypt2011/70730001/70730001.pdf). Proc. ASIACRYPT 2011.
 
-[4] Y. Chen, P. Q. Nguyen (2011) ["BKZ 2.0: Better Lattice Security Estimates"]
-(http://www.iacr.org/archive/asiacrypt2011/70730001/70730001.pdf).
-Proc. ASIACRYPT 2011.
+[8] L. Babai, ["On Lovász’ lattice reduction and the nearest lattice point problem"](https://link.springer.com/article/10.1007/BF02579403),
+Combinatorica, 1986.
 
diff --git a/benchmark/lrtest.jl b/benchmark/lrtest.jl
index 3228899..91f7c05 100644
--- a/benchmark/lrtest.jl
+++ b/benchmark/lrtest.jl
@@ -19,7 +19,7 @@ function lrtest(Ns::Int,N::Array{Int,1},L::Array{Int,1},
 
 # Packages that need to be loaded for lrtest to work include PyPlot and
 # BenchmarkTools
-    
+
 #lrAlgs = [lll, lllrecursive,seysen]
 lrAlgs = [lll, seysen]
 
@@ -35,16 +35,18 @@ if length(N)>1
     for s = 1:length(N)
         push!(out,lrsim(Ns,N[s],L[1],dataType[1], distType, lrAlgs));
     end
-    plotfun = PyPlot.loglog;
+    xscale=:log10;
+    yscale=:log10;
     xval = N;
     xlab = "Matrix Size";
-    tstr = @sprintf("Ns=%d,L=%d,Type=%s,dist=%s",
-                    Ns,L[1],string(dataType[1]),distType);
+    tstr = @sprintf("Ns=%d,Type=%s,dist=%s",
+                    Ns,string(dataType[1]),distType);
 elseif length(L)>1
     for s = 1:length(L)
         push!(out,lrsim(Ns,N[1],L[s],dataType[1], distType, lrAlgs));
     end
-    plotfun = semilogy;
+    xscale = :identity;
+    yscale = :log10;
     xval = L;
     xlab = "L";
     tstr = @sprintf("Ns=%d,N=%d,Type=%s,dist=%s",
@@ -53,11 +55,17 @@ elseif length(dataType)>1
     for s = 1:length(dataType)
         push!(out,lrsim(Ns,N[1],L[1],dataType[s], distType, lrAlgs));
     end
-    plotfun = semilogy;
+    xscale = :identity;
+    yscale = :log10;
     xval = 1:length(dataType)
     xtickStrs = map(string,dataType)
     xlab = "dataType";
     tstr = @sprintf("Ns=%d,N=%d,L=%d,dist=%s",Ns,N[1],L[1],distType);
+    for n=xval
+        if dataType[n]== DoubleFloat{Float64}
+            xtickStrs[n] = "Double64"
+        end
+    end
 end
 
 pColor = ["r-","b.-","k-","g-","c-","m-",
@@ -67,31 +75,33 @@ pIdx   = 1;
 
 Nout = size(out,1);
 
-f=figure(num=1,figsize=(6.5,4.5))
-clf()
-#plt.style[:use]("ggplot")
 
 times = zeros(Nout);
+orthf = zeros(Nout);
+pltT = plot(legend=:topleft);
+#pltO = plot(legend=:topleft);
+
 for a=1:length(out[1][2])
     for k=1:Nout
-        times[k] = out[k][1][a];
+        times[k] = out[k][2][a];
+        orthf[k] = out[k][3][a];
     end
-    plotfun(xval,times,pColor[pIdx],label=out[1][2][a]);
+    plot!(pltT,xval,times,xscale=xscale,yscale=yscale,label=out[1][1][a],linewidth=3);
+#    plot!(pltO,xval,orthf,xscale=xscale,yscale=yscale,label=out[1][1][a],linewidth=3);
     pIdx = pIdx==length(pColor) ? 1 : pIdx + 1;
 end
 
-xlabel(xlab);
-ylabel("execution time (sec)");
-legend(loc=2);
+plot!(pltT,xlabel=xlab,ylabel="execution time (sec)")
+#plot!(pltO,xlabel=xlab,ylabel="orthogonalization factor")
 
 if ~isempty(xtickStrs)
-    xticks(xval,xtickStrs)
+    plot!(pltT,xticks=(xval,xtickStrs))
+#    plot!(pltO,xticks=(xval,xtickStrs))
 end
-grid(figure=f,which="both",axis="y")
-grid(figure=f,which="major",axis="x")
 
+display(pltT)
+#display(pltO)
 
-return
 end
 
 #######################################################################
@@ -130,12 +140,12 @@ for ix = 1:Ns
         # CPUtic();
         # (B,T) = lrAlgs[ax](data);
         # times[ax,ix] = CPUtoq();
-        times[ax,ix] = @belapsed (B,T) = $lrAlgs[$ax]($data) samples=2 seconds=1
-        # detB = abs(det(B))
-        # # Hermite factor
-        # hermitef[ax,ix] = norm(B[:,1])/detB^(1/N)
-        # # Orthogonality defect
-        # orthf[ax,ix] = prodBi(B,N)/detB
+        times[ax,ix] = @elapsed (B,T) = lrAlgs[ax](data) 
+        detB = abs(det(B))
+        # Hermite factor
+        hermitef[ax,ix] = norm(B[:,1])/detB^(1/N)
+        # Orthogonality defect
+        orthf[ax,ix] = prodBi(B,N)/detB
 
         # if abs(abs(det(T))-1.0)>1e-6
         #     println("For $(lrAlgs[ax]), det(T) is $(abs(det(T)))")
@@ -152,7 +162,7 @@ end
 for ax = 1:length(lrAlgs)
     algNames[ax] = string(lrAlgs[ax]);
 end
-@printf("%8d %6d %6d %10s", Ns,  N,  L, string(dataType))
+@printf("%8d %6d %6d %10s", Ns,  N,  L, string(dataType)[1:min(end,10)])
 
 #outtimes = mean(times,2);
 outtimes = zeros(Nalgs,1);
@@ -162,25 +172,18 @@ mean(x) = sum(x)/length(x)
 for ax = 1:Nalgs
 #    stimes = sort(vec(times[ax,:]));
 #    outtimes[ax] = mean(stimes[1:floor(Ns*2/3)]);
+#    outtimes[ax] = minimum(times[ax,:]);
     outtimes[ax] = mean(times[ax,:]);
     outHF[ax] = mean(hermitef[ax,:]);
     outOF[ax] = mean(orthf[ax,:]);
 end
 
-# BOLD= "\x1B[1;30m"
-# RESET="\x1B[0;0m"
-# (mn,mx)=findmin(outtimes)
 for ax = 1:min(Nalgs,7)
-    # if ax==mx
-    #     @printf("    %s%10.4f%s",BOLD,outtimes[ax]*1000,RESET)
-    # else
-        @printf("    %10.4f",outtimes[ax]*1000)
-#        @printf("    %10.4f",outHF[ax])
-    # end 
+    @printf("    %10.4f",outtimes[ax]*1000)
 end
 @printf("\n")
 
-return outtimes, algNames
+return algNames, outtimes, outOF, outHF
 end
 
 #######################################################################
@@ -191,7 +194,7 @@ function prodBi(B,N)
         for m=1:N
             pn+=conj(B[m,n])*B[m,n]
         end
-        println("pn=$(pn), typeof(pn)=$(typeof(pn))")
+        # println("pn=$(pn), typeof(pn)=$(typeof(pn))")
         prod*=sqrt(pn)
     end
     return prod
diff --git a/benchmark/perfVsDataType.png b/benchmark/perfVsDataType.png
new file mode 100644
index 0000000..c6d65bc
Binary files /dev/null and b/benchmark/perfVsDataType.png differ
diff --git a/benchmark/perfVsNfloat32.png b/benchmark/perfVsNfloat32.png
deleted file mode 100644
index 53219ee..0000000
Binary files a/benchmark/perfVsNfloat32.png and /dev/null differ
diff --git a/benchmark/perfVsNfloat64.png b/benchmark/perfVsNfloat64.png
new file mode 100644
index 0000000..de2ce9f
Binary files /dev/null and b/benchmark/perfVsNfloat64.png differ
diff --git a/benchmark/perftest.jl b/benchmark/perftest.jl
index c00bea9..9ab0643 100644
--- a/benchmark/perftest.jl
+++ b/benchmark/perftest.jl
@@ -1,14 +1,15 @@
-using PyPlot
+using Plots
 using BenchmarkTools
 using LLLplus
 using Printf
 using Random
 using LinearAlgebra
+using DoubleFloats
 
 include("lrtest.jl")
 
-lrtest(5,2 .^[2],[100],[Int32,Int64,Int128,Float64,BigInt,BigFloat],"rand")
-savefig("benchmark/perfVsDataTypeN16.png") # run from root directory
+lrtest(40,2 .^[7],[100],[Int32,Int64,Int128,Float32,Float64,Double64,BigInt,BigFloat],"rand")
+savefig("perfVsDataType.png")
 
-lrtest(5,2 .^[0:5;],[1],[Float64],"randn")
-savefig("benchmark/perfVsNfloat32.png") # run from root directory
+lrtest(40,2 .^[0:8;],[1],[Float64],"randn")
+savefig("perfVsNfloat64.png")
diff --git a/src/LLLplus.jl b/src/LLLplus.jl
index c143999..6b648ec 100644
--- a/src/LLLplus.jl
+++ b/src/LLLplus.jl
@@ -1,16 +1,28 @@
 module LLLplus
 
 using LinearAlgebra
+using Printf
 
 export
     lll,
+    cvp,
+    svp,
+    gauss,
     seysen,
     vblast,
+    subsetsum,
+    integerfeasibility,
     hard_sphere
 
-include("lll.jl")
+include("lll.jl")          # lll, gauss
+include("cvp.jl")          # cvp, svp
 include("seysen.jl")
 include("vblast.jl")
-include("hard_sphere.jl")
+include("applications.jl") # subsetsum, integerFeasibility
+
+include("hard_sphere.jl")  # may be deprecated in future
+
+roundf(r::Td) where {Td<:Complex} = round(real(r)) + im*round(imag(r));
+roundf(r) = round(r);
 
 end # LLLplus module
diff --git a/src/applications.jl b/src/applications.jl
new file mode 100644
index 0000000..ccd93e4
--- /dev/null
+++ b/src/applications.jl
@@ -0,0 +1,148 @@
+"""
+    integerfeasibility(A,d,nullVecs=false)
+
+Given a linear system `Ax=d`, return an integer vector `x` which satisfies the
+system.  This is the *integer programming feasibility problem*.
+
+If `nullVecs==true`, then as well as returning a solution `x`, also return a
+matrix `xNull` of vectors in the null space of `A` which could be added to the
+`x` vector to find a solution which satisfies a constraint such as `0 .≤ x
+.≤ u`; see the paper below.
+
+*Solving A System Of Diophantine Equations With Bounds On The Variables* by
+Karen Aardal, Cor Hurkens, and Arjen Lenstra in Integer Programming and
+Combinatorial Optimization, 6th International IPCO Conference, vol 1412, pp
+229-242, 1998. See http://softlib.rice.edu/pub/CRPC-TRs/reports/CRPC-TR98782.pdf
+
+# Examples
+```jldoctest
+julia> A=[10 1 -9; 1 8 8]; xtrue=[0; 2; 9]; d=A*xtrue;
+julia> integerfeasibility(A,d)
+3-element Array{Int64,1}:
+ 0
+ 2
+ 9
+
+julia> n=20;m=30; A = rand(-10:10,n,m); xtrue = rand(0:10,m); d=A*xtrue;
+julia> sum(abs.(xtrue-integerfeasibility(A,d)))
+0
+
+```
+"""
+function integerfeasibility(A::AbstractArray{Td,2},
+                            d::AbstractArray{Td,1}, flag=false) where {Td<:Number}
+    m,n=size(A)
+    N1=10^3
+    N2=10^4
+    B = [[I           zeros(Td,n,1)];
+         zeros(Td,1,n) N1;
+         N2*A          -N2*d]
+
+    Bhat,_ = lll(B)
+    Bprime = Bhat[1:n+1,1:n-m+1]
+    x_d = Bprime[1:n,end]
+
+    if flag
+        xNull=Bprime[1:n,1:n-m]
+        return x_d,xNull
+    else
+        return x_d
+    end
+end
+
+
+
+"""
+    x = subsetsum(a,s)
+
+For a vector of integers `a`, and an integer `s`, try to find a binary
+vector `x` such that `x'*a=s`. We use the LLL algorithm to find the
+solution.
+
+This follows the technique described by Lagarias and Odlyzko  in 
+"Solving Low-Density Subset Sum Problems"  in Journal of ACM, Jan 1985.
+Code based on http://web.eecs.umich.edu/~cpeikert/lic15/lec05.pdf
+The original technique came from Lagarias and Odlyzko.
+
+It's odd that permuting the `a` vector in the second example given below
+causes the alg to often not find a solution. The example doesn't have the
+required density, so maybe we should be asking why it finds an answer at
+all.
+
+# Examples
+```jldoctest
+julia> a=[1,2,4,9,20,38]; s=30; x=subsetsum(a,s); s-x'*a
+0
+
+julia> a=[32771,65543,131101,262187,524387,1048759, # from Bremner p 117
+          2097523,4195057,8390143,16780259,33560539,
+          67121039,134242091,268484171,536968403];
+julia> s=891221976; x=subsetsum(a,s); s-x'*a
+0
+
+julia> N=40;a=rand(1:2^BigInt(256),N);xtrue=rand(Bool,N); s=a'*xtrue; 
+julia> setprecision(BigFloat,300); x=subsetsum(a,s); s-x'*a
+0
+```
+"""
+function subsetsum(a::AbstractArray{Ti,1},s::Ti) where {Ti<:Integer}
+    # page numbers below refer to lecture note above
+
+    n = length(a)
+    
+    flag = false;
+    if s<sum(a)/2
+        flag=true
+        s = sum(a)-s
+    end
+    # b is the "B" parameter defined at bottom of page 2
+    nt = Ti(n)
+    b = ceil(sqrt(nt*2^nt))
+    # BB is the 'B' matrix defined also at bottom of page 2
+    BB = [Matrix{Ti}(I,n,n+1); -b*a' b*s]
+    B,T = lll(BB);
+    
+    # CBP: just winging it from here...
+    # Pick off the interesting parts of the B matrix
+    Bp = B[1:n,1:n+1]
+    # Check if any of the columns satisfy the requirement
+    sums = abs.(a'*Bp)
+    ixMatch = findall(sums.==s)
+    # d = diag(Bp'*Bp)  # solution generally matches a small element of d
+
+    binarySolution = false
+    global xb
+    for ix=1:length(ixMatch)
+        # Pick off a column of B
+        x = Bp[:,ixMatch[ix][2]]
+        if !(any(x.<0)|any(x.>1))
+            binarySolution= true
+            xb = flag ? mod.(x .+ 1,2) : x
+        end
+        x = -x
+        if !(any(x.<0)|any(x.>1))
+            binarySolution= true
+            xb = flag ? mod.(x .+ 1,2) : x
+        end
+    end
+
+    if binarySolution
+        return xb
+    else
+        if maximum(a)<2^(n^2/2)
+            density = n/maximum(log2.(a))
+            @printf("The density (%4.2f) of the 'a' vector is not as low as required\n",
+                    density)
+            @printf("(%4.2f) for Lagarias-Odlyzko to work. We'll look for a ",2/n)
+            @printf("solution\nanyway... ")
+        end
+
+        if length(ixMatch)>=1
+            print("A non-binary solution was found\n")
+            return Bp[:,ixMatch[1][2]]
+        else
+            print("A solution was not found\n")
+        end
+    end
+    return NaN
+end
diff --git a/src/cvp.jl b/src/cvp.jl
new file mode 100644
index 0000000..6473c4c
--- /dev/null
+++ b/src/cvp.jl
@@ -0,0 +1,167 @@
+"""
+    x=cvp(y,H,infinite=Val{true},Umax=1)
+
+Solve the problem `argmin_x ||y-Hx||` for integer x using the technique from
+the paper below.  The input vector `y` is of length `n`, with `H` of
+dimension `n` by `n`, and the returned vector `x` of length `n`. If
+`finite==Val{false}` then we search the (infinite) lattice, otherwise we
+search integers in `[-Umax,Umax]`.  At present cvp does not handle complex
+numbers.
+
+Uses alg from "Faster Recursions in Sphere Decoding" Arash Ghasemmehdi, Erik
+Agrell, IEEE Transactions on Information Theory, vol 57, issue 6 , June 2011.
+
+# Examples
+```jldoctest
+julia> H=[1 2; 3 4]; Q,R=qr(H); uhat = cvp(Q'*[0,2],R)
+2-element Array{Float64,1}:
+  2.0
+ -1.0
+
+julia> n=100;H=randn(n,n);Q,R=qr(H);
+julia> u=Int.(rand(0:1e10,n));y=H*u+rand(n)/100;
+julia> uhat=cvp(Q'*y,R); sum(abs.(u-uhat))
+0.0
+
+julia> n=500;H=randn(n,n);Q,R=qr(H);
+julia> u=Int.(rand(-1:1,n));y=H*u+rand(n)/10;
+julia> uhat=cvp(Q'*y,R,Val{false},1); sum(abs.(u-uhat))
+0.0
+```
+"""
+function cvp(r::AbstractArray{Td,1},G::AbstractArray{Td,2},
+               infinite=Val{true},Umax=1) where {Td<:Number}
+    
+    roundGA(x) = roundFinite(x,Td(Umax))
+# nx=1
+    n = length(r)
+# mxNx=Int(ceil(log2(n)*100000))
+    C = Inf
+    i = n+1
+    d = ones(Int,n)*n
+    λ = zeros(n+1)
+    F = zeros(Td,n,n)
+    F[:,n] = r
+    p = zeros(n)
+    u = NaN*zeros(Int,n)
+    û = NaN*zeros(Int,n)
+    Δ = zeros(Int,n)
+    
+    begin
+        @label LOOP
+
+#        nx+=1
+        while λ[i]<C
+            if i != 1
+                i-=1
+                for j=d[i]:-1:i+1
+                    F[i,j-1] = F[i,j] - G[i,j]*u[j]
+                end
+                p[i] = F[i,i]/G[i,i]
+                if infinite==Val{true}
+                    u[i] = round(p[i])
+                else
+                    u[i] = roundGA(p[i])
+                end
+                y = (p[i]-u[i])*G[i,i]
+                Δ[i] = signGA(y)
+                λ[i] = λ[i+1]+y*y
+            else
+                û .= u
+                C = λ[1]
+            end
+        end
+        m = i
+
+        while λ[i]≥C
+#            if i==n || nx>mxNx
+            if i==n
+                # if nx>mxNx
+                #     @warn("terminated search after $(nx) iterations.")
+                # end
+                return û
+            else
+                i+=1
+                if infinite ≠ Val{true}
+                    y = typemax(Td)
+                end
+                u[i]+=Δ[i]
+                Δ[i]=-Δ[i]-signGA(Δ[i])
+                if infinite ≠ Val{true}
+                    if -Umax ≤ u[i] ≤ Umax
+                        y = (p[i]-u[i])*G[i,i]
+                    else
+                        u[i]+=Δ[i]
+                        Δ[i]=-Δ[i]-signGA(Δ[i])
+                        if -Umax ≤ u[i] ≤ Umax
+                            y = (p[i]-u[i])*G[i,i]
+                        end
+                    end
+                else
+                    y = (p[i]-u[i])*G[i,i]
+                end
+                λ[i] = λ[i+1]+y*y
+            end
+        end
+        d[m:i-1] .= i
+        for j=m-1:-1:1
+            if d[j]<i
+                d[j]=i
+            else
+                @goto LOOP
+            end
+        end
+        @goto LOOP
+    end
+end
+
+signGA(x) = x<=0 ? -one(x) : one(x)
+
+function roundFinite(x,Umax)
+    y=round(x)
+    return  abs(y)>Umax ?  Umax*signGA(y) : y
+end
+    
+
+
+
+"""
+    b = svp(B)
+
+Find the shortest basis vector `b` for the lattice formed by the matrix
+`B`. This solves the 'shortest vector problem' (SVP). 
+
+We use the [`cvp`](@ref) function in the library, which is not necessarily
+the fastest SVP solver.  Roughly follows the CVP-to-SVP reduction in
+http://web.eecs.umich.edu/~cpeikert/lic15/lec06.pdf
+
+# Examples
+```jldoctest
+julia> H=[1 2; 3 4]; svp(H)
+2-element Array{Int64,1}:
+ -1
+  1
+
+julia> H= BigFloat.([2.5 2; 3 4]); svp(H)
+2-element Array{BigFloat,1}:
+  0.50
+ -1.0 
+
+```
+"""
+function svp(B::AbstractArray{Td,2}) where Td
+    m,n= size(B)
+    V = zeros(Td,m,n)
+    c = zeros(Td,n)
+
+    for i = 1:n
+        Bi = copy(B)
+        Bi[:,i] *=2
+        Q,R = qr(Bi)
+        vc = cvp(Q'*B[:,i],R)
+        V[:,i] = Bi*vc-B[:,i]
+        c[i] = V[:,i]'*V[:,i]
+    end
+    idx = sortperm(c)
+    return V[:,idx[1]]
+end
diff --git a/src/hard_sphere.jl b/src/hard_sphere.jl
index 7739976..78289f6 100644
--- a/src/hard_sphere.jl
+++ b/src/hard_sphere.jl
@@ -12,7 +12,7 @@
 #   columns of X.
 #
 # Examples:
-#   X = hard_sphere([1 2]', [1 2; 3 4],2)
+#   X = hard_sphere([1; 2], [1 2; 3 4],2)
 #   X = hard_sphere(rand(0:20,2,15), [1 2; 3 4],2)
 # """ ->
 function hard_sphere(Y::AbstractArray{Td,2},H::AbstractArray{Td,2},Nc::Integer) where {Td}
diff --git a/src/lll.jl b/src/lll.jl
index e4b890a..4192c3a 100644
--- a/src/lll.jl
+++ b/src/lll.jl
@@ -1,25 +1,32 @@
-function lll(H::Array{Td,2},δ::Float64=3/4) where {Td}
-# (B,T,Q,R) = LLL(H,δ=3/4)
-#  Do Lenstra–Lenstra–Lovász lattice reduction of matrix H using optional
-#  parameter δ.  The output is B, an LLL-reduced basis; T, a unimodular
-#  (det(T)=+/-1) transformation matrix such that B= H*T; Q and R such that
-#  B=Q*R and Q is orthonormal, and R is upper triangular.  So H = Q*R*inv(T)
-#
-#  H can be of Integer, FloatingPoint, BigInt, or BigFloat types. The core
-#  algorithm is designed for floating-point.
-#
-# Example with large real matrix:
-#   N=500;H = randn(N,N); (B,T) = lll(H)
-# Example with 2x2 complex matrix:
-#   N=2;H = randn(N,N)+im*randn(N,N); (B,T) = lll(H)
-# Example with 2x2 BigInt matrix:
-#   N=2;H = ones(BigInt,N,N); rand!(H,-1e10:1e10); (B,T) = lll(H)
-
-# Follows LLL in D. Wueben, et al, MMSE-Based Lattice-Reduction for Near-ML
-# Detection of MIMO Systems International IEEE Workshop on Smart Antennas,
-# Munich, March 2004.
-
-# A few cycles can be saved by skipping updates of the Q matrix.
+"""
+    B,T,Q,R = LLL(H,δ=3/4)
+
+Do Lenstra–Lenstra–Lovász lattice reduction of matrix `H` using optional
+parameter `δ`.  The output is `B`, an LLL-reduced basis; `T`, a unimodular
+(meaning `det(T)=+/-1`) transformation matrix such that `B= H*T`; and
+finally `Q` and `R` which are a QR decomposition of `B`.  So `H = B*inv(T) =
+Q*R*inv(T)`.
+
+Follows D. Wuebben, et al, "Lattice Reduction - A Survey with Applications
+in Wireless Communications". IEEE Signal Processing Magazine, 2011. See
+[`subsetsum`](@ref) for an application of `lll`.
+
+# Examples
+```jldoctest
+julia> H= [1 2; 3 4];B,_ = lll(H); B
+2×2 Array{Int64,2}:
+ 1  -1
+ 1   1
+
+julia> H= BigFloat.([1.5 2; 3 4]) .+ 2im; B,_= lll(H); B
+2×2 Array{Complex{BigFloat},2}:
+ 0.50+0.0im  0.0+1.0im
+  1.0+0.0im  0.0+0.0im
+
+julia> N=500;H = randn(N,N); B,T = lll(H);
+```
+"""
+function lll(H::AbstractArray{Td,2},δ::Float64=3/4) where {Td<:Number}
 
 if !(0.25 < δ < 1.0)
     error("δ must be between 1/4 and 1.");
@@ -27,16 +34,28 @@ end
 
 B = copy(H);
 L = size(B,2);
-(Qt,R) = qr(B);
-Q = Matrix(Qt);
+Qt,R = qr(B);
+Q = Matrix(Qt); # A few cycles can be saved by skipping updates of the Q matrix.
 
-roundf(r::Td) where {Td<:Complex} = round(real(r)) + im*round(imag(r));
-roundf(r) = round(r);
+# of course these should be replaced with different methods... I'll do it
+# someday
+if Td<:BigInt || Td<:BigFloat
+    Ti = BigInt
+elseif Td==Float32 || Td==Int32
+    Ti=Int32
+elseif Td==Int16
+    Ti=Int16
+elseif Td==Int8
+    Ti=Int8
+else
+    Ti=Int
+end
+#Ti = (Td<:BigInt || Td<:BigFloat) ? BigInt : Int
 
 if Td<:Complex
-    T = Matrix{Complex{Int}}(I, L, L)
+    T = Matrix{Complex{Ti}}(I, L, L)
 else
-    T = Matrix{Int}(I, L, L)
+    T = Matrix{Ti}(I, L, L)
 end
 
 lx  = 2;
@@ -46,10 +65,13 @@ while lx <= L
     for k=lx-1:-1:1
         rk = R[k,lx]/R[k,k]
         mu = roundf(rk)
-        if abs(mu)>0
-            B[:,lx]   -= mu .* B[:,k]
-            R[1:k,lx] -= mu .* R[1:k,k]
-            T[:,lx]   -= mu .* T[:,k]
+        if abs(mu)>zero(Ti)
+            # B[:,lx]   -= mu * B[:,k]
+            # R[1:k,lx] -= mu * R[1:k,k]
+            # T[:,lx]   -= mu * T[:,k]
+            B[:,lx]   .-= mu .* view(B,:,k)
+            R[1:k,lx] .-= mu .* view(R,1:k,k)
+            T[:,lx]   .-= mu .* view(T,:,k)
         end
     end
 
@@ -57,26 +79,68 @@ while lx <= L
     if δ*abs(R[lx-1,lx-1])^2 > nrm^2
 
         # swap columns lx-1 and lx in B, T and R
-        B[:,[lx-1,lx]]    = B[:,[lx,lx-1]];
-        T[:,[lx-1,lx]]    = T[:,[lx,lx-1]];
-        R[1:lx,[lx-1,lx]] = R[1:lx,[lx,lx-1]];
+        # B[:,[lx-1,lx]]    = B[:,[lx,lx-1]]
+        # T[:,[lx-1,lx]]    = T[:,[lx,lx-1]]
+        # R[1:lx,[lx-1,lx]] = R[1:lx,[lx,lx-1]]
+        B[:,[lx-1,lx]]    .= view(B,:,[lx,lx-1])
+        T[:,[lx-1,lx]]    .= view(T,:,[lx,lx-1])
+        R[1:lx,[lx-1,lx]] .= view(R,1:lx,[lx,lx-1])
 
         # upper triangular by Givens rotation
-        # mult with matrix Θ achieves R[lx,lx-1] = 0
         cc = R[lx-1,lx-1] / nrm # nrm = ||R[lx-1:lx,lx-1]|| after swapping
         ss = R[lx,lx-1]   / nrm
         Θ = [cc' ss; -ss cc]
 
-        # Don't have BLAS working here yet
-        R[lx-1:lx,lx-1:end] = Θ * R[lx-1:lx,lx-1:end]
-        #gemm!('N','N',1.0,Θ,R[lx-1:lx,lx-1:end],0.0,R[lx-1:lx,lx-1:end])
-        Q[:,lx-1:lx] = Q[:,lx-1:lx] * Θ'
-        #gemm!('N','C', 1.0, Q[:,lx-1:lx], Θ, 0.0, Q[:,lx-1:lx])
+        # the following multiply by Θ results in R[lx,lx-1] = 0
+        R[lx-1:lx,lx-1:end] .= Θ * R[lx-1:lx,lx-1:end]
+        # Q[:,lx-1:lx] = Q[:,lx-1:lx] * Θ'
+        Q[:,lx-1:lx] .= view(Q,:,lx-1:lx) * Θ'
         lx = max(lx-1,2)
     else
         lx = lx+1;
     end
 end
+return B,T,Q,R
+end
+
 
-return (B,T,Q,R)
+"""
+    B = gauss(H)
+
+Do Gauss (or Lagrange?!) reduction on the lattice defined by the two columns of
+H.
+
+Follows Fig 2.3 of "Lattice Basis Reduction: An Introduction to the LLL
+Algorithm and Its Applications" by Murray R. Bremner, CRC Press, 2012.
+
+# Examples
+```jldoctest
+julia> H = [1 2; 3 3]; B = gauss(H)
+2×2 Array{Float64,2}:
+ 1.0  0.0
+ 0.0  3.0
+```
+"""
+function gauss(H::AbstractArray{Td,2}) where Td
+    if size(H,2)!=2
+        error("Gauss reduction only works on two columns.")
+    end
+    x = float.(H[:,1])
+    y = float.(H[:,2])
+    if norm(x)<=norm(y)
+        v1,v2=x,y
+    else
+        v1,v2=y,x
+    end
+    finished = false;
+    while !finished
+        m = round((v2'*v1)/(v1'*v1))
+        v2 -= m*v1
+        if norm(v1)<=norm(v2)
+            finished = true
+        else
+            v1,v2 = v2,v1
+        end
+    end
+    return [v1 v2]
 end
diff --git a/src/seysen.jl b/src/seysen.jl
index 997f25f..07583a6 100644
--- a/src/seysen.jl
+++ b/src/seysen.jl
@@ -1,15 +1,30 @@
+"""
+    B,T,B_dual,num_it = seysen(H::Array{Td,2}) where Td
+
+Do greedy  Seysen lattice reduction  on the  matrix `H`, returning  `B`, the
+reduced lattice basis;  `T` a unimodular matrix that reduces  `H` (i.e. `B =
+H*T`); `B_dual`, dual lattice basis (i.e., `B_dual = pinv(B)`); and num_it the number
+of iterations (basis updates). See also [`lll`](@ref), and [`brun`](@ref).
+
+Follows Seysen algorithm in "Lattice Reduction - A Survey with Applications
+in Wireless Communications" by D. Wuebben, et al, IEEE Signal Processing
+Magazine, 2011.
+
+# Examples
+```jldoctest
+julia> H= [1 2; 3 4];B,_ = seysen(H); B
+2×2 Array{Int64,2}:
+ 1  -1
+ 1   1
+
+julia> H= BigFloat.([1.5 2; 3 4]) .+ 2im; B,_= seysen(H); B
+2×2 Array{Complex{BigFloat},2}:
+ 0.0+1.0im  0.50+0.0im
+ 0.0+0.0im   1.0+0.0im
+
+```
+"""
 function seysen(H::Array{Td,2}) where {Td}
-# [B,T,B_dual,num_it] = seysen(H::Array{Td,2}) where Td
-#
-# Do greedy Seysen lattice reduction on the matrix H (real or
-# complex-valued), returning T a unimodular matrix that reduces H;
-# B, the reduced lattice basis (i.e. B = H*T);
-# B_dual,  dual lattice basis (i.e., B_dual = pinv(B)); and
-# the number of iterations (the number of basis updates).
-
-# Follows LLL in D. Wueben, et al, MMSE-Based Lattice-Reduction for Near-ML
-# Detection of MIMO Systems International IEEE Workshop on Smart Antennas,
-# Munich, March 2004.
 
 if Td<:BigInt || Td<:BigFloat
     Ti= Td<:Complex ? Complex{BigInt} : BigInt
@@ -20,12 +35,8 @@ else
         Ti= Td<:Complex ? Complex{Int} : Int
     end
 end
-roundf(r) = Td<:Complex ? round(real(r)) + im*round(imag(r)) : round(r);
 
-#H = H*1.0
-
-# get size of input
-(n, m) = size(H); # m-dimensional lattice in an n-dimensional space
+n,m = size(H); # m-dimensional lattice in an n-dimensional space
 
 # initialization, outputs
 B      = copy(H);    # reduced lattice basis
@@ -35,8 +46,8 @@ A      = (H'*H)*1.0;          # Gram matrix of H
 Adual  = inv(A);     # Inverse gram matrix of H
 B_dual = H*Adual;    # Dual basis
 
-# calculate all possible update values Λ[s,t)
-# and their corresponding reduction Δ[s,t) in Seysen's measure
+# calculate update values Λ[s,t] and the corresponding reduction Δ[s,t] in
+# Seysen's measure 
 Λ = zeros(Ti,m,m);
 Δ = zeros(Td,m,m)*0.0;
 
@@ -55,14 +66,13 @@ for s = 1:m
 end # - end calculation of Λ and Δ
 
 # find maximum reduction in Seysen's measure (greedy approach)
-(zw,max_ind) = findmax(abs.(Δ[:]));
-(s, t)       = Tuple(CartesianIndices((m,m))[max_ind])
+zw,max_ind = findmax(abs.(Δ[:]));
+s,t        = Tuple(CartesianIndices((m,m))[max_ind])
 
 # init loop
 do_reduction = true;
 
-# if no improvement can be achieved
-if Δ[s,t] == 0
+if Δ[s,t] == 0  # if no improvement can be achieved
     do_reduction = false;
 end
 
@@ -71,23 +81,12 @@ while do_reduction
 
     num_it = num_it + 1;
 
-    # perform basis update
-    try
-        B[:,s] = B[:,s] + Λ[s,t]*B[:,t];
-    catch
-        println("B[:,s] = $(B[:,s]), typeof(B[:,s]) = $(typeof(B[:,s]))")
-        println("Λ[s,t] = $(Λ[s,t]), typeof(Λ[s,t]) = $(typeof(Λ[s,t]))")
-    end
-
-    # compute corresponding unimodular trasformation matrix
-    T[:,s] = T[:,s] + Λ[s,t]*T[:,t];  # basis transformation matrix
+    
+    B[:,s]  = B[:,s] + Λ[s,t]*B[:,t];    # perform basis update
+    T[:,s]  = T[:,s] + Λ[s,t]*T[:,t];    # updater unimodular transformation
+    B_dual[:,t] = B_dual[:,t] - Λ[s,t]'*B_dual[:,s];  # update dual basis
 
-    # update corresponding dual basis
-    B_dual[:,t] = B_dual[:,t] - Λ[s,t]'*B_dual[:,s];
-
-    # Update Gram and inverse Gram matrix
     for ind = 1:m
-
         # update Gram matrix
         if ind != s
             A[s,ind] = A[s,ind]+Λ[s,t]'*A[t,ind];
@@ -95,7 +94,6 @@ while do_reduction
         else
             A[s,ind] = norm(B[:,s]).^2;
         end
-
         # update inverse Gram matrix
         if ind != t
             Adual[t,ind] = Adual[t,ind]-Λ[s,t]*Adual[s,ind];
@@ -103,8 +101,7 @@ while do_reduction
         else
             Adual[t,ind] = norm(B_dual[:,t])^2;
         end
-
-    end # - end update Gram und inverse Gram matrix
+    end
 
     # update all possible update values Λ[s,t]
     # and their corresponding reduction Δ[s,t] in Seysen's measure
@@ -113,20 +110,10 @@ while do_reduction
             if ((ind1==s) | (ind1==t) | (ind2==s) | (ind2==t)) & (ind1!=ind2)
                 x = 0.5*(Adual[ind2,ind1]/Adual[ind1,ind1]-A[ind2,ind1]/
                          A[ind2,ind2]);
-                ## There is an intermitent bug that I haven't yet figured
-                ## out that this try/catch code is aimed at. Yes, it's ugly...
-                try
-                    Λ[ind1,ind2] = roundf(x);
-                catch
-                    println("x = $(x), typeof(x) = $(typeof(x))")
-                    println("A[ind2,ind2] = $(A[ind2,ind2]),  "*
-                            "typeof(A[ind2,ind2]) = $(typeof(A[ind2,ind2]))")
-                    println("roundf(x)) = $(roundf(x))  "*
-                            "typeof(roundf(x))) = $(typeof(roundf(x)))")
-                    println("Λ[ind1,ind2]) = $(Λ[ind1,ind2])  "*
-                            "typeof(Λ[ind1,ind2])) = $(typeof(Λ[ind1,ind2]))")
-                    println(" ")
-                end
+
+                # Try Float64 if the following fails w Float32
+                Λ[ind1,ind2] = Ti(roundf(x));
+
                 AbsΛ = abs(Λ[ind1,ind2])^2;
                 if AbsΛ != 0
                     zw = real(Λ[ind1,ind2])*real(x)+imag(Λ[ind1,ind2])*imag(x);
@@ -136,19 +123,19 @@ while do_reduction
                 end
             end
         end
-    end # - end update Λ and Δ
+    end
 
     # find maximum reduction in Seysen's measure (greedy approach)
-    (zw, max_ind) = findmax(abs.(Δ[:]));
-    (s, t)        = Tuple(CartesianIndices((m,m))[max_ind])
+    zw,max_ind = findmax(abs.(Δ[:]));
+    s,t = Tuple(CartesianIndices((m,m))[max_ind])
 
     # if no reduction is possible, exit loop
     if Δ[s,t] == 0
         do_reduction = false;
     end
 
-end # - end lattice reduction loop
+end # while do_reduction
 
 
-return (B,T,B_dual,num_it)
+return B,T,B_dual,num_it
 end
diff --git a/test/runtests.jl b/test/runtests.jl
index 3cf1e5d..e8a8965 100644
--- a/test/runtests.jl
+++ b/test/runtests.jl
@@ -83,31 +83,33 @@ println("Error Rate is $(errRate). It should be zero or very small.\n")
 # --------------
 
 println("Testing now with 200x200 matrix from latticechallenge.org.")
-println("The col norm of the input should be 233, col norm of the "*
-        "reduced bases should be 30.")
+println("The min norm of the input should be 30, min norm of the "*
+        "reduced bases should be smaller.")
 mat = readdlm("challenge-200.mod",Int64) #run from parent directory
+mat = Matrix(mat');
+N = size(mat,2)
 
-nrms = zeros(200,1)
-for ix=1:200
+nrms = zeros(N,1);
+for ix=1:N
     nrms[ix] = norm(mat[:,ix])
 end
-println("max col-norm of input is $(maximum(nrms))")
+println("min norm of input is $(minimum(nrms))")
 
 @time (B,T) = lll(mat);
-nrms = zeros(200,1)
-for ix=1:200
-    nrms[ix] = norm(B[:,ix])
+nrms = zeros(N,1);
+for ix=1:N
+    nrms[ix] = norm(B[:,ix]);
 end
-maxlll = maximum(nrms)
-println("max col-norm of lll-reduced basis is $(maxlll)")
-@test maxlll<=30+1e-6
+mlll=minimum(nrms)
+println("min norm of lll-reduced basis is $(mlll)")
+@test mlll<=30+1e-6
 
 @time (B,T) = seysen(mat);
-nrms = zeros(200,1)
-for ix=1:200
+nrms = zeros(N,1)
+for ix=1:N
     nrms[ix] = norm(B[:,ix])
 end
-maxSeysen = maximum(nrms)
-println("max column norm of seysen-reduced basis is $(maxSeysen)")
-@test maxSeysen<=30+1e-6
+mSeysen = minimum(nrms)
+println("min norm of seysen-reduced basis is $(mSeysen)")
+@test mSeysen<=30+1e-6