Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Findfeatures fails for LROC NAC images #4396

Closed
lwellerastro opened this issue Apr 9, 2021 · 22 comments
Closed

Findfeatures fails for LROC NAC images #4396

lwellerastro opened this issue Apr 9, 2021 · 22 comments
Assignees
Labels
bug Something isn't working Products Issues which are impacting the products group
Milestone

Comments

@lwellerastro
Copy link
Contributor

ISIS version(s) affected: isis3.9.1, isis4.3.0, isis5.0.0-RC1

Description
findfeatures fails for two lroc nac images with the following error:

./runFF.cmd 
terminate called recursively
terminate called recursively
terminate called recursively
./runFF.cmd: line 1: 68596 Aborted                 (core dumped) findfeatures algorithm=fast/brief/parameters@saverenderedimages:true@savepath:Rendered maxpoints=5000 maxthreads=7 match=M102128467LE.cub from=M102142784LE.cub fastgeom=true geomtype=camera epitolerance=3.0 ratio=0.9 hmgtolerance=3.0 filter=SOBEL onet=FF.net tolist=FF.lis pointid='M102128467LE_fast_?????' networkid=M102128467LE description='Create image-image control network' debug=true debuglog=FF.log

The above is for images without footprint information. When images have footprintinit run on them, then findfeatures includes this additional line in the error message as shown above:
terminate called after throwing an instance of 'cv::Exception'

There is very little recorded in the output FF.log file and it never manages to render the images.

How to reproduce
See /work/users/lweller/Isis3Tests/FindFeatures/LROCNAC/runFF.cmd for command execution. Contains the same information contained in the error message above.

See proc.scr for the pre-processing which includes ingestion, spiceinit and footrpintinit (which I don't think is necessary but I included it because I was just trying everything to get findfeatures to work). It fails on calibrated images as well, but I don't think it is a necessity for findfeatures anyway, but I had that data available so I tried it.

I am currently running findfeatures successfully on Kaguya TC images as well as Titan and have not experienced this sort of failure. I believe folks outside of Astro have successfully run findfeatures on LROC NAC images, which is partly why I went back to older versions.

Possible Solution

Additional context
I have limited the number of keypoints because I have found when I don't and there is a lot of overlap and very interesting terrain the program fails/aborts. It fails regardless. It seems it is just not getting past step 1 to use any of this information.

Let me know if there is something else worth testing.

@lwellerastro lwellerastro added the Products Issues which are impacting the products group label Apr 9, 2021
@KrisBecker
Copy link
Contributor

Hi @lwellerastro,

Let's see if we can narrow the problem down. First, try to turn off the SOBEL filtering. If it fails, then turn off FASTGEOM.

@jessemapel jessemapel added this to the 5.0.1 milestone Apr 12, 2021
@lwellerastro
Copy link
Contributor Author

Thanks for the suggestions @KrisBecker.

It continues to fail if filtering is turned off and fastgeom=true.
Findfeatures will run to completion if fastgeom=false. That is completely unexpected.

Leaving fastgeom=false is ok for these particular images because they have similar geometry to begin with. Because of that I would not expect findfeatures to fail while setting fastgeom=true.

@KrisBecker
Copy link
Contributor

Ok, good. A couple of possibilities come to mind.

LROC images can be big, long noodles. I am wondering if you are running into a memory problem. Note that Sobel filtering makes a whole new copy of the image (in 8-bit) for both images after fast geom'ing, which also makes an image space projected copy, similar to cam2cam. But it looks like its not making it out of geom'ing so that appears to a potential source of the problem.

There is pretty good chance the fast geom (homography) transform is incomplete or ill-formed. By default, only 25 points are used to create an image perspective warp transform - it doesn't apply rigorous RubberSheet projections like cam2cam. This can cause troubles for non-square image datasets. I suspect there are not enough points to create a well-formed transform, so increase FASTGEOMPOINTS in increments of 25 or 50 pixels to see if that improves things. However, if you can get by without fast geom, that may significantly reduce memory and mapping issues like this.

Of course, if you are not FASTGEOM'n then you should use scale and orientation invariant matchers/detectors if they differ significantly spatially.

@lwellerastro
Copy link
Contributor Author

@KrisBecker thanks again for lending direction.

Memory does not seem to be an issue here. I got on bigmem interactively so it would have all the memory it needed and findfeatures only used <3G whether it was successful (fastgeom=no) or not (fastgeom=yes), with or without filtering.

I did get a slightly different error message when I set fastgeom=yes and fastgeompoints=50 (no filter applied to images), oh, and in this case I accidentally increased maxpoints to 500000 (meant there to be one less 0):

./runFF_3.cmd
terminate called after throwing an instance of 'cv::Exception'
  what():  /feedstock_root/build_artefacts/opencv_1497784813564/work/opencv-3.2.0/modules/imgproc/src/imgwarp.cpp:4956: error: (-215) dst.cols < SHRT_MAX && dst.rows < SHRT_MAX && src.cols < SHRT_MAX && src.rows < SHRT_MAX in function remap

./runFF_3.cmd: line 1: 55301 Aborted                 (core dumped) findfeatures algorithm=fast@type:TYPE_5_8/brief maxpoints=500000 maxthreads=7 match=M102128467LE.cub from=M102142784LE.cub fastgeom=true geomtype=camera fastgeompoints=50 epitolerance=3.0 ratio=0.9 hmgtolerance=3.0 onet=FF.net tolist=FF.lis pointid='M102128467LE_fast_?????' networkid=M102128467LE description='Create image-image control network' debug=true debuglog=FF.log

This might be a red-herring as some would say since I don't get this particular message when maxpoints=50000, I just get the "terminate recursively" message and it dies.

@jessemapel
Copy link
Contributor

jessemapel commented Apr 14, 2021

Okay, I looked around a bit and it seems that the image may be too large for opencv to do the fastgeom warp. @lwellerastro Can you try reducing the images and running again? This won't be a work around, but it will give us an idea of what issue we're seeing.

Also, we should be properly handling the opencv exceptions and not just letting them fall through here.

If it is the image size, then it looks like we can use an opencv matrix with different indexing types to get around this issue https://answers.opencv.org/question/205847/assertion-failed-error-when-using-cv2remap/

@lwellerastro
Copy link
Contributor Author

Good call @jessemapel. I reduced both images by 2, ran it the originally way I intended (fastgeom=yes, filter=sobel) and findfeatures was successful.

Original input had 5064 samples and 52224 lines and the reduced data were 2532 samples and 26112 lines. I think 52k lines is at the maximum length for that data but I'm not 100% sure.

@jessemapel
Copy link
Contributor

jessemapel commented Apr 14, 2021

Yeah, OpenCV attempts to really optimize its space usage and it defaults to using 16-bit signed integers to store indices. The largest 16-bit signed value is 32,767 (SHRT_MAX stands for short or signed 16-bit max) so it's going to fail on anything with more lines or samples than that.

The good news is I think we can fix this but just changing our matrix indexing type.

@jessemapel jessemapel added the bug Something isn't working label Apr 14, 2021
@lwellerastro
Copy link
Contributor Author

ah, well at least there is a hard number to keep in mind for other datasets.

@KrisBecker
Copy link
Contributor

Glad that worked out for you. Unfortunately, the sample/line image measure coordinates are in corresponding scale. This means you will need to keep a complete set of scaled/cropped image versions to run through the entirety of the bundle adjustment process - unless there is a way around it that I am not aware of. This could also increase uncertainty in the solution.

If you were successful without fastgeom, then you could apply an invariant feature matcher and increase tolerances for EPITOLERANCE and HMGTOLERANCE, and potentially adjust RATIO. You then compensate for this in pointreg by adding (2 * max(EPITOLERANCE, HMGTOLERANCE)) to the search chip size in the def file. I have found that a tolerance of 5 or higher really adds a lot of points where the spatial differences of fastgeom'ed images are most prominent. Increasing these values will still need to considered as the robust outlier detection algorithm tests for residuals using spatial Homography correlation regardless of the matcher used.

A second, perhaps more value added option would be to add a ScaleTransform to findfeatures that scales the input images just as you did in your tests. The inverse of that scale would then be applied to the control measures coordinates prior to writing out the control network This should be really easy to implement (add an interp option, please).

I agree with @jessemapel that findfeatures should catch OpenCV exceptions.

@lwellerastro
Copy link
Contributor Author

Glad that worked out for you. Unfortunately, the sample/line image measure coordinates are in corresponding scale. This means you will need to keep a complete set of scaled/cropped image versions to run through the entirety of the bundle adjustment process - unless there is a way around it that I am not aware of. This could also increase uncertainty in the solution.

Nah, not worth the bother in this case. I was trying to help out someone who needs to control stereo LROC NAC pairs and thought it was worth trying findfeatures because I had an enormous amount of success using it for Kaguya TC polar images. I was applying what I used for that data to test images to make sure it worked before handing it off to someone and came across the problem. Since they only have two images to deal with they can just brute force their way through pointreg, but it will be challenging for them. Although there is no emergency at the moment, I think we should improve the software to handle this particular data set. At least there is an understanding of the problem.

A second, perhaps more value added option would be to add a ScaleTransform to findfeatures that scales the input images just as you did in your tests. The inverse of that scale would then be applied to the control measures coordinates prior to writing out the control network This should be really easy to implement (add an interp option, please).

I like this idea a lot. And agree OpenCV exceptions need to be captured better. I've run into a number of them under different circumstances and I never know what actually went wrong. I just sort of flub my way through making it work or abandon ship all together. Just depends on the situation.

@KrisBecker
Copy link
Contributor

KrisBecker commented Apr 15, 2021

Well...turns out ScalingTransform already exists in findfeatures. It just isn't applied anywhere in the code. It should be really easy to implement. In findfeatures, I would add (line 257) a scaling transform after the images have been initialized/loaded and before other filtering, such as Sobel, is done.

You could also perform a real time test that automatically scales images if the OpenCV maximum size is exceeded and provide a warning about it.

You could also smarten up that implemetation. I see ScalingTransform uses the OpenCV resize routine that also may have the same limitations as warp. It may need an alternative scaling implementation. You could add parameters so as to optionally apply scaling based upon a min/max specification, as well as add a resolution-type parameter to test.

@acpaquette
Copy link
Collaborator

@KrisBecker @jessemapel What is the best way to handle this? Looks like there are a couple potential solutions and I would like to clarify the approach we want to take for this

@jessemapel
Copy link
Contributor

It looks like just using a different index type won't work as remap has to define thing internally that are hard-coded to use shorts. I found the actual bug report on OpenCV and they recommend doing a tiling approach.

Looking at the warpPerspective docs and a comment on the bug post, we could chop the image into pieces and warp each of them individually and then re-combine them.

I will think more about the math we need to do to modify the transformations and stitch them back together.

@KrisBecker
Copy link
Contributor

At a minimum, we should probably add a catch for OpenCV cv::Exections (note it inherits std::exception). Also, try to produce a sensible error message.

I looked into adding a ScalingTransform prior to FastGeom in findfeatures and it has issues. For example, the most glaring problem is when FastGeom is dispatched, it will not use any existing transforms to potentially adjust (i.e., via a transform inverse operation) the line/sample coordinates when correlating geometric coordinates between the query and train images. This mapping occurs in ImageSource which has no access to transforms at that scope.

Have to think about how this can be done. Perhaps abstract the fastgeom operation to the MatchImage class.

Additionally, a more robust use of scale could be built upon the ScalingTransform which attempts to intelligently match the resolution of the source image. This would necessarily require a scaling transform/train image. Its doable, but not until the geometry mapping includes a scaling capability.

@jessemapel
Copy link
Contributor

@acpaquette How about we band-aid this for right now and properly catch the exception from OpenCV and raise an IException. Then we can re-slot this for the next products sprint and we'll hopefully have a better solution figured out by then.

@jessemapel
Copy link
Contributor

jessemapel commented Apr 20, 2021

Okay I think I've got the tiling logic worked out. Based on this implementation: https://gist.github.com/kinchungwong/141bfa2d996cb5ae8f42

First we need to make a new transformation class like GenericTransform that handled a tiled transformation, maybe GenericTiledTransform, could even subclass. It will take everything GenericTransform does + a tile size. Then, it will perform a tiled transformation instead of a direct transformation.

In the constructor it should determine the size of the transformed image and then chop that into tiles. For example, with a tile size of 30,000 and a transformed size of 45,000 x 36,000 it should create the following tiles: 0-30,000 x 0-30,000; 30,000-45,000 x 0-30,000; 0-30,000 x 30,000-36,000; and 30,000-45,000 x 30,000-36,000. The next place we need to modify is GenericTransform::render. Here's what it'll need to do:

  1. Loop over all the tiles
  2. Get the tile ROI. This is the slice of the destination matrix for the tile.
  3. Compute the corresponding ROI in the source image for the tile using the inverse homography
  4. Allocate a matrix of pixel coordinats to store the pixel mapping between the tile and the source ROI
  5. For every pixel in the tile compute the corresponding pixel in the source image. This is tile coordinate transformed through the inverse homography minus the source ROI offset. This then gets stored in the tile pixel minus the tile offset.
  6. Use cv::remap with the two ROIs and the pixel mapping matrix

You can see this in the linked gist. Step 1 is implicit. Step 2. Step 3. Step 4. Step 5. Step 6

Here's what it would look like for our example. Let's assume the homography is just swap line and sample:

  1. First tile of 0-30,000 x 0-30,000
    a. The tile ROI is dest(0-30,000, 0-30,000)
    b. Compute the source ROI, because our line and sample ranges are the same the source ROI is src(0-30,000, 0-30,000).
    c. Allocate a 30,000 by 30,000 matrix of float 2D points for our mapping
    d. Loop over the pixels in the tile and compute the source pixel. So, tile pixel (0, 1) maps to source pixel (1, 0), (0, 2) maps to (2, 0), etc. Our ROI offsets are just (0, 0) so we just store (1, 0) in mapping(0, 1), (2, 0) in mapping(0, 2), etc.
    e. Use remap(src_ROI, dest_ROI, mapping) to transform the source ROI into our destination ROI. Because our destination ROI is just a slice, this actually maps into the full destination image.
  2. Second tile of 30,000-45,000 x 0-30,000
    a. The tile ROI is dest(30,000-45,000, 0-30,000)
    b. Compute the source ROI, again we swap line and sample so our source ROI is src(0-30,000, 30,000-45,000)
    c. Allocate a 15,000 by 30,000 mapping matrix
    d. Loop over the pixels in the tile and compute the source pixel. So, the tile pixel (30,000, 0) maps to source pixel (0, 30,000), (30,001, 0) maps to (0, 30,0001), etc. Then, we need to subtract off the source ROI offset, (0, 30,000), to get the source ROI pixel. So, the tile pixel (30,000, 0) maps to the source ROI pixel (0, 0), (30,001, 0) maps to (0, 1), etc. Finally we account for the tile to tile ROI and subtract off the tile ROI offset, (30,000, 0). So, we store the source ROI pixel (0, 0) in mapping(0, 0), (0, 1) in mapping(1, 0), etc.
    e. Use remap(src_ROI, dest_ROI, mapping) to transform the source ROI into our destination ROI.
  3. Third tile of 0-30,000 x 30,000-36,000
    a. The tile ROI is dest(0-30,000, 30,000-36,000)
    b. Compute the source ROI, again we swap line and sample so our source ROI is src(30,000-36,000, 0-30,000)
    c. Allocate a 30,000 by 6,000 mapping matrix
    d. Loop over the pixels in the tile and compute the source pixel. So, the tile pixel (0, 30,000) maps to source pixel (30,000, 0), (1, 30,000) maps to (30,0000, 1), etc. Then, we need to subtract off the source ROI offset, (30,000, 0), to get the source ROI pixel. So, the tile pixel (0, 30,000) maps to the source ROI pixel (0, 0), (1, 30,000) maps to (0, 1), etc. Finally we account for the tile to tile ROI and subtract off the tile ROI offset, (0, 30,000). So, we store the source ROI pixel (0, 0) in mapping(0, 0), (0, 1) in mapping(1, 0), etc.
    e. Use remap(src_ROI, dest_ROI, mapping) to transform the source ROI into our destination ROI.
  4. Fourth tile of 30,000-45,000 x 30,000-36,000
    a. The tile ROI is dest(30,000-45,000, 30,000-36,000)
    b. Compute the source ROI, again we swap line and sample so our source ROI is src(30,000-36,000, 30,000-45,000)
    c. Allocate a 15,000 by 6,000 mapping matrix
    d. Loop over the pixels in the tile and compute the source pixel. So, the tile pixel (30,000, 30,000) maps to source pixel (30,000, 30,000), (30,001, 30,000) maps to (30,0000, 30,001), etc. Then, we need to subtract off the source ROI offset, (30,000, 30,000), to get the source ROI pixel. So, the tile pixel (30,000, 30,000) maps to the source ROI pixel (0, 0), (30,001, 30,000) maps to (0, 1), etc. Finally we account for the tile to tile ROI and subtract off the tile ROI offset, (30,000, 30,000). So, we store the source ROI pixel (0, 0) in mapping(0, 0), (0, 1) in mapping(1, 0), etc.
    e. Use remap(src_ROI, dest_ROI, mapping) to transform the source ROI into our destination ROI.

@jessemapel jessemapel removed this from the 5.0.1 milestone May 28, 2021
@AustinSanders AustinSanders self-assigned this Jun 1, 2021
@AustinSanders AustinSanders added this to the 6.1.0 milestone Aug 5, 2021
@AustinSanders AustinSanders removed their assignment Aug 17, 2021
@amystamile-usgs amystamile-usgs self-assigned this Aug 19, 2021
@amystamile-usgs amystamile-usgs removed their assignment Sep 7, 2021
@AustinSanders AustinSanders modified the milestones: 6.1.0, 6.2, 6.2.0 Oct 19, 2021
@github-actions
Copy link

Thank you for your contribution!

Unfortunately, this issue hasn't received much attention lately, so it is labeled as 'stale.'

If no additional action is taken, this issue will be automatically closed in 180 days.

@github-actions github-actions bot added the inactive Issue that has been inactive for at least 6 months label Aug 14, 2022
@acpaquette
Copy link
Collaborator

This is still active

@lwellerastro
Copy link
Contributor Author

would be extremely useful for project work coming up.

@github-actions github-actions bot removed the inactive Issue that has been inactive for at least 6 months label Sep 12, 2022
@rfergason rfergason moved this to In Progress in FY23 Q1 Software Support Dec 1, 2022
@Kelvinrr Kelvinrr moved this to In Progress in FY23 Q2 Support Feb 22, 2023
@lwellerastro
Copy link
Contributor Author

Has there been any progress on #5091 script to help address this issue?

@Kelvinrr
Copy link
Collaborator

@lwellerastro Those changes should be released with the RC coming after this support sprint is over. Week of 12/4/23

@github-project-automation github-project-automation bot moved this from In Progress to Done in FY23 Q1 Software Support Nov 13, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in FY23 Q2 Support Nov 13, 2023
@github-project-automation github-project-automation bot moved this from Todo to Done in ASC Software Support Nov 13, 2023
@lwellerastro
Copy link
Contributor Author

Hi @Kelvinrr, I am not seeing findfeaturesSegment in the release candidate isis8.1.0-RC1. Did it miss the boat or do I need to do something special to invoke it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Products Issues which are impacting the products group
Projects
Development

Successfully merging a pull request may close this issue.

7 participants