Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added findfeaturesSegment script #5091

Merged
merged 21 commits into from
Aug 18, 2023
Merged

Conversation

Kelvinrr
Copy link
Collaborator

@Kelvinrr Kelvinrr commented Dec 5, 2022

Description

Adds a script that splits images into segments and runs findfeatures on them.

Related Issue

#4396

How Has This Been Validated?

Ran in a jupyter notebook and had the resulting network reviewed by @lwellerastro

Still needs more adjustments so more changes incoming.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Documentation change (update to the documentation; no code change)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Infrastructure change (changes to things like CI or the build system that do not impact users)

Checklist:

  • I have read and agree to abide by the Code of Conduct
  • I have read the CONTRIBUTING document.
  • My change requires a change to the documentation and I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • I have added myself to the .zenodo.json document.
  • I have added any user impacting changes to the CHANGELOG.md document.

Licensing

This project is mostly composed of free and unencumbered software released into the public domain, and we are unlikely to accept contributions that are not also released into the public domain. Somewhere near the top of each file should have these words:

This work is free and unencumbered software released into the public domain. In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain.

  • I dedicate any and all copyright interest in this software to the public domain. I make this dedication for the benefit of the public at large and to the detriment of my heirs and successors. I intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

Comment on lines 181 to 184
new_params["networkid"] = og_networkid + f"_match_segment{match_segment_n}_from_segment{from_segment_n}"
new_params["pointid"] = og_pointid + f"_match_segment{match_segment_n}_from_segment{from_segment_n}"
new_params["onet"] = f"{og_onet.parent/og_onet.stem}.match_segment{match_segment_n}.from_segment{from_segment_n}.net"
new_params["tolist"] = f"{og_tolist.parent/og_tolist.stem}.match_segment{match_segment_n}.from_segment{from_segment_n}.lis"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be changed to include the image names, not just match/from and segment number

Comment on lines 26 to 37
def parse_parameters():
# Use a regular expression to extract the parameter name and value from each
# key-value pair in the parameter string.
pattern = r'(\w+)=(\S+)'

# Create a dictionary to store the parameter names and values.
params = {}
for arg in sys.argv[1:]:
# Extract the parameter name and value from the argument.
matches = re.findall(pattern, arg)
for match in matches:
name = match[0]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda ghetto for now, but if we can expose ISIS's UI objects into python, we can write this in a way that is indistinguishable from a C++ ISIS app using Python (I think)

@Kelvinrr Kelvinrr marked this pull request as ready for review January 3, 2023 17:13
@lwellerastro
Copy link
Contributor

lwellerastro commented Jan 4, 2023

I'm running some tests on this now and seeing that the pointid's are a bit on the long side.
My command line includes 'M1316912174RE_fastsobel_?????'

and my output is like the following:
M1316912174RE_fastsobel_00013_M1316912174RE_cal_echo.segment1_from_images_segment1

Would it be possible to drop the "M1316912174RE_cal_echo." at the very least if it is important to keep the "segment' references for tracking?

If it could be boiled down to "M1316912174RE_fastsobel_00013_segment#" that would be ideal.

Edit - this request would also apply to the some of the intermediate output files which have similarly extremely long names:

M1316912174RE_cubes_ff.M1316912174RE_cal_echo.segment1.from_images_segment1.lis
M1316912174RE_cubes_ff.M1316912174RE_cal_echo.segment1.from_images_segment2.lis
M1316912174RE_cubes_ff.M1316912174RE_cal_echo.segment2.from_images_segment2.lis
M1316912174RE_ff.M1316912174RE_cal_echo.segment1_from_images_segment1.log
M1316912174RE_ff.M1316912174RE_cal_echo.segment1.from_images_segment1.net
M1316912174RE_ff.M1316912174RE_cal_echo.segment1_from_images_segment2.log
M1316912174RE_ff.M1316912174RE_cal_echo.segment1.from_images_segment2.net
M1316912174RE_ff.M1316912174RE_cal_echo.segment2_from_images_segment2.log
M1316912174RE_ff.M1316912174RE_cal_echo.segment2.from_images_segment2.net

@lwellerastro
Copy link
Contributor

I'm having a hard time telling what's going on while with my cluster jobs. My findfeaturesSegment command line includes maxthreads=7 but it seems it's using a lot more than that. Do I need to ask for a different value? Is this being used at all?

I need to know in order to ask for the appropriate amount of threads and most importantly memory for a single image findfeatures job execution. This is particularly important when there are limited resources, numerous images to work on (jobs) and multiple people likely to do so at the same time.

@lwellerastro
Copy link
Contributor

I'm getting *PROGRAMMER ERROR** Can't invert empty matrix in GenericTransform.cpp at 139. errors for some of the image segments. I'm familiar with this error, or at least some of its undesirable affects.

I believe there are "overlapping" segments that are too slim for findfeatures to work with so it dies. There is no easy way to know what image segment in the fromlist is creating the problem for the match image so it just quits and no images get worked on (a findfeatures problem that is already in another post).

The findfeatures workflow starts by passing a config file to isisminer which evaluates the overlap ratios of the match and the images contained in its overlaps.csv file. Overlap ratios that are too small are not allowed to go into the findfeatures fromlist. This is for the whole images. The findfeaturesSegment script seems to evaluate if the match and from segment overlap, but it does not seem to be evaluating if those overlaps as being suitable for the program.

This may be why 13 of the 18 jobs (images) I sent to cluster have not finished after 30+ minutes and all seem to be stuck on something, but I can't really tell what is going yet. I don't have any completed networks to look at yet and I'm having trouble tracing some things because issues with having to write to a single print.prt and information not being captured where I usually expect.

I can point someone to the data I'm looking at (on our scratch directory) when necessary via chat or email.

@lwellerastro
Copy link
Contributor

I would very much like to help debug some of the problems I am experiencing, but I am having an especially difficult time going through my slurm stdout log file because of the enormous amount of output and seemingly numerous calls to findfeatures. I have no useful print file to look at, and in the worst failure cases (like the example below) there is no findfeatures log file to look through.

One thing I am observing, is that it seems there are image segments being passed to findfeatures via the fromlist that do not actually overlap the match file segment being operated on. This may be why most of my jobs hung or returned empty matrix errors. None of my test jobs ran to a successful completion and after 3+ hours of doing nothing I killed the 12 jobs that hung.

I'm looking at an image that has some output in it's working directory in the form of from_images_segment#.lis and trying to correlate that back to the text in the slurm log file to see if there is anything useful.

Maybe I'm not interpreting things properly, but in the following findfeatures command (copied from the slurm output) I have opened the match image and the files in the fromlist in qmos and can clearly see there is a lack of overlaps for one of the segments (green) with the match image (blue). Maybe there is something going on with that aspect of the script?

I've removed all of the relevant path information:

findfeatures algorithm=fast/brief maxthreads=7 match=Lev1/M156671044LE.lev1.segment1.cub fastgeom=true geomtype=camera maxpoints=15000 epitolerance=9.0 ratio=0.9 hmgtolerance=9.0 filter=sobel networkid=M156671044LE_M156671044LE.lev1.segment1_from_images_segment2 pointid=M156671044LE_fastsobel_?????_M156671044LE.lev1.segment1_from_images_segment2 onet=M156671044LE_Network/M156671044LE_ff.M156671044LE.lev1.segment1.from_images_segment2.net tolist=M156671044LE_Network/M156671044LE_cubes_ff.M156671044LE.lev1.segment1.from_images_segment2.lis tonotmatched=M156671044LE_Network/M156671044LE_notmatched_ff.lis description=Create debug=true debuglog=M156671044LE_Network/M156671044LE_ff.M156671044LE.lev1.segment1_from_images_segment2.log fromlist=M156671044LE_Network/from_images_segment2.lis

contents of fromlist w/ paths removed with comments inline:

Lev1/M1214450277RE.lev1.segment2.cub        very small ovl with M156671044LE.lev1.segment1.cub at corners
Lev1/M1214450277LE.lev1.segment2.cub         no overlap at all with M156671044LE.lev1.segment1.cub

See qmos footprint screen capture. Like I said, it was difficult parsing through the slurm output file, so maybe I'm misreading something.

FFSegment_Qmos

@lwellerastro
Copy link
Contributor

This is in reference to my last set of comments and failures possibly due to segments not overlapping.

To the best of my ability, I am going though the script steps manually that precede the call to findfeatures for the 3 segments above. The match image is M156671044LE.lev1.segment1.cub and in the above case, segment 2 images are trying to be matched to it, M1214450277LE.lev1.segment2.cub and M1214450277RE.lev1.segment2.cub. I'll share my steps below and the results that lead to suggesting changes for determining overlaps before creating a segment fromlist for findfeatures.

I'm piecing this together based on the bits and pieces I can pull from what is going to the slurm standard out file since none of it is properly captured to screen or print file since I have to disable those things in my Preference file in order to get the script to run. There may be flaws below due to the limited information. There are a number of temporary files created via script that I am attempting to mimic in my steps below. The filenames are my own.

segment from=M156671044LE.lev1.cub nl=30000 overlap=0
segment from=M1214450277LE.lev1.cub nl=30000 overlap=0
segment from=M1214450277RE.lev1.cub nl=30000 overlap=0
ls *segment*cub > segments.lis

foreach i ( `cat segments.lis ` )
footprintinit from=${i}
end
 --> I would use a linc/sinc larger than the default 100 because it takes a long time to run and at least the longer segments can handle it.  increaseprecision can always be used as well if the linc/since is too large for some segments

Attempting to mimic a findfeatures command and input from above:
ls M156671044LE.lev1.segment1.cub > M156671044LE_Seg1_From_Seg2.lis
ls M1214450277LE.lev1.segment2.cub >> M156671044LE_Seg1_From_Seg2.lis
ls M1214450277RE.lev1.segment2.cub >> M156671044LE_Seg1_From_Seg2.lis

findimageoverlaps froml=M156671044LE_Seg1_From_Seg2.lis overlaplist=overlaps_M156671044LE_Seg1_From_Seg2.dat

overlapstats fromlist=M156671044LE_Seg1_From_Seg2.lis overlaplist=overlaps_M156671044LE_Seg1_From_Seg2.dat detail=full to=ovlstats_M156671044LE_Seg1_From_Seg2.csv

Best I can tell, the script is evaluating the basic Group=Results output from overlapstats to determine if there are overlaps in the input file:

  Group = Results
    ThicknessMinimum            = 6.96146608705005e-04
    ThicknessMaximum            = 0.024623438838304
    ThicknessAverage            = 0.012659792723505
    ThicknessStandardDeviation  = 0.016919150590982
    ThicknessVariance           = 2.86257656720319e-04
    AreaMinimum                 = 711456.45971044
    AreaMaximum                 = 892790.20723427
    AreaAverage                 = 802123.33347235
    AreaStandardDeviation       = 128222.32253207
    AreaVariance                = 16440963995.518
    ImageStackMinimum           = 2.0
    ImageStackMaximum           = 2.0
    ImageStackAverage           = 2.0
    ImageStackStandardDeviation = 0.0
    ImageStackVariance          = 0.0
    PolygonCount                = 5
  End_Group

  Group = Results
    ThicknessMinimum            = 6.96146608705005e-04
    ThicknessMaximum            = 0.024623438838304
    ThicknessAverage            = 0.012659792723505
    ThicknessStandardDeviation  = 0.016919150590982
    ThicknessVariance           = 2.86257656720319e-04
    AreaMinimum                 = 711456.45971044
    AreaMaximum                 = 892790.20723427
    AreaAverage                 = 802123.33347235
    AreaStandardDeviation       = 128222.32253207
    AreaVariance                = 16440963995.518
    ImageStackMinimum           = 2.0
    ImageStackMaximum           = 2.0
    ImageStackAverage           = 2.0
    ImageStackStandardDeviation = 0.0
    ImageStackVariance          = 0.0
    PolygonCount                = 5
  End_Group

This is not especially useful in this case because it does not specify who overlaps who. Notice that I included detail=full to=ovlstats_M156671044LE_Seg1_From_Seg2.csv in my command. Here are the contents of the output csv file showing who overlaps who and by how much:

cat ovlstats_M156671044LE_Seg1_From_Seg2.csv 
Overlap ID,Thickness,Area,Image Count,Serial Numbers in Overlap,Image Files in Overlap
3,0.0006961466087050050,892790.2072342676,2,LRO/1/481503810:23259/NACL,M1214450277LE.lev1.segment2.cub,LRO/1/481503810:23259/NACR,M1214450277RE.lev1.segment2.cub
4,0.02462343883830421,711456.4597104399,2,LRO/1/323724577:00452/NACL,M156671044LE.lev1.segment1.cub,LRO/1/481503810:23259/NACR,M1214450277RE.lev1.segment2.cub

Sorry for the length of these files.

The output shows in the second line that our match segment file of interest M156671044LE.lev1.segment1.cub only overlaps M1214450277RE.lev1.segment2.cub and not the other segment2 image. The first line shows that the two segment2 images overlap each other. M1214450277LE.lev1.segment2.cub should NOT be included in the findfeatures fromlist since it does not overlap the match file we are working on. This is no doubt the sort of thing that will make findfeatures choke.

I think the script needs to be reworked to determine the properly overlapping segments to be passed to the findfeatures runs.

I further think there needs to be some inspection and filtering on the contents of the detailed overlap statistics to determine if the overlap is large enough to pass to findfeatures. An overlap ratio would be more useful than the thickness or area information being captured by overlapstats which are notoriously difficult to work with. That said, I think setting that value needs to come from the user, not the script since the user will be more familiar with the data. Until having the opportunity to research a cutoff I can't recommend a cutoff for either statistic for exclusion from a fromlist.

@lwellerastro
Copy link
Contributor

Something just occurred to me that might be another issue for this script, but I'm not sure and won't until I am able to run the script.

In the links above and through personal communication, @Kelvinrr said the script is parallelized/threaded(?) which may be a problem if that means findfeatures is running multiple instances in the same directory on the same match and fromlist images.

Findfeatures is explicitly run in separate match image directories for file management reasons, but also because it reads train (froml) and query (match) images to memory and sometimes to disk (when using the rendered image option) while finding and matching interesting points. I'm probably not articulating this properly and may even have the concepts ill conceived, but actual experience has demonstrated that running multiple instances of findfeatures on multiple images that are involved with not only building up their own network but also included as a fromlist image in other networks (the norm), and doing so all in the same directory, will result scrambled, incomplete, and often undesired results. Separating out the findfeatures runs into individual image network directories fixes this.

Again, probably not describing this quite right, but when all actions take place in the same directory, the query (match) and trainer (froml) image overlaps are going to the feature detection algorithm in a projected form via memory (I use fastgeom=true; though I might not be interpreting this aspect properly). This is ok if there is one run for a match with multiple froms - the froms are rotated and scaled to match the match image. And when rendered have names like M1214450277RE.lev1.train.png and M156671044LE.lev1.query.png, which mimics the input filenames. But what happens when there are multiple segments running around and multiple runs of findfeatures occurring at once in the same directory? Match.segment1 passed to findfeatures along with From.segment2 at the same time as Match.segment1 From.segment1 could be a problem.

I just don't know, but I think it needs to be thought about for a second if the answer is not obvious. I understand and appreciate that running things in parallel will make the work go quicker, but it could be a problem for this application.

If @KrisBecker has any time and/or interest in commenting, that would be greatly appreciated, but understood if he chooses not to considering he is not directly involved with this particular work. I would only ask if the concept of running multiple findfeatures concurrently in the same directory on the same images might pose a problem. Thanks.

@KrisBecker
Copy link
Contributor

@lwellerastro sorry you are having so much trouble. There is a lot going on here and the cluster may be contributing to the problems to some degree (i.e., mostly due to job constraints, perhaps).

After reading through your posts, I think there are a few things that are contributing to the issues you are seeing. findfeatures's FastGeom algorithm is quite challenged with the very small overlapping images. The FastGeom::Grid algorithm will iterate way too long trying to construct a perspective transform from common ground points, which are computed from a priori SPICE. The very small sliver/corner of overlap shown in your qmos graphic is exactly the kind of case that will cause findfeatures to iterate seemingly forever. It will eventually fail, killing the job, because it can't get enough common points to create the transformation matrix in FastGeom. And ofttimes, when it does get enough points, it creates a very bad matrix leading to the matrix inversion issue.

Threading is likely contributing to some of your problems, particularly if you are running on a cluster that (typically) manages memory and CPU utilization. If its true that this findfeature segmentation script is threaded, and you are requesting threads in findfeatures, the cluster itself could impose limiting that resource such that it will kill the job when you start requesting more threads. It seems to make sense that essentially, if you are running the segmentation on the cluster, which is threaded, it is using all the allocated/specified threads. When a findfeatures run is dispatched on the cluster, it will only get one CPU (or none, which causes a suspension or the job is killed?). And if FastGeom is in its ground point collection loop, it will not finish any time soon, that's for sure.

I suggest that if the images are nearly spatially consistent, you may be able to forego using FastGeom. If you do that, be sure to use matching algorithms that are scale and rotation invariant.

For cluster management, it might be nice to have the segmenting script submit the individual findfeatures jobs back to the cluster for each segment so that threading is not contentious/competing for the same resource.

To answer your specific question if running findfeatures in the same directory is a problem: the only issue I think might be a problem is if you are requesting to save the FastGeom'ed images as PNGs. This could cause the same filenames to be generated from like-named input files and potentially cause file I/O collisions.

Finally, I have a long standing findfeatures PR waiting for me to finalize that addresses many of these issues. These modifications greatly improve the FastGeom::Grid algorithm by providing indexed looping ranges, adds a new one shot FastGeom::Radial algorithm, won't fail when sufficent common ground points cannot be found (writes failed image names to a file) and resolves the matrix inversion issues. I am working to finish that PR ASAP - in the next week - that should help with some of these problems.

@lwellerastro
Copy link
Contributor

Thanks for the thoughtful and insightful comments @KrisBecker. There is a lot here to digest and consider and we appreciate you taking the time to provide such useful feedback.

@blandoplanet
Copy link

Ping this for a review. It's really important that we get this code merged as soon as we can.

@Kelvinrr
Copy link
Collaborator Author

pinged some people I think are appropriate specifically.

@AustinSanders
Copy link
Contributor

@Kelvinrr I'm about 90% of the way through this review, but I don't have a lot of experience with the swig interface files, so I'm moving slowly. I expect to be done with the review in the next couple hours.

Copy link
Collaborator

@acpaquette acpaquette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple questions regarding some CmakeLists and swig

@Kelvinrr
Copy link
Collaborator Author

Gonna add some quick tests and I think this might be good for re-review.

@Kelvinrr Kelvinrr requested a review from AustinSanders April 7, 2023 18:43
@Kelvinrr
Copy link
Collaborator Author

Kelvinrr commented Apr 7, 2023

This should be good for re-review

@AustinSanders
Copy link
Contributor

Code changes look good, but this is failing a few tests that are passing on dev. Is that expected? Looks like:

1439 - DefaultCube.FunctionalTestCkwriterDefault (Failed)
1440 - DefaultCube.FunctionalTestCkwriterFromlist (Failed)
994 - lro_module_test_lroc (Failed)
1036 - mex_module_test_hrsc (Failed)

@Kelvinrr
Copy link
Collaborator Author

@AustinSanders these pass locally, so maybe a red herring?

AustinSanders
AustinSanders previously approved these changes Jun 28, 2023
Copy link
Collaborator Author

@Kelvinrr Kelvinrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some annotations on recent changes

Comment on lines +6 to +37
def test_module_import():
name = 'astroset'
if name in sys.modules:
print(f"{name!r} already in sys.modules")
assert (spec := importlib.util.find_spec(name)) is not None

def test_check_pip_install():
package_name = "astroset"
try:
# Check if the package is installed by running the "pip show" command
result = subprocess.check_output(['pip', 'show', package_name])
package_info = result.decode('utf-8')

# Parse the package information to check if it contains the package name
assert(package_name.lower() in package_info.lower())

except subprocess.CalledProcessError:
assert False

def test_app_install():
app_name = 'findFeaturesSegment.py'
try:
# Check if the package is installed by running the "pip show" command
result = subprocess.check_output([app_name, '-H'])
help_string = result.decode('utf-8')

# Parse the package information to check if it contains the package name
assert("from" in help_string.lower())
assert("fromlist" in help_string.lower())

except subprocess.CalledProcessError:
assert False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some simple install scripts

Comment on lines +120 to +121
cd $WORKSPACE/isis/python_bindings/tests
pytest .
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the tests to jenkins

install(FILES ${ASTROSET_APP_XML_FILES} DESTINATION ${CMAKE_INSTALL_PREFIX}/bin/xml/)

# Setup to run setup tools on install
install(CODE "execute_process(COMMAND $ENV{CONDA_PREFIX}/bin/python ${CMAKE_CURRENT_BINARY_DIR}/setup.py install --single-version-externally-managed --prefix=${CMAKE_INSTALL_PREFIX} --record=record.txt
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now using the up to date way of installing

AustinSanders
AustinSanders previously approved these changes Aug 8, 2023
Copy link
Contributor

@AustinSanders AustinSanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to go!

@Kelvinrr
Copy link
Collaborator Author

new tests pass and seem to match dev, I think this is good for re-review @AustinSanders

@Kelvinrr Kelvinrr merged commit 53e3bc0 into DOI-USGS:dev Aug 18, 2023
@lwellerastro
Copy link
Contributor

I really look forward to using this @Kelvinrr !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants