Skip to content

Rust call analysis #452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Aug 15, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
595a77b
Rust analysis
another-rex Jul 20, 2023
b74370c
Fully support rust call analysis
another-rex Jul 26, 2023
e41dcb6
Merge remote-tracking branch 'upstream/main' into rust-call-graph
another-rex Jul 27, 2023
bbbe369
Fix missing line + minor refactor in logic
another-rex Jul 27, 2023
b490584
Move "ar" library to third_party
another-rex Jul 28, 2023
130f41c
Fix occasional tests failing from stdout routed incorrectly
another-rex Jul 28, 2023
47b5cba
Clarify and address PR comments
another-rex Jul 28, 2023
be8a7ad
Move go fixtures
another-rex Jul 28, 2023
cb95bad
Minor refactor
another-rex Jul 28, 2023
d812ef4
Verify govulncheck behavior
another-rex Jul 31, 2023
d1e8e52
Verify govulncheck behavior
another-rex Jul 31, 2023
2563ab6
Refactor, add additional optimisations and corrections, add test
another-rex Jul 31, 2023
75d71be
Optimise test
another-rex Jul 31, 2023
70c1823
Fix linter issues
another-rex Jul 31, 2023
6ab1643
Tests working
another-rex Aug 1, 2023
07d923e
Make it work with go 1.19
another-rex Aug 1, 2023
c690e8c
Merge branch 'main' into rust-call-graph
another-rex Aug 2, 2023
92dff00
docs update
another-rex Aug 2, 2023
dce94c5
Update docs more
another-rex Aug 2, 2023
09b835a
Update docs even more
another-rex Aug 2, 2023
8b5dd76
Change formatting
another-rex Aug 2, 2023
fcfcb0b
Fix spacing
another-rex Aug 2, 2023
7600773
Fix kramdown toc generation level
another-rex Aug 2, 2023
a8413d5
Update output doc to include examples for call analysis
another-rex Aug 10, 2023
94c7eab
Merge remote-tracking branch 'upstream/main' into rust-call-graph
another-rex Aug 15, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions cmd/osv-scanner/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -115,11 +115,11 @@ func run(args []string, stdout, stderr io.Writer) int {

switch format {
case "json":
r = reporter.NewJSONReporter(stdout, stderr)
r = reporter.NewJSONReporter(context.App.Writer, context.App.ErrWriter)
case "table":
r = reporter.NewTableReporter(stdout, stderr, false)
r = reporter.NewTableReporter(context.App.Writer, context.App.ErrWriter, false)
case "markdown":
r = reporter.NewTableReporter(stdout, stderr, true)
r = reporter.NewTableReporter(context.App.Writer, context.App.ErrWriter, true)
default:
return fmt.Errorf("%v is not a valid format", format)
}
Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ require (
github.com/go-git/gcfg v1.5.1-0.20230307220236-3a3c6141e376 // indirect
github.com/goark/errs v1.1.0 // indirect
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/ianlancetaylor/demangle v0.0.0-20230524184225-eabc099b10ab
github.com/imdario/mergo v0.3.15 // indirect
github.com/jbenet/go-context v0.0.0-20150711004518-d14ea06fba99 // indirect
github.com/kevinburke/ssh_config v1.2.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da h1:oI5xCqsCo564l
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/google/go-cmp v0.5.9 h1:O2Tfq5qg4qc4AmwVlvv0oLiVAGB7enBSJ2x2DqQFi38=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/ianlancetaylor/demangle v0.0.0-20230524184225-eabc099b10ab h1:BA4a7pe6ZTd9F8kXETBoijjFJ/ntaa//1wiH9BZu4zU=
github.com/ianlancetaylor/demangle v0.0.0-20230524184225-eabc099b10ab/go.mod h1:gx7rwoVhcfuVKG5uya9Hs3Sxj7EIvldVofAWIUtGouw=
github.com/imdario/mergo v0.3.15 h1:M8XP7IuFNsqUx6VPK2P9OSmsYsI/YFaGil0uD21V3dM=
github.com/imdario/mergo v0.3.15/go.mod h1:WBLT9ZmE3lPoWsEzCh9LPo3TiwVN+ZKEjmz+hD27ysY=
github.com/jbenet/go-context v0.0.0-20150711004518-d14ea06fba99 h1:BQSFePA1RWJOlocH6Fxy8MmwDt+yVQYULKfN0RoTN8A=
Expand Down
1 change: 1 addition & 0 deletions internal/sourceanalysis/go.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ func matchAnalysisWithPackageVulns(pkgs []models.PackageVulns, idToFindings map[
fillNotImportedAnalysisInfo(vulnsByID, vulnID, pv, analysis)
continue
}
// TODO: There feels like something's wrong here, not sure what
(*analysis)[vulnID] = models.AnalysisInfo{
Called: moduleToCalled[pv.Package.Name],
}
Expand Down
283 changes: 283 additions & 0 deletions internal/sourceanalysis/rust.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
package sourceanalysis

import (
"bytes"
"debug/dwarf"
"debug/elf"
"errors"
"fmt"
"io"
"log"
"os"
"os/exec"
"path/filepath"
"strings"

"github.com/google/osv-scanner/internal/cachedregexp"
"github.com/google/osv-scanner/internal/thirdparty/ar"
"github.com/google/osv-scanner/pkg/models"
"github.com/google/osv-scanner/pkg/reporter"
"github.com/ianlancetaylor/demangle"
)

const (
RustFlagsEnv = "RUSTFLAGS=-C opt-level=3 -C debuginfo=1"
RustLibExtension = ".rcgu.o/"
)

func rustAnalysis(r reporter.Reporter, pkgs []models.PackageVulns, source models.SourceInfo) {
binaryPaths, err := rustBuildSource(r, source)
if err != nil {
r.PrintError(fmt.Sprintf("failed to build cargo/rust project from source: %s", err))
return
}

// This map stores 3 states for each vuln ID
// - There is function level vuln info, but it **wasn't** called (false)
// - There is function level vuln info, and it **is** called (true)
// - There is **no** functional level vuln info, so we don't know whether it is called (doesn't exist)
isCalledVulnMap := map[string]bool{}

for _, path := range binaryPaths {
if strings.HasSuffix(path, ".rlib") {
// Is a library, so need an extra step to extract the object binary file before passing to parseDWARFData
objFilePath, err := extractRlibArchive(path)
if err != nil {
r.PrintError(fmt.Sprintf("failed to analyse '%s': %s", path, err))
continue
}
// TODO: Do we need to care about error on deletion here?
defer os.Remove(objFilePath)
}
calls, err := parseDWARFData(r, path)
if err != nil {
r.PrintError(fmt.Sprintf("failed to analyse '%s': %s", path, err))
continue
}

for _, pv := range pkgs {
for _, v := range pv.Vulnerabilities {
for _, a := range v.Affected {
// Example of RUSTSEC function level information:
//
// "affects": {
// "os": [],
// "functions": [
// "smallvec::SmallVec::grow"
// ],
// "arch": []
// }
ecosystemAffects, ok := a.EcosystemSpecific["affects"].(map[string]interface{})
if !ok {
continue
}
affectedFunctions, ok := ecosystemAffects["functions"].([]interface{})
if !ok {
continue
}
for _, f := range affectedFunctions {
if funcName, ok := f.(string); ok {
_, called := calls[funcName]
// Once one advisory marks this vuln as called, always mark as called
isCalledVulnMap[v.ID] = isCalledVulnMap[v.ID] || called
}
}
}
}
}

for _, pv := range pkgs {
for groupIdx := range pv.Groups {
for _, vulnID := range pv.Groups[groupIdx].IDs {
analysis := &pv.Groups[groupIdx].ExperimentalAnalysis
if *analysis == nil {
*analysis = make(map[string]models.AnalysisInfo)
}

called, hasFuncInfo := isCalledVulnMap[vulnID]
if hasFuncInfo {
(*analysis)[vulnID] = models.AnalysisInfo{
Called: called,
}
}
}
}
}
}
}

func parseDWARFData(_ reporter.Reporter, binaryPath string) (map[string]struct{}, error) {
output := map[string]struct{}{}
file, err := elf.Open(binaryPath)
if err != nil {
return nil, fmt.Errorf("failed to open binary %s: %w", binaryPath, err)
}
dwarfData, err := file.DWARF()
if err != nil {
return nil, fmt.Errorf("failed to extract debug symbols from binary %s: %w", binaryPath, err)
}
entryReader := dwarfData.Reader()

for {
entry, err := entryReader.Next()
if errors.Is(err, io.EOF) || entry == nil {
// We've reached the end of DWARF entries
break
}
if err != nil {
return nil, fmt.Errorf("error parsing binary DWARF data: %w", err)
}

// We only care about contents in functions
if entry.Tag != dwarf.TagSubprogram {
continue
}
// Go through fields
for _, field := range entry.Field {
// We only care about linkage names (including function names)
if field.Attr != dwarf.AttrLinkageName {
continue
}

val, err := demangle.ToString(field.Val.(string), demangle.NoClones)
if err != nil {
// most likely not a rust function, so just ignore it
continue
}

val = cleanRustFunctionSymbols(val)
output[val] = struct{}{}
}
}

return output, nil
}

// extractRlibArchive return the file path to a temporary ELF Object file extracted from the given rlib.
//
// It is the callers responsibility to remove the temporary file
func extractRlibArchive(rlibPath string) (string, error) {
file, err := os.CreateTemp("", "rust-*.o")
if err != nil {
return "", fmt.Errorf("failed to create temporary file: %w", err)
}
rlibFile, err := os.Open(rlibPath)
if err != nil {
return "", fmt.Errorf("failed to open .rlib file '%s': %w", rlibPath, err)
}

reader, err := ar.NewReader(rlibFile)
if err != nil {
return "", fmt.Errorf(".rlib file '%s' is not valid ar archive: %w", rlibPath, err)
}
for {
header, err := reader.Next()
if err != nil {
log.Fatalf("%v", err)
}
if header.Name == "//" { // "//" is used in GNU ar format as a store for long file names
fileBuf := bytes.Buffer{}
// Ignore the error here as it's likely
_, err = io.Copy(&fileBuf, reader)
if err != nil {
return "", fmt.Errorf("failed to read // store in ar archive: %w", err)
}
// There should only be one file (since we set codegen-units=1)
if !strings.HasSuffix(fileBuf.String(), RustLibExtension) {
// TODO: Verify this, and return an error here instead.
log.Printf("rlib archive contents were unexpected: %s\n", fileBuf.String())
}
}
// /0 indicates the first file mentioned in the "//" store
if header.Name == "/0" || strings.HasSuffix(header.Name, RustLibExtension) {
break
}
}
_, err = io.Copy(file, reader)
if err != nil {
return "", fmt.Errorf("failed to write to temporary file '%s': %w", file.Name(), err)
}

filePath := file.Name()
err = file.Close()
if err != nil {
return "", fmt.Errorf("failed to close the temporary file '%s': %w", file.Name(), err)
}

return filePath, nil
}

func rustBuildSource(r reporter.Reporter, source models.SourceInfo) ([]string, error) {
projectBaseDir := filepath.Dir(source.Path)

cmd := exec.Command("cargo", "build", "--all-targets", "--release")
cmd.Env = append(cmd.Environ(), RustFlagsEnv)
cmd.Dir = projectBaseDir
if errors.Is(cmd.Err, exec.ErrDot) {
cmd.Err = nil
}

stdoutBuffer := bytes.Buffer{}
stderrBuffer := bytes.Buffer{}
cmd.Stdout = &stdoutBuffer
cmd.Stderr = &stderrBuffer

r.PrintText("Begin building rust/cargo project\n")

if err := cmd.Run(); err != nil {
r.PrintError(fmt.Sprintf("cargo stdout:\n%s", stdoutBuffer.String()))
r.PrintError(fmt.Sprintf("cargo stderr:\n%s", stderrBuffer.String()))

return nil, fmt.Errorf("failed to run cargo build: %w", err)
}

outputDir := filepath.Join(projectBaseDir, "target", "debug")
entries, err := os.ReadDir(outputDir)
if err != nil {
return nil, fmt.Errorf("failed to read \"%s\" dir: %w", outputDir, err)
}

resultBinaryPaths := []string{}
for _, de := range entries {
// We only want .d files, which is generated for each output binary from cargo
// These files contains a string to the full path of output binary/library file.
// This is a reasonably reliable way to identify the output in a cross platform way.
if de.IsDir() || !strings.HasSuffix(de.Name(), ".d") {
continue
}

file, err := os.ReadFile(filepath.Join(outputDir, de.Name()))
if err != nil {
return nil, fmt.Errorf("failed to read \"%s\": %w", filepath.Join(outputDir, de.Name()), err)
}

fileSplit := strings.Split(string(file), ": ")
if len(fileSplit) != 2 {
// TODO: this can probably be fixed with more effort
return nil, fmt.Errorf("file path contains ': ', which is unsupported")
}
resultBinaryPaths = append(resultBinaryPaths, fileSplit[0])
}

return resultBinaryPaths, nil
}

// cleanRustFunctionSymbols takes in demanged rust symbols and makes them fit format of
// the common function level advisory information
func cleanRustFunctionSymbols(val string) string {
// Used to remove generics from functions and types as they are not included in function calls
// in advisories:
// E.g.: `smallvec::SmallVec<A>::new` => `smallvec::SmallVec::new`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should start a discussion here if the data can be formatted in a more convenient way to begin with?

Regex seems potentially very error prone here.

//
// Usage: antiGenericRegex.ReplaceAllString(val, "")
var antiGenericRegex = cachedregexp.MustCompile(`<[\w,]+>`)
val = antiGenericRegex.ReplaceAllString(val, "")

// Used to remove fully qualified trait implementation indicators from the function type,
// since those are generally not included in advisory:
// E.g.: `<libflate::gzip::MultiDecoder as std::io::Read>::read` => `libflate::gzip::MultiDecoder::read`
var antiTraitImplRegex = cachedregexp.MustCompile(`<(.*) as .*>`)
val = antiTraitImplRegex.ReplaceAllString(val, "$1")

return val
}
4 changes: 4 additions & 0 deletions internal/sourceanalysis/sourceanalysis.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,8 @@ func Run(r reporter.Reporter, source models.SourceInfo, pkgs []models.PackageVul
if source.Type == "lockfile" && filepath.Base(source.Path) == "go.mod" {
goAnalysis(r, pkgs, source)
}

if source.Type == "lockfile" && filepath.Base(source.Path) == "Cargo.lock" {
rustAnalysis(r, pkgs, source)
}
}
19 changes: 19 additions & 0 deletions internal/thirdparty/ar/COPYING
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Copyright (c) 2013 Blake Smith <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Loading