Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated creating and updating scala_x_x.bzl repository files #1608

Open
wants to merge 36 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
7a0374c
testing new convention of file
lm1nrt Sep 3, 2024
c1004da
testing new convention of file
lm1nrt Sep 3, 2024
b91159d
wrong format of file
lm1nrt Sep 4, 2024
aa4778b
valid format of file
lm1nrt Sep 4, 2024
1e46caf
additional deps
lm1nrt Sep 4, 2024
2d51c6f
removed not existing deps
lm1nrt Sep 4, 2024
f349b78
removed runtime dependencies from deps
lm1nrt Sep 4, 2024
8662cf3
fastparse update
lm1nrt Sep 4, 2024
dfef0f3
missing jar in artifact
lm1nrt Sep 4, 2024
c0f7769
removed jar
lm1nrt Sep 4, 2024
d60190c
added scala_collection_compat dependencies
lm1nrt Sep 4, 2024
1f7d150
updated scala_2_13 with script
lm1nrt Sep 4, 2024
222b708
updated scala_2_11 with script
lm1nrt Sep 6, 2024
f8b83f8
update
lm1nrt Sep 6, 2024
4bfe939
updated scalapb runtime
lm1nrt Sep 6, 2024
df4c3b8
added deps to "scala_proto_rules_scalapb_runtime"
lm1nrt Sep 9, 2024
2ba1724
rollback to old version
lm1nrt Sep 9, 2024
23ecb4d
update scala 2_11
lm1nrt Sep 9, 2024
2be7038
added @com_lihaoyi_fastparse to defalt scalapb compile deps
lm1nrt Sep 9, 2024
4a2b15e
removed @com_lihaoyi_fastparse from compile deps
lm1nrt Sep 9, 2024
2d8eb67
rollback to old version of 2_11
lm1nrt Sep 9, 2024
f9442ff
update scala 3_1
lm1nrt Sep 9, 2024
3d362ba
update scala 3_1
lm1nrt Sep 9, 2024
ecb0cb2
update scala 3_2
lm1nrt Sep 9, 2024
3248e01
update scala 3_3
lm1nrt Sep 9, 2024
1e28b46
update scala 3_4
lm1nrt Sep 10, 2024
1d203ad
update scala 2_11
lm1nrt Sep 10, 2024
8446cc6
update scala 2_11
lm1nrt Sep 10, 2024
5b1ff1c
update scala parser
lm1nrt Sep 10, 2024
6175950
update scala_2_11.bzl
lm1nrt Sep 10, 2024
0d5cab3
update scala_3_5.bzl
lm1nrt Sep 13, 2024
193b9f6
script for generating .bzl files
lm1nrt Sep 13, 2024
36eec77
update script and scala_3_5.bzl
lm1nrt Sep 16, 2024
9485f2e
update deps for scala3-library
lm1nrt Sep 16, 2024
807ff84
code refactor
lm1nrt Sep 18, 2024
17d3836
added readme for script
lm1nrt Sep 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions scala/scalafmt/toolchain/setup_scalafmt_toolchain.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ _SCALAFMT_DEPS = [
"@org_scalameta_scalafmt_core",
"@org_scalameta_scalameta",
"@org_scalameta_trees",
"@org_scala_lang_modules_scala_collection_compat",
]

def setup_scalafmt_toolchain(
Expand Down
61 changes: 61 additions & 0 deletions scripts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Update/create scala_x_x.bzl repository file script

- [About](#about)
- [Usage](#usage)
- [Examples](#examples)
- [Debugging](#debugging)
- [Requirements](#requirements)

### About
The script allows to update a certain scala_x_x.bzl file and its content (artifact, sha, dependencies), by changing the value of `root_scala_version` variable.
It can be used to create non-existent file for chosen Scala version. <br>
It's using a [https://get-coursier.io/docs/](coursier) in order to **resolve** lists the transitive dependencies of dependencies and **fetch** the JARs of it.

### Usage
Usage from `/rules_scala/scripts`:
````
python3 create_repository.py
````

### Examples
Current value of `root_scala_versions`:
```
root_scala_versions = ["2.11.12", "2.12.19", "2.13.14", "3.1.3", "3.2.2", "3.3.3", "3.4.3", "3.5.0"]
```

To **update** content of `scala_3_4.bzl` file:
```
root_scala_versions = ["2.11.12", "2.12.19", "2.13.14", "3.1.3", "3.2.2", "3.3.3", "3.4.4", "3.5.0"]
^^^^^^^ <- updated version
```

To **create** new `scala_3_6.bzl` file:
```
root_scala_versions = ["2.11.12", "2.12.19", "2.13.14", "3.1.3", "3.2.2", "3.3.3", "3.4.3", "3.5.0", "3.6.0"]
^^^^^^^ <- new version
```

### Debugging
Certain dependency version may not have a support for chosen Scala version e.g.
```
kind_projector_version = "0.13.2" if scala_major < "2.13" else "0.13.3"
```

In order of that, there may be situations that script won't work. To debug that problem and adjust the values of hard-coded variables:
```
scala_test_major = "3" if scala_major >= "3.0" else scala_major
scala_fmt_major = "2.13" if scala_major >= "3.0" else scala_major
kind_projector_version = "0.13.2" if scala_major < "2.13" else "0.13.3"
f"org.scalameta:scalafmt-core_{scala_fmt_major}:{"2.7.5" if scala_major == "2.11" else scala_fmt_version}"
```
there is an option to print the output of these two subprocesses:

`output = subprocess.run(f'cs fetch {artifact}', capture_output=True, text=True, shell=True).stdout.splitlines()` <br>

```
command = f'cs resolve {' '.join(root_artifacts)}'
output = subprocess.run(command, capture_output=True, text=True, shell=True).stdout.splitlines()
```

### Requirements
Installed [Coursier](https://get-coursier.io/) and [Python 3](https://www.python.org/downloads/)
198 changes: 198 additions & 0 deletions scripts/create_repository.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
import subprocess
from dataclasses import dataclass
from pathlib import Path
from typing import List
from typing import Dict
import urllib.request
import time
import hashlib
import json
import ast
import copy
import glob
import os

root_scala_versions = ["2.11.12", "2.12.19", "2.13.14", "3.1.3", "3.2.2", "3.3.3", "3.4.3", "3.5.0"]
scala_test_version = "3.2.9"
scala_fmt_version = "3.0.0"

@dataclass
class MavenCoordinates:
group: str
artifact: str
version: str
coordinate: str

@dataclass
class ResolvedArtifact:
coordinates: MavenCoordinates
checksum: str
direct_dependencies: List[MavenCoordinates]

def select_root_artifacts(scala_version) -> List[str]:
scala_major = ".".join(scala_version.split(".")[:2])
scala_test_major = "3" if scala_major >= "3.0" else scala_major
scala_fmt_major = "2.13" if scala_major >= "3.0" else scala_major
kind_projector_version = "0.13.2" if scala_major < "2.13" else "0.13.3"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some versions are not compatible/doesn't exist in certain Scala version


common_root_artifacts = [
f"org.scalatest:scalatest_{scala_test_major}:{scala_test_version}",
f"org.scalameta:scalafmt-core_{scala_fmt_major}:{"2.7.5" if scala_major == "2.11" else scala_fmt_version}"
]

scala_artifacts = [
f'org.scala-lang:scala3-library_3:{scala_version}',
f'org.scala-lang:scala3-compiler_3:{scala_version}',
f'org.scala-lang:scala3-interfaces:{scala_version}',
f'org.scala-lang:tasty-core_3:{scala_version}'
] if scala_major[0] == "3" else [
f'org.scala-lang:scala-library:{scala_version}',
f'org.scala-lang:scala-compiler:{scala_version}',
f'org.scala-lang:scala-reflect:{scala_version}',
f'org.scalameta:semanticdb-scalac_{scala_version}:4.9.9',
f'org.typelevel:kind-projector_{scala_version}:{kind_projector_version}'
]

return common_root_artifacts + scala_artifacts

def get_maven_coordinates(artifact) -> MavenCoordinates:
splitted = artifact.split(':')
version = splitted[2] if splitted[2][0].isnumeric() else splitted[3]
Copy link
Contributor Author

@lm1nrt lm1nrt Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version location in maven coordinates isn't strict, so there are cases that the version is under the x+1 index after splitting it by ':'

return MavenCoordinates(splitted[0], splitted[1], version, artifact)

def get_mavens_coordinates_from_json(artifacts) -> List[MavenCoordinates]:
return list(map(lambda artifact: get_maven_coordinates(artifact), artifacts))

def get_artifact_checksum(artifact) -> str:
output = subprocess.run(f'cs fetch {artifact}', capture_output=True, text=True, shell=True).stdout.splitlines()
possible_url = [o for o in output if "https" in o][0]
possible_url = possible_url[possible_url.find("https"):].replace('https/', 'https://')
try:
with urllib.request.urlopen(possible_url) as value:
body = value.read()
return hashlib.sha256(body).hexdigest()
except urllib.error.HTTPError as e:
print(f'RESOURCES NOT FOUND: {possible_url}')

def get_json_dependencies(artifact) -> List[MavenCoordinates]:
with open('out.json') as file:
data = json.load(file)
return get_mavens_coordinates_from_json(dependency["directDependencies"]) if any((dependency := d)["coord"] == artifact for d in data["dependencies"]) else []

def get_label(coordinate) -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 5 patterns of labels, which are related to the structure of certain group and artifact due to custom naming.

if ("org.scala-lang" in coordinate.group or "org.scalatest" in coordinate.group or "org.scalactic" in coordinate.group or "com.twitter" in coordinate.group or "javax.annotation" in coordinate.group) and "scala-collection" not in coordinate.artifact and "scalap" not in coordinate.artifact:
return "io_bazel_rules_scala_" + coordinate.artifact.split('_')[0].replace('-', '_')
elif "org.openjdk.jmh" in coordinate.group or "org.ow2.asm" in coordinate.group or "net.sf.jopt-simple" in coordinate.group or "org.apache.commons" in coordinate.group or "junit" in coordinate.group or "org.hamcrest" in coordinate.group or "org.specs2" in coordinate.group:
return "io_bazel_rules_scala_" + coordinate.group.replace('.', '_').replace('-', '_') + '_' + coordinate.artifact.split('_')[0].replace('-', '_')
elif "mustache" in coordinate.group or "guava" in coordinate.group or "scopt" in coordinate.group:
return "io_bazel_rules_scala_" + coordinate.group.split('.')[-1]
elif "com.thesamet.scalapb" in coordinate.group or "io." in coordinate.group or "com.google.guava" in coordinate.group:
return "scala_proto_rules_" + coordinate.artifact.split('_')[0].replace('-', '_')
else:
return (coordinate.group.replace('.', '_').replace('-', '_') + '_' + coordinate.artifact.split('_')[0].replace('-', '_')).replace('_v2', '')

def map_to_resolved_artifacts(output) -> List[ResolvedArtifact]:
resolved_artifacts = []
subprocess.call(f'cs fetch {' '.join(output)} --json-output-file out.json', shell=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creates local file to get the directDependencies for fetched artifacts

for o in output:
replaced = o.replace(':default','')
coordinates = get_maven_coordinates(replaced)
checksum = get_artifact_checksum(replaced)
direct_dependencies = get_json_dependencies(replaced)
resolved_artifacts.append(ResolvedArtifact(coordinates, checksum, direct_dependencies))
return resolved_artifacts

def resolve_artifacts_with_checksums_and_direct_dependencies(root_artifacts) -> List[ResolvedArtifact]:
command = f'cs resolve {' '.join(root_artifacts)}'
output = subprocess.run(command, capture_output=True, text=True, shell=True).stdout.splitlines()
return map_to_resolved_artifacts(output)

def to_rules_scala_compatible_dict(artifacts, version) -> Dict[str, Dict]:
temp = {}

for a in artifacts:
label = get_label(a.coordinates).replace('scala3_', 'scala_').replace('scala_tasty_core', 'scala_scala_tasty_core')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related to custom naming convention of certain labels, TBD in the future to simplify it.

deps = [f'@{get_label(dep)}_2' if "scala3-library_3" in a.coordinates.artifact else f'@{get_label(dep)}' for dep in a.direct_dependencies]

temp[label] = {
"artifact": f"{a.coordinates.coordinate}",
"sha256": f"{a.checksum}",
} if not deps else {
"artifact": f"{a.coordinates.coordinate}",
"sha256": f"{a.checksum}",
"deps:": deps,
}

return temp

def is_that_trailing_coma(content, char, indice) -> bool:
return content[indice] == char and content[indice+1] != ',' and content[indice+1] != ':' and content[indice+1] != '@' and not content[indice+1].isalnum()

def get_with_trailing_commas(content) -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After mapping to the dict, there are no trailing commas, so this is a workaround to adjust file to .bzl files / starlark convention and CI-checks failures.

copied = copy.deepcopy(content)
content_length = len(copied)
i = 0

while i < content_length - 1:
if is_that_trailing_coma(copied, '"', i):
copied = copied[:i] + '",' + copied[i + 1:]
content_length = content_length + 1
i = i+2
elif is_that_trailing_coma(copied, ']', i):
copied = copied[:i] + '],' + copied[i + 1:]
content_length = content_length + 1
i = i+2
elif is_that_trailing_coma(copied, '}', i):
copied = copied[:i] + '},' + copied[i + 1:]
content_length = content_length + 1
i = i+2
else:
i = i+1

return copied

def write_to_file(artifact_dict, version, file):
with file.open('w') as data:
data.write(f'scala_version = "{version}"\n')
data.write('\nartifacts = ')
data.write(f'{get_with_trailing_commas(json.dumps(artifact_dict, indent=4).replace('true', 'True').replace('false', 'False'))}\n')

def create_file(version):
path = os.getcwd().replace('/scripts', '/third_party/repositories')
file = Path(f'{path}/{'scala_' + "_".join(version.split(".")[:2]) + '.bzl'}')

if not file.exists():
Copy link
Contributor Author

@lm1nrt lm1nrt Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If file doesn't exist, the script is copying the content from last file in directory (default sort) and updating it. So it won't work when someone is creating e.g. scala_2_10.bzl (since the heuristic is copying the content from scala_3_5.bzl in this example).

file_to_copy = Path(sorted(glob.glob(f'{path}/*.bzl'))[-1])
with file.open('w+') as data, file_to_copy.open('r') as data_to_copy:
for line in data_to_copy:
data.write(line)

with file.open('r+') as data:
excluded_artifacts = ["org.scala-lang.modules:scala-parser-combinators_2.11:1.0.4"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are artifacts versions that cause CI-check failures after update and exceptions in logs.

root_artifacts = select_root_artifacts(version)
read_data = data.read()
replaced_data = read_data[read_data.find('{'):]

original_artifact_dict = ast.literal_eval(replaced_data)
labels = original_artifact_dict.keys()

transitive_artifacts: List[ResolvedArtifact] = resolve_artifacts_with_checksums_and_direct_dependencies(root_artifacts)
generated_artifact_dict = to_rules_scala_compatible_dict(transitive_artifacts, version)
generated_labels = generated_artifact_dict.keys()

for label in labels:
if label in generated_labels and generated_artifact_dict[label]["artifact"] not in excluded_artifacts:
artifact = generated_artifact_dict[label]["artifact"]
sha = generated_artifact_dict[label]["sha256"]
deps = generated_artifact_dict[label]["deps:"] if "deps:" in generated_artifact_dict[label] else []
original_artifact_dict[label]["artifact"] = artifact
original_artifact_dict[label]["sha256"] = sha
if deps:
dependencies = [d for d in deps if d[1:] in labels and "runtime" not in d and "runtime" not in artifact]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not modifying the runtime dependencies due to CI-checks failure.

if dependencies:
original_artifact_dict[label]["deps"] = dependencies

write_to_file(original_artifact_dict, version, file)

for version in root_scala_versions:
create_file(version)
Loading