Skip to content

Commit

Permalink
Merge pull request #39 from glevita-uc/v2.0
Browse files Browse the repository at this point in the history
V2.0
  • Loading branch information
glen-uc authored Oct 28, 2022
2 parents a135a79 + b0bf96a commit c2ed605
Show file tree
Hide file tree
Showing 23 changed files with 9,257 additions and 2,349 deletions.
141 changes: 92 additions & 49 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# cic-beautify-state-codes
# cic-beautify-state-codes-framework

Welcome to the Code Improvement Commission

Expand All @@ -8,15 +8,14 @@ This repository contains software that transforms official codes from ugly .rtf

Currently this code supports following states:

1. ###Georgia (GA):
1. ###Alaska (AK):

**Code repo:** https://github.com/UniCourt/cic-code-ga
**Code repo:** https://github.com/UniCourt/cic-code-ak

**Code pages:** https://unicourt.github.io/cic-code-ga
**Code pages:** https://unicourt.github.io/cic-code-ak

**Original RTF:** https://archive.org/download/gov.ga.ocga.2018
**Original RTF:** https://archive.org/download/gov.ak.code



2. ###Arkansas (AR):

Expand All @@ -27,67 +26,97 @@ Currently this code supports following states:
**Original RTF:** https://archive.org/download/gov.ar.code


3. ###Mississippi (MS):
3. ###Colorado (CO):

**Code repo:** https://github.com/UniCourt/cic-code-ms
**Code repo:** https://github.com/UniCourt/cic-code-co

**Code pages:** https://unicourt.github.io/cic-code-ms
**Code pages:** https://unicourt.github.io/cic-code-co

**Original RTF:** https://archive.org/download/gov.ms.code.ann.2018
**Original RTF:** https://archive.org/download/gov.co.crs.bulk


4. ###Georgia (GA):

**Code repo:** https://github.com/UniCourt/cic-code-ga

**Code pages:** https://unicourt.github.io/cic-code-ga

**Original RTF:** https://archive.org/download/gov.ga.ocga.2018


4. ###Tennessee (TN):
5. ###Idaho (ID):

**Code repo:** https://github.com/UniCourt/cic-code-tn
**Code repo:** https://github.com/UniCourt/cic-code-id

**Code pages:** https://unicourt.github.io/cic-code-tn
**Code pages:** https://unicourt.github.io/cic-code-id

**Original RTF:** https://archive.org/details/gov.tn.tca
**Original files can be found here:** https://archive.org/details/govlaw?and%5B%5D=subject%3A%22idaho.gov%22+AND+subject%3A%222020+Code%22&sin=&sort=titleSorter

5. ###Kentucky (KY):

6. ###Kentucky (KY):

**Code repo:** https://github.com/UniCourt/cic-code-ky

**Code pages:** https://unicourt.github.io/cic-code-ky

**Original RTF:** https://archive.org/details/gov.ky.code

6. ###Colorado (CO):

7. ###Mississippi (MS):

**Code repo:** https://github.com/UniCourt/cic-code-co
**Code repo:** https://github.com/UniCourt/cic-code-ms

**Code pages:** https://unicourt.github.io/cic-code-co
**Code pages:** https://unicourt.github.io/cic-code-ms

**Original RTF:** https://archive.org/download/gov.co.crs.bulk
**Original RTF:** https://archive.org/download/gov.ms.code.ann.2018


7. ###Idaho (ID):
8. ###North Carolina (NC):

**Code repo:** https://github.com/UniCourt/cic-code-id
**Code repo:** https://github.com/UniCourt/cic-code-nc

**Code pages:** https://unicourt.github.io/cic-code-id
**Code pages:** https://unicourt.github.io/cic-code-nc

**Original files can be found here:** https://archive.org/details/govlaw?and%5B%5D=subject%3A%22idaho.gov%22+AND+subject%3A%222020+Code%22&sin=&sort=titleSorter
**Original RTF:** https://archive.org/download/gov.nc.code


8. ###Virginia (VA):
9. ###North Dakota (ND):

**Code repo:** https://github.com/UniCourt/cic-code-va

**Code pages:** https://unicourt.github.io/cic-code-va

**Original RTF:** https://archive.org/download/gov.va.code/
**Code repo:** https://github.com/UniCourt/cic-code-nd
**Code pages:** https://unicourt.github.io/cic-code-nd
**Original RTF:** https://archive.org/details/gov.nd.code


9. ###Vermont (VT):
10. ###Tennessee (TN):

**Code repo:** https://github.com/UniCourt/cic-code-vt
**Code repo:** https://github.com/UniCourt/cic-code-tn

**Code pages:** https://unicourt.github.io/cic-code-tn

**Original RTF:** https://archive.org/details/gov.tn.tca


11. ###Vermont (VT):

**Code pages:** https://unicourt.github.io/cic-code-vt
**Code repo:** https://github.com/UniCourt/cic-code-vt

**Code pages:** https://unicourt.github.io/cic-code-vt

**Original RTF:** https://archive.org/download/gov.vt.code


12. ###Virginia (VA):

**Original RTF:** https://archive.org/download/gov.vt.code
**Code repo:** https://github.com/UniCourt/cic-code-va

**Code pages:** https://unicourt.github.io/cic-code-va

**Original RTF:** https://archive.org/download/gov.va.code/

10. ###Wyoming (WY):

13. ###Wyoming (WY):

**Code repo:** https://github.com/UniCourt/cic-code-wy

Expand All @@ -98,8 +127,10 @@ Currently this code supports following states:

In subsequent months, we intend to add two more features:

1. Extend the code to handle the official codes Colorado and Idaho.
2. Add a "redline" capability to show diffs.
1. Extend the code to handle the official codes Rhode Island and other states.
2. Add a "redline" capability to show diffs.
3. Adding citation to external links.


**REQUIREMENTS AND INSTALLATION**

Expand Down Expand Up @@ -127,10 +158,9 @@ In subsequent months, we intend to add two more features:
│ │ file012.py
|
└───transforms
│ └───ga
│ └───ocga
│ └───raw
│ title_01.html
│ └───co
│ └───occo
│ └───title_01.html


5. Python3.8 should be installed in development environment to run this project
Expand All @@ -139,12 +169,25 @@ In subsequent months, we intend to add two more features:

**Usage:** python html_parser/html_parse_runner.py

[--state_key (GA)]
[--release_label (Release-75)]
[--release_date (DD-MM-YYYY)]
[--input_file_name (gov.ga.ocga.title.01.html) This is an optional argument,
if this argument is not passed all the files for provided release label will be parsed]
[--state_key (CO)]

[--path This argument can be in three different types,
To run single file : (/co/occo/r80/gov.co.code.title.01.html)
To run all files from particular release : (/co/occo/r80/)
To run all the release of particular state : (/co/occo/) ]

[--run_after_release (83) This is an optional argument,this helps to run all releases after the mentioned release]


**Additional required files:**

Release_dates.txt :
This is a file where all states release dates are stored in the format <state_key>_r<release_number>< ><release_date>
eg: [CO_r71 2020.08.01]

**Implementation of Child class:**

Child class name format : <state_key>_html_parser eg:co_html_parser.
Mandatory functions in child :
pre_process :
convert_paragraph_to_alphabetical_ol_tags
15 changes: 10 additions & 5 deletions html_parser/ak_html_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -819,9 +819,9 @@ def clean_html_and_add_cite(self):
title_id = id_reg.group("title").strip().zfill(2)

if os.path.isfile(
f"../../code-ak/transforms/ak/ocak/r{self.release_number}/gov.ak.code.title.{title_id}.html"):
f"/home/mis/PycharmProjects/cic-code-ak-1/transforms/ak/ocak/r{self.release_number}/gov.ak.code.title.{title_id}.html"):
with open(
f"../../code-ak/transforms/ak/ocak/r{self.release_number}/gov.ak.code.title.{title_id}.html",
f"/home/mis/PycharmProjects/cic-code-ak-1/transforms/ak/ocak/r{self.release_number}/gov.ak.code.title.{title_id}.html",
'r') as firstfile:

for line in firstfile:
Expand All @@ -843,7 +843,7 @@ def clean_html_and_add_cite(self):

tag.clear()
text = re.sub(fr'\s{re.escape(match)}',
f' <cite class="ocnd"><a href="{a_id}" target="{target}">{match}</a></cite>',
f' <cite class="ocak"><a href="{a_id}" target="{target}">{match}</a></cite>',
inside_text,
re.I)
tag.append(text)
Expand All @@ -863,7 +863,7 @@ def clean_html_and_add_cite(self):

tag.clear()
text = re.sub(fr'\s{re.escape(match)}',
f' <cite class="ocnd"><a href="{a_id}" target="{target}">{match}</a></cite>',
f' <cite class="ocak"><a href="{a_id}" target="{target}">{match}</a></cite>',
inside_text,
re.I)
tag.append(text)
Expand Down Expand Up @@ -937,8 +937,13 @@ def write_soup_to_file(self):
soup_str = re.sub(rf'{tag}', rf'{cleansed_tag}', soup_str, re.I)

print("validating")
with open(f"../../code-ak/transforms/ak/ocak/r{self.release_number}/{self.html_file_name}", "w") as file:
with open(f"../../cic-code-ak-1/transforms/ak/ocak/r{self.release_number}/{self.html_file_name}", "w") as file:
# soup_str = re.sub(r'&(?!amp;)', '&amp;', soup_str)
# file.write(soup_str)

soup_str = re.sub(r'&(?!amp;)', '&amp;', soup_str)
soup_str = re.sub('<br/>', '<br />', soup_str)
soup_str = re.sub(r'<span class.*?>\s*</span>', '', soup_str)
file.write(soup_str)

def create_Notes_to_Decisions_ul_con(self):
Expand Down
9 changes: 0 additions & 9 deletions html_parser/ar_html_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,13 +144,7 @@ def replace_tags(self):
for key, value in tag_dict.items():
ul = self.soup.new_tag("ul", Class="leaders")
while True:



p_tag = self.soup.find('p', {"class": key})



if not p_tag or p_tag.has_attr('Class') and p_tag['Class'] == 'transformation':
break
p_tag.name = value
Expand Down Expand Up @@ -1406,9 +1400,6 @@ def create_case_notes_nav_tag(self):
header['id'] = header_id.strip('#')
nav_tag.append(new_ul)
case_notes_nav.replace_with(nav_tag)



print('created analysis tag')


Expand Down
Loading

0 comments on commit c2ed605

Please sign in to comment.