-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace rtf conversion script with actual PDFs #135
Comments
pdfs are not a substitute for the rft conversion, because the html is a
much better web experience.
…On Mon, Mar 19, 2018 at 2:23 PM, Regina Compton ***@***.***> wrote:
The rtf conversion script for NYC sometimes requires longer than 15
minutes to complete (which delays NYC data imports).
Let's replace the RTF --> HTML with the actual PDFs. It should be possible
via this PR
<opencivicdata/python-legistar-scraper#64>.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#135>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAgxbcjL38_4a5nQDYEZQ5A8mZ5AURk_ks5tgAXMgaJpZM4Swwen>
.
--
773.888.2718
|
Can you say more about the "better web experience"? For example, how does this NYC bill (HTML) compare with this Chicago bill (PDF)? PDF cons HTML cons |
- It's much better for SEO.
- It's much for web accessibility
- It's much better finding things on a page, (the normal browser search
just works)
What details do we lose in the original bill document for NYC.
…On Mon, Mar 19, 2018 at 10:19 PM, Regina Compton ***@***.***> wrote:
Can you say more about the "better web experience"?
For example, how does this NYC bill
<https://laws.council.nyc.gov/legislation/int-241-2018/> (HTML) compare
with this Chicago bill
<https://chicago.councilmatic.org/legislation/o-2018-2260/> (PDF)?
*PDF cons*
With the PDF, you need to scroll, if the bill has multiple pages; the PDF
also looks rather small in mobile view.
*HTML cons*
With the HTML, we lose detail in the original bill document, which can
make it difficult to read in the mobile view (see example above).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#135 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAgxbYV3b-4xJR9uoKEMlRQZjBRqlfsnks5tgHVWgaJpZM4Swwen>
.
--
773.888.2718
|
Those are important points. For details, we mainly lose header and footer information - so, nothing crucial. In that sense, it's an aesthetic issue. However, I still think for longer bills with several indents we sacrifice readability (particularly in the mobile view). I might be projecting my subjective experience though. If we decide to maintain the rtf converter, then we must remember that it's imperfect: we should render PDFs as a "back-up" when a bill does not have html. But....would such inconsistently look strange for users? |
We could render the footer and header. |
I was able to speed up the RTF conversion script via datamade/django-councilmatic#230 (per issue #155). We should still consider scraping the PDF links, but this seems like an enhancement to the current system. I will mark it as such. |
The rtf conversion script for NYC sometimes requires longer than 15 minutes to complete (which delays NYC data imports).
Let's replace the RTF --> HTML with the actual PDFs. It should be possible via this PR.
The text was updated successfully, but these errors were encountered: