Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Prod for legacy #3384

Closed
wants to merge 190 commits into from
Closed

Conversation

kaplun
Copy link
Member

@kaplun kaplun commented Jul 21, 2015

This is just a mega branch with every patch done on INSPIRE that applied without conflict on top of Invenio legacy branch. Maybe that it could be used as reference for pushing upstream some quick fixes.
(cc @egabancho)

jmartinm and others added 30 commits July 21, 2015 16:41
Co-authored-by: Alessio Deiana <[email protected]>
Signed-off-by: Alessio Deiana <[email protected]>
* Improves the standard Journal tokenizer to keep page-end
  when indexing, as well as adding new journal,volume pairs.

* Amends the search_engine to specially handle searches in
  the journal field with page ranges to make use of the new
  tokenized strings. This fixes issues seen in BibRank and
  other search engine clients that passes a valid full page
  range and obtains no result (because it was not indexed!).

Co-authored-by: Alessio Deiana <[email protected]>
* For certain journal search queries with INSPIRE style page
  ranges the user is now presented with an alternative
  result without the page-end part of the query, should the
  original query return no results.

Co-authored-by: Alessio Deiana <[email protected]>
* Session is now correctly initialized before use.
* Now handles correctly repeatable affiliations
* Optimised ticketing system sync, significantly reduces server polling load.
* Introduces a new parameter in the datacacher that defaults to 30s.
  This is the mininum time the data is cached. That means that calling
  recreate_cache_if_needed() does nothing if it was called less than 30s ago.
* When bibrank -w citation fails because we lost too many citations,
  displays in the log for each record how many citations they lost.
* Clicking on a name on a paper now brings correctly to the author page
  instead of a person search page. Solves some minor ambiguity problems.
* BibTasks are typically loading large quantities
  of data from the database. This is not easily
  cacheable, therefore query_cache_type is locally
  disabled.

Reported-by: Thorsten Schwander <[email protected]>
Signed-off-by: Samuele Kaplun <[email protected]>
* Silences "request data read error" IOError seen when
  the client closes a connection upon POSTing.

* Clean up exception raising in general.

Signed-off-by: Samuele Kaplun <[email protected]>
* Now /author/manage_profile/ correctly displays an error message
  when visited without the expected argument.

Signed-off-by: Samuele Carli <[email protected]>
Reviewed-by: Samuele Kaplun <[email protected]>
* If the client sends invalid header containing '/n', send back
  a 400 message instead of creating an exception
* Extracts transform links out of the HTMLWasher class to allow
  easy use in other context

Co-authored-by: Jan Aage Lavik <[email protected]>
Co-authored-by: Jan Aage Lavik <[email protected]>
Co-authored-by: Alessio Deiana <[email protected]>
* Adds a try..except around __write and raises an exception in case of GET
  so that we interrupt the processing of request (prevents useless computation)

Signed-off-by: Alessio Deiana <[email protected]>
* Since try...excepts are all over the place, we need extra code to silence
  this exception.
* Gives priority to site/dist-packages when importing
  HTMLParser. In this way, when on Python-2.7 and
  HTMLParser is installed from pypi, it will use the latter.

Signed-off-by: Samuele Kaplun <[email protected]>
Reviewed-by: Javier Martin Montull <[email protected]>
Reviewed-by: Jan Aage Lavik <[email protected]>
* The assertion message was generating an exception due to a bad
  string replacement.

Signed-off-by: Carli Samuele <[email protected]>
* The system crashed in case something goes wrong and the
  orcid was not assignet to a profile yet when the import
  was called. Now this is handled gracefully.

Signed-off-by: Carli Samuele <[email protected]>
* Updates the run_shell_command to only escape_shell_args when
  there are actual args to escape. Otherwise errors related to
  Python string formatting may happen.

Signed-off-by: Jan Aage Lavik <[email protected]>
* Fixes UnboundLocalError when number of server retries reach
  it's maximum and the code is trying to call a variable that
  is not defined.

* Fixes a few PEP8 issues.

Signed-off-by: Jan Aage Lavik <[email protected]>
* When 'login_method' is missing from user preferences, the
  exception is now gracefully raised and execution continues
  using default values.

Co-authored-by: Samuele Kaplun <[email protected]>
* Do not report KeyboardInterrupt exception. (fixes inveniosoftware#1701)

Signed-off-by: Jan Aage Lavik <[email protected]>
 * Reverts some changes made in f5969cf to fix the 'keywords' tab,
   as exceptions were raised for bad use of parameters.

 * Adds desired additions made in f5969cf instead in BibClassify
   module instead of having it in search_engine.

Signed-off-by: Jan Aage Lavik <[email protected]>

Conflicts:
	modules/websearch/lib/search_engine.py
* Adds a readable representation of BibFormatObject so that
  it can be used in debugging etc.

Signed-off-by: Jan Aage Lavik <[email protected]>
jmartinm and others added 19 commits July 21, 2015 16:45
* Adds possibility to insert into holdingpen through robotupload.

Signed-off-by: Javier Martin Montull <[email protected]>
Signed-off-by: Samuele Kaplun <[email protected]>
* Improves the referencecount tokenizer by using get_record and
  directly count the effective number of references rather than the
  number of subfields.

Signed-off-by: Samuele Kaplun <[email protected]>
* In case of DB connection errors, sleeps 30s to account for
  when the database is simply restarted (e.g. after unattended
  upgrades).

Signed-off-by: Samuele Kaplun <[email protected]>
* Sorry Thorsten... I know what you would say... :-)
  BTW I couldn't say to you goodbye!

Signed-off-by: Samuele Kaplun <[email protected]>
* Improves the authorlist conversion to also ignore lowercase
  "undefined" in values.

Signed-off-by: Jan Aage Lavik <[email protected]>
Signed-off-by: Jan Aage Lavik <[email protected]>
Signed-off-by: Samuele Kaplun <[email protected]>
* Finally proper fix the fix of the fix for the emergency fix.
  (in particular "'foo' in bar", where bar is an instance of an
  exception, does not do what one would expect, but simply returns
  False, even if str(bar) actually contains 'foo').

Signed-off-by: Samuele Kaplun <[email protected]>
* When BibUpload spots that two records are duplicates
  (e.g. because a DOI is being added to record 1 while
  it was already existing for record 2) creates a ticket
  in Asana with sensible links and prefixing the title
  of the link in case any of the involved records has
  in a way or another a 773__m field for Erratum

Signed-off-by: Samuele Kaplun <[email protected]>
* Fixes crash when calculating next starting time for a rule, in case
  webcoll as not run, or bibindex has not run for all indexes.

Signed-off-by: Samuele Kaplun <[email protected]>
@kaplun
Copy link
Member Author

kaplun commented Jul 21, 2015

@egabancho to fix conflicts is quite a painful process. What do you suggest is the best way to proceed? Maybe we should one day simply sit together sifting all the commits that we have never integrated upstream and see which one you are interested into?

@egabancho
Copy link
Member

I would say, at least, make a PR module by module.
Ideally I would go commit by commit, if they are not closely related i.e one can't live without the other, and make PRs for them, like you did already for kaplun@64333fe (I can imagine it is doable for most of them)

Thinking with the CDS hat the smaller the granularity of the PR the easier to deploy and test.
I know you can't change the past, but at least for the future I suggest you to make a PR to your repo and to inveniosoftware with the same branch, it just takes a few seconds and it will save you a few headaches.

which one you are interested into?

[Un]fortunately it's not only us who might be interested in Inpire contributions but anyone using Invenio 1.x ... potentially all the enhancements/fixes should be welcome, but of course I can give you our opinion about which ones are particularly interesting for CDS

@kaplun
Copy link
Member Author

kaplun commented Jul 21, 2015

I did several PRs in the past, but since they were not integrated, they became stale and eventually incompatible. Check e.g. #2365, #2824, #2787, #2254, #2021, #2758, #2841, #2361, (and almost #2526).
Since at INSPIRE we are basically no longer actively working on legacy (and we were based on maint-1.1), it is quite painful for me to go back and resurrect all patches and amend them against latest legacy, and finding out no-one needs them or that they are simply not integrated... Additionally, unfortunately, many patches are not nicely self-contained, and are cross-modules.

I know you can't change the past, but at least for the future I suggest you to make a PR to your repo and to inveniosoftware with the same branch, it just takes a few seconds and it will save you a few headaches.

Ah that is already the case, and is made easy by the new course of split-up modules.

@jirikuncar jirikuncar added this to the v1.3.0 milestone Jul 30, 2015
@kaplun kaplun closed this Sep 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants