Display transform function for full names #805

torfjelde · 2023-10-10T02:25:18Z

Is your feature request related to a problem? Please describe.
I often find myself wanting to search through my bibliographies using the first name. This might just be because I remember the first name but not the last name, or there are too many authors with the same last name.

Describe the solution you'd like
I'm thinking something like (basically just a copy-paste of citar--shorten-names):

;; Similar to `citar--shorten-name' and `citar--shorten-names'
;; but using the full name instead of just the last name.
(defun citar--full-name (name)
  "Return full NAME, i.e. `given family`, from `family, given' string.

Otherwise, return as is."
  (let (split-names (split-string name ", "))
    (if (> (length split-names) 1)
        (concat (cadr split-names) " " (car split-names))
      name)))

(defun citar--full-names (namestr &optional truncate andstr)
  "Return a list of full names from a list of full NAMESTR.

With an integer TRUNCATE, will shorten the list, and ANDSTR will
replace last comma."
  (let* ((namelist (split-string (or namestr "") " and "))
         (namelength (length namelist))
         (tnamelist (seq-take namelist (or truncate namelength)))
         (tnamelength (length tnamelist)))
    (mapconcat
     (lambda (n)
       (let* ((shortname (citar--full-name n))
              (pos (citar--shorten-name-position tnamelist n))
              (suffix
               (cond
                ;; if last name in the list and we're truncating add et al.; otherwise, no suffix
                ((equal pos tnamelength)
                 (if (< tnamelength namelength) " et al." ""))
                ;; if second to last in the list, and ANDSTR, use that
                ((and andstr (equal pos (- tnamelength 1)))
                 (concat " " andstr " "))
                ;; otherwise, use a comma
                (t ", "))))
         (concat shortname suffix)))
     tnamelist "")))

in combination with

(setq citar-display-transform-functions
   `((sn . (citar--shorten-names))
     (etal . (citar--shorten-names 3 "&"))
     (fn . (citar--full-names))))

so I can use this in templates with the transform key fn.

Describe alternatives you've considered
One alternative is to "maintain" the usage of "and" in the author list, but IMO this becomes quite cluttered when searching, and so I prefer the "given1 family1, given2 family2, ..." instead of `given1 family1 and given2 family2 and ...".

The text was updated successfully, but these errors were encountered:

bdarcus · 2023-10-10T10:19:53Z

Given this behavior is already configurable (just create your own function and use that in your templates), is there really any point in changing the default function there?

Or maybe I don't understand what you're asking for.

torfjelde · 2023-10-20T19:20:58Z

Of course, that is also completely reasonable:)

But given that the sn and the etal transform functions are already there, it seems like citar.el thinks it is useful to provide some "default" transform functions, right? And so I think it's just a question of whether the full name transformation function should be considered one of these?

Personally I think the answer is yes (because I use it myself 😅 ) but I completely understand if that opinion is not shared. Nonetheless, I thought it would be useful to open an issue so other people who desired the same could potentially discover a "solution":)

bdarcus · 2023-10-21T12:35:07Z

OK, so you're not asking to change the default, but to add another option?

I don't see any particular problem doing that. Can you submit a PR?

https://github.com/emacs-citar/citar/blob/main/CONTRIBUTING.org

torfjelde · 2023-10-21T12:57:48Z

Ah yes, I probably wasn't too clear on that! Awesome! I'll do that soon-ish then:)

…

On Sat, Oct 21, 2023, 2:35 PM bdarcus ***@***.***> wrote: OK, so you're not asking to change the default, but to add another option? I don't see any particular problem doing that. Can you submit a PR? https://github.com/emacs-citar/citar/blob/main/CONTRIBUTING.org — Reply to this email directly, view it on GitHub <#805 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACUPZZFE7RU63IQIWA6CQPLYAO6QNAVCNFSM6AAAAAA5ZTRYXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZTG43TOMBYGE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

bdarcus · 2023-10-21T13:03:46Z

One design option worth considering (I'm not myself sure):

Perhaps citar--shorten-names should be generalized, where this becomes a parameter on that?

Whether or not to do that probably comes down to the clarity and maintainability of the code, which is why I'm not sure ATM.

Add new display transform function for full names. Generalize existing `citar--shorten-names` to `citar--render-names` to allow code-sharing between completion rendering of family names and full names. Refs: emacs-citar#805

the authors rather than just surnames. Related: emacs-citar/citar#805

aikrahguzar · 2023-10-28T18:32:39Z

I think a better way to solve this problem would be allow display transform functions to make some part of the string invisible. However, this would mess up the alignment since truncate-string-to-width doesn't take visibility into account. I experimented a bit and it is not too hard to write a version that takes into account invisible portions of the string when fitting it to a given width:

(defun citar-format--truncate-string-to-column (str end-column &optional ellipsis ellipsis-text-property)
  (with-current-buffer (get-buffer-create " *truncate-string*" t)
    (insert str)
    (if (<= (current-column) end-column)
        (insert (make-string (- end-column (current-column)) ?\s))
      (when (not (stringp ellipsis))
        (setq ellipsis (truncate-string-ellipsis)))
      (setq end-column (- end-column (if ellipsis (string-width ellipsis) 0)))
      (let ((col (move-to-column end-column)))
        (unless (eq col end-column)
          (forward-char -1)
          (cl-callf concat ellipsis
            (make-string (- end-column col) ?\s))))
      (if ellipsis-text-property
          (put-text-property (point) (point-max) 'display (or ellipsis ""))
        (delete-region (point) (point-max))
        (insert ellipsis)))
    (delete-and-extract-region (point-min) (point-max))))

To see how performant it is, I evaluted

(defvar citar-format--test (string-split (buffer-substring-no-properties (point-min) (point-max)) "\n" t))

so that citar-format--test has lines of citar-format.el as list elements. Then I benchmarked,

(benchmark-run-compiled 1000
    (dolist (line citar-format--test)
      (truncate-string-to-width line 25 nil ?\s t t)))
;;  (1.477754321 6 0.9875454620000141)

(benchmark-run-compiled 1000
  (dolist (line citar-format--test)
    (citar-format--truncate-string-to-column line 25 t t)))
;; (3.7252026 6 0.9839875159999565)

So citar-format--truncate-string-to-column is about 2.5x slower than truncate-string-to-width. However most of it is buffer switching overhead due to with-current-buffer,

(benchmark-run-compiled 1000
  (with-current-buffer (get-buffer-create " *truncate-string*" t)
    (dolist (line citar-format--test)
      (citar-format--truncate-string-to-column line 25 t t))))
;; (1.647513596 6 0.9943548589999978)

i.e. wrapping each dolist to make the temporary buffer current makes the overhead go down to ~ 15%. This means that if wrap the formatting of all the candidates with this call, the overhead is not too high. However it does vary with the desired width,

| Width |   citar-format--truncate-string-to-column   |  truncate-string-to-width          |
|-------|---------------------------------------------|------------------------------------|
| 5     | (1.968308915 7 1.1616637719999972)          | (1.555034720 7 1.1581623760000070) |
| 10    | (1.988653394 7 1.1613642060000302)          | (1.565650172 7 1.1428396089999637) |
| 25    | (1.809252233 6 0.9900703910000175)          | (1.493039642 6 0.9859764029999951) |
| 50    | (1.501522882 5 0.8258050510000317)          | (1.410216259 5 0.8243931129999851) |
| 75    | (1.116250921 4 0.6578333830000247)          | (1.280658874 4 0.6582537240000192) |
| 100   | (1.086847893 4 0.6582314920000272)          | (1.270738270 4 0.6523561829999949) |

so citar-format--truncate-string-to-column goes from being about 33% slower when every string need to be truncated so about 20% faster when every string need to be padded.

So with that change it is possible to have a midway between full names and last names which is to put propertize first names with invisible property so that they appear in string but can still be searched.

bdarcus · 2023-10-28T18:45:32Z

I think a better way to solve this problem would be allow display transform functions to make some part of the string invisible.

I was wondering about this, though didn't have time to look into it.

@torfjelde - not sure you know this, but we already use a similar strategy for the default names when a long list of authors gets truncated. You can still search those candidates because the truncated part is still there.

Well, probably more precisely: this is true for any field, I believe. E.g. a long title that gets truncated.

It provides a nice balance of allowing the general case to be pretty compact, but still allowing more of the content to be completed against.

roshanshariff · 2023-10-28T18:53:06Z

I've been thinking about a potential change to how Citar generates and formats candidate strings that could help with this. In short, the completion system uses candidate strings only for matching against the user input, so the strings we generate needn't be formatted for display. They can just be concatenated, untruncated text for all the fields, including invisible text like full author names.

Then we actually format the text for display in the affixation function. This would truncate and ellipsize the text in each column, for which the column boundaries would have to be encoded as text properties.

The major advantage is that the affixation function is only called for displayed candidates, not for all possible completions (e.g. Vertico only calls it for the candidates it shows in the minibuffer). This opens the door for more complicated and expensive reformatting: for example, we could use variable-pitch text and then dynamically decide where to truncate and ellipsize it to fit the frame width, like vtable.el does for in-buffer text.

aikrahguzar · 2023-10-28T19:04:03Z

I think a potential downside would be that affixation function would also need to highlight the candidate since losing all highlighting of matches would be too big a loss. It is probably not too hard to overcome: completion-all-completions or completion-try-completion with only one string in table will probably do it.

bdarcus · 2023-10-28T19:05:46Z

I think a potential downside would be that affixation function would also need to highlight the candidate since losing all highlighting of matches would be too big a loss.

If I understand you right, yes, this is the downside of annotation and affixation, and the primary reasons why the current approach.

roshanshariff · 2023-10-28T19:06:28Z

@aikrahguzar I was under the impression that the completion system does highlighting of the candidate string after it has been transformed by the affixation function? As long as the affixation function merely hides or replaces text using text properties, it shouldn't affect highlighting, right?

aikrahguzar · 2023-10-28T19:20:01Z

@aikrahguzar I was under the impression that the completion system does highlighting of the candidate string after it has been transformed by the affixation function? As long as the affixation function merely hides or replaces text using text properties, it shouldn't affect highlighting, right?

Looking at the elisp manual, I don't know at what stage affixation function is called. But I think you are right, and my impression was from remembering annotation-function and not affixation-function.

torfjelde · 2023-10-29T10:00:08Z

not sure you know this, but we already use a similar strategy for the default names when a long list of authors gets truncated. You can still search those candidates because the truncated part is still there.

Gotcha:) This was actually how I initially thought it was implemented and why I sort of "expected" to be able to search full names.

So do we then close this PR in favour of the invisibility-solution? Or do we try to get this merged first, and then invisibility-solution afterwards?

bdarcus · 2023-10-29T11:09:11Z

@torfjelde

So do we then close this PR in favour of the invisibility-solution? Or do we try to get this merged first, and then invisibility-solution afterwards?

Indeed, this is the question.

@roshanshariff @aikrahguzar - thoughts? He's referring to #808.

Edit: it does occur to me contributor names are unique among the fields, in that it might make sense to by default include the full name, or the given name only, in the invisible part of the candidate, but still leave room for people to include it in the visible part?

roshanshariff · 2023-10-29T17:03:10Z

I think it makes sense to make the name-shortening functions work by hiding (rather than deleting) the extra text. This should also work well with highlighting, since the entire text will match but only the visible parts will be highlighted. It'll need @aikrahguzar's modified truncation function to allow that though. It'll still be useful to have both full-name and last-name transformers, because people will have different preferences about what should be shown.

torfjelde added the enhancement New feature or request label Oct 10, 2023

torfjelde mentioned this issue Oct 23, 2023

Generalize rendering of names for completion #807

Closed

torfjelde mentioned this issue Oct 23, 2023

Generalize citar--shorten-names + add display transform function for full names #808

Closed

torfjelde added a commit to torfjelde/dotfiles that referenced this issue Oct 25, 2023

Added functionality for Citar.el so that I can search the full name of

4d141e4

the authors rather than just surnames. Related: emacs-citar/citar#805

bdarcus changed the title ~~Display transform function for full namees~~ Display transform function for full names Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Display transform function for full names #805

Display transform function for full names #805

torfjelde commented Oct 10, 2023

bdarcus commented Oct 10, 2023 •

edited

Loading

torfjelde commented Oct 20, 2023

bdarcus commented Oct 21, 2023

torfjelde commented Oct 21, 2023 via email

bdarcus commented Oct 21, 2023

aikrahguzar commented Oct 28, 2023

bdarcus commented Oct 28, 2023

roshanshariff commented Oct 28, 2023 •

edited

Loading

aikrahguzar commented Oct 28, 2023

bdarcus commented Oct 28, 2023 •

edited

Loading

roshanshariff commented Oct 28, 2023

aikrahguzar commented Oct 28, 2023

torfjelde commented Oct 29, 2023

bdarcus commented Oct 29, 2023 •

edited

Loading

roshanshariff commented Oct 29, 2023

Display transform function for full names #805

Display transform function for full names #805

Comments

torfjelde commented Oct 10, 2023

bdarcus commented Oct 10, 2023 • edited Loading

torfjelde commented Oct 20, 2023

bdarcus commented Oct 21, 2023

torfjelde commented Oct 21, 2023 via email

bdarcus commented Oct 21, 2023

aikrahguzar commented Oct 28, 2023

bdarcus commented Oct 28, 2023

roshanshariff commented Oct 28, 2023 • edited Loading

aikrahguzar commented Oct 28, 2023

bdarcus commented Oct 28, 2023 • edited Loading

roshanshariff commented Oct 28, 2023

aikrahguzar commented Oct 28, 2023

torfjelde commented Oct 29, 2023

bdarcus commented Oct 29, 2023 • edited Loading

roshanshariff commented Oct 29, 2023

bdarcus commented Oct 10, 2023 •

edited

Loading

roshanshariff commented Oct 28, 2023 •

edited

Loading

bdarcus commented Oct 28, 2023 •

edited

Loading

bdarcus commented Oct 29, 2023 •

edited

Loading