Implement RefChecker in JabKit

⚠ This is a bigger "first issue". Only take it if you have enough time for it. ⚠

## Context

There are more and more fake references. JabRef has the infrastructure to check it, but it needs to be wired together.

A whole bib file should be checked.

There is a Python-script "RefChecker" doing it, but we want to do it integrated in JabRef

## Related Work

See https://github.com/markrussinovich/refchecker

(LinkedIn-Post: https://www.linkedin.com/posts/markrussinovich_github-markrussinovichrefchecker-a-tool-activity-7355654490076696576-WycH?utm_source=share&utm_medium=member_desktop&rcm=ACoAAACCUVQBYmlu_A9exTiDRiuXB95v-LNYD4c)


## 1. Implement logic

### Goal

For each `BibEntry`, fetch authoritative metadata via its **own identifiers** and compare **local vs fetched** to classify into groups.

### Groups to ensure (create if missing)

```
refcheck
 ├─ real paper
 ├─ unsure
 └─ fake paper
```

Implementation can mirror `org.jabref.gui.groups.GroupTreeViewModel#addSuggestedGroups`.

### Algorithm (per BibEntry)

0. **Convert text to BibEntry**

    * In Prefernces > Web Search, there is "Default plain citation parser" configured. This one should be used.
	* Use `org.jabref.logic.importer.plaincitation.SeveralPlainCitationParser` to turn a text into a List of BibEntries

1. **Resolve by DOI (preferred)**

   * If `StandardField.DOI` present:
     fetch `authoritativeEntry` via `org.jabref.logic.importer.fetcher.DoiFetcher#performSearchById(doi)`.
   * Else try to find a DOI via `org.jabref.logic.importer.fetcher.CrossRef#findIdentifier(entry)`; if found, fetch via `DoiFetcher` and store as `authoritativeEntry`

2. **Fallback: resolve by arXiv** (`authoritativeEntry` still null)

   * If arXiv ID present or found via `org.jabref.logic.importer.fetcher.ArXivFetcher#findIdentifier(entry)`, fetch its metadata and store in ``authoritativeEntry`

3. **Compare: local vs `authoritativeEntry`**

   Use `org.jabref.logic.database.DuplicateCheck#isDuplicate` to determine if local is a duplicate of `authoritativeEntry` 
   
   If yes: Add to group `real paper`. If not: Add to group `fake paper`
 
   return

---
   
Now: authoritativeEntry is null

4. **Search paper using fetcher**

    Look up paper using `org.jabref.logic.importer.fetcher.CompositeSearchBasedFetcher`. 
	
	If something found: check if any entry is a duplicate of `local`. If yes: If yes: Add to group `real paper`. If not: Add to group `fake paper`

---

The current proposal does not make use of the group "unsure". Maybe, the DuplicateCheck class needs to be adapted accordingly.

## 2. Add test

For 1, tests need to be crafted. Think of TDD - and add tests before/while coding

## 3. Wire into CLI

A. Include `refcheck `--online`/`--offline` <file.bib>` in  `org.jabref.cli.ArgumentProcessor`
B. Include `refcheck `--online`/`--offline` <file.pdf>` in  `org.jabref.cli.ArgumentProcessor`

Note that `--online` and `--offline` are optional. If not given, the default plain citation parser is used.

For B

Import references from PDF into .bib using "New library based on references". Users can do `--online` and `--offline` (with `--online` being the default if AI is available. Error if `--online` and no AI available)

<img width="374" height="186" alt="Image" src="https://github.com/user-attachments/assets/f2030df8-1da1-41f1-9a92-44f1c9bba819" />

## 4. Wire into GUI

Create "Tools" > "Ref Checker"

Content:

Tab "Citations" and Tab "PDF File"

Tab Citations: Text field with citations

Tab "PDF File": Filename with "Browse" button

At the end of each tab: "Check". Then the functionality is called. On success, a new library is created in JabRef.

## Code hints

Similar comparions are done at

- org.jabref.gui.mergeentries.newmergedialog.FieldRowViewModel#autoSelectBetterValue // most similar approach
- org.jabref.logic.database.DuplicateCheck#isDuplicate / org.jabref.logic.database.DuplicateCheck#compareFieldSet // but cannot be used as we really want to rely on a "high quality" BibEntry

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implement RefChecker in JabKit #13604

Context

Related Work

1. Implement logic

Goal

Groups to ensure (create if missing)

Algorithm (per BibEntry)

2. Add test

3. Wire into CLI

4. Wire into GUI

Code hints

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Implement RefChecker in JabKit #13604

Description

Context

Related Work

1. Implement logic

Goal

Groups to ensure (create if missing)

Algorithm (per BibEntry)

2. Add test

3. Wire into CLI

4. Wire into GUI

Code hints

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions