Skip to content

Commit

Permalink
feat: Enhance DMM and IMDB import processes with metadata tracking an…
Browse files Browse the repository at this point in the history
…d configuration updates

- Introduced `DmmService` and `ImdbFileService` for handling import metadata and tracking last import dates.
- Updated `DmmScraping` and `ImdbMetadataLoader` to utilize new services for checking last import times and storing import metadata.
- Added `DmmLastImport` and `ImdbLastImport` classes to track import statistics.
- Modified `DmmFileDownloader` to respect a minimum re-download interval.
- Updated `DmmSyncState` to load and save parsed pages using `DmmService`.
- Enhanced `compose.yaml` with new environment variables for configuration options.
- Updated `README.md` with detailed configuration options and examples.
- Added new database configurations and entity mappings for import metadata and parsed pages.
- Refined global usings and service collection extensions to include new services and configurations.
- Adjusted Python requirements installation script to use the default Python version.
  • Loading branch information
iPromKnight committed Sep 10, 2024
1 parent 355f308 commit fec3e96
Show file tree
Hide file tree
Showing 36 changed files with 437 additions and 95 deletions.
8 changes: 4 additions & 4 deletions .run/Zilean.DmmScraper.run.xml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
<component name="ProjectRunConfigurationManager">
<configuration default="false" name="Zilean.DmmScraper" type="DotNetProject" factoryName=".NET Project">
<option name="EXE_PATH" value="$PROJECT_DIR$/src/Zilean.DmmScraper/bin/Release/net8.0/dmmscraper" />
<option name="EXE_PATH" value="$PROJECT_DIR$/src/Zilean.DmmScraper/bin/Debug/net8.0/dmmscraper.exe" />
<option name="PROGRAM_PARAMETERS" value="" />
<option name="WORKING_DIRECTORY" value="$PROJECT_DIR$/src/Zilean.DmmScraper/bin/Release/net8.0" />
<option name="WORKING_DIRECTORY" value="$PROJECT_DIR$/src/Zilean.DmmScraper/bin/Debug/net8.0" />
<option name="PASS_PARENT_ENVS" value="1" />
<envs>
<env name="Zilean__ElasticSearch__Url" value="http://localhost:9200" />
<env name="ZILEAN_PYTHON_PYLIB" value="/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/lib/libpython3.11.dylib" />
<env name="ZILEAN_PYTHON_VENV" value="C:\Users\Prom3theu5\.conda\envs\dev" />
<env name="ZILEAN_PYTHON_PYLIB" value="C:\Python311\python311.dll" />
<env name="ZILEAN_PYTHON_VENV" value="C:\Python311" />
</envs>
<option name="USE_EXTERNAL_CONSOLE" value="0" />
<option name="USE_MONO" value="0" />
Expand Down
97 changes: 40 additions & 57 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,66 +11,49 @@ The DMM import reruns on missing pages every hour.

## Configuration

There is no configuration for Zilean.
For persistence, you can mount a volume on the path `/app/data`.
---

## Compose
```yaml
volumes:
zilean_data:

services:
zilean:
build:
context: .
dockerfile: Dockerfile
ports:
- "8181:8181"
volumes:
- zilean_data:/app/data
````
---

## Endpoints

### Search Endpoint

Endpoint
```bash
POST /dmm/search
```

Request Body
```json
{
"queryText": "string"
"Zilean": {
"Dmm": {
"EnableScraping": true,
"EnableEndpoint": true,
"ScrapeSchedule": "0 * * * *",
"MaxFilteredResults": 200,
"MinimumScoreMatch": 0.85,
"ImportBatched": false
},
"Database": {
"ConnectionString": "Host=localhost;Database=zilean;Username=postgres;Password=postgres;Include Error Detail=true;Timeout=300;CommandTimeout=300;"
},
"Prowlarr": {
"EnableEndpoint": true
},
"Imdb": {
"EnableImportMatching": false,
"EnableEndpoint": true,
"MinimumScoreMatch": 0.85
}
}
}
```

### Responses
Success:
```json
[
{
"filename": "string",
"infoHash": "string",
"filesize": "long"
},
{
"filename": "string",
"infoHash": "string",
"filesize": "long"
}
]
```
Every option you see can be set as an env variable, the env variable name is the same as the json path with double underscores instead of dots.
For example, `Zilean__Dmm__EnableScraping` would be the env variable for `Zilean.Dmm.EnableScraping`.

A breakdown of all configuration options:

- `Zilean__Dmm__EnableScraping`: Whether to enable the DMM scraping service.
- `Zilean__Dmm__EnableEndpoint`: Whether to enable the DMM search endpoint.
- `Zilean__Dmm__ScrapeSchedule`: The cron schedule for the DMM scraping service.
- `Zilean__Dmm__MaxFilteredResults`: The maximum number of results to return from the DMM search endpoint.
- `Zilean__Dmm__MinimumScoreMatch`: The minimum score required for a search result to be returned. Values between 0 and 1. Defaults to 0.85.
- `Zilean__Dmm__ImportBatched`: Whether to import DMM pages in batches. This is for low end systems. Defaults to false. Will make the initial import take longer. A lot longer.
- `Zilean__Database__ConnectionString`: The connection string for the database (Postgres).
- `Zilean__Prowlarr__EnableEndpoint`: Whether to enable the Prowlarr search endpoint. (Unused currently).
- `Zilean__Imdb__EnableImportMatching`: Whether to enable the IMDB import matching service. Defaults to true. Disabling this will improve import speed at the cost of not having IMDB data.
- `Zilean__Imdb__EnableEndpoint`: Whether to enable the IMDB search endpoint.
- `Zilean__Imdb__MinimumScoreMatch`: The minimum score required for a search result to be returned. Values between 0 and 1. Defaults to 0.85.
---

Error:
```json
{
"type": "https://tools.ietf.org/html/rfc7231#section-6.5.1",
"title": "string",
"status": 500,
"detail": "string"
}
```
## Compose Example
See the file [compose.yaml](https://github.com/iPromKnight/zilean/blob/main/compose.yaml) for an example of how to run Zilean.
1 change: 1 addition & 0 deletions Zilean.sln.DotSettings.user
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
<s:String x:Key="/Default/CodeInspection/ExcludedFiles/FilesAndFoldersToSkip2/=7020124F_002D9FFC_002D4AC3_002D8F3D_002DAAB8E0240759_002Ff_003ADbContextBulkExtensions_002Ecs_002Fl_003A_002E_002E_003F_002E_002E_003FUsers_003FProm3theu5_003FAppData_003FRoaming_003FJetBrains_003FRider2024_002E2_003Fresharper_002Dhost_003FSourcesCache_003Fd1f8f68864fcaf754b3aefb6cb6777e56a17cfd9217db17992967674aa5cd1d_003FDbContextBulkExtensions_002Ecs/@EntryIndexedValue">ForceIncluded</s:String>
<s:String x:Key="/Default/CodeInspection/ExcludedFiles/FilesAndFoldersToSkip2/=7020124F_002D9FFC_002D4AC3_002D8F3D_002DAAB8E0240759_002Ff_003ADbContextBulkTransaction_002Ecs_002Fl_003A_002E_002E_003F_002E_002E_003FUsers_003FProm3theu5_003FAppData_003FRoaming_003FJetBrains_003FRider2024_002E2_003Fresharper_002Dhost_003FSourcesCache_003F1dfe90a67dda13ce28d4a1975cde7e7445f47ecbb8b4d73dd1bcc0c01acd88_003FDbContextBulkTransaction_002Ecs/@EntryIndexedValue">ForceIncluded</s:String>
<s:String x:Key="/Default/CodeInspection/ExcludedFiles/FilesAndFoldersToSkip2/=7020124F_002D9FFC_002D4AC3_002D8F3D_002DAAB8E0240759_002Ff_003AExceptionDispatchInfo_002Ecs_002Fl_003A_002E_002E_003F_002E_002E_003FUsers_003FProm3theu5_003FAppData_003FRoaming_003FJetBrains_003FRider2024_002E2_003Fresharper_002Dhost_003FSourcesCache_003Fbd1d5c50194fea68ff3559c160230b0ab50f5acf4ce3061bffd6d62958e2182_003FExceptionDispatchInfo_002Ecs/@EntryIndexedValue">ForceIncluded</s:String>
<s:String x:Key="/Default/CodeInspection/ExcludedFiles/FilesAndFoldersToSkip2/=7020124F_002D9FFC_002D4AC3_002D8F3D_002DAAB8E0240759_002Ff_003AJsonSerializer_002Ecs_002Fl_003A_002E_002E_003F_002E_002E_003F_002E_002E_003FUsers_003FProm3theu5_003FAppData_003FRoaming_003FJetBrains_003FRider2024_002E2_003Fresharper_002Dhost_003FDecompilerCache_003Fdecompiler_003F5148e388a7db4994ad2ab3750386454116e910_003F72_003Fc63c4ed4_003FJsonSerializer_002Ecs/@EntryIndexedValue">ForceIncluded</s:String>
<s:String x:Key="/Default/Environment/UnitTesting/UnitTestSessionStore/Sessions/=1182152c_002De1fa_002D4b35_002D8711_002D79eeef4fbfc0/@EntryIndexedValue">&lt;SessionState ContinuousTestingMode="0" Name="VerifyShawshankMatches_WithoutThe_ShouldMatch" xmlns="urn:schemas-jetbrains-com:jetbrains-ut-session"&gt;&#xD;
&lt;TestAncestor&gt;&#xD;
&lt;TestId&gt;xUnit::3F8985A4-7E1C-48A0-89A5-47790470023E::net8.0::Zilean.Tests.Tests.PttPythonTests&lt;/TestId&gt;&#xD;
Expand Down
9 changes: 6 additions & 3 deletions compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,12 @@ services:
- zilean_data:/app/data
environment:
Zilean__Database__ConnectionString: "Host=postgres;Port=5432;Database=zilean;Username=postgres;Password=postgres"
# Zilean__Dmm__ImportBatched: "true" Allows enabling batched import - this is for low-end systems.
# Zilean__Dmm__MaxFilteredResults: 200 Allows changing the maximum number of filtered results returned by the DMM API. 200 is the default.
# Zilean__Dmm__MinimumScoreMatch: 0.85 Allows changing the minimum score match for the DMM API. 0.85 is the default. Values between 0 and 1 are accepted.
# Zilean__Dmm__ImportBatched: "true" Allows enabling batched import - this is for low-end systems.
# Zilean__Dmm__MaxFilteredResults: 200 Allows changing the maximum number of filtered results returned by the DMM API. 200 is the default.
# Zilean__Dmm__MinimumScoreMatch: 0.85 Allows changing the minimum score match for the DMM API. 0.85 is the default. Values between 0 and 1 are accepted.
# Zilean__Imdb__MinimumScoreMatch: 0.85 Allows changing the minimum score match for Imdb Matching API. 0.85 is the default. Values between 0 and 1 are accepted.
# Zilean__Dmm__MinimumReDownloadIntervalMinutes: 30 Minimum number of minutes between downloads from the DMM Repo - defaults to `30`
# Zilean__Imdb__EnableImportMatching: true Should IMDB functionality be enabled, true/false, defaults to `true`. Disabling will lead to drastically improved import speed on initial run however you will have no internal imdb matching within zilean, so its up to the upstream project utilising zilean to implement that.
healthcheck:
test: curl --connect-timeout 10 --silent --show-error --fail http://localhost:8181/healthchecks/ping
timeout: 60s
Expand Down
2 changes: 1 addition & 1 deletion eng/install-python-reqs-dmmscraper.ps1
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
Remove-Item -Path ../src/Zilean.DmmScraper/python -Recurse -Force
New-Item -Path ../src/Zilean.DmmScraper/python -ItemType Directory
python3.11 -m pip install -r ../requirements.txt -t ../src/Zilean.DmmScraper/python/ --no-user
python -m pip install -r ../requirements.txt -t ../src/Zilean.DmmScraper/python/ --no-user
3 changes: 2 additions & 1 deletion src/Zilean.Database/Functions/SearchImdbProcedure.cs
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ RETURNS TABLE(
"Container" TEXT,
"Extension" TEXT,
"Torrent" BOOLEAN,
"Score" REAL,
"ImdbId" TEXT,
"ImdbCategory" TEXT,
"ImdbTitle" TEXT,
Expand Down Expand Up @@ -132,8 +133,8 @@ RETURN QUERY
t."Container",
t."Extension",
t."Torrent",
t."ImdbId",
similarity(t."ParsedTitle", query) AS "Score",
t."ImdbId",
i."Category" AS "ImdbCategory",
i."Title" AS "ImdbTitle",
i."Year" AS "ImdbYear",
Expand Down
1 change: 1 addition & 0 deletions src/Zilean.Database/GlobalUsings.cs
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@
global using Zilean.Shared.Features.Configuration;
global using Zilean.Shared.Features.Dmm;
global using Zilean.Shared.Features.Imdb;
global using Zilean.Shared.Features.Statistics;

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
using Microsoft.EntityFrameworkCore.Migrations;
using System.Text.Json;
using Microsoft.EntityFrameworkCore.Migrations;

#nullable disable

Expand All @@ -25,6 +26,30 @@ protected override void Up(MigrationBuilder migrationBuilder)
table.PrimaryKey("PK_ImdbFiles", x => x.ImdbId);
});

migrationBuilder.CreateTable(
name: "ImportMetadata",
columns: table => new
{
Key = table.Column<string>(type: "text", nullable: false),
Value = table.Column<JsonDocument>(type: "jsonb", nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_ImportMetadata", x => x.Key);
});

migrationBuilder.CreateTable(
name: "ParsedPages",
columns: table => new
{
Page = table.Column<string>(type: "text", nullable: false),
EntryCount = table.Column<int>(type: "integer", nullable: false)
},
constraints: table =>
{
table.PrimaryKey("PK_ParsedPages", x => x.Page);
});

migrationBuilder.CreateTable(
name: "Torrents",
columns: table => new
Expand Down Expand Up @@ -108,6 +133,12 @@ protected override void Up(MigrationBuilder migrationBuilder)
/// <inheritdoc />
protected override void Down(MigrationBuilder migrationBuilder)
{
migrationBuilder.DropTable(
name: "ImportMetadata");

migrationBuilder.DropTable(
name: "ParsedPages");

migrationBuilder.DropTable(
name: "Torrents");

Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
namespace Zilean.Database.Migrations;

/// <inheritdoc />
public partial class FunctionsIndexes : Migration
public partial class FunctionsAndIndexes : Migration
{
/// <inheritdoc />
protected override void Up(MigrationBuilder migrationBuilder)
Expand Down
Loading

0 comments on commit fec3e96

Please sign in to comment.