Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/source site polymorphism #115

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
16c9482
Start of a more extensible way to store source sites. First up, Thing…
anaximander23 Dec 10, 2019
cbc9767
Source sites for the big three. Just the data model; not being create…
anaximander23 Dec 11, 2019
4bc800a
Using SourceSite for personal websites.
anaximander23 Dec 16, 2019
59e239f
Merge branch 'master' into feature/source-site-polymorphism
anaximander23 Dec 16, 2019
3abd40f
Tidying up mini parse logic.
anaximander23 Dec 19, 2019
bfa40ea
[WIP] Moved things out into separate projects and added a basic Thing…
anaximander23 Dec 20, 2019
06e9e89
Adding in a local settings file which will be ignored by git, for dev…
anaximander23 Dec 23, 2019
346696a
Thingiverse parsing converted to new system. Seems to work.
anaximander23 Dec 23, 2019
740d95e
Listing supported sites on the submission page
anaximander23 Dec 23, 2019
30005e3
Shapeways parsing
anaximander23 Jan 2, 2020
3efd1b4
Handling pre-existing creators
anaximander23 Jan 2, 2020
0dfca33
Naively merging creators based on matching usernames.
anaximander23 Jan 2, 2020
4b59e58
Fixed migrations.
anaximander23 Jan 2, 2020
46d9ede
Recording source sites against minis. Also allows for one mini to hav…
anaximander23 Jan 2, 2020
bb83f86
Removing superfluous check for existing mini.
anaximander23 Jan 2, 2020
7476dc9
Parser for Gumroad.
anaximander23 Jan 3, 2020
6695c19
Removed redundant check when validating Gumroad URLs.
anaximander23 Jan 3, 2020
1290dca
Unifying source sites to use a common username column to avoid databa…
anaximander23 Jan 6, 2020
1b76491
Removed unneeded Gumroad parse method.
anaximander23 Jan 6, 2020
917f65c
Added missing Gumroad parser site label.
anaximander23 Jan 6, 2020
04645b8
Removing parts that are no longer used.
anaximander23 Jan 7, 2020
f673624
Style fixes.
anaximander23 Jan 7, 2020
e4d595f
Opportunistic cleanup of a few warnings.
anaximander23 Jan 7, 2020
f7aef73
Rough merge; migration for Starred needs adding back in.
anaximander23 Jan 7, 2020
c32dee5
Recreating the Starred migration
anaximander23 Jan 7, 2020
ccca120
Fixing migrations to avoid deployment errors
anaximander23 Jan 7, 2020
9460e2e
Two minor changes to possibly improve preprod.
aluhrs13 Jan 8, 2020
3b8fea8
Disabling logging entirely as a test.
aluhrs13 Jan 8, 2020
de792ff
Disabling AppInsights, removing index telemetry, and trying to set ev…
aluhrs13 Jan 8, 2020
f8d4d9a
Re-adding AppInsights.
aluhrs13 Jan 8, 2020
8722bd4
Attempted performance improvements.
anaximander23 Jan 13, 2020
6e6a673
Minor code cleanup.
anaximander23 Jan 13, 2020
81be349
Simplifying the creator browse page; seems to be faster now.
anaximander23 Jan 13, 2020
288328f
Performance improvement for mini details page.
anaximander23 Jan 13, 2020
f720178
Removed duplicated Starred migration
anaximander23 Jan 13, 2020
24dd1c9
Clearing and rebuilding the logging config to remove console logging,…
anaximander23 Jan 13, 2020
4291f27
Re-adding local config stuff I removed for no real reason, fixing mer…
aluhrs13 Jan 13, 2020
fcbc088
Adding MyMiniFactory support
aluhrs13 Jan 13, 2020
e3f9a65
Using public_url for Thingiverse models, which is the site URL rather…
anaximander23 Jan 14, 2020
cc4d0a0
Correctly attaching to existing source sites rather than creating dup…
anaximander23 Jan 14, 2020
2607a78
Simple fix for mini links, leaving in support for anything not yet mi…
anaximander23 Jan 14, 2020
49c9db7
Fixed parsing of Gumroad hash-format URLs (or more accurately, sidest…
anaximander23 Jan 14, 2020
c34614f
Fixing some bugs in MyMiniFactory
aluhrs13 Jan 14, 2020
1a3f9a3
rename MyMiniFactory parser
aluhrs13 Jan 14, 2020
dc6e06b
[BROKEN] First attempt at finding a matching creator should match bot…
aluhrs13 Jan 14, 2020
6b51183
Fixed issue with finding pre-existing sources, and added a migration …
anaximander23 Jan 15, 2020
95a1e0d
Merge branch 'master' into feature/source-site-polymorphism
aluhrs13 Jan 15, 2020
913a921
Adding Gumroad website migration fix
aluhrs13 Jan 15, 2020
c34e523
Ordering browse creators page by number of Minis
aluhrs13 Jan 16, 2020
2bfacf7
Making the "submit to index" button on a creator's Thingiverse tab in…
anaximander23 Jan 16, 2020
cf3a840
Added a migration to fix up Gumroad source links.
anaximander23 Jan 17, 2020
c84325a
Turned Facebook auth back on.
anaximander23 Jan 17, 2020
138c711
Adding telemetry on Mini creation attempted.
aluhrs13 Jan 17, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,14 @@ dotnet_diagnostic.CA1710.severity = none

# CA1303: Do not pass literals as localized parameters
dotnet_diagnostic.CA1303.severity = silent

# CA2227: Collection properties should be read only
dotnet_diagnostic.CA2227.severity = silent

dotnet_naming_symbols.local.capitalisation = camel_case
dotnet_naming_symbols.property.capitalisation = pascal_case



# CA1052: Static holder types should be Static or NotInheritable
dotnet_diagnostic.CA1052.severity = silent
16 changes: 16 additions & 0 deletions MiniIndex.Core/CoreServiceInfo.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
using MiniIndex.Core.Minis;
using System.Collections.Generic;
using System.Linq;

namespace MiniIndex.Core
{
public class CoreServiceInfo
{
public CoreServiceInfo(IEnumerable<IParser> parsers)
{
SupportedSites = parsers.Select(p => p.Site).ToList();
}

public IEnumerable<string> SupportedSites { get; }
}
}
42 changes: 42 additions & 0 deletions MiniIndex.Core/CoreServices.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
using System;
using Lamar;
using MediatR;
using Microsoft.Extensions.DependencyInjection;
using MiniIndex.Core.Http;
using MiniIndex.Core.Minis;
using MiniIndex.Core.Minis.Parsers.Thingiverse;

namespace MiniIndex.Core
{
public class CoreServices : ServiceRegistry
{
public CoreServices()
{
RegisterMediatrTypes();

this.AddHttpClient<ThingiverseClient>()
.SetHandlerLifetime(TimeSpan.FromMinutes(10))
.ApplyResiliencePolicies();
}

private void RegisterMediatrTypes()
{
Scan(scan =>
{
scan.TheCallingAssembly();

scan.ConnectImplementationsToTypesClosing(typeof(IRequestHandler<,>));
scan.ConnectImplementationsToTypesClosing(typeof(INotificationHandler<>));

scan.AddAllTypesOf<IParser>();
});

For<IMediator>()
.Use<Mediator>()
.Transient();

For<ServiceFactory>()
.Use(ctx => ctx.GetInstance);
}
}
}
36 changes: 36 additions & 0 deletions MiniIndex.Core/Http/HttpClientConfigurationExtensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
using System;
using System.Net;
using System.Net.Http;
using Microsoft.Extensions.DependencyInjection;
using Polly;
using Polly.CircuitBreaker;

namespace MiniIndex.Core.Http
{
public static class HttpClientConfigurationExtensions
{
public static IHttpClientBuilder ApplyResiliencePolicies(this IHttpClientBuilder builder)
{
Random jitter = new Random();

AsyncCircuitBreakerPolicy<HttpResponseMessage> circuitBreaker = Policy
.Handle<HttpRequestException>()
.OrResult<HttpResponseMessage>(x => x.StatusCode >= HttpStatusCode.InternalServerError)
.AdvancedCircuitBreakerAsync(
failureThreshold: 0.5,
samplingDuration: TimeSpan.FromSeconds(10),
minimumThroughput: 6,
durationOfBreak: TimeSpan.FromMinutes(1));

builder
.AddTransientHttpErrorPolicy(transient => transient
.WaitAndRetryAsync(3, (retryCount) =>
TimeSpan.FromSeconds(Math.Pow(retryCount, 2))
+ TimeSpan.FromMilliseconds(jitter.NextDouble() * 500d)))
.AddPolicyHandler(Policy.TimeoutAsync<HttpResponseMessage>(TimeSpan.FromSeconds(30)))
.AddPolicyHandler(circuitBreaker);

return builder;
}
}
}
24 changes: 24 additions & 0 deletions MiniIndex.Core/MiniIndex.Core.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netstandard2.1</TargetFramework>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="HtmlAgilityPack" Version="1.11.17" />
<PackageReference Include="Lamar" Version="4.0.0" />
<PackageReference Include="MediatR" Version="7.0.0" />
<PackageReference Include="Microsoft.Extensions.Configuration.Abstractions" Version="3.1.0" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection" Version="3.1.0" />
<PackageReference Include="Microsoft.Extensions.Http" Version="3.1.0" />
<PackageReference Include="Microsoft.Extensions.Http.Polly" Version="3.1.0" />
<PackageReference Include="Newtonsoft.Json" Version="12.0.3" />
<PackageReference Include="Polly" Version="7.2.0" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\MiniIndex.Models\MiniIndex.Models.csproj" />
<ProjectReference Include="..\MiniIndex.Persistence\MiniIndex.Persistence.csproj" />
</ItemGroup>

</Project>
15 changes: 15 additions & 0 deletions MiniIndex.Core/Minis/IParser.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
using MiniIndex.Models;
using System;
using System.Threading.Tasks;

namespace MiniIndex.Core.Minis
{
public interface IParser
{
string Site { get; }

bool CanParse(Uri url);

Task<Mini> ParseFromUrl(Uri url);
}
}
116 changes: 116 additions & 0 deletions MiniIndex.Core/Minis/Parsers/Gumroad/GumroadParser.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
using HtmlAgilityPack;
using MiniIndex.Core.Utilities;
using MiniIndex.Models;
using MiniIndex.Models.SourceSites;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace MiniIndex.Core.Minis.Parsers.Gumroad
{
public class GumroadParser : IParser
{
public string Site => "Gumroad";

public bool CanParse(Uri url)
{
bool isGumroadUrl = url.Host.Replace("www.", "").Equals("gumroad.com", StringComparison.OrdinalIgnoreCase);

if (!isGumroadUrl)
{
return false;
}

return IsLFormatUrl(url) || IsHashFormatUrl(url);
}

public async Task<Mini> ParseFromUrl(Uri url)
{
if (IsHashFormatUrl(url))
{
string id = url.ToString().Split('#').Last();
url = new Uri($"https://www.gumroad.com/l/{id}");
}

HtmlWeb web = new HtmlWeb();
HtmlDocument htmlDoc = await web.LoadFromWebAsync(url, null, null);

HtmlNode creatorLink = htmlDoc.DocumentNode.SelectNodes("//a[@class=\"js-creator-profile-link\"]")
.FirstOrDefault();

string creatorUrl = creatorLink.GetAttributeValue("href", null);
string creatorName = new Uri(creatorUrl).AbsolutePath.Skip(1).AsString();

Dictionary<string, string> miniProperties = htmlDoc.DocumentNode.SelectNodes("//*[@itemprop]")
.Select(node => new
{
property = node.GetAttributeValue("itemprop", null),
value = GetNodeContent(node)
})
.Where(node => !String.IsNullOrWhiteSpace(node.property))
.ToDictionary(k => k.property, v => v.value);

Creator creator = new Creator
{
Name = creatorName
};
GumroadSource source = new GumroadSource(creator, creatorName);
creator.Sites.Add(source);

Mini mini = new Mini()
{
Creator = creator,
Name = miniProperties["name"],
Thumbnail = miniProperties["image"],
Link = miniProperties["url"]
};
mini.Sources.Add(new MiniSourceSite(mini, source, url));

return mini;
}

public string GetNodeContent(HtmlNode node)
{
switch (node.GetAttributeValue("itemprop", null))
{
case "url":
return node.GetAttributeValue("href", null);

case "image":
return node.GetAttributeValue("src", null);

default:
string directText = node.GetDirectInnerText().Trim();

if (!String.IsNullOrWhiteSpace(directText))
{
return directText;
}

foreach (HtmlNode innerNode in node.ChildNodes)
{
string innerText = GetNodeContent(innerNode);

if (!String.IsNullOrWhiteSpace(innerText))
{
return innerText;
}
}
return node.GetAttributeValue("content", null);
}
}

private static bool IsHashFormatUrl(Uri url)
{
return !String.IsNullOrWhiteSpace(url.Fragment)
&& url.Fragment.StartsWith("#");
}

private static bool IsLFormatUrl(Uri url)
{
return !String.IsNullOrWhiteSpace(url.LocalPath)
&& url.LocalPath.StartsWith("/l/");
}
}
}
71 changes: 71 additions & 0 deletions MiniIndex.Core/Minis/Parsers/MyMiniFactory/MyMiniFactoryParser.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
using HtmlAgilityPack;
using MiniIndex.Core.Utilities;
using MiniIndex.Models;
using MiniIndex.Models.SourceSites;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace MiniIndex.Core.Minis.Parsers.MyMiniFactory
{
public class MyMiniFactoryParser : IParser
{
public string Site => "MyMiniFactory";

public bool CanParse(Uri url)
{
bool isMyMiniFactoryUrl = url.Host.Replace("www.", "").Equals("myminifactory.com", StringComparison.OrdinalIgnoreCase);

if (!isMyMiniFactoryUrl)
{
return false;
}

bool mmfFormat1 = !String.IsNullOrWhiteSpace(url.LocalPath)
&& url.LocalPath.StartsWith("/object/");

return mmfFormat1;
}

public async Task<Mini> ParseFromUrl(Uri url)
{
HtmlWeb web = new HtmlWeb();
HtmlDocument htmlDoc = await web.LoadFromWebAsync(url, null, null);

HtmlNode creatorLink = htmlDoc.DocumentNode.SelectNodes("//a[@class=\"under-hover\"]")
.FirstOrDefault();

string creatorUrl = creatorLink.GetAttributeValue("href", null);
string creatorName = Uri.UnescapeDataString(creatorUrl.Split('/').Last());

Creator creator = new Creator
{
Name = creatorName
};
MyMiniFactorySource source = new MyMiniFactorySource(creator, creatorName);
creator.Sites.Add(source);

Mini mini = new Mini()
{
Creator = creator,
Name = htmlDoc.DocumentNode.SelectNodes("//h1").FirstOrDefault().InnerText.Trim(),
Thumbnail = htmlDoc.DocumentNode.SelectNodes("//meta").Where(n => n.Attributes.Any(a => a.Value == "og:image")).First()
.Attributes.Where(a => a.Name == "content").First().Value,
Link = url.ToString()
};

int cost = 0;
HtmlNodeCollection priceNode = htmlDoc.DocumentNode.SelectNodes("//span[@class=\"price-title\"]");
if (priceNode != null)
{
cost = Int32.Parse(priceNode.First().InnerText.Remove(0, 1).Split(".").First());
}
mini.Cost = cost;

mini.Sources.Add(new MiniSourceSite(mini, source, url));

return mini;
}
}
}
Loading