mirror of
https://github.com/Kareadita/Kavita.git
synced 2025-06-03 21:54:47 -04:00
* Staging the code for the new scan loop. * Implemented a basic idea of changes on drives triggering scan loop. Issues: 1. Scan by folder does not work, 2. Queuing system is very hacky and needs a separate thread, 3. Performance degregation could be very real. * Started writing unit test for new loop code * Implemented a basic method to scan a folder path with ignore support (not implemented, code in place) * Added some code to the parser to build out the idea of processing series in batches based on some top level folder. * Scan Series now uses the new code (folder based parsing) and now handles the LocalizedSeries issue. * Got library scan working with the new folder-based scan loop. Updated code to set FolderPath (for improved scan times and partial scan support). * Wrote some notes on update library scan loop. * Removed migration for merge * Reapplied the SeriesFolder migration after merge * Refactored a check that used multiple db calls into one. * Made lots of progress on ignore support, but some confusion on underlying library. Ticket created. On hold till then. * Updated Scan Library and Scan Series to exit early if no changes are on the underlying folders that need to be scanned. * Implemented the ability to have .kavitaignore files within your directories and Kavita will parse them and ignore files and directories based on rules within them. * Fixed an issue where ignore files nested wouldn't stack with higher level ignores * Wrote out some basic code that showcases how we can scan series or library based on file events on the underlying system. Very buggy, needs lots of edge case testing and logging and dupplication checking. * Things are working kinda. I'm getting lost in my own code and complexity. I'm not sure it's worth it. * Refactored ScanFiles out to Directory Service. * Refactored more code out to keep the code clean. * More unit tests * Refactored the signature of ParsedSeries to use IList. Started writing unit tests and reworked the UpdateLibrary to work how it used to with new scan loop code (note: using async update library/series does not work). * Fixed the bug where processSeriesInfos was being invoked twice per series and made the code work very similar to old code (except loose leaf files dont work) but with folder based scanning. * Prep for unit tests (updating broken ones with new implementations) * Just some notes. Not sure I want to finish this work. * Refactored the LibraryWatcher with some comments and state variables. * Undid the migrations in case I don't move forward with this branch * Started to clean the code and prepare for finishing this work. * Fixed a bad merge * Updated signatures to cleanup the code and commit to the new strategy for scanning. * Swapped out the code with async processing of series on a small library * The new scan loop is working in both Sync and Async methods. The code is slow and not optimized. This represents a good point to start polling and applying optimizations. * Refactored UpdateSeries out of Scanner and into a dedicated file. * Refactored how ProcessTasks are awaited to allow more async * Fixed an issue where side nav item wouldn't show correct highlight and migrated to OnPush * Moved where we start to stopwatch to encapsulate the full scan * Cleaned up SignalR events to report correctly (still needs a redesign) * Remove the "remove" code until I figure it out * Put in extremely expensive series deletion code for library scan. * Have Genre and Tag update the DB immediately to avoid dup issues * Taking a break * Moving to a lock with People was successful. Need to apply to others. * Refactored code for series level and tag and genre with new locking strategy. * New scan loop works. Next up optimization * Swapped out the Kavita log with svg for faster load * Refactored metadata updates to occur when the series are being updated. * Code cleanup * Added a new type of generic message (Info) to inform the user. * Code cleanup * Implemented an optimization which prevents any I/O (other than an attribute lookup) for Library/Series Scan. This can bring a recently updated library on network storage (650 series) to fully process in 2 seconds. Fixed a bug where File Analysis was running everytime for each non-epub file. * Fixed ARM x64 builds not being able to view PDF cover images due to a bad update in DocNet. * Some code cleanup * Added experimental signalr update code to have a more natural refresh of library-detail page * Hooked in ability to send new series events to UI * Moved all scan (file scan only) tasks into Scan Queue. Made it so scheduled ScanLibraries will now check if any existing task is being run and reschedule for 3 hours, and 10 mins for scan series. * Implemented the info event in the events widget and added a clear all button to dismiss all infos and errors. Added --event-widget-info-bg-color * Remove --drawer-background-color since it's not used * When new series added, inject directly into the view. * Some debug code cleanup * Fixed up the unit tests * Ensure all config directories exist on startup * Disabled Library Watching (that will go in next build) * Ensure update for series is admin only * Lots of code changes, scan series kinda works, specials are splitting, optimizations are failing. Demotivated on this work again. * Removed SeriesFolder migration * Added the SeriesFolder migration * Added a new pipe for dates so we can provide some nicer defaults. Added folder path to the series detail. * The scan optimizations now work for NTFS systems. * Removed a TODO * Migrated all the times to use DateTime.Now and not Utc. * Refactored some repo calls to use the includes flag pattern * Implemented a check for the library scan optimization check to validate if the library was updated (type change, library rename, folder change, or series deleted) and let the optimization be bypassed. * Added another optimization which will use just folder attribute of last write time if the drive is not NTFS. * Fixed a unit test * Some code cleanup
345 lines
16 KiB
C#
345 lines
16 KiB
C#
using System;
|
|
using System.Collections.Generic;
|
|
using System.Diagnostics;
|
|
using System.Linq;
|
|
using System.Threading.Tasks;
|
|
using API.Comparators;
|
|
using API.Data;
|
|
using API.Data.Metadata;
|
|
using API.Data.Repositories;
|
|
using API.Data.Scanner;
|
|
using API.Entities;
|
|
using API.Entities.Enums;
|
|
using API.Extensions;
|
|
using API.Helpers;
|
|
using API.Services.Tasks.Metadata;
|
|
using API.SignalR;
|
|
using Hangfire;
|
|
using Microsoft.AspNetCore.SignalR;
|
|
using Microsoft.Extensions.Logging;
|
|
|
|
namespace API.Services;
|
|
|
|
public interface IMetadataService
|
|
{
|
|
/// <summary>
|
|
/// Recalculates cover images for all entities in a library.
|
|
/// </summary>
|
|
/// <param name="libraryId"></param>
|
|
/// <param name="forceUpdate"></param>
|
|
[DisableConcurrentExecution(timeoutInSeconds: 60 * 60 * 60)]
|
|
[AutomaticRetry(Attempts = 3, OnAttemptsExceeded = AttemptsExceededAction.Delete)]
|
|
Task GenerateCoversForLibrary(int libraryId, bool forceUpdate = false);
|
|
/// <summary>
|
|
/// Performs a forced refresh of cover images just for a series and it's nested entities
|
|
/// </summary>
|
|
/// <param name="libraryId"></param>
|
|
/// <param name="seriesId"></param>
|
|
/// <param name="forceUpdate">Overrides any cache logic and forces execution</param>
|
|
Task GenerateCoversForSeries(int libraryId, int seriesId, bool forceUpdate = true);
|
|
|
|
Task GenerateCoversForSeries(Series series, bool forceUpdate = false);
|
|
Task RemoveAbandonedMetadataKeys();
|
|
}
|
|
|
|
public class MetadataService : IMetadataService
|
|
{
|
|
private readonly IUnitOfWork _unitOfWork;
|
|
private readonly ILogger<MetadataService> _logger;
|
|
private readonly IEventHub _eventHub;
|
|
private readonly ICacheHelper _cacheHelper;
|
|
private readonly IReadingItemService _readingItemService;
|
|
private readonly IDirectoryService _directoryService;
|
|
private readonly ChapterSortComparerZeroFirst _chapterSortComparerForInChapterSorting = new ChapterSortComparerZeroFirst();
|
|
private readonly IList<SignalRMessage> _updateEvents = new List<SignalRMessage>();
|
|
public MetadataService(IUnitOfWork unitOfWork, ILogger<MetadataService> logger,
|
|
IEventHub eventHub, ICacheHelper cacheHelper,
|
|
IReadingItemService readingItemService, IDirectoryService directoryService)
|
|
{
|
|
_unitOfWork = unitOfWork;
|
|
_logger = logger;
|
|
_eventHub = eventHub;
|
|
_cacheHelper = cacheHelper;
|
|
_readingItemService = readingItemService;
|
|
_directoryService = directoryService;
|
|
}
|
|
|
|
/// <summary>
|
|
/// Updates the metadata for a Chapter
|
|
/// </summary>
|
|
/// <param name="chapter"></param>
|
|
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
|
|
private Task<bool> UpdateChapterCoverImage(Chapter chapter, bool forceUpdate)
|
|
{
|
|
var firstFile = chapter.Files.MinBy(x => x.Chapter);
|
|
|
|
if (!_cacheHelper.ShouldUpdateCoverImage(_directoryService.FileSystem.Path.Join(_directoryService.CoverImageDirectory, chapter.CoverImage), firstFile, chapter.Created, forceUpdate, chapter.CoverImageLocked))
|
|
return Task.FromResult(false);
|
|
|
|
if (firstFile == null) return Task.FromResult(false);
|
|
|
|
_logger.LogDebug("[MetadataService] Generating cover image for {File}", firstFile.FilePath);
|
|
chapter.CoverImage = _readingItemService.GetCoverImage(firstFile.FilePath, ImageService.GetChapterFormat(chapter.Id, chapter.VolumeId), firstFile.Format);
|
|
_unitOfWork.ChapterRepository.Update(chapter); // BUG: CoverImage isn't saving for Monter Masume with new scan loop
|
|
_updateEvents.Add(MessageFactory.CoverUpdateEvent(chapter.Id, MessageFactoryEntityTypes.Chapter)); // TODO: IDEA: Instead of firing here where it's not yet saved, maybe collect the ids and fire after save
|
|
return Task.FromResult(true);
|
|
}
|
|
|
|
private void UpdateChapterLastModified(Chapter chapter, bool forceUpdate)
|
|
{
|
|
var firstFile = chapter.Files.MinBy(x => x.Chapter);
|
|
if (firstFile == null || _cacheHelper.HasFileNotChangedSinceCreationOrLastScan(chapter, forceUpdate, firstFile)) return;
|
|
|
|
firstFile.UpdateLastModified();
|
|
}
|
|
|
|
/// <summary>
|
|
/// Updates the cover image for a Volume
|
|
/// </summary>
|
|
/// <param name="volume"></param>
|
|
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
|
|
private Task<bool> UpdateVolumeCoverImage(Volume volume, bool forceUpdate)
|
|
{
|
|
// We need to check if Volume coverImage matches first chapters if forceUpdate is false
|
|
if (volume == null || !_cacheHelper.ShouldUpdateCoverImage(
|
|
_directoryService.FileSystem.Path.Join(_directoryService.CoverImageDirectory, volume.CoverImage),
|
|
null, volume.Created, forceUpdate)) return Task.FromResult(false);
|
|
|
|
|
|
volume.Chapters ??= new List<Chapter>();
|
|
var firstChapter = volume.Chapters.MinBy(x => double.Parse(x.Number), _chapterSortComparerForInChapterSorting);
|
|
if (firstChapter == null) return Task.FromResult(false);
|
|
|
|
volume.CoverImage = firstChapter.CoverImage;
|
|
//await _eventHub.SendMessageAsync(MessageFactory.CoverUpdate, MessageFactory.CoverUpdateEvent(volume.Id, MessageFactoryEntityTypes.Volume), false);
|
|
_updateEvents.Add(MessageFactory.CoverUpdateEvent(volume.Id, MessageFactoryEntityTypes.Volume));
|
|
|
|
return Task.FromResult(true);
|
|
}
|
|
|
|
/// <summary>
|
|
/// Updates cover image for Series
|
|
/// </summary>
|
|
/// <param name="series"></param>
|
|
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
|
|
private Task UpdateSeriesCoverImage(Series series, bool forceUpdate)
|
|
{
|
|
if (series == null) return Task.CompletedTask;
|
|
|
|
if (!_cacheHelper.ShouldUpdateCoverImage(_directoryService.FileSystem.Path.Join(_directoryService.CoverImageDirectory, series.CoverImage),
|
|
null, series.Created, forceUpdate, series.CoverImageLocked))
|
|
return Task.CompletedTask;
|
|
|
|
series.Volumes ??= new List<Volume>();
|
|
var firstCover = series.Volumes.GetCoverImage(series.Format);
|
|
string coverImage = null;
|
|
if (firstCover == null && series.Volumes.Any())
|
|
{
|
|
// If firstCover is null and one volume, the whole series is Chapters under Vol 0.
|
|
if (series.Volumes.Count == 1)
|
|
{
|
|
coverImage = series.Volumes[0].Chapters.OrderBy(c => double.Parse(c.Number), _chapterSortComparerForInChapterSorting)
|
|
.FirstOrDefault(c => !c.IsSpecial)?.CoverImage;
|
|
}
|
|
|
|
if (!_cacheHelper.CoverImageExists(coverImage))
|
|
{
|
|
coverImage = series.Volumes[0].Chapters.MinBy(c => double.Parse(c.Number), _chapterSortComparerForInChapterSorting)?.CoverImage;
|
|
}
|
|
}
|
|
series.CoverImage = firstCover?.CoverImage ?? coverImage;
|
|
//await _eventHub.SendMessageAsync(MessageFactory.CoverUpdate, MessageFactory.CoverUpdateEvent(series.Id, MessageFactoryEntityTypes.Series), false);
|
|
_updateEvents.Add(MessageFactory.CoverUpdateEvent(series.Id, MessageFactoryEntityTypes.Series));
|
|
return Task.CompletedTask;
|
|
}
|
|
|
|
|
|
/// <summary>
|
|
///
|
|
/// </summary>
|
|
/// <param name="series"></param>
|
|
/// <param name="forceUpdate"></param>
|
|
private async Task ProcessSeriesCoverGen(Series series, bool forceUpdate)
|
|
{
|
|
_logger.LogDebug("[MetadataService] Processing series {SeriesName}", series.OriginalName);
|
|
try
|
|
{
|
|
var volumeIndex = 0;
|
|
var firstVolumeUpdated = false;
|
|
foreach (var volume in series.Volumes)
|
|
{
|
|
var firstChapterUpdated = false; // This only needs to be FirstChapter updated
|
|
var index = 0;
|
|
foreach (var chapter in volume.Chapters)
|
|
{
|
|
var chapterUpdated = await UpdateChapterCoverImage(chapter, forceUpdate);
|
|
// If cover was update, either the file has changed or first scan and we should force a metadata update
|
|
UpdateChapterLastModified(chapter, forceUpdate || chapterUpdated);
|
|
if (index == 0 && chapterUpdated)
|
|
{
|
|
firstChapterUpdated = true;
|
|
}
|
|
|
|
index++;
|
|
}
|
|
|
|
var volumeUpdated = await UpdateVolumeCoverImage(volume, firstChapterUpdated || forceUpdate);
|
|
if (volumeIndex == 0 && volumeUpdated)
|
|
{
|
|
firstVolumeUpdated = true;
|
|
}
|
|
volumeIndex++;
|
|
}
|
|
|
|
await UpdateSeriesCoverImage(series, firstVolumeUpdated || forceUpdate);
|
|
}
|
|
catch (Exception ex)
|
|
{
|
|
_logger.LogError(ex, "[MetadataService] There was an exception during updating metadata for {SeriesName} ", series.Name);
|
|
}
|
|
}
|
|
|
|
|
|
/// <summary>
|
|
/// Refreshes Cover Images for a whole library
|
|
/// </summary>
|
|
/// <remarks>This can be heavy on memory first run</remarks>
|
|
/// <param name="libraryId"></param>
|
|
/// <param name="forceUpdate">Force updating cover image even if underlying file has not been modified or chapter already has a cover image</param>
|
|
[DisableConcurrentExecution(timeoutInSeconds: 60 * 60 * 60)]
|
|
[AutomaticRetry(Attempts = 3, OnAttemptsExceeded = AttemptsExceededAction.Delete)]
|
|
public async Task GenerateCoversForLibrary(int libraryId, bool forceUpdate = false)
|
|
{
|
|
var library = await _unitOfWork.LibraryRepository.GetLibraryForIdAsync(libraryId, LibraryIncludes.None);
|
|
_logger.LogInformation("[MetadataService] Beginning metadata refresh of {LibraryName}", library.Name);
|
|
|
|
_updateEvents.Clear();
|
|
|
|
var chunkInfo = await _unitOfWork.SeriesRepository.GetChunkInfo(library.Id);
|
|
var stopwatch = Stopwatch.StartNew();
|
|
var totalTime = 0L;
|
|
_logger.LogInformation("[MetadataService] Refreshing Library {LibraryName}. Total Items: {TotalSize}. Total Chunks: {TotalChunks} with {ChunkSize} size", library.Name, chunkInfo.TotalSize, chunkInfo.TotalChunks, chunkInfo.ChunkSize);
|
|
|
|
await _eventHub.SendMessageAsync(MessageFactory.NotificationProgress,
|
|
MessageFactory.CoverUpdateProgressEvent(library.Id, 0F, ProgressEventType.Started, $"Starting {library.Name}"));
|
|
|
|
for (var chunk = 1; chunk <= chunkInfo.TotalChunks; chunk++)
|
|
{
|
|
if (chunkInfo.TotalChunks == 0) continue;
|
|
totalTime += stopwatch.ElapsedMilliseconds;
|
|
stopwatch.Restart();
|
|
|
|
_logger.LogInformation("[MetadataService] Processing chunk {ChunkNumber} / {TotalChunks} with size {ChunkSize}. Series ({SeriesStart} - {SeriesEnd}",
|
|
chunk, chunkInfo.TotalChunks, chunkInfo.ChunkSize, chunk * chunkInfo.ChunkSize, (chunk + 1) * chunkInfo.ChunkSize);
|
|
|
|
var nonLibrarySeries = await _unitOfWork.SeriesRepository.GetFullSeriesForLibraryIdAsync(library.Id,
|
|
new UserParams()
|
|
{
|
|
PageNumber = chunk,
|
|
PageSize = chunkInfo.ChunkSize
|
|
});
|
|
_logger.LogDebug("[MetadataService] Fetched {SeriesCount} series for refresh", nonLibrarySeries.Count);
|
|
|
|
var seriesIndex = 0;
|
|
foreach (var series in nonLibrarySeries)
|
|
{
|
|
var index = chunk * seriesIndex;
|
|
var progress = Math.Max(0F, Math.Min(1F, index * 1F / chunkInfo.TotalSize));
|
|
|
|
await _eventHub.SendMessageAsync(MessageFactory.NotificationProgress,
|
|
MessageFactory.CoverUpdateProgressEvent(library.Id, progress, ProgressEventType.Updated, series.Name));
|
|
|
|
try
|
|
{
|
|
await ProcessSeriesCoverGen(series, forceUpdate);
|
|
}
|
|
catch (Exception ex)
|
|
{
|
|
_logger.LogError(ex, "[MetadataService] There was an exception during metadata refresh for {SeriesName}", series.Name);
|
|
}
|
|
seriesIndex++;
|
|
}
|
|
|
|
await _unitOfWork.CommitAsync();
|
|
|
|
await FlushEvents();
|
|
|
|
_logger.LogInformation(
|
|
"[MetadataService] Processed {SeriesStart} - {SeriesEnd} out of {TotalSeries} series in {ElapsedScanTime} milliseconds for {LibraryName}",
|
|
chunk * chunkInfo.ChunkSize, (chunk * chunkInfo.ChunkSize) + nonLibrarySeries.Count, chunkInfo.TotalSize, stopwatch.ElapsedMilliseconds, library.Name);
|
|
}
|
|
|
|
await _eventHub.SendMessageAsync(MessageFactory.NotificationProgress,
|
|
MessageFactory.CoverUpdateProgressEvent(library.Id, 1F, ProgressEventType.Ended, $"Complete"));
|
|
|
|
_logger.LogInformation("[MetadataService] Updated metadata for {SeriesNumber} series in library {LibraryName} in {ElapsedMilliseconds} milliseconds total", chunkInfo.TotalSize, library.Name, totalTime);
|
|
}
|
|
|
|
|
|
public async Task RemoveAbandonedMetadataKeys()
|
|
{
|
|
await _unitOfWork.TagRepository.RemoveAllTagNoLongerAssociated();
|
|
await _unitOfWork.PersonRepository.RemoveAllPeopleNoLongerAssociated();
|
|
await _unitOfWork.GenreRepository.RemoveAllGenreNoLongerAssociated();
|
|
await _unitOfWork.CollectionTagRepository.RemoveTagsWithoutSeries();
|
|
await _unitOfWork.AppUserProgressRepository.CleanupAbandonedChapters();
|
|
|
|
}
|
|
|
|
/// <summary>
|
|
/// Refreshes Metadata for a Series. Will always force updates.
|
|
/// </summary>
|
|
/// <param name="libraryId"></param>
|
|
/// <param name="seriesId"></param>
|
|
/// <param name="forceUpdate">Overrides any cache logic and forces execution</param>
|
|
public async Task GenerateCoversForSeries(int libraryId, int seriesId, bool forceUpdate = true)
|
|
{
|
|
var series = await _unitOfWork.SeriesRepository.GetFullSeriesForSeriesIdAsync(seriesId);
|
|
if (series == null)
|
|
{
|
|
_logger.LogError("[MetadataService] Series {SeriesId} was not found on Library {LibraryId}", seriesId, libraryId);
|
|
return;
|
|
}
|
|
|
|
await GenerateCoversForSeries(series, forceUpdate);
|
|
}
|
|
|
|
/// <summary>
|
|
/// Generate Cover for a Series. This is used by Scan Loop and should not be invoked directly via User Interaction.
|
|
/// </summary>
|
|
/// <param name="series">A full Series, with metadata, chapters, etc</param>
|
|
/// <param name="forceUpdate"></param>
|
|
public async Task GenerateCoversForSeries(Series series, bool forceUpdate = false)
|
|
{
|
|
var sw = Stopwatch.StartNew();
|
|
await _eventHub.SendMessageAsync(MessageFactory.NotificationProgress,
|
|
MessageFactory.CoverUpdateProgressEvent(series.LibraryId, 0F, ProgressEventType.Started, series.Name));
|
|
|
|
await ProcessSeriesCoverGen(series, forceUpdate);
|
|
|
|
|
|
if (_unitOfWork.HasChanges())
|
|
{
|
|
await _unitOfWork.CommitAsync();
|
|
_logger.LogInformation("[MetadataService] Updated cover images for {SeriesName} in {ElapsedMilliseconds} milliseconds", series.Name, sw.ElapsedMilliseconds);
|
|
}
|
|
|
|
await _eventHub.SendMessageAsync(MessageFactory.NotificationProgress,
|
|
MessageFactory.CoverUpdateProgressEvent(series.LibraryId, 1F, ProgressEventType.Ended, series.Name));
|
|
|
|
await _eventHub.SendMessageAsync(MessageFactory.CoverUpdate, MessageFactory.CoverUpdateEvent(series.Id, MessageFactoryEntityTypes.Series), false);
|
|
await FlushEvents();
|
|
}
|
|
|
|
private async Task FlushEvents()
|
|
{
|
|
// Send all events out now that entities are saved
|
|
_logger.LogDebug("Dispatching {Count} update events", _updateEvents.Count);
|
|
foreach (var updateEvent in _updateEvents)
|
|
{
|
|
await _eventHub.SendMessageAsync(MessageFactory.CoverUpdate, updateEvent, false);
|
|
}
|
|
_updateEvents.Clear();
|
|
}
|
|
}
|