Kavita/API/Controllers/BookController.cs
Joseph Milazzo 33db123e81
v0.4.8 Release (#720)
* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Workflow updates (#658)

# Added
- Added: Added automatic character parsing for discord notifier. Now if the PR is over a certain character limit, it will trim and add an appropriate link to the full changelog. (Release for Stable, PR for Dev)

# Removed
- Removed: Removed Sentry map task from the workflow since Sentry is no longer used.

* Bump versions by dotnet-bump-version.

* Misc Updates (#665)

* Do not allow non-admins to change their passwords when authentication is disabled

* Clean up the login page so that input field text is black

* cleanup some resizing when typing a password and having a lot of users

* Changed the LastActive for a user to not just be login, but also when they open an already authenticated session.

* Bump versions by dotnet-bump-version.

* Logging Cleanup (#668)

* Do not allow non-admins to change their passwords when authentication is disabled

* Clean up the login page so that input field text is black

* cleanup some resizing when typing a password and having a lot of users

* Changed the LastActive for a user to not just be login, but also when they open an already authenticated session.

* Removed some verbose debugging statements and moved some debug to information to be more prevelant to logs for default installs.

* In Progress now sends progress information on the Series

* Add ability to add cards to recently added when new series are added in backend

* Implemented the ability to click the glasses icon to turn off incognito mode from within the reader so you can start tracking progress

* Don't warn the user about authentication when they don't touch that control

* Bump versions by dotnet-bump-version.

* Changed the stats that are sent back to stat server from installed server.

* Revert "Changed the stats that are sent back to stat server from installed server."

This reverts commit 644cb6d1f67de9531ea1a1dfd3853709e0329ce7.

* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Bulk Add to Collection (#674)

* Fixed the typeahead not having the same size input box as other inputs

* Implemented the ability to add multiple series to a collection through bulk operations flow. Updated book parser to handle "@import url('...');" syntax as well as @import '...';

* Implemented the ability to create a new Collection tag via bulk operations flow.

* Bump versions by dotnet-bump-version.

* Bulk Operations for In Progress and Recently Added (#677)

* Don't log a message about bad match if the file is a cover image

* Enable bulk operations for In Progress and Recently Added

* Fixed a bad logic case

* Bump versions by dotnet-bump-version.

* Regression Fix (#680)

* Ensure we mount the backups directory for Docker users

* Fixed a huge logic bug that deleted files in users libraries

* Bump versions by dotnet-bump-version.

* Change chunk size to be a fixed 50 to validate if it's causing issue with refresh. Added some try catches to see if exceptions are causing issues. (#681)

* Bump versions by dotnet-bump-version.

* Fixed a bug where searching on localized name would fail to show on the search. Fixed a bug where extra spaces would cause the search results not to show properly. (#682)

* Bump versions by dotnet-bump-version.

* When we have a special marker, ensure we fall back to folder parsing to try and group correctly to the actual series before just accepting what we parsed. (#684)

Fixed a missed parsing case where comic special parsing wasn't being called on comic libraries.

* Bump versions by dotnet-bump-version.

* iOS Admin page dropdown fix (#686)

# Fixed:
- Fixed: Fixed an issue where the dropdown on the admin server page would not work on Safari or other iOS browsers.

* When the DB fails to save, log out all the series the user should look into for constraint issues and push a message to the admins connected to webui. (#687)

* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Stat upload will now schedule itself between midnight and 6am in server time for upload. (#688)

* Bump versions by dotnet-bump-version.

* EPUB CSS Parsing Issues (#690)

* WIP. Rewrote some of the Regex to better support css escaping. We now escape background-image, border-image, and list-style-image within css files.

* Added position relative to help with positioning on books that are just absolute positioned elements.

* When there is absolute positioning, like in some epub based comics, supress the bottom action bar since it wont render in the correct location.

* Fixed tests

* Commented out tests

* Bump versions by dotnet-bump-version.

* More EPUB Scoping Fixes (#691)

* Added better handling around when importing css files that are empty. Moved comment removal on css files to before some css whitespace cleanup to get better matches.

* Some enhancements on the checks to see if we need the bottom action bar on reader. Now we don't query DOM and have something that works more reliably.

* Bump versions by dotnet-bump-version.

* Fixed an issue where docker users were not properly backing up the database. Removed an empty File for when covers/ had nothing in it. (#692)

* Bump versions by dotnet-bump-version.

* Fallback to Folder Parsing Issue (#694)

* Fixed a bug in the scanner where we fall back to parsing from folders for poorly named files. The code was exiting early if a chapter or volume could be parsed out.

* Fixed a unit test by tweaking a regex for fallback

* Bump versions by dotnet-bump-version.

* KavitaStats Cleanup (#695)

* Refactored Stats code to be much cleaner and user better naming.

* Cleaned up the actual http code to use Flurl and to return if the upload was successful or not so we can delete the file where appropriate.

* More refactoring for the stats code to clean it up and keep it consistent with our standards.

* Removed a confusing log statement

* Added support for old api key header from original stat server

* Use the correct endpoint, not the new one.

* Code smell

* Bump versions by dotnet-bump-version.

* Bulk Deletion (#697)

* Implemented bulk deletion of series

* Don't show unauthorized exception on UI, just redirect to the login page.

* Bump versions by dotnet-bump-version.

* Cover Image Picking + Forwarding Headers with EPUBs (#700)

* Ensure Kavita knows about forwarding headers (fixes issue with epub urls not going through https with reverse proxy). Fixed a case where cover image selection preferred nested folders vs files in root directory.

* Fixed broken unit test

* Added bug that I fixed to the unit tests

* Cover Image Picking + Forwarding Headers with EPUBs (#702)

* Updating GA Bump version temporarily for fix (#703)

* Bump versions by dotnet-bump-version.

* Cover Image Picking + Forwarding Headers with EPUBs (GA Fix) (#704)

* Bump versions by dotnet-bump-version.

* Vacation Fixes (#709)

* Ignore system and hidden folders when performing directory scan.

* Fixed the comic parser tests not using Comic mode for parsing.

* Accept all forwarded headers and use them.

* Ignore some changes from another branch

* Bump versions by dotnet-bump-version.

* Breaking Changes: Docker Parity (#698)

* Refactored all the config files for Kavita to be loaded from config/. This will allow docker to just mount one folder and for Update functionality to be trivial.

* Cleaned up documentation around new update method.

* Updated docker files to support config directory

* Removed entrypoint, no longer needed

* Update appsettings to point to config directory for logs

* Updated message for docker users that are upgrading

* Ensure that docker users that have not updated their mount points from upgrade cannot start the server

* Code smells

* More cleanup

* Added entrypoint to fix bind mount issues

* Updated README with new folder structure

* Fixed build system for new setup

* Updated string path if user is docker

* Updated the migration flow for docker to work properly and Fixed LogFile configuration updating.

* Migrating docker images is now working 100%

* Fixed config from bad code

* Code cleanup

Co-authored-by: Chris Plaatjes <kizaing@gmail.com>

* Bump versions by dotnet-bump-version.

* Feature/docker parity (#714)

* Refactored all the config files for Kavita to be loaded from config/. This will allow docker to just mount one folder and for Update functionality to be trivial.

* Cleaned up documentation around new update method.

* Updated docker files to support config directory

* Removed entrypoint, no longer needed

* Update appsettings to point to config directory for logs

* Updated message for docker users that are upgrading

* Ensure that docker users that have not updated their mount points from upgrade cannot start the server

* Code smells

* More cleanup

* Added entrypoint to fix bind mount issues

* Updated README with new folder structure

* Fixed build system for new setup

* Updated string path if user is docker

* Updated the migration flow for docker to work properly and Fixed LogFile configuration updating.

* Migrating docker images is now working 100%

* Fixed config from bad code

* Code cleanup

* Fixed monorepo-build.sh

Co-authored-by: Chris Plaatjes <kizaing@gmail.com>

* Breaking Changes: Docker Parity (#715)

* Fixed a bug in the copy directory to directory in the migration

* Somehow GetFiles lost static modifier.

* Bump versions by dotnet-bump-version.

* Build issue (#716)

* Fixed a bug in the copy directory to directory in the migration

* Somehow GetFiles lost static modifier.

* Please work

* Bump versions by dotnet-bump-version.

* Bump versions by dotnet-bump-version.

* Shakeout Changes (#717)

* Make the appsettings public on Configuration and change how we detect when to migrate for non-docker users.

* Fixed up non-docker copy command and removed duplicate check on source directory for a copy.

* Don't delete files unless we know we are successful

* Bump versions by dotnet-bump-version.

* Fixed a migration issue on docker happening too many times or throwing exception when source wasn't there. (#719)

* Bump versions by dotnet-bump-version.

* Version bump for release (#718)

* Bump versions by dotnet-bump-version.

Co-authored-by: Robbie Davis <robbie@therobbiedavis.com>
Co-authored-by: YEGCSharpDev <89283498+YEGCSharpDev@users.noreply.github.com>
Co-authored-by: Chris Plaatjes <kizaing@gmail.com>
2021-11-04 05:29:02 -07:00

358 lines
16 KiB
C#

using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using API.DTOs;
using API.DTOs.Reader;
using API.Entities.Enums;
using API.Extensions;
using API.Interfaces;
using API.Interfaces.Services;
using API.Services;
using HtmlAgilityPack;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Logging;
using VersOne.Epub;
namespace API.Controllers
{
public class BookController : BaseApiController
{
private readonly ILogger<BookController> _logger;
private readonly IBookService _bookService;
private readonly IUnitOfWork _unitOfWork;
private readonly ICacheService _cacheService;
private static readonly string BookApiUrl = "book-resources?file=";
public BookController(ILogger<BookController> logger, IBookService bookService, IUnitOfWork unitOfWork, ICacheService cacheService)
{
_logger = logger;
_bookService = bookService;
_unitOfWork = unitOfWork;
_cacheService = cacheService;
}
[HttpGet("{chapterId}/book-info")]
public async Task<ActionResult<BookInfoDto>> GetBookInfo(int chapterId)
{
var dto = await _unitOfWork.ChapterRepository.GetChapterInfoDtoAsync(chapterId);
var bookTitle = string.Empty;
if (dto.SeriesFormat == MangaFormat.Epub)
{
var mangaFile = (await _unitOfWork.ChapterRepository.GetFilesForChapterAsync(chapterId)).First();
using var book = await EpubReader.OpenBookAsync(mangaFile.FilePath);
bookTitle = book.Title;
}
return Ok(new BookInfoDto()
{
ChapterNumber = dto.ChapterNumber,
VolumeNumber = dto.VolumeNumber,
VolumeId = dto.VolumeId,
BookTitle = bookTitle,
SeriesName = dto.SeriesName,
SeriesFormat = dto.SeriesFormat,
SeriesId = dto.SeriesId,
LibraryId = dto.LibraryId,
IsSpecial = dto.IsSpecial,
Pages = dto.Pages,
});
}
[HttpGet("{chapterId}/book-resources")]
public async Task<ActionResult> GetBookPageResources(int chapterId, [FromQuery] string file)
{
var chapter = await _unitOfWork.ChapterRepository.GetChapterAsync(chapterId);
var book = await EpubReader.OpenBookAsync(chapter.Files.ElementAt(0).FilePath);
var key = BookService.CleanContentKeys(file);
if (!book.Content.AllFiles.ContainsKey(key)) return BadRequest("File was not found in book");
var bookFile = book.Content.AllFiles[key];
var content = await bookFile.ReadContentAsBytesAsync();
Response.AddCacheHeader(content);
var contentType = BookService.GetContentType(bookFile.ContentType);
return File(content, contentType, $"{chapterId}-{file}");
}
[HttpGet("{chapterId}/chapters")]
public async Task<ActionResult<ICollection<BookChapterItem>>> GetBookChapters(int chapterId)
{
// This will return a list of mappings from ID -> pagenum. ID will be the xhtml key and pagenum will be the reading order
// this is used to rewrite anchors in the book text so that we always load properly in FE
var chapter = await _unitOfWork.ChapterRepository.GetChapterAsync(chapterId);
using var book = await EpubReader.OpenBookAsync(chapter.Files.ElementAt(0).FilePath);
var mappings = await _bookService.CreateKeyToPageMappingAsync(book);
var navItems = await book.GetNavigationAsync();
var chaptersList = new List<BookChapterItem>();
foreach (var navigationItem in navItems)
{
if (navigationItem.NestedItems.Count > 0)
{
var nestedChapters = new List<BookChapterItem>();
foreach (var nestedChapter in navigationItem.NestedItems)
{
if (nestedChapter.Link == null) continue;
var key = BookService.CleanContentKeys(nestedChapter.Link.ContentFileName);
if (mappings.ContainsKey(key))
{
nestedChapters.Add(new BookChapterItem()
{
Title = nestedChapter.Title,
Page = mappings[key],
Part = nestedChapter.Link.Anchor ?? string.Empty,
Children = new List<BookChapterItem>()
});
}
}
if (navigationItem.Link == null)
{
var item = new BookChapterItem()
{
Title = navigationItem.Title,
Children = nestedChapters
};
if (nestedChapters.Count > 0)
{
item.Page = nestedChapters[0].Page;
}
chaptersList.Add(item);
}
else
{
var groupKey = BookService.CleanContentKeys(navigationItem.Link.ContentFileName);
if (mappings.ContainsKey(groupKey))
{
chaptersList.Add(new BookChapterItem()
{
Title = navigationItem.Title,
Page = mappings[groupKey],
Children = nestedChapters
});
}
}
}
}
if (chaptersList.Count == 0)
{
// Generate from TOC
var tocPage = book.Content.Html.Keys.FirstOrDefault(k => k.ToUpper().Contains("TOC"));
if (tocPage == null) return Ok(chaptersList);
// Find all anchor tags, for each anchor we get inner text, to lower then titlecase on UI. Get href and generate page content
var doc = new HtmlDocument();
var content = await book.Content.Html[tocPage].ReadContentAsync();
doc.LoadHtml(content);
var anchors = doc.DocumentNode.SelectNodes("//a");
if (anchors == null) return Ok(chaptersList);
foreach (var anchor in anchors)
{
if (anchor.Attributes.Contains("href"))
{
var key = BookService.CleanContentKeys(anchor.Attributes["href"].Value).Split("#")[0];
if (!mappings.ContainsKey(key))
{
// Fallback to searching for key (bad epub metadata)
var correctedKey = book.Content.Html.Keys.SingleOrDefault(s => s.EndsWith(key));
if (!string.IsNullOrEmpty(correctedKey))
{
key = correctedKey;
}
}
if (!string.IsNullOrEmpty(key) && mappings.ContainsKey(key))
{
var part = string.Empty;
if (anchor.Attributes["href"].Value.Contains("#"))
{
part = anchor.Attributes["href"].Value.Split("#")[1];
}
chaptersList.Add(new BookChapterItem()
{
Title = anchor.InnerText,
Page = mappings[key],
Part = part,
Children = new List<BookChapterItem>()
});
}
}
}
}
return Ok(chaptersList);
}
[HttpGet("{chapterId}/book-page")]
public async Task<ActionResult<string>> GetBookPage(int chapterId, [FromQuery] int page)
{
var chapter = await _cacheService.Ensure(chapterId);
var path = _cacheService.GetCachedEpubFile(chapter.Id, chapter);
using var book = await EpubReader.OpenBookAsync(path);
var mappings = await _bookService.CreateKeyToPageMappingAsync(book);
var counter = 0;
var doc = new HtmlDocument {OptionFixNestedTags = true};
var baseUrl = Request.Scheme + "://" + Request.Host + Request.PathBase + "/api/";
var apiBase = baseUrl + "book/" + chapterId + "/" + BookApiUrl;
var bookPages = await book.GetReadingOrderAsync();
foreach (var contentFileRef in bookPages)
{
if (page == counter)
{
var content = await contentFileRef.ReadContentAsync();
if (contentFileRef.ContentType != EpubContentType.XHTML_1_1) return Ok(content);
// In more cases than not, due to this being XML not HTML, we need to escape the script tags.
content = BookService.EscapeTags(content);
doc.LoadHtml(content);
var body = doc.DocumentNode.SelectSingleNode("//body");
if (body == null)
{
if (doc.ParseErrors.Any())
{
LogBookErrors(book, contentFileRef, doc);
return BadRequest("The file is malformed! Cannot read.");
}
_logger.LogError("{FilePath} has no body tag! Generating one for support. Book may be skewed", book.FilePath);
doc.DocumentNode.SelectSingleNode("/html").AppendChild(HtmlNode.CreateNode("<body></body>"));
body = doc.DocumentNode.SelectSingleNode("/html/body");
}
var inlineStyles = doc.DocumentNode.SelectNodes("//style");
if (inlineStyles != null)
{
foreach (var inlineStyle in inlineStyles)
{
var styleContent = await _bookService.ScopeStyles(inlineStyle.InnerHtml, apiBase, "", book);
body.PrependChild(HtmlNode.CreateNode($"<style>{styleContent}</style>"));
}
}
var styleNodes = doc.DocumentNode.SelectNodes("/html/head/link");
if (styleNodes != null)
{
foreach (var styleLinks in styleNodes)
{
var key = BookService.CleanContentKeys(styleLinks.Attributes["href"].Value);
// Some epubs are malformed the key in content.opf might be: content/resources/filelist_0_0.xml but the actual html links to resources/filelist_0_0.xml
// In this case, we will do a search for the key that ends with
if (!book.Content.Css.ContainsKey(key))
{
var correctedKey = book.Content.Css.Keys.SingleOrDefault(s => s.EndsWith(key));
if (correctedKey == null)
{
_logger.LogError("Epub is Malformed, key: {Key} is not matching OPF file", key);
continue;
}
key = correctedKey;
}
var styleContent = await _bookService.ScopeStyles(await book.Content.Css[key].ReadContentAsync(), apiBase, book.Content.Css[key].FileName, book);
if (styleContent != null)
{
body.PrependChild(HtmlNode.CreateNode($"<style>{styleContent}</style>"));
}
}
}
var anchors = doc.DocumentNode.SelectNodes("//a");
if (anchors != null)
{
foreach (var anchor in anchors)
{
BookService.UpdateLinks(anchor, mappings, page);
}
}
var images = doc.DocumentNode.SelectNodes("//img");
if (images != null)
{
foreach (var image in images)
{
if (image.Name != "img") continue;
// Need to do for xlink:href
if (image.Attributes["src"] != null)
{
var imageFile = image.Attributes["src"].Value;
if (!book.Content.Images.ContainsKey(imageFile))
{
var correctedKey = book.Content.Images.Keys.SingleOrDefault(s => s.EndsWith(imageFile));
if (correctedKey != null)
{
imageFile = correctedKey;
}
}
image.Attributes.Remove("src");
image.Attributes.Add("src", $"{apiBase}" + imageFile);
}
}
}
images = doc.DocumentNode.SelectNodes("//image");
if (images != null)
{
foreach (var image in images)
{
if (image.Name != "image") continue;
if (image.Attributes["xlink:href"] != null)
{
var imageFile = image.Attributes["xlink:href"].Value;
if (!book.Content.Images.ContainsKey(imageFile))
{
var correctedKey = book.Content.Images.Keys.SingleOrDefault(s => s.EndsWith(imageFile));
if (correctedKey != null)
{
imageFile = correctedKey;
}
}
image.Attributes.Remove("xlink:href");
image.Attributes.Add("xlink:href", $"{apiBase}" + imageFile);
}
}
}
// Check if any classes on the html node (some r2l books do this) and move them to body tag for scoping
var htmlNode = doc.DocumentNode.SelectSingleNode("//html");
if (htmlNode != null && htmlNode.Attributes.Contains("class"))
{
var bodyClasses = body.Attributes.Contains("class") ? body.Attributes["class"].Value : string.Empty;
var classes = htmlNode.Attributes["class"].Value + " " + bodyClasses;
body.Attributes.Add("class", $"{classes}");
// I actually need the body tag itself for the classes, so i will create a div and put the body stuff there.
return Ok($"<div class=\"{body.Attributes["class"].Value}\">{body.InnerHtml}</div>");
}
return Ok(body.InnerHtml);
}
counter++;
}
return BadRequest("Could not find the appropriate html for that page");
}
private void LogBookErrors(EpubBookRef book, EpubTextContentFileRef contentFileRef, HtmlDocument doc)
{
_logger.LogError("{FilePath} has an invalid html file (Page {PageName})", book.FilePath, contentFileRef.FileName);
foreach (var error in doc.ParseErrors)
{
_logger.LogError("Line {LineNumber}, Reason: {Reason}", error.Line, error.Reason);
}
}
}
}