Biggest change is no longer using standard library's tls.Config.getCertificate function to get a certificate during TLS handshake. Implemented our own cache which can be changed dynamically at runtime, even during TLS handshakes. As such, restarts are no longer required after certificate renewals or OCSP updates.
We also allow loading multiple certificates and keys per host, even by specifying a directory (tls got a new 'load' command for that).
Renamed the letsencrypt package to https in a gradual effort to become more generic; and https is more fitting for what the package does now.
There are still some known bugs, e.g. reloading where a new certificate is required but port 80 isn't currently listening, will cause the challenge to fail. There's still plenty of cleanup to do and tests to write. It is especially confusing right now how we enable "on-demand" TLS during setup and keep track of that. But this change should basically work so far.
Implements "on-demand TLS" as I call it, which means obtaining TLS certificates on-the-fly during TLS handshakes if a certificate for the requested hostname is not already available. Only the first request for a new hostname will experience higher latency; subsequent requests will get the new certificates right out of memory.
Code still needs lots of cleanup but the feature is basically working.
Before, Caddy couldn't support graceful (zero-downtime) restarts when the reloaded Caddyfile had a host in it that was elligible for a LE certificate because the port was already in use. This commit makes it possible to do zero-downtime reloads and issue certificates for new hosts that need it. Supports only http-01 challenge at this time.
OCSP stapling is improved in that it updates before the expiration time when the validity window has shifted forward. See 30c949085cad82d07562ca3403a22513b8fcd440. Before it only used to update when the status changed.
This commit also sets the user agent for Let's Encrypt requests with a string containing "Caddy".
Hosts which are eligible for automatic HTTPS have default port "https" but other hosts (wildcards, loopback, etc.) have the default port 2015. The default host of empty string should be more IPv6-compatible.
Added a -grace flag to customize graceful shutdown period, fixed bugs related to closing file descriptors (and dup'ed fds), improved healthcheck signaling to parent, fixed a race condition with the graceful listener, etc. These improvements mainly provide better support for frequent reloading or unusual use cases of Start and Stop after a Restart (POSIX systems). This forum thread was valuable help in debugging: https://forum.golangbridge.org/t/bind-address-already-in-use-even-after-listener-closed/1510?u=matt
Fixed pidfile writing problem where a pidfile would be written even if child failed, also cleaned up restarts a bit and fixed a few bugs, it's more robust now in case of failures and with logging.
Whether the original parent process or a child process as part of a restart, the pidfile will not be written/changed until that process has started successfully. It is written every time caddy.Start() succeeds (may be reundant, but that's probably okay).
We had to hack some special support into the server and caddy packages for this. There are some middlewares which should only execute commands when the original parent process first starts up. For example, someone using the startup directive to start a backend service would not expect the command to be executed every time the config was reloaded or changed - only once when they first started the original caddy process.
This commit adds FirstStartup to the virtualhost config
The error channel used when starting all the servers must be buffered so that, even if there are no errors at startup, the returns that insert into the error channel will not be blocked, since after startup, nobody is reading that channel anymore.
Previously, if a listener fails to bind (for example), there was a race in caddy.go between unblocking the startup waitgroup and returning the error and putting it into errChan. Now, an error is returned directly into errChan and the closing of the startup waitgroup is defered until after that return takes place.
Log file can also be stdout or stderr. Log output is disabled by default now, which makes it more feasible to add more log statements to trace program flow in debugging situations.
Fixed bug where manually specifying port 443 disabled TLS (whoops); otherHostHasScheme was the culprit, since it would return true even if it was the same config that had that scheme.
Also, an error at startup (if not a restart) is now fatal, rather than keeping a half-alive zombie server.
Before, we were activating Let's Encrypt after all the directives were executed. This means their setup functions had access to potentially erroneous information about the server's TLS setup, since the letsencrypt package makes changes to the port, etc. Now, we execute all directives up to and including tls, then activate letsencrypt, then finish with the rest of the directives. It's a bit ugly, but I do think it is more correct. It also fixes some bugs, for example: a host that only has a catch-all redirect.
The file path of the originally-loaded Caddyfile must be piped to the forked process; previously it was using stdin after the first fork, which wouldn't load the newest Caddyfile from disk, which is the point of SIGUSR1.
Merged config and app packages into one called caddy. Abstracted away caddy startup functionality making it easier to embed Caddy in any Go application and use it as a library. Graceful restart (should) now ensure child starts properly. Now piping a gob bundle to child process so that the child can match up inherited listeners to server address. Much cleanup still to do.