* Fix import resolution performance regression
Related to https://github.com/dhall-lang/dhall-haskell/issues/1511
This fixes a performance regression introduced in #1159 where `newManager`
was being called on every remote import. This fixes that by going back to
caching the `Manager` created by the first request.
This leads to *dramatic* performance improvements for import-rich packages
(like the Prelude or `dhall-kubernetes`) on the first import. For example,
here are the performance numbers for importing the Prelude for a cold cache
before and after this change:
Before:
```
$ XDG_CACHE_HOME=.cache time dhall hash <<< 'https://prelude.dhall-lang.org/package.dhall'
sha256:99462c205117931c0919f155a6046aec140c70fb8876d208c7c77027ab19c2fa
64.10 real 10.83 user 2.73 sys
```
After:
```
$ XDG_CACHE_HOME=.cache2 time dhall hash <<< 'https://prelude.dhall-lang.org/package.dhall'
sha256:99462c205117931c0919f155a6046aec140c70fb8876d208c7c77027ab19c2fa
4.39 real 0.49 user 0.15 sys
```
That's ~16x faster!
The improvement for `dhall-kubernetes` is smaller, but still significant:
Before:
```
$ XDG_CACHE_HOME=.cache3 time dhall hash <<< ~/proj/dhall-kubernetes-charts/stable/jenkins/index.dhall
sha256:04ebd960f6af331c49c3ccaedb353ac8269032b54fe0a29bd167febcd7104d4f
833.59 real 145.36 user 36.16 sys
After:
```
$ XDG_CACHE_HOME=.cache4 time dhall hash <<< ~/proj/dhall-kubernetes-charts/stable/jenkins/index.dhall
sha256:04ebd960f6af331c49c3ccaedb353ac8269032b54fe0a29bd167febcd7104d4f
381.41 real 8.41 user 1.91 sys
```
... or ~2-3x improvement.
* Fix `-f-with-http` build
* Remove unnecessary `CPP`
... as caught by @sjakobi
... by not going through a `Term` intermediate
This gives a ~28% performance in decoding improvement, which means that
cache looks are not faster.
Here are the new decoding benchmarks before and after this change:
Before:
```
benchmarked Issue #108/Binary
time 266.5 μs (265.7 μs .. 267.4 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 266.3 μs (265.6 μs .. 267.1 μs)
std dev 2.418 μs (1.891 μs .. 3.436 μs)
benchmarking Kubernetes/Binary ... took 36.94 s, total 56 iterations
benchmarked Kubernetes/Binary
time 641.3 ms (623.0 ms .. 655.4 ms)
0.999 R² (0.997 R² .. 1.000 R²)
mean 679.7 ms (665.5 ms .. 702.6 ms)
std dev 29.48 ms (14.15 ms .. 39.05 ms)
```
After:
```
benchmarked Issue #108/Binary
time 282.2 μs (279.6 μs .. 284.7 μs)
1.000 R² (0.999 R² .. 1.000 R²)
mean 281.9 μs (280.7 μs .. 287.7 μs)
std dev 7.089 μs (2.550 μs .. 15.44 μs)
variance introduced by outliers: 11% (moderately inflated)
benchmarking Kubernetes/Binary ... took 27.57 s, total 56 iterations
benchmarked Kubernetes/Binary
time 499.1 ms (488.1 ms .. 506.6 ms)
0.999 R² (0.998 R² .. 1.000 R²)
mean 498.9 ms (494.4 ms .. 503.9 ms)
std dev 8.539 ms (6.236 ms .. 12.56 ms)
```
There's a slight performance regression for the decoding microbenchmark, but
in practice my testing on real examples matches performance improvements seen
in the larger benchmark based on an example cache product from
`dhall-kubernetes`.
Note that is a breaking change because:
* There is no longer a `FromTerm` nor `ToTerm` class. Now we use the
`Serialise` class and `{encode,decode}Expression` now work on `ByteString`s
instead of `Term`s
* I further narrowed the types of several encoding/decoding utilites to expect a
`Void` for the first type parameter of `Expr`
* This is a regression with respect to stripping 55799 CBOR tags, mainly
because properly handling the tags at every possible point in the syntax tree
would considerably complicate the code
This updates the `dhall` package to have 100% haddock coverage and
also updates CI to enforce this going forward.
This also includes a change to deprecate the `X` type synonym, which
I noticed along the way
* Warning action about missing cache dir
* Add warning to executable
* Correct duplicate cacheName in getCacheFile
* Warn if dhall-haskell cache dir is not usable
* Improve warn message
Co-Authored-By: Simon Jakobi <simon.jakobi@gmail.com>
* Correct plural
Co-Authored-By: Simon Jakobi <simon.jakobi@gmail.com>
* Improve syntax
Co-Authored-By: Simon Jakobi <simon.jakobi@gmail.com>
* Use runMaybeT to make 2 step warnings
* Add FlexibleContexts to make happy lts-6
* Catch IOException's when handling cache dir
This make the haskell impl follow the standard.
* Correct unwanted identantion
* Push warnings to get* wrapper functions
* Remove unnecessary lang extension
* Inline warnings in get* functions
* Being consistent with break lines
* Apply suggestions from code review
About phrasing, formatting, syntax, etc
Co-Authored-By: Gabriel Gonzalez <Gabriel439@gmail.com>
* doesPathExist is not in directory-1.2.2.0
* Make message fit in 80 cols
When enabled, we handle protected imports as if the semantic cache was
empty:
* Protected imports are resolved again, downloaded or read from
the filesystem as necessary.
* Protected imports are β-normalized, not αβ-normalized.
* Protected imports are checked against their SHA256 hashes,
failing to resolve if they don't match.
Context:
https://github.com/dhall-lang/dhall-haskell/pull/1275#issuecomment-528847192
* Add a haddock to explain the various `Binding` fields.
* Add combinators to make dealing with `Binding` less awkward.
With all of the source information flying around, manually
deconstructing and reconstructing `Binding`s is a pain. These
combinators cover some very common cases.
* Use `bindingExprs` to simplify `subExpressions`.
* Use bindingExprs and chunkExprs to simplify another traversal.
Closes#1185.
This mostly reverts "Add support for multi-`let` (#675)" /
8a5bfaa3b9.
Also:
* Add fields for Src
This is useful for to make 'Note's less noisy during debugging:
first srcText expr
* Remove Dhall.X and replace with Data.Void
This commit removes the Dhall.X module and the Dhall.X.X type,
preferring the use of Data.Void.Void. As I'm sure a lot of people are
actually using X, I've added a type-alias type X = Void. However,
pattern matching on X would be a breaking change.
Fixes#1120.
* Restore unsafeCoerce
* Fix regression
* Unused
* Reorganise exports
* Fix dhall-nix
* Another fix
* Fix Dhall.LSP.Backend.Typing
* Fix dhall-bash
* Remove most uses of `StandardVersion` from the API
We no longer support multiple versions of the standard, except for
supporting old integrity checks, so this change removes all inessential
uses of `StandardVersion` from the API and command-line interface.
* Fix `dhall-lsp-server` build
The motivation for this change is to avoid α-normalizing all imported
expressions.
For example, before this change you would get the following behavior
beginning with an empty cache
```
$ cat ./example.dhall
λ(a : Type) → a
$ dhall <<< './example.dhall'
λ(_ : Type) → _
```
The reason why is that the current code α-normalizes all imported
expressions, even when returning them fresh.
To fix this, I changed the `ImportSemantics` type to not require that
expressions are α-normalized. Instead, the α-normalization only
happens at the last minute when interacting with the semantic cache, but
nowhere else.
I figured that this change would also be fine from the perspective of
the semi-semantic cache because false-negatives for this cache are
fine. In particular, we probably don't mind if we get a cache miss for
the semi-semantic cache if the user renames a variable.
After this change imports are no longer α-normalized, whether loaded
from a hot or cold cache:
```
$ cat ./example.dhall
λ(a : Type) → a
$ dhall <<< './example.dhall'
λ(a : Type) → a
$ dhall <<< './example.dhall'
λ(a : Type) → a
```
* Allow customization of remote import resolution
Makes the `Status` type more general; previously support for
`Network.HTTP.Client` was hardcoded. In short:
```
data Status = Status
{ _stack :: NonEmpty Chained
[...]
-- , _manager :: Maybe Dynamic
-- -- importing the same expression twice with different values
++ , _remote :: URL -> StateT Status IO Data.Text.Text
++ -- ^ The remote resolver, fetches the content at the given URL.
[...]
}
```
* Simplify and expose `toHeaders`
`toHeaders` will be needed for mock http testing
* Fix compilation without `with-http` flag
* Fix compilation with `with-http` flag
* Fix tests without `with-http` flag
Implements a mock http client that handles requests to:
- `https://raw.githubusercontent.com/dhall-lang/dhall-lang/master/`
- `https://test.dhall-lang.org/Bool/package.dhall`
- `https://httpbin.org/user-agent`
This allows tests involving remote imports to succeed even when compiled
without the `with-http` flag.
* Build `dhall` with HTTP support compiled out in CI
... to prevent regressions from occurring in the future
* Tag ImportSemantics with their semantic hashes
This is in preparation for semi-semantic caching.
* Collect the list of imports during import resolution
The final step needed in preparation for semi-semantic caching!
* Implement semi-semantic caching
This completes the implementation of the "semi-semantic caching"
proposal (issue #1098).
We compute the semi-semantic hash of a dhall import/file/expression as
follows:
- Parse the input;
- compute the semantic hashes of all imports referenced in the AST, i.e.
the hashes of their normal forms;
- compute the syntactic hash of the input (hashing the parsed AST);
- concatenate the syntactic hash of the input with the semantic hashes
of its imports and hash the result.
The "semi-semantic" cache (normal forms, indexed by semi-semantic
hashes) has the following properties:
- For a given input we can quickly find out if it is in the cache: we
only need to parse the input – we don't need to typecheck or normalise
it!
- The cache stays consistent, that is, we don't need to ‘invalidate’ old
cache entries if their dependencies change!
* Simplify semi-semantic hash
As suggested by @Gabriel439.
* Simplify code
We don't actually need to carry the list of imports around when loading.
* Restore `load`
Previously, `BAD="0 0" dhall <<< "env:BAD ? 0"` resulted in the
following error:
```
↳ env:BAD
Error: Not a function
1│ 0 0
BAD:1:1
```
According to the standard the above expression was supposed to evaluate
successfully to `0`. See #1146 for further discussion.
* Load imports recursively
This is the big change that enables us to implement 'semi-semantic'
caching.
* Use `throwM` instead of `liftIO . throwIO`
* Fix build with __GHCJS__
* Fix exceptions in Dhall.Import
* Fix dhall-lsp-server
* Revert exception behaviour on typecheck errors
This is one for a separate pull request!
* Make sure loadImportFresh returns alpha-normal expression
As caught by @Gabriel439, `loadImportFresh` violated the invariant that
`ImportSemantics` should be alpha-beta-normal. This fix also means that
we don't have to alpha-normalise again in `loadImportWithSemanticCache`.
* Remove old comment
* Fix regression test for issue 216
Turns out the test was testing the wrong thing, because it was
pretty-printing an import. This worked previously because when importing
uncached expressions we would not alpha-normalise them.
* Restore `dhall freeze` bevhaviour
Newly frozen imports should also be present in the cache.
* Fix misleading comment
* Add `Chained` type to capture fully chained imports
Until now we used `Import` two mean two different things:
- The syntactic construct; e.g. `./a.dhall` corresponds to the following
AST:
```
Embed
(Import
(ImportHashed Nothing (Local Here (Directory ["."]) "a.dhall"))
Code)
```
- The physical location the import is pointing to, computed by
'chaining' the syntactical import with the the 'physical' parent import.
For example the syntactic import `./a.dhall` might actually refer to the
remote file `http://host/directory/a.dhall`.
This commit adds a `Chained` newtype on top of `Import` to make this
distinction explicit at type level.
* Use `HTTPHeaders` alias for binary headers
I claim that `HTTPHeaders` is more readable and informative than the
unfolded type `(CI ByteString, ByteString)`.
* Typecheck and normalise http headers earlier
Previously we would typecheck and normalise http headers in
`exprFromImport`, i.e. while loading the import. This commit adds the
invariant that any headers in 'Chained' imports are already typechecked
and normalised, and moves this step into `loadWith` accordingly.
This causes a subtle difference in behaviour when importing remote files
with headers `as Location`: previously, nonsensical expressions like
`http://a using 0 0 as Location` were valid, while they would now cause
a type error.
* Fix dhall-lsp-server
* Fix Dhall.Import API regarding `Chained` imports
Do not expose the `Chained` constructor; we don't want external code
breaking our invariants! Also further clarifies the comment describing
the `Chained` type.
* Fix dhall-lsp-server
Since we are no longer able to construct `Chained` imports directly we
need to export a few additional helper functions from Dhall.Import.
Furthermore, since VSCode (and presumably the other editors out there
implementing the LSP protocol) does not support opening remote files
anyway we can get rid of some complications by dropping support for
remote files entirely on the back-end.
* Generalise decodeExpression, fixes TODO
* Fix tests
* Fix benchmarks
* Remove Travis cache for `~/.local/bin`
* Fix copy-pasted comment
Thanks to @Gabriel439 for spotting this!
* Add clarifying comment to `toHeaders`
It is not the case that
canonicalize (a <> b) = canonicalize a <> canonicalize b.
For example
canonicalize (Directory ["asd"] <> Directory [".."])
= Directory [],
but
canonicalize (Directory ["asd"]) <> canonicalize (Directory [".."])
= Directory ["..", "asd"].
The law we want instead is:
canonicalize (a <> b)
= canonicalize (canonicalize a <> canonicalize b)
* Expose `localToPath` in Dhall.Import
Also modifies `localToPath` to return a relative path if the input was
relative, rather than resolving relative paths by appending the current
directory.
* Turn imports into clickable links
This implements a handler for 'Document Link' requests. As a result,
imports are now clickable!
* Recover original behaviour
* Move "Dot" import graph generation to Dhall.Main
Previously `Dhall.Import` would generate the import graph in "dot"
format while resolving imports. This change simplifies `Dhall.Import` to
only keep track of the adjacency list representing the import graph,
moving the logic for generating "dot" files to Dhall.Main.
This change will allow us to implement proper cache invalidation for
`dhall-lsp-server`.
* Correctly invalidate transitive dependencies
Fixes dhall-lsp-server`s caching behaviour to correctly invalidate
cached imports that (possibly indirectly) depend on the changed file.
Example:
Suppose we have the following three files:
{- In A.dhall -} 2 : ./B.dhall
{- In B.dhall -} ./C.dhall
{- In C.dhall -} Natural
Previously, changing C.dhall to `Text` would not cause `A.dhall` to stop
type-checking, since the old version of `B.dhall` (which evaluated to
`Natural`) would still have been in the cache. This change fixes that
behaviour.
* Make edges of import graph self-documenting
As suggested by @Gabriel439
* Don't cache expressions manually
After computing the diagnostics for a given file we added its normal
form to the cache, but forgot to add its dependencies to the dependency
graph. This bug points out that keeping the import graph consistent
manually is probably not a good idea. With this commit we never mess
with the import cache manually; this means that files are only cached
once they are depended upon by some other file, potentially causing us
to duplicate work (but no more than once).
* Fix left-overs from previous commit
Part of https://github.com/dhall-lang/dhall-lang/issues/563
This flag freezes imports in the same way as the Prelude by providing a
fallback unprotected import without an integrity check. The primary use
case for this is caching imports with a graceful fallback, which is why
the flag is named `--cache`
This adds a new `Dhall.Test.Util.discover` utility for auto-generating
a `TestTree` from a directory tree. This simplifies keeping up to date
with changes to the standard test suite.
- Dhall.Eval: new evaluator, conversion checker and normalizer.
There is no standalone alpha normalizer yet.
- There is a new option "new-normalize" for dhall executable, which uses
the new normalizer.
- Type checker is unchanged.
- new implementation: alphaNormalize, judgmentallyEqual, normalize
- normalizeWith takes a Maybe ReifiedNormalizer argument now, and switches to
the new evaluator whenever the input normalizer is Nothing
- QuickCheck test for isNormalized removed, because we don't support evaluation
of ill-typed terms, which the test would require.
... as standardized in https://github.com/dhall-lang/dhall-lang/pull/426
This adds two new `ToTerm`/`FromTerm` classes in order to minimize
code disruption. The main disruption is due to renaming the old
`encode`/`decode` to `encodeExpression`/`decodeExpression`
... as standardized in https://github.com/dhall-lang/dhall-lang/pull/438
This also adds `dhall-json` support for empty alternatives
In particular, this translates empty alternatives to strings encoding the alternative name
```haskell
-- ./example.dhall
let Role = < Wizard | Fighter | Rogue >
in [ Role.Wizard, Role.Fighter ]
```
```
$ dhall-to-json <<< './example.dhall'
["Wizard","Fighter"]
```