Skip to content

Lazy initialization for recursive bindings#4283

Merged
JordanMartinez merged 1 commit intopurescript:masterfrom
rhendric:rhendric/fix-4179
Apr 14, 2022
Merged

Lazy initialization for recursive bindings#4283
JordanMartinez merged 1 commit intopurescript:masterfrom
rhendric:rhendric/fix-4179

Conversation

@rhendric
Copy link
Member

@rhendric rhendric commented Apr 4, 2022

I wrote a book in src/Language/PureScript/CoreFn/Laziness.hs; read that
for details.

This is unlikely to break a working program, but the upshot for users is
that it's now possible to get a run-time error when dereferencing an
identifier in a recursive binding group before it has been initialized,
instead of silently getting an undefined value and having that maybe
or maybe not lead to an error somewhere else.

Description of the change

Fixes #4179.


Checklist:

  • Added a file to CHANGELOG.d for this PR (see CHANGELOG.d/README.md)
  • Added myself to CONTRIBUTORS.md (if this is my first contribution)
  • Linked any existing issues or proposals that this pull request should close
  • Updated or added relevant documentation
  • Added a test for the contribution (if applicable)

@rhendric rhendric force-pushed the rhendric/fix-4179 branch from 4bce1d2 to d9324a9 Compare April 4, 2022 23:04
@JordanMartinez
Copy link
Contributor

Thanks! (and wow... this looks complicated.) I'm going to take a look at this tomorrow.

@rhendric
Copy link
Member Author

rhendric commented Apr 5, 2022

Ha, hopefully it isn't as bad as it seems! (The main file is 60% comments, which I suppose is its own sort of red flag....)

I forgot to mention: re-running the benchmark and analysis I ran in #4179 (comment) produced few meaningful differences, with the one exception that the number of bindings made lazy has dropped from 2% then (142/6547) to just under 1% (65/6611). (The denominators are different because the package set libraries have changed!) Minified bundle sizes are still almost 4% larger with the laziness patch though, so those per-module runtime functions seem to be contributing the bulk of the bloat. If 4% is something we can live with for this release, we can design and implement an actual runtime module after 0.15 without changing any semantics, I think.

@JordanMartinez
Copy link
Contributor

Minified bundle sizes are still almost 4% larger with the laziness patch though, so those per-module runtime functions seem to be contributing the bulk of the bloat.

Given that esbuild produces a ~25% smaller bundle than purs bundle, I think that's ok for now. But fixing this bloat in the future would be desirable. Moreover, if this unblocks #3915, I think the tradeoff is worth it.

@garyb
Copy link
Member

garyb commented Apr 5, 2022

This looks great, and the comments are very helpful. I don't think the volume of them is pointing to a bad thing, the concept is simple enough, there's just a bunch of details involved in making it a reality 😉 (the infinite graph being the main culprit for that, understandably).

@JordanMartinez
Copy link
Contributor

Here's my summary of this doc comment: https://github.com/purescript/purescript/pull/4283/files#diff-052739b45547126f69f413fa5df905734d356e26eee8bf58e9cabd4eadd6799fR67-R77

Delay = # of lambdas enclosing a var (always known)

Delay for x Expression
0 x
1 \_ -> x
2 \_ -> \_ -> x
2 \_ _ -> x

Force = # of args being applied to expression (sometimes known)

Force Expression
0 f
1 f x
2 f x x
2 f x \y -> ...

My own syntax reference

  • x#n = a variable x with delay n (n cages surround x)
  • f!n = an expression f with force n (n forces before expression f is reduced)

I'm trying to understand how the rules work. Below is my current understanding, but I'm not sure I'm getting it right:

  • Delay
    • D-Init - "a Var with delay 0 needs to be initialized before the enclosing expression is evaluated"
      • x#0 + 1 means x needs to be initialized before the enclosing expression, x + 1, is evaluated
    • D-Delay - "a Var with more delay can wait"
      • \_ -> x#1 + 1 means x does not need to be initialized
      • Does this mean x does not need to be initialized before the enclosing expression is evaluated? (I assume not but that seems to be how the sentence should read.) Before \_ -> x + 1 is used as an argument somewhere? etc.
  • Force
    • F-Init0 - "a Var with force 0 only needs to have its delay-0 references initialized in order to be ready for use"
      • x!0 = foo!0#0 + 1 means x cannot be used until foo has been initialized
    • F-Init1 - "a Var with force 1 needs to have all of its delay-1 references initialized as well"
      • What would be an example of this? I'm not entirely sure how to translate this one to my syntax...
      • x!1 f!1#0 = f unit + 3 isn't quite an example because f has delay 0, not delay 1, even though x has force 1.
      • x!2 f!1#1 = \arg -> show (f arg) isn't quite right because x technically has force 2, not force 1, even though f has delay 1.
      • let f!1 = ...; x!1 = \arg -> f!1#1 arg seems to be this example...?

@rhendric
Copy link
Member Author

rhendric commented Apr 5, 2022

As an aside, I have to assume that somewhere someone has come up with this sort of delay/force analysis before now (probably with different names); if any of you know of a standard paper I could reference or even set of terminology I could use instead of developing the concepts from scratch, I figure that would be a big help.

I like your syntax annotations. I'll propose a slight extension: f!* means that the expression f is used with unknown (which practically gets treated as unlimited) force.

  • D-Delay - "a Var with more delay can wait"

    • \_ -> x#1 + 1 means x does not need to be initialized

    • Does this mean x does not need to be initialized before the enclosing expression is evaluated? (I assume not but that seems to be how the sentence should read.) Before \_ -> x + 1 is used as an argument somewhere? etc.

Force is a measure of ‘how much’ an expression like \_ -> x#1 + 1 gets reduced. x#1 means that force of at least 1 will be necessary to activate the reference to x. So, in f = (\_ -> x#1 + 1)!0, x doesn't need to be initialized first. But in:

f = (\_ -> x#1 + 1)!0
y = (f!1 unit)!0

x needs to be initialized before y (but still not necessarily before ff hasn't changed, after all). The definition of y applies force 1 to f, and so all of f's delay-1 references need to be initialized first.

Similarly,

f = (\_ -> x#1 + 1)!0
g = (h f!*)!0

f is in argument position and thus is used with unknown force (we don't know what h is going to do to f). In this case, x again needs to be initialized before g, because unknown force is effectively unlimited force and requires references of every delay level to be initialized first.

  • F-Init0 - "a Var with force 0 only needs to have its delay-0 references initialized in order to be ready for use"

    • x!0 = foo!0#0 + 1 means x cannot be used until foo has been initialized

I would write this as x = foo!*#0 + 1. The LHS of a binding doesn't have delay or force; delay and force are properties of expressions. And foo here is properly speaking an argument to the function (+), and without knowing what (+) does, we don't know the level of force it applies to its arguments. And the conclusion to be drawn is that x can't be initialized (a stronger statement than ‘x can't be used’) until foo has been.

  • F-Init1 - "a Var with force 1 needs to have all of its delay-1 references initialized as well"

    • What would be an example of this? I'm not entirely sure how to translate this one to my syntax...

See my f and y example in my discussion of D-Delay.

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Apr 5, 2022

The LHS of a binding doesn't have delay or force; delay and force are properties of expressions.

Ah! That clears up probably my biggest misunderstanding.

Let me try redefining the rules in light of this clarification.

-- D-Init - "a Var with delay 0 needs to be initialized
--           before the enclosing expression is evaluated"
let
  binding = (x#0 + 1)!0
in ...
-- Translation:
-- `x` needs to be initialized before the enclosing expression, `x + 1`, is evaluated

-- D-Delay - "a Var with more delay can wait"
let
  binding = (\_ -> x#1 + 1)!1
in ...
-- Translation:
--`x` does not need to be initialized

-- F-Init0 - "a Var with force 0 only needs to have its
--            delay-0 references initialized in order to be ready for use"
let
  foo = (1)!0
  bar = (foo!0#0)!0
in ...
-- Translation:
-- `bar` cannot be _used_ until `foo` has been initialized
-- However, `bar` _can_ be initialized before `foo` has been initialized

-- F-Init1 - "a Var with force 1 needs to have all of its
--            delay-1 references initialized as well"
let
  foo = (1)!0
  lazyFoo = (\_ -> foo#1)!1
  bar = (lazyFoo!1#0 true!0#0)!0
in ...
-- Translation:
-- For `lazyFoo`, `foo` does not need to be initialized due to the rule, D-Delay.
-- However, `bar` cannot be _initialized_ until after `foo` has been initialized
-- because the force needed to activate `foo` in `lazyFoo` is being applied.

-- Inferred rule based on `f!*` comment.
-- F-InitStar - "a Var with force * needs to have all of its
--               references (regardless of delay) initialized as well"
foreign import addFFI :: (Unit -> Int) -> Int -> Int

let
  foo = (1)!0
  lazyFoo = (\_ -> foo#1)!1
  bar = (addFFI!2 lazyFoo!*#0 2!0#0)!0
in ...
-- Translation:
-- Normally, `foo` in `lazyFoo` does not need to be initialized via rule, D-Delay.
-- However, `bar` cannot be _initialized_ until after all entities referenced
-- by `lazyFoo` (i.e. `foo`) have been initialized because we don't know
-- what `addFFI` will do with `lazyFoo`.
-- Since `lazyFoo` _might_ be used, we have to assume the force
-- needed to activate all of `lazyFoo`'s references is being applied.
-- Without knowing how much force is needed to fully activate all entities
-- referenced by `lazyFoo`, we use infinity as the amount of force.

@rhendric
Copy link
Member Author

rhendric commented Apr 5, 2022

-- D-Delay - "a Var with more delay can wait"
let
  binding = (\_ -> x#1 + 1)!1
in ...
-- Translation:
--`x` does not need to be initialized

This should be binding = (\_ -> x#1 + 1)!0. Force starts at 0.

Now, to head off a different potential misunderstanding, I also said that in the bindings of a Let, force is unknown. From the perspective of this analysis, there is a difference between the ‘top-level’ bindings and inner let-bindings. (I put ‘top-level’ in quotes because I don't necessarily mean module-level; this analysis runs once for every recursive binding group, and each time it's run, the given binding group acts as the top level.) So a more complicated example that illustrates this would be:

let
  f = (...)!0
  g = let
    x = (...)!*
    y = (...)!*
  in (...)!0
in ...

That is, assuming that f and g end up referencing each other recursively, of course! While this isn't a problem for understanding the concepts of force and delay, bear in mind that this analysis only ever runs for recursive binding groups, which none of these examples include—if the bindings can be ordered just by topsorting references, then the binding groups desugaring phase will do so and nothing is fed to the laziness transform. (It's pretty tricky to write examples that are recursive and type-check and aren't rejected by different cycle-checking logic in the type checker and demonstrate some of the weirder edge cases of these rules, so it definitely makes sense to drop some of these requirements for purposes of trying the ideas out.)

-- F-Init0 - "a Var with force 0 only needs to have its
--            delay-0 references initialized in order to be ready for use"
let
  foo = (1)!0
  bar = (foo!0#0)!0
in ...
-- Translation:
-- `bar` cannot be _used_ until `foo` has been initialized
-- However, `bar` _can_ be initialized before `foo` has been initialized

F-Init0 doesn't really apply to bar here, which only appears in the example as a binding, not as a Var. It applies—trivially—to foo. The foo!0 Var can't be used until all of foo's delay-0 references (an empty set) are initialized.

But D-Init also applies to this situation. The foo Var also has delay 0, and so foo needs to be initialized before the definition of bar can be evaluated, so bar can't be initialized first.

-- F-Init1 - "a Var with force 1 needs to have all of its
--            delay-1 references initialized as well"
let
  foo = (1)!0
  lazyFoo = (\_ -> foo#1)!1
  bar = (lazyFoo!1#0 true!0#0)!0
in ...
-- Translation:
-- For `lazyFoo`, `foo` does not need to be initialized due to the rule, D-Delay.
-- However, `bar` cannot be _initialized_ until after `foo` has been initialized
-- because the force needed to activate `foo` in `lazyFoo` is being applied.

I'll revise those annotations:

let
  foo = (1)!0
  lazyFoo = (\_ -> foo#1)!0
  bar = (lazyFoo!1#0 true!*#0)!0
in ...

But your translation is correct!

-- Inferred rule based on `f!*` comment.
-- F-InitStar - "a Var with force * needs to have all of its
--               references (regardless of delay) initialized as well"
foreign import addFFI :: (Unit -> Int) -> Int -> Int

let
  foo = (1)!0
  lazyFoo = (\_ -> foo#1)!1
  bar = (addFFI!2 lazyFoo!*#0 2!0#0)!0
in ...
-- Translation:
-- Normally, `foo` in `lazyFoo` does not need to be initialized via rule, D-Delay.
-- However, `bar` cannot be _initialized_ until after all entities referenced
-- by `lazyFoo` (i.e. `foo`) have been initialized because we don't know
-- what `addFFI` will do with `lazyFoo`.
-- Since `lazyFoo` _might_ be used, we have to assume the force
-- needed to activate all of `lazyFoo`'s references is being applied.
-- Without knowing how much force is needed to fully activate all entities
-- referenced by `lazyFoo`, we use infinity as the amount of force.

Again, fully correct translation, but the annotations should be:

let
  foo = (1)!0
  lazyFoo = (\_ -> foo#1)!0
  bar = (addFFI!2 lazyFoo!*#0 2!*#0)!0
in ...

This is all great feedback for me, by the way—in particular I'm learning that I should consolidate the language that describes the judgments about initialization order that can be made given delay and force information. In addition to the rules you've named so far, there's one more big one that I reveal a little later in the file:

  • F-InitTrans1: Before a Var x with force 1 can be used, not only must all of the delay-1 references in the definition of x be ready for use (F-Init1), so must all of the delay-0 references with 1 extra unit of force.

This generalizes to:

  • F-InitTrans: Before a Var x with force f can be used, for d from 0 to f, all delay-d references in the definition of x must be ready for use with fd additional units of force.
  • F-InitTransStar: Before a Var x with unknown force can be used, all references in the definition of x must be ready for use with unknown force. (This extends F-InitStar, which as written only claims that all references in the definition of x must be initialized. It's also a straightforward consequence of F-InitTrans if you think of force as an element of the naturals plus infinity, with infinity ± any finite number = infinity. (I originally did use a data ExtendedNat = Fin Int | Inf type for force in this code, but I mapped it in and out of Maybe so frequently that I decided it was clearer using Maybe directly.))

The example from my comments, with relevant references newly annotated, is:

f = g#0!2 1 2
g x = h#1!2 3 x
h x y z = f#3!0

To initialize f, g must be used with force 2. To use g with force 2, h must be used with force 3 (2 − 1 + 2). To use h with force 3, f must be used with force 0. The algorithm should conclude that this is a loop that requires run-time laziness. Without F-InitTrans, the algorithm would not reach such a conclusion.

@JordanMartinez
Copy link
Contributor

It's pretty tricky to write examples that are recursive and type-check and aren't rejected by different cycle-checking logic in the type checker and demonstrate some of the weirder edge cases of these rules, so it definitely makes sense to drop some of these requirements for purposes of trying the ideas out.

Perhaps we should instead use the rules you've defined on the golden tests themselves? Since those are the "pretty tricky" examples that are otherwise hard to come up with?

@rhendric
Copy link
Member Author

rhendric commented Apr 7, 2022

Perhaps we should instead use the rules you've defined on the golden tests themselves? Since those are the "pretty tricky" examples that are otherwise hard to come up with?

Yeah, right on.

I just pushed a fixup with a rewording of the delay/force explanation. The new explanation uses three explicit rules:

-- USE-INIT: A binding x must be initialized before x is ready for use with
-- any force.
--
-- USE-IMMEDIATE: Initializing a binding x is equivalent to requiring a
-- reference to x to be ready for use with force 0.

-- USE-USE: Before a reference to x is ready for use with force f, every reference
-- in the initializer of x with delay d and force f' must be ready for use
-- with force f − d + f', provided d is not greater than f. (If f is
-- unknown, every reference must be ready for use with unknown force
-- irrespective of delay or the original force of that reference.)

And here's one of the more interesting golden tests, with top-level references annotated:

alpha :: Int -> Int -> Number
alpha = case _ of
  0 -> bravo#1!0
  1 -> charlie#1!0
  2 -> \y -> if y > 0 then bravo#2!1 y else charlie#2!1 y
  x -> \y -> delta#2!2 y x

bravo :: Int -> Number
bravo = (\_ -> alpha#0!1) {} 3

charlie :: Int -> Number
charlie = (\_ -> alpha#0!1) {} 4

delta :: Int -> Int -> Number
delta =
  let b = (\_ -> bravo#0!*) {}
  in \x y -> if x == y then b 0 else 1.0

alpha contains no delay-0 references, so USE-IMMEDIATE and USE-USE impose no constraints on its initialization.

USE-USE says that neither bravo nor charlie can be ready for use with force 0 until alpha is ready for use with force 1, and likewise alpha can't be ready for use with force 1 until both bravo and charlie are ready for use with force 0. By USE-INIT, the initialization of alpha must precede the initialization of bravo and charlie, and there is a cycle in making both bravo and charlie ready for use that requires them to be initialized together.

Finally, initializing delta requires bravo to be ready for use with unknown force. By USE-USE, bravo is not ready for use with unknown force until alpha is, and alpha is not ready for use with unknown force until delta (along with bravo and charlie) is. So delta must be initialized after the other three bindings, but there is a cycle in making delta ready for use by itself.

So the expected result is:

var alpha = ...;
var $lazy_bravo = ...;
var $lazy_charlie = ...;
var bravo = $lazy_bravo(...);
var charlie = $lazy_charlie(...);
var $lazy_delta = ...;
var delta = $lazy_delta(...);

@JordanMartinez
Copy link
Contributor

I'll be looking over more of your comment and the new changes after posting this comment, but I wanted to at least get this out.

I noticed that I was using n!0#0 (i.e. n with force 0 and delay 0) and you've swapped the order of delay and force to n#0!0 (i.e. n with delay 0 and force 0) in the above examples. I'm pointing that out for others who maybe missed that.

In the doc comments you pushed, perhaps it would be beneficial to use this !0 and #1 syntax to help communicate various points?

The initial reading of USE-USE-NAIVE makes it seem like it's the third and final rule. However, you then clarify that it's not quite correct and you replace it with USE-USE. Could you instead prefix that part with something like:

Both of these observations are captured and generalized by the following rules. The initial definition of the third rule, USE-USE-NAIVE, is incorrect and will be further refined into the final definition, USE-USE:

@JordanMartinez
Copy link
Contributor

Using your example above, here's all the possible control flows

alpha 0 3
alpha 0 3
bravo 3
((\_ -> alpha) {} 3) 3
(       alpha     3) 3
        alpha     3  3
alpha 3 3
(\y -> delta y 3) 3
       delta 3 3
delta 3 3
let b = (\_ -> bravo) {} in if 3 == 3 then b     0 else 1.0
let b =        bravo     in if 3 == 3 then b     0 else 1.0
                            if 3 == 3 then bravo 0 else 1.0
                                           bravo 0
bravo 0
((\_ -> alpha) {} 3) 0
(       alpha)    3) 0
        alpha     3  0
alpha 3 0
delta 0 3
let b = (\_ -> bravo) {} in if 0 == 3 then b     0 else 1.0
let b =        bravo     in if 0 == 3 then b     0 else 1.0
                            if 0 == 3 then bravo 0 else 1.0
                                                        1.0
1.0
alpha 1 4
alpha 1 4
charlie 4
((\_ -> alpha) {} 4) 4
(       alpha     4) 4
        alpha     4  4
alpha 4 4
(\y -> delta y 4) 4
       delta 4 4
delta 4 4
let b = (\_ -> bravo) {} in if 4 == 4 then b     0 else 1.0
let b = bravo            in if 4 == 4 then b     0 else 1.0
                            if 4 == 4 then bravo 0 else 1.0
                                           bravo 0
bravo 0
((\_ -> alpha) {} 3) 0
(       alpha)    3) 0
        alpha     3  0
alpha 3 0
delta 0 3
let b = (\_ -> bravo) {} in if 0 == 3 then b     0 else 1.0
let b =        bravo     in if 0 == 3 then b     0 else 1.0
                            if 0 == 3 then bravo 0 else 1.0
                                                        1.0
1.0
alpha 2 3
(alpha 2) 3
(\y -> if y > 0 then bravo y else charlie y) 3
       if 3 > 0 then bravo 3 else charlie 3
                     bravo 3
bravo 3
((\_ -> alpha) {} 3) 3
(       alpha     3) 3
        alpha     3  3
alpha 3 3
(\y -> delta y 3) 3
       delta 3 3
delta 3 3
let b = (\_ -> bravo) {} in if 3 == 3 then b     0 else 1.0
let b = bravo            in if 3 == 3 then b     0 else 1.0
                            if 3 == 3 then bravo 0 else 1.0
                                           bravo 0
bravo 0
((\_ -> alpha) {} 3) 0
(       alpha)    3) 0
        alpha     3  0
alpha 3 0
delta 0 3
let b = (\_ -> bravo) {} in if 0 == 3 then b     0 else 1.0
let b =        bravo     in if 0 == 3 then b     0 else 1.0
                            if 0 == 3 then bravo 0 else 1.0
                                                        1.0
1.0
alpha 2 0
(alpha 2) 0
(\y -> if y > 0 then bravo y else charlie y) 3
       if 0 > 0 then bravo 3 else charlie 0
                                      charlie 0
charlie 0
((\_ -> alpha) {} 4) 0
(       alpha     4) 0
        alpha     4  0
alpha 4 0
(\y -> delta y 4) 0
       delta 0 4
delta 0 4
let b = (\_ -> bravo) {} in if 0 == 4 then b     0 else 1.0
let b =        bravo     in if 0 == 4 then b     0 else 1.0
                            if 0 == 4 then bravo 0 else 1.0
                                                        1.0
1.0
alpha 0 0
(alpha 0) 0
(\x -> (\y -> delta y x) 0 0
       (\y -> delta y 0)   0
              delta 0 0
delta 0 0
let b = (\_ -> bravo) {} in if 0 == 0 then b     0 else 1.0
let b = bravo            in if 0 == 0 then b     0 else 1.0
                            if 0 == 0 then bravo 0 else 1.0
                                           bravo 0
bravo 0
((\_ -> alpha) {} 3) 0
(       alpha)    3) 0
        alpha     3  0
alpha 3 0
delta 0 3
let b = (\_ -> bravo) {} in if 0 == 3 then b     0 else 1.0
let b =        bravo     in if 0 == 3 then b     0 else 1.0
                            if 0 == 3 then bravo 0 else 1.0
                                                        1.0
1.0
alpha 8 0
(alpha 8) 0
(\x -> (\y -> delta y x) 8 0
       (\y -> delta y 8)   0
              delta 0 8
delta 0 8
let b = (\_ -> bravo) {} in if 0 == 8 then b     0 else 1.0
let b = bravo            in if 0 == 8 then b     0 else 1.0
                            if 0 == 8 then bravo 0 else 1.0
                                                        1.0
1.0

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Apr 8, 2022

I found it easier to understand these rules when I made the following changes:

  • In USE-INIT, I changed the order of the sentence to put before first, since it matches the same logical flow as USE-USE.
  • In USE-USE, I tweaked the variables used, provided an example for illustration of what variables mean, and reformatted the parenthesis part.
  • I read them in the below order and with the following in mind:
    • When is a binding, x, "ready for use" with a given force?
      • Part 1: USE-INIT - after x has been initialized
      • Part 2: USE-USE - only if references in x's initializer are "ready for use" with their appropriate force(s).
    • USE-IMMEDIATE

USE-INIT: (edited by Jordan) Before a binding x is ready for use with any force, x must be initialized.

x = ...

-- before any these can use `x`, `x` must be initialized.
withForce0 = x#0!0
withForce0' = (\_ -> x#0!0) 1
withForce1 = foo#0!1 x#0!_
withForce1' = (\_ -> foo#0!1 x#0!_) 1
-- Note: `!_` means I don't know what the force is in that usage

USE-USE: (edited by Jordan) Before a reference to mainRef is ready for use with force f, every reference in the initializer of mainRef (e.g. ref) with delay d and force g must be ready for use with force h:

  • when f is unknown, h is unknown
  • when f is known, h = f - d + g'
ref = ...
mainRef = ref#d!g
actualUsage = mainRef!f

In other words, USE-USE is the recursive case. To check whether A works, we check B. To check B, we check C. To check C, we check ....

Since a binding can ultimately reference itself (e.g. per your example, alpha references bravo which references alpha), this can sometimes lead to a cycle. Hence, the need to model this as a graph.

When f is unknown, we cannot calculate h. One could say f in this case acts like Infinity. Perhaps a ref must be ready for force 2, perhaps force 240. Mathematically, it's easier to model this as an infinite graph.

To better understand the relationship between f, d, and g, there are two calculation tables below. I found the "sorted by h" table most helpful as it shows under what conditions h has force 0, which is relevant for the remaining rules. Note: there are times when h is negative. I'm not sure if that's ever possible as I was just plugging in numbers to see what would happen.

Calculation Table (Sorted by `f`, `d`, and `g`)
mainRef!f ref#d!g ref must be ready for use with force h Calculation
mainRef!0 ref#0!0 0 0 - 0 + 0
mainRef!0 ref#0!1 1 0 - 0 + 1
mainRef!0 ref#0!2 2 0 - 0 + 2
mainRef!0 ref#1!0 -1 0 - 1 + 0
mainRef!0 ref#1!1 0 0 - 1 + 1
mainRef!0 ref#1!2 1 0 - 1 + 2
mainRef!0 ref#2!0 -2 0 - 2 + 0
mainRef!0 ref#2!1 -1 0 - 2 + 1
mainRef!0 ref#2!2 0 0 - 2 + 2
mainRef!1 ref#0!0 1 1 - 0 + 0
mainRef!1 ref#0!1 2 1 - 0 + 1
mainRef!1 ref#0!2 3 1 - 0 + 2
mainRef!1 ref#1!0 0 1 - 1 + 0
mainRef!1 ref#1!1 1 1 - 1 + 1
mainRef!1 ref#1!2 2 1 - 1 + 2
mainRef!1 ref#2!0 -1 1 - 2 + 0
mainRef!1 ref#2!1 0 1 - 2 + 1
mainRef!1 ref#2!2 1 1 - 2 + 2
mainRef!* ref#_!_ * f is unknown
Calculation Table (sorted by `h`)
mainRef!f ref#d!g ref must be ready for use with force h Calculation
mainRef!0 ref#2!0 -2 0 - 2 + 0
mainRef!0 ref#1!0 -1 0 - 1 + 0
mainRef!0 ref#2!1 -1 0 - 2 + 1
mainRef!1 ref#2!0 -1 1 - 2 + 0
mainRef!0 ref#0!0 0 0 - 0 + 0
mainRef!0 ref#1!1 0 0 - 1 + 1
mainRef!0 ref#2!2 0 0 - 2 + 2
mainRef!1 ref#1!0 0 1 - 1 + 0
mainRef!1 ref#2!1 0 1 - 2 + 1
mainRef!0 ref#0!1 1 0 - 0 + 1
mainRef!0 ref#1!2 1 0 - 1 + 2
mainRef!1 ref#0!0 1 1 - 0 + 0
mainRef!1 ref#1!1 1 1 - 1 + 1
mainRef!1 ref#2!2 1 1 - 2 + 2
mainRef!0 ref#0!2 2 0 - 0 + 2
mainRef!1 ref#0!1 2 1 - 0 + 1
mainRef!1 ref#1!2 2 1 - 1 + 2
mainRef!1 ref#0!2 3 1 - 0 + 2
mainRef!* ref#_!_ * f is unknown

USE-IMMEDIATE: Initializing a binding x is equivalent to requiring a reference to x to be ready for use with force 0.

This rule works with USE-USE to prevent the scenario described in the doc comment:

f = g 1 2
g x = h 3 x
h x y z = f!0

Initializing f (i.e. f = g 1 2) means f must be ready for use with force 0 in h (e.g. h x y z = f!0). f!0 triggers the USE-USE rule, which will catch this infinite loop.

@JordanMartinez
Copy link
Contributor

Does code still compile if an invalid recursive binding is defined but never used? For example...

foo = 1
  where
  f = g 1 2
  g x = h 3 x
  h x y z = f

@rhendric
Copy link
Member Author

rhendric commented Apr 8, 2022

To better understand the relationship between f, d, and g, there are two calculation tables below.

Your rewrite of USE-USE is missing an important conditional which these tables should also incorporate: ‘provided d is not greater than f’. This prevents h from going negative, but also some cases where h is positive but still doesn't correspond to an actual dependency.

Does code still compile if an invalid recursive binding is defined but never used? For example...

In general, there's nothing in this PR that changes which programs compile. Think of the cycle detection done here as an optimization—with this PR, the default semantics of a recursive binding group is for all the bindings to be lazy with run-time initialization checking, and the analysis just removes the overhead of laziness where it can be proven superfluous.

PureScript currently considers that example invalid anyway, because the reference to g in f isn't allowed by the type checker. If the type checker is fooled by eta-expanding g, it's allowed but leads to f silently being undefined at run time (which doesn't matter given the definition of foo, I suppose, but one presumes that in real code it would). With this PR, eta-expansion is still necessary, and doing so causes an explicit error to be raised at run time instead.

@JordanMartinez
Copy link
Contributor

Your rewrite of USE-USE is missing an important conditional which these tables should also incorporate: ‘provided d is not greater than f’. This prevents h from going negative, but also some cases where h is positive but still doesn't correspond to an actual dependency.

Ah... Gotcha. Let me revise that.

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Apr 9, 2022

Your rewrite of USE-USE is missing an important conditional which these tables should also incorporate: ‘provided d is not greater than f’. This prevents h from going negative, but also some cases where h is positive but still doesn't correspond to an actual dependency.

Sorry for the delay.

I forgot about this part, but you previously stated that Force is a non-negative integer, which makes sense since it represents the number of args passed to a function and we can't have a negative number of args.

Moreover, "provided d is not greater than f" makes sense because it summarizes this situation. Given foo = \_ -> ref, when foo must be ready for use with force 0 (i.e. foo!0), then ref doesn't need to be ready for use because it's enclosed in a lambda. When foo is passed an arg (i.e. foo!0 arg), the lambda will be unwrapped and ref needs to have already been initialized.

In other words, I believe the USE-USE rule could be restated to the following:

USE-USE: (Jordan's v2) Before a reference to mainRef is ready for use with force f,

  • when f is unknown, every reference in the initializer of mainRef (e.g. ref) must be ready for use with unknown force
     ref = ...
     mainRef = ref
     actualUsage = mainRef!*
    
  • when f is known, only references in the initializer of mainRef (e.g. ref) with delay d and force g where d <= f must be ready for use with force h where h = f - d + g. References where d > f do not need to be ready for use.
     ref = ...
     mainRef = ref#d!g
     actualUsage = mainRef!f
    

Is that rendering correct?

@JordanMartinez
Copy link
Contributor

If so, here are updated tables (that also include an example with an explanation):

Calculation Table (Sorted by `f`, `d`, and `g`)
mainRef!f ref#d!g mainRef
definition
ref must
be ready
for use
with force h
Calculation Comments
mainRef!0 ref#0!0 ref 0 0 - 0 + 0 Calling mainRef with no args is the same as calling ref
mainRef!0 ref#0!1 ref a 1 0 - 0 + 1 Calling mainRef with no args is the same as calling ref with an arg applied
mainRef!0 ref#0!2 ref a b 2 0 - 0 + 2 Calling mainRef with no args is the same as calling ref with 2 args applied
mainRef!0 ref#1!0 \_ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!1 \_ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!2 \_ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!0 \_ _ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!1 ref#0!0 ref 1 1 - 0 + 0 Calling mainRef with 1 arg propagates that arg to ref as arg 1
mainRef!1 ref#0!1 ref a 2 1 - 0 + 1 Calling mainRef with 1 arg propagates that arg to ref as arg 2
mainRef!1 ref#0!2 ref a b 3 1 - 0 + 2 Calling mainRef with 1 arg propagates that arg to ref as arg 3
mainRef!1 ref#1!0 \_ -> ref 0 1 - 1 + 0 Calling mainRef with 1 arg applies is the same as calling ref
mainRef!1 ref#1!1 \_ -> ref a 1 1 - 1 + 1 Calling mainRef with 1 arg applies is the same as calling ref with an arg applied
mainRef!1 ref#1!2 \_ -> ref a b 2 1 - 1 + 2 Calling mainRef with 1 arg applies is the same as calling ref with 2 args applied
mainRef!1 ref#2!0 \_ _ -> ref - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!* ref#0!0 ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of its args
mainRef!* ref#0!1 ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of one of its args
mainRef!* ref#0!2 ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of two of its args
mainRef!* ref#1!0 \_ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#1!1 \_ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#1!2 \_ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but two of its args
mainRef!* ref#2!0 \_ _ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#2!1 \_ _ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#2!2 \_ _ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all but two of its args
Calculation Table (sorted by `h`)
mainRef!f ref#d!g mainRef
definition
ref must
be ready
for use
with force h
Calculation Comments
mainRef!0 ref#1!0 \_ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!1 \_ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!2 \_ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!0 \_ _ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!1 ref#2!0 \_ _ -> ref - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!0 ref#0!0 ref 0 0 - 0 + 0 Calling mainRef with no args is the same as calling ref
mainRef!1 ref#1!0 \_ -> ref 0 1 - 1 + 0 Calling mainRef with 1 arg applies is the same as calling ref
mainRef!0 ref#0!1 ref a 1 0 - 0 + 1 Calling mainRef with no args is the same as calling ref with an arg applied
mainRef!1 ref#0!0 ref 1 1 - 0 + 0 Calling mainRef with 1 arg propagates that arg to ref as arg 1
mainRef!1 ref#1!1 \_ -> ref a 1 1 - 1 + 1 Calling mainRef with 1 arg applies is the same as calling ref with an arg applied
mainRef!0 ref#0!2 ref a b 2 0 - 0 + 2 Calling mainRef with no args is the same as calling ref with 2 args applied
mainRef!1 ref#0!1 ref a 2 1 - 0 + 1 Calling mainRef with 1 arg propagates that arg to ref as arg 2
mainRef!1 ref#1!2 \_ -> ref a b 2 1 - 1 + 2 Calling mainRef with 1 arg applies is the same as calling ref with 2 args applied
mainRef!1 ref#0!2 ref a b 3 1 - 0 + 2 Calling mainRef with 1 arg propagates that arg to ref as arg 3
mainRef!* ref#0!0 ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of its args
mainRef!* ref#0!1 ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of one of its args
mainRef!* ref#0!2 ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of two of its args
mainRef!* ref#1!0 \_ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#1!1 \_ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#1!2 \_ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but two of its args
mainRef!* ref#2!0 \_ _ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#2!1 \_ _ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#2!2 \_ _ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all but two of its args

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Apr 9, 2022

I'm pretty sure I got the delta part wrong because it seems I'm supposed to 'abort' as soon as I encounter an unknown force. However, this is what I got for trying to visualize the algorithm once the force and delay of every var are known.

Visualization of the doc comment explaining the goal of delay and force for determining initialization order

ryan-example-as-graph

@rhendric
Copy link
Member Author

rhendric commented Apr 10, 2022

I think all of that is right.

The reason we can abort the search in delta once we see bravo!* is what I tried to explain with this comment: ‘Every identifier in a recursiving [sic! I need to fix this] binding group is necessarily reachable from every other, ignoring delay and force, which is what arbitrarily high force lets you do.’

In other words, if we're doing this analysis at all, we already know that any identifier with unknown force is going to reference every other if we follow the rules as you did. Actually doing so doesn't hurt but since we already have a case where we need to abort the search in order to prevent infinite loops, doing so in this case as well is a straightforward shortcut.

I hacked together some graphs that take a more mathematical approach to this example; maybe this is helpful. This is closer to how I initially conceived of this infinite graph search, but it's harder to explain in text this way, I think.

Start with a simple graph between the bindings in the group, where there is one edge for every Var of interest:

This is the graph that the binding groups desugaring would have used to conclude that this is a recursive binding group. For our purposes, we need to ‘expand’ it by adding a delay/force axis, like so:

There is again one edge here for every Var of interest; the number used at the source of the edge corresponds to the delay of the Var, and the number used at the sink of the edge corresponds to the force of the Var (with unknown force pessimized to infinity). You can see that the original graph is a flattening of this one, collapsing each column of vertices down to one.

This is the ‘base case’ graph. To properly represent USE-USE in this graph, we need to make an infinite number of copies, each shifted ‘upward’ by one:

Each color of edges represents one of these copies. In the infinite limit, all edges flatten back down to a copy of the very first graph (the blue edges).

This is the infinite graph we use to determine initialization order. We want to know the set of identifiers irreflexively reachable from each of the delay-0 vertices (the bottom row of the graph). We can use a few shortcuts in determining the set of reachable vertices: First, if we ever revisit the same column a second time further up, because the graph is made out of copies of itself we know we can trace copies of the path we took as a ladder to go arbitrarily high. Second, if we ever revisit the same column a second time lower down, we can prune that path because the identifiers reachable from that vertex will be a subset of the identifiers reachable from the higher vertex, again because the graph is made out of copies of itself. Third, because the subgraph of blue edges at the very top is strongly connected, we know that as soon as we visit any infinite-level vertex (either directly, as in the deltabravo edge, or indirectly via an infinite ladder), we can stop searching and claim that every identifier is reachable.

Following the graph and computing these sets of irreflexively reachable identifiers gives us the following new graph:

The cyclic strongly connected components of that graph are the bindings that need to be lazily initialized together, the acyclic single-vertex SCCs can be strictly initialized, and their natural reverse topsorted order is the initialization order.

@JordanMartinez
Copy link
Contributor

Thanks. That makes sense!

Comment on lines +330 to +393
cutLoops :: (Int, force) -> MaxRoseTree m force -> MaxRoseTree m force
cutLoops (i, force) = go
where
go = (=<<) . IM.traverseWithKey $ \i' (MaxRoseNode force' inner) ->
MaxRoseNode force' <$> if i == i' then guard (force >= force') $> pure IM.empty else pure $ go inner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just explaining this to myself as it took me a while to understand how and why this worked.

This function can be rewritten to a more pointful version:

-- Since `cutLoops` is used in an applicative context (`cutLoops <*> ...`),
-- we need to return back a function in some context (i.e. `f (a -> b)`).
-- Hence, the final `(\aToMB ma -> aToMB =<< ma)` part.
-- The result of the `IM.traverseWithKey` is passed to the function,
-- so we end up with: `\ma -> ma >>= IM.traverseWithKey ...`
go = (\aToMB ma -> aToMB =<< ma) . IM.traverseWithKey $ \i' (MaxRoseNode force' inner) ->
    -- Identifiers are mapped to an integer. So, this searches
    -- for the specified identifier. If found, runs the next part
    -- or keeps searching within the nested maps.
    if i == i' then do
      -- We found the identifier in the current map.
      -- This is a Maybe monad context.
      -- If a force (i.e. `force'`) is found that is
      -- greater than the current force (i.e. `force`)
      -- or `force < force'`, this short-circuits the computation
      -- by passing `Nothing` to `bind`. 
      -- Conceptually, this aborts the search
      guard (force >= force')
      -- If we get past the guard check,
      -- then we rewrite the map to an empty one.
      pure $ MaxRoseNode force' (pure IM.empty)
    else do
      -- keep searching for the identifier in the graph
      -- by traversing to the next map
      pure $ MaxRoseNode force' (go inner)

In short, the infinite graph is defined via mem and each of its nodes are memoized via memoizedNodes. Then, for each identifier, cutLoops is called which rewrites references to that identifier in mem to either abort (i.e. Nothing) or be an empty map (i.e. Just IM.empty). Because all keys in the maps comprising the graph correspond to one of the identifiers and because each identifier is called with cutLoops, eventually, all identifiers will be rewritten. In other words, what appears to be an infinite loop in IM.traverseWithKey \_ -> if match then rewrite else loop eventually terminates because IM.traverseWithKey will be called with an empty map. Or at least, that's my current understanding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You got the internals correct; not sure about your first comment block. Remember that MaxRoseTree m force is a type alias for m (MonoidalIntMap (MaxRoseNode m force)); the outer (=<<) manages that m, since traverseWithKey gives us a Kleisli arrow (simplifying the types) Map -> m Map, and we want go to be a function m Map -> m Map (= Tree -> Tree).

Comment on lines +360 to +438
-- Now do the first delay/force traversal of all the bindings to find
-- references to other names in this binding group.
--
-- The parts of this type mean:
-- D is the maximum force (or Nothing if unknown) with which the identifier C
-- is referenced in any delay-B position inside the expression A.
--
-- where A, B, C, and D are as below:
-- A B (keys) C (keys) D
findReferences :: Expr Ann -> IM.MonoidalIntMap (IM.MonoidalIntMap (Ap Maybe (Max Int)))
findReferences = (getConst .) . onVarsWithDelayAndForce $ \delay force _ -> \case
Qualified mn' ident | all (== mn) mn', Just i <- ident `S.lookupIndex` names
-> Const . IM.singleton delay . IM.singleton i $ coerceForce force
_ -> Const IM.empty
Copy link
Contributor

@JordanMartinez JordanMartinez Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify this with our current syntax of # (delay) and ! (force), A, B, C, and D refer to this, right?

C = ...
A = C#B!D

and produces a map of B -> C -> D for each C-B-D combination.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait... that's not quite right. A is an expression and part of it contains C#B!D.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not sure how helpful the syntactical annotations are for this one, but an example would be

exprA = ... refC1#B1!D1 ... refC1#B1!D2 ... refC1#B2!D3 ... refC2#B1!D4 ... refC2#B1!* ...

findReferences <initializer of exprA> =
  IM.fromList [(B1, IM.fromList [(C1, Just (D1 `max` D2))
                                ,(C2, Nothing)
                                ])
              ,(B2, IM.fromList [(C1, Just D3)])
              ]

Comment on lines +384 to +464
-- Using the approach explained above, traverse the reference graph generated
-- by `refsByIndex` and find all reachable names.
--
-- The parts of this type mean:
-- D is the maximum force with which the identifier C is referenced,
-- directly or indirectly, during the initialization of identifier A. B is
-- Nothing if the analysis of A was inconclusive and A might need the entire
-- binding group.
--
-- where A, B, C, and D are as below:
-- A B C (keys) D
reachablesByIndex :: A.Array Int (Maybe (IM.MonoidalIntMap (Max Int)))
reachablesByIndex = searchReachable maxIdx $ \(i, force) ->
getAp . flip IM.foldMapWithKey (dropKeysAbove force $ refsByIndex A.! i) $ \delay ->
IM.foldMapWithKey $ \i' force' ->
Ap $ IM.singleton i' . Max . (force - delay +) <$> uncoerceForce force'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function calculates h = f - d + g, but also accounts for expressions where an identifier occurs multiple times with different force. For example:

-- given `C` is some identifier
C = ...
-- and `C` is used multiple times in the initialization
-- of `A` with forces, `x`, `y`, and `z`...
A = C#_!x ... C#_!y ... C#_!z
-- Then `D` is the maximum of those force values
-- (e.g. `max [x, y, z]`)

Or put differently, who cares if C is called with force 1 or force 2 because if C is called with force 5, that's the case we need to care about.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but note that this covers not just the places that C appears in the initializer of A as you illustrated, but also all the other places that C occurs in the binding group that could be indirectly used during the evaluation of the initializer of A.

Comment on lines +401 to +479
-- If `reachablesByIndex` is a sort of labeled relation, this function
-- produces part of the reverse relation, but only for the edges from the
-- given vertex.
--
-- The parts of this type mean:
-- The identifier A is reachable from the identifier B with maximum force C
-- (B is also the index provided to the function).
--
-- where A, B, and C are as below:
-- (B) A B (singleton key) C
reverseReachablesFor :: Int -> IM.MonoidalIntMap (IM.MonoidalIntMap (Ap Maybe (Max Int)))
reverseReachablesFor i = case reachablesByIndex A.! i of
Nothing -> IM.fromAscList $ (, IM.singleton i $ Ap Nothing) <$> [0..maxIdx]
Just im -> IM.singleton i . Ap . Just <$> im
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the delay/force syntax, this is...

A = ...
B = A#_!x ... A#_!y ... A#_!z
C = max [x, y, z]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, again with the comment about indirect use, which this inherits from reachablesByIndex.

@natefaubion
Copy link
Contributor

One concern I have with this analysis is whether this is something that really needs to be spec'ed as a language guarantee, and inserted as corefn. #4179 shows that our "escape hatch" for instance cycles is unsound under strictness. This cycle analysis happens well before corefn. The analysis in this PR happens as part of JS code generation in order to restore that soundness for the JS backend. Does that not mean that all backends which have strict initialization (which is likely most of them) will suffer the same issue and need to implement the same analysis? Is the analysis for ordering an absolute requirement, or do we only need to spec that mutual binding groups must be lazily initialized to be sound?

@natefaubion
Copy link
Contributor

What I want to avoid, as a reference backend, is backend "optimizations" that core code will end up relying on for soundness, wherein if a backend does not implement the same optimization, code will flat out not work. I'm thinking of previous issues with the function composition inliner. Would it be reasonable to add this to corefn, such as an extra isLazy :: Boolean flag to bindings in Rec groups?

@rhendric
Copy link
Member Author

Does that not mean that all backends which have strict initialization (which is likely most of them) will suffer the same issue and need to implement the same analysis?

Yes, but shouldn't they be able to reuse this module since it operates on CoreFn? The backend just needs to provide an appropriate implementation of $runtime_lazy, which should be a Fn3 String String (Unit -> a) (Int -> a).

Is the analysis for ordering an absolute requirement, or do we only need to spec that mutual binding groups must be lazily initialized to be sound?

No, I think with a natively-lazy backend, or a backend that unconditionally makes recursive initializers lazy, we don't have soundness problems. The language spec would say that it must be at least a run-time error to dereference a variable before it has initialized, and that variables must be initialized in a run-time order that prevents such errors if one exists, but that an implementation may raise such errors at compile time if it can prove that no such order exists, and/or elide the run-time code that does the initialization check if it can statically determine such an order.

A backend could theoretically do more advanced analysis than what I've implemented here, in which case even fewer bindings would need to be lazy. Also, with external CoreFn transformations, we would want the JS backend not to emit unsound code if a transformation changes a dependency. That's why I lean toward making this a post-CoreFn analysis.

@natefaubion
Copy link
Contributor

Yes, but shouldn't they be able to reuse this module since it operates on CoreFn?

This assumes backends are implemented in Haskell.

and that variables must be initialized in a run-time order that prevents such errors if one exists

This is what concerns me, so I just need clarification that this is correct (which I think is what you're saying):

  • If all recursive bindings in the JS backend were replaced with $runtime_lazy and invoked in an arbitrary order after being declared, we would avoid soundness issues with the existing instance escape hatch.
  • The analysis exists as a way to remove 99% of the $runtime_lazy calls since we can determine that they are unnecessary for soundness because we can rearrange their initialization order instead.

If that's the case, I still think this sort of thing would be extremely helpful to have in corefn as an annotation (for more casual backend implementors), but I don't think that should block this PR.

@natefaubion
Copy link
Contributor

natefaubion commented Apr 12, 2022

I also think that even though this is fixing a bug, it does have downstream effects on backends if we are going to exploit this in passes before codegen (such as in corefn optimizations that rely on lazy recursive semantics). So it may be worth noting as a breaking semantic change for backend implementors.

@natefaubion
Copy link
Contributor

natefaubion commented Apr 12, 2022

I also think it's worth pointing out the alternative, which is to take out the instance escape hatch. Is there a reason we aren't considering that as a way to solve #4179? That would mean that laziness is then not necessarily required for soundness (for anything the compiler produces), so it wouldn't be a breaking semantic change. That's not to say this PR isn't still useful (I think it is), I just don't want to mask the underlying issue.

@rhendric
Copy link
Member Author

This assumes backends are implemented in Haskell.

Ah... yes, so it does.

this is correct

That's my claim.

So it may be worth noting as a breaking semantic change for backend implementors.

Agreed that we should probably communicate something to backend implementors. I don't know if we should call that something a breaking semantic change? This feels more like a clarification of a previously-implicit semantic assumption—recursive bindings have always been able to be referenced before initialization, as long as you can get past the cycle checking in the binding group desugarer and the type checker. It's never been explicitly stated what code has the responsibility of preventing the consequent soundness issues. purs did a bad job of modeling what to do, by ignoring the problem and choosing a fixed, alphabetically-based ordering for initializers. With this patch, purs does the right thing, and we establish the precedent that backends are responsible for sorting out initialization order to preserve soundness. But they've always needed to do so; we've just never called it out before.

take out the instance escape hatch

#4179 (comment) shows that the problem is more fundamental than the instance escape hatch. There are basically two pre-CoreFn checks that minimized the impact of our failing to deal with recursive initialization: the binding group desugaring pass and an assertion in the type checker. Both can be circumvented with techniques that include but aren't limited to the instance escape hatch; enclosing a reference in a lambda, even if that lambda is then immediately used, causes both checks to ignore that reference. I think reconsidering that escape hatch, as well as revisiting the type checker assertion, would be good things to discuss going forward (both of which would more dramatically change which programs are accepted by the compiler). But I think the underlying issue is actually the problem corrected here: that recursive bindings are assumed to be available to each other during initialization, and a strict initialization order doesn't always fulfill that assumption.

@natefaubion
Copy link
Contributor

natefaubion commented Apr 12, 2022

With this patch, purs does the right thing, and we establish the precedent that backends are responsible for sorting out initialization order to preserve soundness. But they've always needed to do so; we've just never called it out before.

Right, I think most backends follow the reference implementation's lead in cases like this (since we are a standard defined by it's implementation), is all I'm really referring to. I think it's likely too that by clearing up this hole, that the compiler will probably take advantage of it in the future. It's not really "breaking" since it was always a mess, but backend implementors may find more and more things broken in the future 😆.

Thanks for the clarifications! I agree with your assessment. Do you have an objection to potentially adding it as a CoreFn annotation in the future? It wasn't clear to me if there was something painful about that.

@rhendric
Copy link
Member Author

Do you have an objection to potentially adding it as a CoreFn annotation in the future? It wasn't clear to me if there was something painful about that.

The two reasons I mentioned in #4283 (comment): it'd be an advisory annotation, since backends could choose to do more or less than what the frontend recommends and the frontend shouldn't offer a contract on what gets marked lazy; and for the purposes of #3339/#4092, we'd have to decide whether to trust the annotations on the CoreFn provided to the compiler (effectively making another invariant that CoreFn transformations would need to preserve) or ignore them and run the analysis a second time. Neither problem is a dealbreaker but it suggests to me that the fit with the CoreFn protocol is awkward. I don't have a better idea for how to make doing this analysis less miserable for non-Haskell backends, though.

@natefaubion
Copy link
Contributor

natefaubion commented Apr 12, 2022

I don't have a better idea for how to make doing this analysis less miserable for non-Haskell backends, though.

One thing that may be useful (again not for this), is to analyze the number of runtime_lazy calls needed for more conservative approaches, such as:

  • Everything in the binding group is deferred under a function literal.
  • Everything in the binding group is deferred under Control.Lazy.defer or Data.Lazy.defer.

If it's the case that either of those gets you to a significant reduction (which I think is likely), I think that's likely a good enough "less miserable" option 😆.

@rhendric
Copy link
Member Author

Could you elaborate on what you mean by those approaches?

@natefaubion
Copy link
Contributor

Could you elaborate on what you mean by those approaches?

Sorry, I'm just saying I think would be useful in general to know how far various naive approaches would get you in removing calls to a lazy runtime initialization call. For example, lots of mutual bindings are just functions that call each other recursively. If all the bindings are just functions, there's no need for a runtime call. Maybe that gets you to 50% of bindings. That's pretty good for something that's a single case on the RHS. You could extend that to if it's a call to defer. Maybe that gets you to 60%. You could then extend that to analyze a full application spine. Maybe that gets you to 90%. Maybe you implement this PR, and that gets you to 99%.

@rhendric
Copy link
Member Author

I know you said this could wait but I was curious too, so here's a comparison of a few naive approaches for the preliminary 0.15 package set.

Approach Lazy binding count Relative improvement
Make every recursive binding lazy (reference) 758 0%
... except for groups that are entirely function literals 224 70%
... contained in zero or more object literals and constructors 102 87%
This PR 41 95%

This PR doesn't do anything special with either defer function, so I didn't try that out; but I think we're in the good-enough zone without it?

@natefaubion
Copy link
Contributor

Nice! That's great information. Thanks for doing that. Yeah, I think as long as @JordanMartinez feels good with his review, I have no problem with this moving forward 👍.

@JordanMartinez
Copy link
Contributor

I think I'll spend tomorrow morning going through this again one last time. AFAICT, this PR does as Ryan explains in his above comments.

The only thing that stood out to me was the usage of Data.Function.Uncurried (runFn3), which won't normally exist unless the functions package is installed. However, I think that usage exists because it gets inlined later (i.e. the compiler looks for such usages and inlines what is otherwise a curried function into an uncurried one). @rhendric, is that correct?

@rhendric
Copy link
Member Author

Yup! As long as that inliner remains downstream of this transformation, we don't pick up any dependency on functions.

@JordanMartinez
Copy link
Contributor

Since there's been a lot of back and forth between Ryan and myself as I try to understand this, I'm writing this post to summarize things. This post may still be wrong in some areas (I'm guessing Ryan will correct me in various places in a follow up comment), but at least we can direct people to here rather than forcing them to read the entire thread above.

Summary

General Goal

This PR's primary concern is with the correct initialization order of recursive binding groups (e.g. bindings that reference one another). The easiest way to implement this is to initialize all such bindings lazily and defer to the runtime for correct initialization checking (e.g. if foo needs bar to be initialized and bar hasn't yet been defined, a runtime error would be thrown). A more performant way is to only lazily initialize bindings that need such laziness, but knowing whether a binding can be initialized lazily or strictly requires analysis of its usage. This PR introduces such analysis using only information from the language's syntax and not from other sources of information (e.g. type checker).

Key Concept: Delay and Force

Two concepts, delay and force, are central to the analysis of recursive bindings. Delay and force are properties of expressions.

Delay = a non-negative number of lambdas enclosing a var (always known)

Delay for x Expression
0 x
1 \_ -> x
2 \_ -> \_ -> x
2 \_ _ -> x

Force = a non-negative number of args being applied to expression (sometimes known, sometimes unknown)

Force Expression
0 f
1 f x
2 f x x
2 f x \y -> ...
* when a function h is passed an arg f that is a function in certain contexts

To refer to these concepts, we use the following syntax:

  • x#n = a variable x with delay n (n cages surround x)
  • x!n = an expression x with force n (n forces before expression x is reduced):
    • when force is known, n is a number (e.g. x!4)
    • when force is unknown, n is an asterisk (e.g. x!*)

Taken together, we get:

  • x#d!f = an expression x with delay d and force f
Revising the above table, we get the following
Delay Force Expression Comments
0 0 f#0!0 -
0 0 (\_ -> f#0!0) a Although f is enclosed in a lambda, the immediate application of an arg a undoes the lambda's delay. So, we still consider f to have delay 0.
0 0 (\_ _ -> f#0!0) a b -
0 1 f#0!1 x -
0 2 f#0!2 x y -
- - - -
0 1 f#0!0 x -
0 1 (\_ -> f#0!1 x) a Although f is enclosed in a lambda, the immediate application of an arg a undoes the lambda's delay. So, we still consider f to have delay 0.
0 0 (\_ _ -> f#0!1 x) a b -
0 1 f#0!1 x -
0 2 f#0!1 x y -
- - - -
1 0 \_ -> f#1!0 -
1 0 (\_ _ -> f#1!0) a Since the lambda takes two args and only one is applied, f is still encosed by one lambda. So, f has delay 1.
1 1 \_ -> f#1!1 x -
1 2 \_ -> f#1!2 x y -
- - - -
1 1 \_ -> f#1!0 x -
1 1 (\_ _ -> f#1!1 x) a Since the lambda takes two args and only one is applied, f is still encosed by one lambda. So, f has delay 1.
1 1 \_ -> f#1!1 x -
1 2 \_ -> f#1!1 x y -

Key Concept: Argument Propagation

Argument Propagation in non-Recursive Binding Groups

Let's consider the following non-recursive binding group to highlight an important point: "argument propagation".

let
  a = b 3 4
  b = \_ -> c 1 2
  c one two four five = one + two + four + five

If we were to annotate the above binding group expressions with delay and force annotations, it might look like this:

let
  a = b#0!2 3 4
  b = \_ -> c#1!2 1 2
  c one two four five = one + two + four + five

To initialize a, obviously b must have already been initialized. However, to initialize b, does c need to be initialized? Since b wraps c in a lambda, it would seem like it's not necessary. However, a calls b with two arguments. The first arg will unwrap the lambda and the second arg will be applied to c as its third argument. In other words, a is the same as c 1 2 4.

Looking at just b's definition, a's need for c isn't immediately obvious. But when we analyze their relationsips using delay and force, this need appears.

In order for a to be ready for use (i.e. a!0 or a with no args passed to it), all references in its initializer with delay 0 (i.e. b) must be ready for use with their corresponding force (i.e. 2 since b!2). b!0 is ready for use without c!0 being ready for use because the c is protected by a lambda, but b!2 is a different matter because arguments passed to b "propagate" to functions in its definition (e.g. c).

In other words, in order for b to be ready for use at force f, all references in its initializer (i.e. c since no others exist) need to be ready for use with a force h that can be calculated based on f. If we keep this "argument propagation" in mind, we can figure out that formula: h = f - d + g where

  • f is the force at which be b is ready for use
  • d is c's delay (i.e. the number of lambdas in which c is enclosed)
  • g is c's force (i.e. the number of args already being passed to c)
  • h is the force at which c is ready for use before b is ready for use

Put differently, we start with the number of args passed to b: 2. Since those args propagate to its definition, the first few args will unwrap lambda(s) (if any) protecting references like c and pass the remaining args to references like c directly. Since c is protected by 1 lambda, the first arg negates that lambda (i.e. f - d). A force of 2 minus a delay of 1 means c will be passed 1 arg. However, c is already being passed two args in the definition of b. So, in actuality, c will be passed 3 arguments, which means c has to be ready for use with force 3. Once we've calcuated the args that propagate to c (i.e. f - d), we need to take the result and add the number of args g already being passed to c in b's definition (i.e. f - d + g).

Thus, b is ready for use with force f only once c is ready for use with force h:

  • h = f - d + g
  • 3 = 2 - 1 + 2

Fortunately, since the above binding group is not recursive, this level of analysis is not needed. We can determine initialization order just by looking at what each thing references. Since a references b, b must be initialized before a. Similarly, since b references c, c must be initialized before b. Thus, the initialization order is c, then b, then a.

Argument Propagation in Recursive Binding Groups

However, this "argument propagation" analysis is needed when dealing with recursive binding groups. Let's use the below annotated example for illustration purposes:

f = g2 1 2
g x = h 3 x
h x y z = f

If we call f, what happens? Using a graph reduction, we get...

f
g 1 2
(\x -> h 3 x) 1 2
(      h 3 1)   2
h 3 1 2
(\x y z -> f      ) 3 1 2
(\  y z -> f 3    )   1 2
(\    z -> f 3 1  )     2
(\         f 3 1 2)
f 3 1 2
(f 3 1 2) 3 1 2
((f 3 1 2) 3 1 2) 3 1 2
... infinite loop

This highlights a problem that can occur in recursive binding groups which is impossible in non-recursive binding groups: because bindings in a recursive binding group reference one another, it is possible to define a recursive binding group where an entry point into the group will cause arguments to propagate to references in the group infinitely. However, using the delay-force analysis, we can 1) detect whether and when this occurs and 2) provide a better error message if/when it does. Unfortunately, to account for this "infinite argument propagation" possibility, the analysis must model the problem as though every reference in a recursive binding group can be called with 0 to infinity args.

Key Concept: Unknown Force

Why is a force only sometimes known? Because of let bindings and case statements.

As a (probably non-compiling) example of let bindings, look at alpha's x case where b is bound to bravo. How many args will b be passed in the expression that follows the in keyword? Since we are only doing a syntax-based analysis, we cannot know. If we had more information, perhaps we could know.

                                                        {-
bravo :: Int -> Number                                  -}
bravo = (\_ -> alpha#0!1) {} 3

                                                        {-
alpha :: Int -> Int -> Number                           -}
alpha = case _ of
  0 -> \_ -> 1.0
  x -> \y -> do
    let
      b = (\_ -> bravo#0!*) {}
    in ...                          {-
    possible usages of `b`:
      no args - b
      1 arg   - b 0
      2 args  - b y x
      3 args  - b y y 3
      etc...                       -}

Since we cannot know based on syntax alone, we must must make the code work in the worst case scenario (arguments propagate infinitely). To simplify this, we assume that alpha depends on every other identifier in the recursive binding group before alpha can be initialized.

Delay-Force Analysis of Recursive Binding Groups

Lazy or Strict Initialization

An identifier is initialized

  • strictly when it does not depend on any other identifier in the binding group.
  • lazily when an identifier depends on itself (directly or indirectly via another identifier) as part of its initialization.
  • two or more identifiers that depend on one another should be initialized in the same group.

Annotated Example

It's pretty tricky to write examples that are recursive and type-check and aren't rejected by different cycle-checking logic in the type checker and demonstrate some of the weirder edge cases of these rules. However, the following delay-and-force-annotated example seems to suffice as a good example for understanding this analysis.

alpha :: Int -> Int -> Number
alpha = case _ of
  0 -> bravo#1!0
  1 -> charlie#1!0
  2 -> \y -> if y > 0 then bravo#2!1 y else charlie#2!1 y
  x -> \y -> delta#2!2 y x

bravo :: Int -> Number
bravo = (\_ -> alpha#0!1) {} 3

charlie :: Int -> Number
charlie = (\_ -> alpha#0!1) {} 4

gamma :: Int -> Int -> Number
gamma = (\_ -> alpha#0!1) {} 4

delta :: Int -> Int -> Number
delta =
  let b = (\_ -> bravo#0!*) {}
  in \x y -> if x == y then b 0 else 1.0

Rules

Read the below three rules with the following in mind for the first two:

When is a binding, x, ready for use with a given force?

  • USE-INIT - after x has been initialized
  • USE-USE - only if references in x's initializer are ready for use with their appropriate force(s).

USE-INIT: Before a binding x is ready for use with any force, x must be initialized.

x = ...

-- before any these can use `x`, `x` must be initialized.
-- Note: examples below omit force for `x` as it's not relevant.
withForce0 = x#0
withForce0' = (\_ -> x#0) 1
withForce1 = foo#0!1 x#0
withForce1' = (\_ -> foo#0!1 x#0) 1

USE-USE: Before a reference to mainRef is ready for use with force f,

  • when f is unknown, every reference in the initializer of mainRef (e.g. ref) must be ready for use with unknown force
     ref = ...
     mainRef = ref
     actualUsage = mainRef!*
    
  • when f is known, only references in the initializer of mainRef (e.g. ref) with delay d and force g where d <= f must be ready for use with force h where h = f - d + g. References where d > f do not need to be ready for use (i.e. not enough args are passed to the usage of mainRef to fuly remove all the lambdas protecting ref in mainRef's definition).
     ref = ...
     mainRef = ref#d!g
     actualUsage = mainRef!f
    
Calculation Table (Sorted by `f`, `d`, and `g`)
mainRef!f ref#d!g mainRef
definition
ref must
be ready
for use
with force h
Calculation Comments
mainRef!0 ref#0!0 ref 0 0 - 0 + 0 Calling mainRef with no args is the same as calling ref
mainRef!0 ref#0!1 ref a 1 0 - 0 + 1 Calling mainRef with no args is the same as calling ref with an arg applied
mainRef!0 ref#0!2 ref a b 2 0 - 0 + 2 Calling mainRef with no args is the same as calling ref with 2 args applied
mainRef!0 ref#1!0 \_ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!1 \_ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!2 \_ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!0 \_ _ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!1 ref#0!0 ref 1 1 - 0 + 0 Calling mainRef with 1 arg propagates that arg to ref as arg 1
mainRef!1 ref#0!1 ref a 2 1 - 0 + 1 Calling mainRef with 1 arg propagates that arg to ref as arg 2
mainRef!1 ref#0!2 ref a b 3 1 - 0 + 2 Calling mainRef with 1 arg propagates that arg to ref as arg 3
mainRef!1 ref#1!0 \_ -> ref 0 1 - 1 + 0 Calling mainRef with 1 arg applies is the same as calling ref
mainRef!1 ref#1!1 \_ -> ref a 1 1 - 1 + 1 Calling mainRef with 1 arg applies is the same as calling ref with an arg applied
mainRef!1 ref#1!2 \_ -> ref a b 2 1 - 1 + 2 Calling mainRef with 1 arg applies is the same as calling ref with 2 args applied
mainRef!1 ref#2!0 \_ _ -> ref - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!* ref#0!0 ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of its args
mainRef!* ref#0!1 ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of one of its args
mainRef!* ref#0!2 ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of two of its args
mainRef!* ref#1!0 \_ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#1!1 \_ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#1!2 \_ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but two of its args
mainRef!* ref#2!0 \_ _ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#2!1 \_ _ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#2!2 \_ _ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all but two of its args
Calculation Table (sorted by `h`)
mainRef!f ref#d!g mainRef
definition
ref must
be ready
for use
with force h
Calculation Comments
mainRef!0 ref#1!0 \_ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!1 \_ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#1!2 \_ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!0 \_ _ -> ref - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!0 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with no args does not unwrap the lambda
mainRef!1 ref#2!0 \_ _ -> ref - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!1 \_ _ -> ref a - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!1 ref#2!2 \_ _ -> ref a b - d > f Calling mainRef with 1 arg unwraps a lambda but not the final one using ref
mainRef!0 ref#0!0 ref 0 0 - 0 + 0 Calling mainRef with no args is the same as calling ref
mainRef!1 ref#1!0 \_ -> ref 0 1 - 1 + 0 Calling mainRef with 1 arg applies is the same as calling ref
mainRef!0 ref#0!1 ref a 1 0 - 0 + 1 Calling mainRef with no args is the same as calling ref with an arg applied
mainRef!1 ref#0!0 ref 1 1 - 0 + 0 Calling mainRef with 1 arg propagates that arg to ref as arg 1
mainRef!1 ref#1!1 \_ -> ref a 1 1 - 1 + 1 Calling mainRef with 1 arg applies is the same as calling ref with an arg applied
mainRef!0 ref#0!2 ref a b 2 0 - 0 + 2 Calling mainRef with no args is the same as calling ref with 2 args applied
mainRef!1 ref#0!1 ref a 2 1 - 0 + 1 Calling mainRef with 1 arg propagates that arg to ref as arg 2
mainRef!1 ref#1!2 \_ -> ref a b 2 1 - 1 + 2 Calling mainRef with 1 arg applies is the same as calling ref with 2 args applied
mainRef!1 ref#0!2 ref a b 3 1 - 0 + 2 Calling mainRef with 1 arg propagates that arg to ref as arg 3
mainRef!* ref#0!0 ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of its args
mainRef!* ref#0!1 ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of one of its args
mainRef!* ref#0!2 ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef propagates directly to ref as all of two of its args
mainRef!* ref#1!0 \_ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#1!1 \_ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#1!2 \_ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but two of its args
mainRef!* ref#2!0 \_ _ -> ref 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all of its args
mainRef!* ref#2!1 \_ _ -> ref a 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambda and then propagates the remaining args directly to ref as all but one of its args
mainRef!* ref#2!2 \_ _ -> ref a b 0-∞ f is unknown In the worst case scenario, a large number of args passed to mainRef unwraps the enclosed lambdas and then propagates the remaining args directly to ref as all but two of its args

USE-IMMEDIATE: Initializing a binding x is equivalent to requiring a reference to x to be ready for use with force 0.

Implementation of the Analysis

There are two parts to modeling the relationships of these bindings as a graph.

First, we model the relationships between bindings using a finite graph.

  • A vertex/node is a pair of an identifier in the recursive binding group and some force: (identifier, force).
  • An edge, (identifier1, f) -> (identifier2, g), indicates that identifier1 is ready for use with force f only once identifier2 is ready for use with force g

When force is unknown, we assume it's infinite (e.g. (bravo, ∞)).

Base Graph

However, to account for the possibility of arguments propagating infinitely via the USE-USE rule, we make a copy of this "base" graph and shift it up one force level. We do this an infinite number of times to produce an infinite graph.

  • Black lines = original "base" graph
  • Red lines = original "base" graph shifted up 1 force level
  • Green lines = original "base" graph shifted up 2 force levels
  • ...

Infinite Graph

The blue lines are the original relationships that the 'binding group desugaring' pass identified, and which caused it to place these identifiers into the same recursive binding group. Conceptually, we can reproduce the blue lines by taking the infinite graph and collapsing it back down to a finite graph by dropping the 'force' part of the node and deduplicating the resulting nodes and edges. In other words, this infinite graph stores the same general structural information as the binding group desugaring pass' finite graph.

This resulting infinite graph produced by copying the base graph upwards ad infinitum is what we analyze. Our goal is to determine what vertices are irreflexively reachable from each identifier with force 0 (e.g. alpha!0).

The analysis exploits the fact that some edges in higher force levels (e.g. (alpha, 2) -> (bravo, 1) red edge) are copies of lower force levels (e.g. (alpha, 1) -> (bravo, 0) black edge) shifted up one force level. We reduce the infinite graph to a finite graph using the below 4 rules:

  1. Because every identifier in a recursive binding group references one another (i.e. the subgraph of blue edges at the very top is strongly connected), when searching for verticies reachable from (i, 0) for a given identifier i, as soon as we visit any infinite-level vertex (e.g. (bravo, ∞)), we can stop searching and claim that every identifier is reachable from i. This can happen for two reasons:
    • Directly (e.g. delta–bravo edge)
    • Indirectly via an "infinite ladder" (next rule).
  2. If we ever revisit the same column a second time further up, we know we can trace copies of the path we took as a ladder to go arbitrarily high because the graph is made out of copies of itself.
    • Visualizing this concept, we get something like this, which has no relationship to the actual example above.
    • ladder-revisit-higher-one
  3. If we ever revisit the same column a second time lower down, we can prune that path because the identifiers reachable from that vertex will be a subset of the identifiers reachable from the higher vertex, again because the graph is made out of copies of itself.
    • Visualizing this concept, I think we get something like this, which has no relationship to the actual example above.
    • prune-lower-level
  4. If searching for vertices reachable from a given identifier i with force f (e.g. (i, f)) returns multiple vertices from the same column (e.g. x!0, x!1, and x!2), we only continue our search using the vertex with the maximum force of all of them (e.g. x!2).
    • Using the example from USE-USE, the ref reference may be used multiple times with different forces in mainRef's definition.

Each identifier's search either aborts (i.e. force is unknown) or terminates with a finite set of vertices. Once we know which verticies are reachable from a given identifier, we use that data to produce the below finite graph:

Final Graph

Using the above finite graph, we conclude the following:

  • lazy vs strict initialization:
    • Acyclic nodes (e.g. alpha) are strictly initialized
    • Cyclic nodes (e.g. bravo, charlie, and delta) are lazily initialized.
  • identifier initialization order
    • the reverse topologically-sorted order is the correct initialization order

Copy link
Contributor

@JordanMartinez JordanMartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this is good to go! I'm not yet approving due to the comments I have, but an approval will come shortly after that.

I think there are still parts of this code that are just hard for me to understand. I feel like I nearly grasp it but it's still somewhat out of reach. However, given this code only triggers in recursive bindings (which tend not to be frequent), this fixes an actual bug, the tests feel comprehensive and actually pass, and my general understanding of the code is that it implements the solution you have proposed correctly, I think this is ready to be merged.

I think the only question I have is whether this may introduce space leaks due to the heavy usage of laziness in this implementation.

Comment on lines +117 to +132
-- A clumsy hack to preserve TCO in a particular idiom of unsafePartial seen in Data.Map.Internal.
App a1 e1@(Var _ (Qualified (Just C.PartialUnsafe) (Ident up))) (Abs a2 i e2) | up == C.unsafePartial
-> App a1 e1 . Abs a2 i <$> handleExpr' e2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed now that purescript/purescript-ordered-collections#57 has been merged.

I'm not sure if the ordered-collection dependency in the test directory's bower.json file only needs to be updated to master or whether all of them do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is just a does-it-optimize issue and not a does-it-work issue, I don't think we need to worry about updating the test dependencies for this if we take it out.

But, continuing the conversation from purescript/purescript-ordered-collections#57 (comment):

@natefaubion

However... the real question here is whether unsafePartial case _ of should trigger TCO or not, which is difficult to answer, because there is no real spec for when TCO should specifically not trigger. From a syntactic/elaboration standpoint, it's clearly not what we consider a tail call. However, the optimizer treats unsafePartial as irrelevant, and removes it, which coincidentally means that it triggers TCO. Same thing happens if you do something like unsafeCoerce $ go .... This issue will come up with any sort of inlining optimizations. Should inlining open up opportunities for TCO, or should TCO only apply to source syntax?

That is the question here, I agree. If unsafePartial, unsafeCoerce, etc. are meant to be transparent to TCO, then they also have to be transparent to this analysis (via hacks like the above or something more principled that I haven't come up with yet).

Do we know how we want to make that decision?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that, generally, the expectation around optimizations is that inlining opens up more opportunities for it. The advantage is that with a decent inliner, more things can be considered tail recursive. The disadvantage is that relying on optimizations for something like stack safety can be quite brittle.

Take for example, existentials. We can CPS encode existentials (or use purescript-exists), but that means it's impossible to write safe, fast, tail-recursive code with them. I've run up against this so, so many times and it's incredibly frustrating. Language support would be nice, but another alternative is if the eliminator were inlined (since it's the equivalent of aliasing unsafeCoerce), it would just work. Another option is tailRec, but it has significant overhead. If we could totally eliminate the overhead of tailRec, I might be OK with removing TCO altogether.

That doesn't really answer the question, except to say:

  • Given the status quo, I think we should grant the expectation that unsafeCoerce/unsafePartial are removed.
  • If we only had optimized tailRec (with multiple parameters), the issue would be moot because then it's no longer a correctness problem, but truly an optimization problem.

Copy link
Member Author

@rhendric rhendric Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the sound of that, but to be clear, when talking about optimizing tailRec, do you mean the case of tailRec (something that inlines to a literal lambda where there is a `Loop` or `Done` lexically present in every tail position) initValue, or the case of tailRec (arbitrary expression) initValue? Because the first sounds straightforward and lovely, and the second sounds impossible unless you have other special cases in mind that you would carve out.

Assuming we agree on what the above means, the logical plan would then be keep the hacks in this analysis for 0.15, implement tailRec optimization for 0.16 and warn everyone to use it if they care about stack safety, remove the hacks in 0.17?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean the former. tailRec is nice because if the optimization bails it's still stack safe, but it may allocate more, so it's not an issue of something working vs not working, just how efficient it is. I don't know if we can realistically get rid of TCO in general (tailRec is implemented in terms of TCO, after all), but we can dissuade it's usage, and basically say that it's necessary if you want to guarantee stack safety.

Copy link
Contributor

@JordanMartinez JordanMartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work! Thanks for your patience as I've tried to wrap my head around this.

@JordanMartinez
Copy link
Contributor

Just FYI. I haven't merged this because I wasn't sure whether more needed to be done in this thread: #4283 (comment)

This commit adds src/Language/PureScript/CoreFn/Laziness.hs, a
transformation on CoreFn that introduces call-by-need semantics to
bindings in recursive binding groups. This is the only way to restore
soundness to those groups without forbidding many programs we've
previously allowed, as historically any reference to another identifier
within the same binding group has been permitted as long as it occurs
within a lambda, regardless of whether that lambda is reduced in the
course of initialization.

The rest of PureScript remains call-by-value. Also, the imperfect
prohibitions on cyclic references enforced during desugaring and type
checking remain in place; consequently, this change does not alter the
set of programs accepted by the compiler.

For details, see the extensive commentary in Laziness.hs.

This is unlikely to break a working program, but the upshot for users is
that it's now possible to get a run-time error when dereferencing an
identifier in a recursive binding group before it has been initialized,
instead of silently getting an `undefined` value and having that maybe
or maybe not lead to an error somewhere else.
@rhendric rhendric force-pushed the rhendric/fix-4179 branch from 9fe742c to 814157b Compare April 14, 2022 17:31
@rhendric
Copy link
Member Author

I think we've settled on leaving the hack in place until we can give users a performant way to explicitly use tailCall to demand TCO, at which point we can remove it.

(The commit I just force-pushed is purely a rebase, squash, and commit message rewrite.)

Copy link
Contributor

@JordanMartinez JordanMartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, You can click on the "force pushed" part of the last state change and see the diff there.

I'm approving this again to show that I've checked that.

@JordanMartinez
Copy link
Contributor

I think we've settled on leaving the hack in place until we can give users a performant way to explicitly use tailCall to demand TCO, at which point we can remove it.

Just as a sanity check, @natefaubion do you agree? If so, I'll merge.

@natefaubion
Copy link
Contributor

Yes, I agree with that statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Changing an instance name can cause a TypeError

4 participants