It’s been a little over a year since I introduced the Mgmt Configuration Language. A lot has happened since then, and I’d like to introduce some of the missing features that weren’t available when the language was first introduced. If you haven’t already read that post, please start there and come back when you’re finished. In this article we’ll learn about classes.
You might remember that the mgmt
language called mcl
has both statements
, and expressions
. Statements
ultimately produce the resources
and edges
that make up our output graph,
and expressions produce the values (strings, integers, lists, etc) that make up
the inputs to those statements.
As it turns out, it’s useful to have a grouping construct for statements to make them composable. That grouping construct is called the class statement. It looks like this:
# define a class named foo
class foo {
pkg "cowsay" {
state => "installed",
}
# you can add any number of statements in here...
}
To use a class statement, you reference a defined class by using the include
statement. This causes the scoped contents of the class definition to
“empty out” into the scope of the include
. It looks like this:
# it's now as if the resources and edges in the class were typed here
include foo
I think this is pretty straightforward.
One interesting thing about classes is that they can capture scope. Examine the following:
$x1 = "t1" # this variable gets captured by bar
class bar {
print $x1 {} # captures variable in the parent scope
}
include bar
In this scenario, when the class is defined, you’ll see that it “captures” the
$x1
variable from the parent scope. Later on when the class is include
-ed,
it builds the print
resource with the original $x1
variable.
You might assume that we’re lazy and we actually just use the $x1
variable
that exists in the same scope, but if you experiment, you’ll find that we
actually “do the right thing”. As an example, assume the following two files:
# whatever.mcl
$x1 = "t1"
class baz {
print $x1 {}
}
and
# main.mcl
import "whatever.mcl"
$x1 = "bad1" # this isn't consumed anywhere
include whatever.baz
If we run main.mcl
, you’ll see that it uses an imported class that was defined
in a different scope, and in fact, it retains the $x1
from that original scope
where it was created. Don’t worry that you haven’t seen import
yet, we’ll get
to that shortly.
It would be unfortunate if class definitions were only “single-use”, and it
would also be unfortunate if you couldn’t mix the local scope into a class when
you include
it. As a result, classes can be “parameterized”. They look like
this:
class fruit($colour, $shape) {
print $x1 {}
if $colour == "red" && $shape == "round" {
print "i am a tomato" {}
} else {
$n1 = "i am a " + $shape + " shaped fruit that's " + $colour
print $n1 {}
}
}
include fruit("red", "round")
include fruit("red", "round") # de-duplication happens at the resource level
include fruit("blue", "round")
As you can see, we can include
the class multiple times, and produce multiple
resources. If for some reason we end up producing identical (or more
specifically, compatible) resources, then they are de-duplicated by the resource
engine, and are not compile errors. This solves a well-known issue (I say bug)
with the Puppet tool. (This works correctly in mgmt and avoids the headaches
with two different modules requiring the same package dependency.)
When you define a class, you may also specify some type restrictions on the parameters. By default when you do not specify the type of a value, it is determined automatically by the type unification engine. This is the recommended way to use mgmt. You may add an additional type rule to the class parameters if you’d like. It looks like this:
import "fmt"
# specifying a fixed type for $b is a compile error, because it's sometimes str!
class c1($a, $b []str) { # note that we specified the type of $b
print $a {
msg => fmt.printf("len is: %d", len($b)),
}
}
# note that the class can have two separate types for $b
include c1("t1", [13, 42, 0, -37,]) # len of second arg is 4
include c1("t2", "hello") # len of second arg is 5
In this case you can see that while the code could allow more than one
different type for the $b
parameter, it was restricted to only allow []str
(a list of strings) and as a result, the above code won’t compile. It’s good to
know that these kinds of issues are always found at compile time, and not at
runtime. Once the program has compiled successfully, it always runs safely.
To get this to compile, you can either remove the type specification, or you can
remove the last line. If you remove the type specification, compile it, and look
at the internals, you might notice that the c1
class is actually polymorphic
before compile time, but afterwards at runtime every single variable and value
has a single static type.
The mgmt language is designed to be very safe. This is important because when
using a powerful, higher-level tool, you don’t want a simple programming error
to cause you to destroy an entire data-center, or triple your aws:ec2
bill.
As a result many common bugs are prevented at compile-time. Consider the
following:
$x1 = "hello"
$x1 = "world" # compile-time error
print $x1 {}
As you can see in the above example, variable re-assignment is not permitted, and as a result, that code will not compile. Variable shadowing however, is permitted, because it has many useful implications. It looks like this:
$msg = "nobody" # won't get used
class shadowme($x1, $msg) {
print "message" {
msg => $x1 + $msg,
}
}
include shadowme("hello", "world")
It’s important that the class author be able to define their own variables, and
as a result, they’re in charge of their own scope. The class parameter $msg
shadows the global var, and is what is actually used. As a quick quiz, consider
this example. What message will be printed?
$msg = "a"
class shadowme($msg) {
$msg = "d"
if true {
$msg = "c"
print "shadowed" {
msg => $msg,
}
}
}
include shadowme("b")
If you think you’ve got it all figured out, what do you think this will print? Getting the answer is easy, telling me why is more important.
$msg = "a"
class shadowme($msg) {
$msg = "c"
if true {
$msg = "d"
}
print "shadowed" {
msg => $msg,
}
}
include shadowme("b")
While it is possible to write the above code, it’s probably not considered good practice. Avoid using shadowing wherever possible, to avoid confusing the user. The answer for the above two problems was “c”.
In case it’s not implied, you should know that classes can be nested. The following is perfectly valid code:
class outside {
class inside {
print "hello" {}
}
include inside
}
include outside
It’s also worth mentioning that the following code will not compile:
class outside {
class inside {
print "hello" {}
}
}
include outside
include inside # the inside class is not in scope here, so this fails
We can’t generate new classes to appear in scope from the include
of a parent
class. As an example, the following is also invalid, and won’t compile:
class outside($b) {
if $b {
class inside {
print "hello" {}
}
}
}
include outside(true)
include inside # the inside class is not in scope here, so this fails
While this might seem unfortunate, it’s actually an important consequence of understanding the statement scope in general. The statement scope includes the classes and variables that exist at that time, and at compile time, they’re bindings (data flow mappings) are static. Consider the following code:
$foo = "a"
class weird($cond) {
$foo = "b"
if $cond {
$foo = "c"
}
print "hello" {}
}
include weird(true) # instead of a constant, this value could change over time
print "world" {
msg => $foo,
}
What does this print? While this code is valid and compiles, it must print “a”. If it didn’t, it would mean the variable binding could change over time, which would be very complex, and error prone. For a declarative DSL, this isn’t a net positive in my opinion. To do so would require a “higher-order” FRP.
For similar reasons to what was just mentioned, classes can’t be recursive. You
can include
other classes inside your class, but the chain must terminate
statically. The following code is valid:
class c1($m) {
print $m {}
}
class c2 {
include c1("hello")
print "world" {}
}
include c1("this is a test...")
include c2
The following code is not valid:
class c1($cond) {
print "nope" {}
if $cond {
include c1(false)
} else {
print "done" {}
}
}
include c1(true)
While it appears that it would terminate and be a “safe” program, for the reasons I mentioned above, the compiler will catch this and forbid it at compile time. Also not allowed:
class c1 {
include c2
}
class c2 {
include c1
}
This all helps prevent a common cause of bugs in programs: non-terminating programs that infinitely recurse. Iteration is not always your friend. But our compiler will be.
When you include
a class, it effectively copies the contents and dumps them
out in the requested location. You might think that this gets expensive quickly,
but in fact it is quite efficient. Consider the following code:
$big_data = "Now this is a story, all about how, my code got..."
class lyrics($in) {
print "${in}" {
msg => "lyrics: " + $big_data,
}
}
include lyrics("fresh code")
include lyrics("fresh rhymes")
Here’s a graph of the data flows that the function engine will run:
The function engine data flows of the above code. Notice how the raw value of the $big_data string only appears once. |
Even though the $big_data
variable gets pulled into the lyrics
class for
each invocation, the initial copy is the only one that’s used in the function
graph. This is because we perform an intelligent “light copy” when running
include
, and anything that is a bound constant, remains that way. Enjoy the
memory gains!
As I mentioned in the earlier language post,
code is actually a graph
and as a result can be written out-of-order. If you experiment with re-arranging
the ordering of the statements, you’ll find that everything still works
correctly. This can be useful for situations when you might prefer to define the
class
before it is used, or for when you want to see your business logic that
combines different include
s at the top, and have the internals down at the
bottom.
In general, it is recommended that you avoid out-of-order code, but at the moment it is allowed. There is a compile-time constant that specifies whether it is allowed or not, and there is another one that specifies whether it should allow it, but generate a warning. Neither is fully implemented because we are missing one function in our graph library. If you can write a function that returns whether a given ordering is a valid subset of one of the possible topological sorts for that graph, then please let us know!
I hope you try out the language! More documentation is available in the language guide.
Happy Hacking,
James
You can follow James on Mastodon for more frequent updates and other random thoughts.
You can follow James on Twitter for more frequent updates and other random thoughts.
You can support James on GitHub if you'd like to help sustain this kind of content.
You can support James on Patreon if you'd like to help sustain this kind of content.
Your comment has been submitted and will be published if it gets approved.
Click here to see the patch you generated.
Comments
Nothing yet.
Post a comment