This blog is not dead. It hasn’t seen any new content in a long while, which is a shame, but I did use the hiatus to recharge some creative energy. Here’s hoping progress will be more steady next year.
Narrator voice It wasn’t more steady. In fact, this article was kept around as an unfinished draft throughout 2019, not to be published before Config Management Camp 2020. It still lacks diagrams that make the graph code understandable. These may or may not be added at a future date. I am publishing this now because the narrative around GAPI development maybe helpful for future contributors (or myself). I hope it still makes for a somewhat enjoyable read. (End edit 2020-02-06.)
Speaking of 2019, another installment of Config Management Camp is coming up, and I felt compelled to get in position to present a cool new thing. Call it Config Management Camp Driven Development, if you will. My motivation nonwithstanding, I built a new feature for mgmt.
Specifically, I meant to implement an idea that was hatched during Config Management Camp 2018. It’s about allowing mgmt to run code that is mixed from Puppet’s DSL and mgmt’s own language. This article is mainly about code, but let it be mentioned that the code now works, is merged into mgmt’s master branch, and has passed its most basic smoke test. The amount of testing was way lower than I would prefer, but there is only so many hours in the week.
This article really only deals with source code internals of mgmt (as seen in December 2018). It may some day serve as an extended guide to how this code works, but might also help some aspiring mgmt contributors and give them some starting pointers. The code has since mutated and will likely do it again, but this post will stay frozen in the code’s original look and feel.
Creating the outline
Before I start bragging about the amazing hacker feats that lead to
this particular patch, let me put emphasis on the very beginning
of the process. I started by chatting up James
on IRC (#mgmtconfig
on Freenode) and asking for help. If you want
to get into hacking on the project, I suggest you do the same. We hopped
into a video call basically instantly, and he walked me through
the code I needed to base my work on.
Specifically, we talked about GAPIs. The GAPI subsystem is the part of mgmt that provides all the alternative ways of supplying configuration instructions in the form of resource graphs. Right now, the following GAPIs exist:
lang
interprets mgmt’s own DSLpuppet
receives graphs generated from Puppet manifestsyamlgraph
allows constructing raw graphs in YAMLempty
is a pseudo-interface that falls back to running with an empty graph
In the course of this article I will describe how we added
the newest GAPI, langpuppet
. Yes, naming it was Very Hard.
In general, the GAPI is a Go interface defined in gapi/gapi.go
by the
full name of gapi.GAPI
. This is important, because the current
convention sees all its implementations using the same base name
of GAPI
as well, but in respective packages, such as lang.GAPI
or puppet.GAPI
. The rest of the article discusses all aspects
of the GAPI interface.
The new langpuppet.GAPI
wraps the existing lang
and puppet
GAPIs. All methods basically multiplex the functionality to the
“child” GAPIs. This was daunting without a complete architectural
overview of the code base, but here is where James’ help was most
critical.
Essentially, we started just as if we were building any new GAPI. That is, we copied an existing implementation and threw out its specifics. I picked the Puppet GAPI, because that’s what I’m most familiar with.
So, step one was to copy puppet/gapi.go
to langpuppet/gapi.go
,
remove anything that alluded to Puppet, and adjust the package name.
Designing the CLI
The first interaction of the mgmt
core with any GAPI is its
CliFlags
method. mgmt uses the
urfave/cli
library for its CLI needs. The CliFlags
method returns an
array of cli.Flag
structures that describe CLI flags that are
specific to the respective GAPI. Here’s how this looked for Puppet
at the time:
func (obj *GAPI) CliFlags() []cli.Flag {
return []cli.Flag{
cli.StringFlag{
Name: "puppet, p",
Value: "",
Usage: "load graph from puppet, optionally takes a manifest or path to manifest file",
},
cli.StringFlag{
Name: "puppet-conf",
Value: "",
Usage: "the path to an alternate puppet.conf file",
},
}
}
This is sufficient in order to display the --puppet
and --puppet-conf
flags for this GAPI
in the output of mgmt help run
. The parameters will also be accepted
from the mgmt run
command line.
GAPI-specific parameters are quite important, because when invoked,
mgmt run
will pick the GAPI it should use, by inspecting the parameters.
The lang
GAPI will be used when --lang
is present, and puppet
if there
is a --puppet
. This is why we decided to not allow a call such as the following:
mgmt run --lang /path/to/code.mcl --puppet /other/path/to/manifest.pp
If the new langpuppet
GAPI was to expose these same parameters, the engine
would pick either one of the lang
, puppet
, or langpuppet
GAPIs in a
semi-deterministic fashion. In other words, building this interface would
have required changes in the way that GAPIs are selected. We didn’t want this
just now.
What we decided to make a “mirror CLI” that has each flag of the lang
and puppet
GAPIs, but prefixed with lp-
for langpuppet
. I was
afraid this would make a bad user experience, but found that typing the
prefixed flags comes rather naturally after all.
This is the code in the langpuppet
GAPI to make it happen:
func (obj *GAPI) CliFlags() []cli.Flag {
langFlags := (&lang.GAPI{}).CliFlags()
puppetFlags := (&puppet.GAPI{}).CliFlags()
var childFlags []cli.Flag
for _, flag := range append(langFlags, puppetFlags...) {
childFlags = append(childFlags, &cli.StringFlag{
Name: FlagPrefix + strings.Split(flag.GetName(), ",")[0],
Value: "",
Usage: fmt.Sprintf("equivalent for '%s' when using the lang/puppet entrypoint", flag.GetName()),
})
}
return childFlags
}
With this code, the new GAPI announces itself to the CLI subsystem and makes its flags known. To be more precise, this piece of code does the customization for this. There are two pieces of boilerplate code that are necessary for the actual registration. One is this function, that appears verbatim in each GAPI implementation:
func init() {
gapi.Register(Name, func() gapi.GAPI { return &GAPI{} }) // register
}
It returns a pointer to a generic GAPI interface, but &GAPI{}
always refers
to the local GAPI structure, such as langpuppet.GAPI
.
The other required step is the import of the package in lib/deploy.go
:
import (
...
_ "github.com/purpleidea/mgmt/lang"
_ "github.com/purpleidea/mgmt/langpuppet"
_ "github.com/purpleidea/mgmt/puppet"
_ "github.com/purpleidea/mgmt/yamlgraph"
...
So now that the CLI basics are here, let’s make sure that mgmt will initialize the GAPI object properly.
Getting up and running
My focus was on the Init
method first, but much later I learned
that the Cli
method is actually more interesting and crucial, and
that it runs earlier. The last bit should not have been surprising,
because this method is the GAPI’s way to determine whether it should
even run, given a specific mgmt
CLI invocation. The method makes this
decision right at its start:
func (obj *GAPI) Cli(c *cli.Context, fs engine.Fs) (*gapi.Deploy, error) {
if !c.IsSet(FlagPrefix+lang.Name) && !c.IsSet(FlagPrefix+puppet.Name) {
return nil, nil
}
Returning nil,nil
means that the method determined that its GAPI should
not activate. It activates when either --lp-lang
or --lp-puppet
is
specified. FlagPrefix
is a constant that is specific to the langpuppet
package and is set to the string lp-
. The suffixes lang
and puppet
are not used here directly, either. Instead, the code imports the Name
constants from the GAPI packages lang
and puppet
respectively.
Next, the method does some consistency checking and aborts if only one of the required flags is passed:
if !c.IsSet(FlagPrefix+lang.Name) || c.String(FlagPrefix+lang.Name) == "" {
return nil, fmt.Errorf("%s input is empty", FlagPrefix+lang.Name)
}
if !c.IsSet(FlagPrefix+puppet.Name) || c.String(FlagPrefix+puppet.Name) == "" {
return nil, fmt.Errorf("%s input is empty", FlagPrefix+puppet.Name)
}
In order to do the actual work of passing the flag values to the lang
and puppet
child GAPIs, the method then needs to remove the lp-
prefixes from them. This
is more involved than I had originally anticipated. That’s because the Cli
method
does not accept some simple array of flag values. It needs a structured value of the
type cli.Context
(from urfave/cli
).
I opted to create a new Context
object using cli.NewContext
, as one does. Doing
so first requires a flag.FlagSet
. The flag
package is part of Go’s standard library.
Here’s how that looks in code:
flagSet := flag.NewFlagSet(Name, flag.ContinueOnError)
for _, flag := range c.FlagNames() {
if !c.IsSet(flag) {
continue
}
childFlagName := strings.TrimPrefix(flag, FlagPrefix)
flagSet.String(childFlagName, "", "no usage string needed here")
flagSet.Set(childFlagName, c.String(flag))
}
I took a shortcut and just assumed that each flag that is of interest here will be
of type string. Hence the flags are registered using flagSet.String()
. Each flag is
explicitly given its value using flagSet.Set
. You can also specify a default
value when registering the flag (with the flagSet.String
call). But the
IsSet
check only passes when the value is specified explicitly. The child GAPI methods
rely on this check, so the Set
call is required.
Finally, the child GAPI methods can be called, and then something important happens.
The Cli
methods return gapi.Deploy
objects, which in turn point to respective
GAPI
objects. These are the workers that we actually hold on to, because they
will do the work of generating graphs from mgmt and Puppet code respectively.
if langDeploy, err = (&lang.GAPI{}).Cli(cli.NewContext(c.App, flagSet, nil), fs); err != nil {
return nil, err
}
if puppetDeploy, err = (&puppet.GAPI{}).Cli(cli.NewContext(c.App, flagSet, nil), fs); err != nil {
return nil, err
}
return &gapi.Deploy{
Name: Name,
Noop: c.GlobalBool("noop"),
Sema: c.GlobalInt("sema"),
GAPI: &GAPI{
langGAPI: langDeploy.GAPI,
puppetGAPI: puppetDeploy.GAPI,
},
}, nil
The important thing to note here is that the Cli
method is invoked on a given
langpuppet.GAPI
object, but it returns a brand new object, wrapped in a
gapi.Deploy
object. The engine will continue making use of that wrapped object.
Doing any initialization on the receiver obj
in either Cli
or CliFlags
will not work, because those receivers are ephemeral. This is also the reason
that the child GAPI methods are invoked on ephemeral receivers that are summarily discarded.
The Init
method has fairly little work left to do now. Its basis is some
boilerplate code that you will find in any GAPI implementation.
func (obj *GAPI) Init(data gapi.Data) error {
if obj.initialized {
return fmt.Errorf("already initialized")
}
obj.data = data // store for later
obj.closeChan = make(chan struct{})
obj.initialized = true
return nil
}
In the middle part, I just added some code that takes care of passing on
the initialization data
to the child GAPIs.
dataLang := gapi.Data{
Program: obj.data.Program,
Hostname: obj.data.Hostname,
World: obj.data.World,
Noop: obj.data.Noop,
NoConfigWatch: obj.data.NoConfigWatch,
NoStreamWatch: obj.data.NoStreamWatch,
Debug: obj.data.Debug,
Logf: func(format string, v ...interface{}) {
obj.data.Logf(lang.Name+": "+format, v...)
},
}
dataPuppet := gapi.Data{
Program: obj.data.Program,
Hostname: obj.data.Hostname,
World: obj.data.World,
Noop: obj.data.Noop,
NoConfigWatch: obj.data.NoConfigWatch,
NoStreamWatch: obj.data.NoStreamWatch,
Debug: obj.data.Debug,
Logf: func(format string, v ...interface{}) {
obj.data.Logf(puppet.Name+": "+format, v...)
},
}
if err := obj.langGAPI.Init(dataLang); err != nil {
return err
}
if err := obj.puppetGAPI.Init(dataPuppet); err != nil {
return err
}
James helped me with the Logf
functions. They prepend a lang:
or puppet:
marker to each log message generated within a child GAPI.
This concludes the initialization methods. Once these worked, the engine was able to get this new GAPI in position to actually feed graphs to it. The mechanism for this is perhaps the most complicated part.
Collecting graphs
The graph generating code in mgmt is quite asynchronous in nature. That’s because
the tool is designed to receive new data from the user or other systems at
any time, and incorporate them as timely as possible. For example, in its default
configuration, the mgmt
process will watch your mcl automation code, and when you
make changes (e.g., push to your git repository), it compiles the new
code into a graph and patches the current graph in memory with the differences.
The GAPI has three parts to enable this mechanism. The first part is the Next
method that each GAPI needs to implement. The engine calls it to express the
desire to receive graphs from the GAPI. The seconds part is a goroutine that
is launched by the Next
method. It is responsible for the actual signaling
whenever a new graph can be produced by the GAPI. It writes structured values
into a channel that is returned to the engine by the Next
method itself.
The third part is the Graph
GAPI method. On first glance it appears to be
the most important part, but do note that implementing it will often be quite straight
forward, and it cannot properly work without a robust Next
loop.
The whole construct assumes that the engine plays nice and obeys the (G)API
rules:
- Always call
Next
first - Wait for a
Next
event before callingGraph
- Call
Graph
at most once per event
The langpuppet
GAPI has to obey the same rules, because
it lodges itself in between the engine and its child GAPIs. Let’s look at
the Graph
method first, because it is structurally much simpler than Next
.
func (obj *GAPI) Graph() (*pgraph.Graph, error) {
if !obj.initialized {
return nil, fmt.Errorf("%s: GAPI is not initialized", Name)
}
var err error
if obj.langGraphReady {
obj.langGraphReady = false
obj.currentLangGraph, err = obj.langGAPI.Graph()
}
if err != nil {
return nil, err
}
if obj.puppetGraphReady {
obj.puppetGraphReady = false
obj.currentPuppetGraph, err = obj.puppetGAPI.Graph()
}
if err != nil {
return nil, err
}
g, err := mergeGraphs(obj.currentLangGraph, obj.currentPuppetGraph)
return g, err
}
It only calls the respective child Graph
method if that GAPI has indicated
that a graph is ready. Finally, it merges the graphs and returns the result.
The flags langGraphReady
and puppetGraphReady
are set in the Next
loop,
so let’s look at that now.
// Next returns nil errors every time there could be a new graph.
func (obj *GAPI) Next() chan gapi.Next {
ch := make(chan gapi.Next)
obj.wg.Add(1)
go func() {
defer obj.wg.Done()
defer close(ch) // this will run before the obj.wg.Done()
if !obj.initialized {
next := gapi.Next{
Err: fmt.Errorf("%s: GAPI is not initialized", Name),
Exit: true, // exit, b/c programming error?
}
ch <- next
return
}
//
// ACTUAL EVENT LOOP HERE
//
}()
return ch
}
obj.wg
is a sync.WaitGroup
and should be used in all GAPI implementations.
It allows the GAPI to cleanly shut down, including the goroutine launched here.
It’s notable that the goroutine does not return errors, but feeds them into
the event channel instead.
The meat of this goroutine is the event loop. Let’s dissect it a little:
nextLang := obj.langGAPI.Next()
nextPuppet := obj.puppetGAPI.Next()
firstLang := false
firstPuppet := false
for {
var err error
exit := false
select {
...
The loop initialization is two-fold: The child GAPIs’s Next
methods
must be called so that their own event loops start running. The respective
channels are held in local variables nextLang
and nextPuppet
. The significance
of the boolean flags firstLang
and firstPuppet
will be described
a little farther below. First, let’s look at the first select
block.
select {
case nextChild := <-nextLang:
if nextChild.Err != nil {
err = nextChild.Err
exit = nextChild.Exit
} else {
obj.langGraphReady = true
firstLang = true
}
case nextChild := <-nextPuppet:
if nextChild.Err != nil {
err = nextChild.Err
exit = nextChild.Exit
} else {
obj.puppetGraphReady = true
firstPuppet = true
}
case <-obj.closeChan:
return
}
Last things first: When the engine calls the Close
method of a GAPI
(which was not really mentioned here yet), the closeChan
gets closed,
and the event loop can terminate immediately.
Otherwise, child GAPI events will flip both respective flags to true
,
or in case of errors, remember the details in local variables, so that
a wrapped error can be handed out. Let’s talk about the dual boolean flags
first. The firstLang
and firstPuppet
flags are checked first,
right after the select
block.
if (!firstLang || !firstPuppet) && err == nil {
continue
}
This is a weird statement, so let’s unpack it by examining the possible
scenarios. Normally, the select
block above will finish after receiving
an event from one of the child GAPIs, say lang
. The firstLang
flag is
true
now, while firstPuppet
is still false, and err
is nil
. The
continue
is hit and the loop starts over. It is supposed to get an event
from the Puppet GAPI then, and now firstPuppet
is true
as well.
Now the “or” expression in parenthesis evaluates to false
, and it will
never flip back. The firstLang
and firstPuppet
flags never get reset.
Long story short, this little block makes sure that both child GAPIs have
sent signals before any processing is done on them. However, if one of them
produces an error, that will be processed immediately, and the continue
is skipped even though the “first” flags are not yet set.
This is what the processing looks like:
if err == nil {
obj.data.Logf("generating new composite graph...")
}
next := gapi.Next{
Exit: exit,
Err: err,
}
select {
case ch <- next: // trigger a run (send a msg)
// unblock if we exit while waiting to send!
case <-obj.closeChan:
return
}
Usually, exit
and err
will have their zero values of false
and nil
respectively. The event does not carry additional information.
Handing out the event to the engine happens in yet another select
block,
so that closeChan
events can be processed even when nobody is listening
on our event channel ch
anymore.
This is all. The event loop is complete, and will send appropriate events
to the engine, which will then invoke our Graph
method, which was
described earlier. It’s almost time to look at the mergeGraphs
function,
but let’s revisit some sections of Next
and Graph
first. I have been
lying to you a little, so that I can emphasize this now.
Let’s talk about race conditions
The first iteration of the Next
loop didn’t use boolean flags to communicate
with the Graph
method. It just set obj.currentLangGraph
and obj.currentPuppetGraph
to nil
, and there were no booleans at all. This is what Graph
looked like:
if obj.currentLangGraph == nil {
obj.currentLangGraph, err = obj.langGAPI.Graph()
if err != nil {
return nil, err
}
}
if obj.currentPuppetGraph == nil {
obj.currentPuppetGraph, err = obj.puppetGAPI.Graph()
if err != nil {
return nil, err
}
}
This worked, but had a big problem: The event loop in Next
will set
the child graph pointers to nil
at arbitrary times, effectively
discarding the graphs. For example,
this could happen right after the Graph
method checked and found
that they are in fact not nil
.
This is a pretty terrible race. Adding the boolean flags made this
business a lot safer already. Yet still, the flag itself is
a value that is written by two independent threads. Bear in mind
that the goroutine launched in Next
runs concurrently with
everything else. That’s why there is still a race, in the following
scenario.
- The
Next
loop receives an event. It sets the boolean flag and passes on the event to the engine. - The engine invokes
Graph
, which looks at the boolean flag, finds that it has been set. - At this very moment, the
Next
loop receives another event and sets the flag totrue
again. - Only now does the
Graph
method get around to resetting the flag tofalse
.
The most recent event is lost, and the child GAPI is missing a call
to Graph
. The most recent input update is ignored, and mgmt will be
none the wiser until randomly another event occurs. Latent subtle
failures like this are frustrating for users, and exhausting for
those who would try to debug them.
(FIXME: Is this even correct? Will the Graph
method, after losing
the race, request the most recent graph from the child GAPI and all
is well anyway? Could we get away without locking after all?)
To be safe against this, the operation of testing and resetting the flags must be made atomic. Go has no atomic test-and-set function, so this calls for a classic mutex. This is how the code actually looks:
var err error
obj.graphFlagMutex.Lock()
if obj.langGraphReady {
obj.langGraphReady = false
obj.graphFlagMutex.Unlock()
obj.currentLangGraph, err = obj.langGAPI.Graph()
if err != nil {
return nil, err
}
} else {
obj.graphFlagMutex.Unlock()
}
The graphFlagMutex
is shared by both the langGraphReady
and the
puppetGraphReady
flags. This isn’t helpful, but it makes the variable
name a little easier to read.
obj.graphFlagMutex.Lock()
if obj.puppetGraphReady {
obj.puppetGraphReady = false
obj.graphFlagMutex.Unlock()
obj.currentPuppetGraph, err = obj.puppetGAPI.Graph()
if err != nil {
return nil, err
}
} else {
obj.graphFlagMutex.Unlock()
}
What’s important here is that any reset of a flag is in the same critical section as the test of its current value.
In the Next
method, the lock is respected with the following modification:
select {
case nextChild := <-nextLang:
if nextChild.Err != nil {
err = nextChild.Err
exit = nextChild.Exit
} else {
obj.graphFlagMutex.Lock()
obj.langGraphReady = true
obj.graphFlagMutex.Unlock()
firstLang = true
}
Setting the flag is guarded by the same lock. An equivalent change was applied
to the handling of obj.puppetGraphReady
.
obj.graphFlagMutex.Lock()
obj.puppetGraphReady = true
obj.graphFlagMutex.Unlock()
firstPuppet = true
Since these boolean flags are the only piece of information that is written from independent threads, this is all the synchronization that is required here.
Merging graphs
Up to this point, the code we have examined was arguably not very creative.
(The reality is that actually writing and debugging it with little knowledge
of the general architecture of the project was quite daunting.)
It was all mechanics in order to properly integrate the lang and Puppet GAPIs
with our new wrapper. However, we skipped a crucial part in the whole process
of generating a graph out of two existing ones, implemented in the
mergeGraphs
function (see the last step of the Graph
method above).
The algorithm for allowing useful graph merging was described in an earlier post. If you intend to follow along with the logic here, I suggest you read that article first (the last part, that is). The general approach is to build a new graph by including all vertices from both input graphs.
Some vertices are merged between the two input graphs.
For example, the resource noop[puppet_handover]
from mcl code will be merged
with the empty Puppet class mgmt_handover
(resulting in a new
noop[handover]
).
Consistency checks should generate errors when either input includes
at least one “mergeable” vertex that has no counterpart in the other input.
The complete implementation lives in langpuppet/merge.go
. Let’s start
with the boilerplate consistency checks.
func mergeGraphs(graphFromLang, graphFromPuppet *pgraph.Graph) (*pgraph.Graph, error) {
if graphFromLang == nil || graphFromPuppet == nil {
return nil, fmt.Errorf("cannot merge graphs until both child graphs are loaded")
}
result, err := pgraph.NewGraph(graphFromLang.Name + "+" + graphFromPuppet.Name)
if err != nil {
return nil, err
}
Once these passed,
the first step for the actual algorithm is to copy the graph produced
by the lang
GAPI verbatim. All vertices and edges are incorporated
in the result
graph. Along the way, the program takes note of vertices
that represent noop
resources with a name starting in puppet_
. These
are candidates for merging, which will be kept in a map
.
mergeTargets := make(map[string]pgraph.Vertex)
// first add all vertices from the lang graph
for _, vertex := range graphFromLang.Vertices() {
if strings.Index(vertex.String(), "noop["+MergePrefixLang) == 0 {
resource, ok := vertex.(engine.Res)
if !ok {
return nil, fmt.Errorf("vertex %s is not a named resource", vertex.String())
}
basename := strings.TrimPrefix(resource.Name(), MergePrefixLang)
resource.SetName(basename)
mergeTargets[basename] = vertex
}
result.AddVertex(vertex)
for _, neighbor := range graphFromLang.OutgoingGraphVertices(vertex) {
result.AddVertex(neighbor)
result.AddEdge(vertex, neighbor, graphFromLang.FindEdge(vertex, neighbor))
}
}
Do note the typecast of vertex
to an engine.Res
pointer. The pgraph.Pgraph
is very generic. Any stringable object can be a vertex. (Edges are just as indistinct.)
So, in order to use any information from a vertex, its actual content type must
be inferred and extracted with a cast like above.
In order to make merging easier down the line, the “base names” of the pertinent resources
are extracted here. For example, the resource noop[puppet_post_deployment]
gets mapped
to the string post_deployment
. In the result
graph, this string also becomes the
name for this resource, e.g. noop[post_deployment]
.
The MergePrefixLang
constant in the code above is
defined as follows, along with its counterpart:
const (
// MergePrefixLang is how a mergeable vertex name starts in mcl code.
MergePrefixLang = "puppet_"
// MergePrefixPuppet is how a mergeable Puppet class name starts.
MergePrefixPuppet = "mgmt_"
)
Finally, the respective vertex is added to the result
graph, along with all its
outgoing edges. The pgraph
API requires that the vertices on both ends of the edge
are in the graph before an edge is added. That’s why each “neighbor” vertex is
also added during this loop iteration.
This implies that most vertices are added at least twice, one time during their
respective loop iteration, but also from whatever other vertices
have outgoing edges to them. This is fine. The pgraph
API implements AddVertex
as an idempotent action that can be repeated at will.
After the lang
graph is processed entirely, the puppet
graph gets
incorporated. The algorithm does two passes on this graph. First, the
algorithm builds an internal map of vertices that will be merged.
During this pass, the program also identifies the “first” vertex of Puppet’s
graph (the start marker of Puppet’s “main stage”) and saves a pointer to it.
This is used to make the second pass more orderly.
var anchor pgraph.Vertex
mergePairs := make(map[pgraph.Vertex]pgraph.Vertex)
// do a scan through the Puppet graph, and mark all vertices that will be
// subject to a merge, so it will be easier do generate the new edges
// in the final pass
for _, vertex := range graphFromPuppet.Vertices() {
if vertex.String() == "noop[admissible_Stage[main]]" {
// we can start a depth first search here
anchor = vertex
continue
}
The top of the loop concludes the handling for the “anchor” vertex. The rest of the loop body deals with merging preparation.
// at this stage we don't distinguis between class start and end
if strings.Index(vertex.String(), "noop[admissible_Class["+strings.Title(MergePrefixPuppet)) != 0 &&
strings.Index(vertex.String(), "noop[completed_Class["+strings.Title(MergePrefixPuppet)) != 0 {
continue
}
resource, ok := vertex.(engine.Res)
if !ok {
return nil, fmt.Errorf("vertex %s is not a named resource", vertex.String())
}
The consistency check here should never fail, because Puppet’s graphs have no vertices
that aren’t resources (after translation to mgmt, that is). Note the use of strings.Title
, which
capitalizes start of the word in the MergePrefixPuppet
constant. This is required
because of Puppet’s convention to capitalize references in this fashion.
The start and end markers of class mgmt_post_deployment
in Puppet are named
admissible_Class[Mgmt_post_deployment]
and completed_Class[Mgmt_post_deployment]
respectively. That’s why the program scans for Mgmt_
rather than mgmt_
.
// strip either prefix (plus the closing bracket)
basename := strings.TrimSuffix(
strings.TrimPrefix(
strings.TrimPrefix(resource.Name(),
"admissible_Class["+strings.Title(MergePrefixPuppet)),
"completed_Class["+strings.Title(MergePrefixPuppet)),
"]")
Wow, who the hell wrote this? The idea is, we can just strip both possible prefixes from whatever vertex we’re looking at, because one of these operations will be a no-op. Also, since I’m lazy, the closing bracket gets stripped off in the same statement. A much better way to write this is:
basename := strings.TrimSuffix(resource.Name(), "]")
basename = strings.TrimPrefix(basename,"admissible_Class["+strings.Title(MergePrefixPuppet))
basename = strings.TrimPrefix(basename,"completed_Class["+strings.Title(MergePrefixPuppet))
This refactoring is probably already in mgmt’s master
by the time you read this.
By this point, the algorithm has identified a vertex in the graph from the Puppet
translator that can be merged. It tries and saves a mapping from this vertex
to its counterpart in the result
graph.
if _, found := mergeTargets[basename]; !found {
// FIXME: should be a warning not an error?
return nil, fmt.Errorf("puppet graph has unmatched class %s%s", MergePrefixPuppet, basename)
}
mergePairs[vertex] = mergeTargets[basename]
Finally, perform another consistency check and make sure that merge classes are really empty.
if strings.Index(resource.Name(), "admissible_Class["+strings.Title(MergePrefixPuppet)) != 0 {
continue
}
// is there more than one edge outgoing from the class start?
if graphFromPuppet.OutDegree()[vertex] > 1 {
return nil, fmt.Errorf("class %s is not empty", basename)
}
// does this edge not lead to the class end?
next := graphFromPuppet.OutgoingGraphVertices(vertex)[0]
if next.String() != "noop[completed_Class["+strings.Title(MergePrefixPuppet)+basename+"]]" {
return nil, fmt.Errorf("class %s%s is not empty, start is followed by %s", MergePrefixPuppet, basename, next.String())
}
If such a class was non-empty, the algorithm would go ahead and merge vertices in a way that would lead to circles in the resulting graph. (TODO: diagram!)
Now for the final pass of the input graph from Puppet. This one is a little peculiar. In order to not repeat the merging logic too often in code, vertices are only added indirectly. The loop visits each vertex, but only adds its “neighbors” (vertices at the end of outgoing edges) and not the current vertex itself. This will still happen multiple times to vertices with more than one incoming edge, but the code is a little more compact this way, and thus easier to reason about.
merged := make(map[pgraph.Vertex]bool)
result.AddVertex(anchor)
// traverse the puppet graph, add all vertices and perform merges
// using DFS so we can be sure the "admissible" is visited before the "completed" vertex
for _, vertex := range graphFromPuppet.DFS(anchor) {
source := vertex
// when adding edges, the source might be a different vertex
// than the current one, if this is a merged vertex
if _, found := mergePairs[vertex]; found {
source = mergePairs[vertex]
}
The first vertex must be added outside the loop, because it has no incoming edges. As mentioned in the code comments, a depth first search is performed so that I can make assumptions about the order in which some vertices are seen. The code above also takes care of half the merging logic. It ensures that edges that are outgoing from merging vertices will actually go out of the merged vertex.
TODO: diagram
Next is the actual work of adding all neighbors, i.e. vertices with edges that are incoming from the vertex currently being iterated.
// the current vertex has been added by previous iterations,
// we only add neighbors here
for _, neighbor := range graphFromPuppet.OutgoingGraphVertices(vertex) {
if strings.Index(neighbor.String(), "noop[admissible_Class["+strings.Title(MergePrefixPuppet)) == 0 {
result.AddEdge(source, mergePairs[neighbor], graphFromPuppet.FindEdge(vertex, neighbor))
continue
}
if strings.Index(neighbor.String(), "noop[completed_Class["+strings.Title(MergePrefixPuppet)) == 0 {
// mark target vertex as merged
merged[mergePairs[neighbor]] = true
continue
}
// if we reach here, this neighbor is a regular vertex
result.AddVertex(neighbor)
result.AddEdge(source, neighbor, graphFromPuppet.FindEdge(vertex, neighbor))
}
Note the assumption in the comment on top. It is safe because this is a depth first search through the graph received from Puppet.
The two if
blocks
in the loop body implement the second half of the vertex merge. Both cases
skip the AddVertex
call, because the resulting graph will contain the
merged vertex instead of these class containment nodes. This merge vertex is
already in the result graph. It was added in the earlier import of the
graph that was compiled from mcl code.
The first if
block handles the vertex that marks the start of the
respective class. Since the merged vertex (mergePairs[neighbor]
) is
already in the result graph, only the edge to that vertex needs adding.
It uses the local source
vertex from the previous code snippet.
The other block is entered when the neighbor in question is the end node
of a merging class. This vertex can only have one incoming edge, because
it is mandated that the merge classes are empty. The only edge is from
the admissible
vertex to the completed
(or if your will, start to end).
This edge is dropped completely, because both vertexes are merged into one.
The only thing that happens here is marking the “target” vertex in the
result graph as completely merged, for later consistency checking.
Note that whenever the code adds an edge to the result graph, it looks up the very edge object in the source graph. It does not even really care what kind of object serves as an edge. Remember that any stringable structure can be an edge (or a vertex, for that matter).
The aforementioned consistency check is the last bit of code
in the mergeGraph
function.
for _, vertex := range mergeTargets {
if !merged[vertex] {
// FIXME: should be a warning not an error?
return nil, fmt.Errorf("lang graph has unmatched %s", vertex.String())
}
}
I’ve left some FIXME
notes like the one above, because I’m currently
not sure what will make for a better UX: Fail a graph that cannot be
cleanly merged, or simply log a warning about the fact. At the moment
I lean towards failing outright, so that’s what the code still does.
Summary
The GAPI system in mgmt’s code base is not terribly complex, but it’s hard to infer its design from just reading code. Do ask James for help with new features. Once you grok the basics, making a new GAPI is not so hard.
The langpuppet
GAPI almost completely relies on existing code from
the lang
and puppet
GAPIs. It carefully implements the Next
/Graph
loop that is responsible for the GAPI’s functionality along most of its
life cycle.
Although there is not much testing data yet, the support for mixed mcl/Puppet code is essentially functional now.