Skip to content
This repository has been archived by the owner on Jun 4, 2021. It is now read-only.

Node identity

David Lutterkort edited this page Oct 25, 2013 · 5 revisions

Identifying a node is a surprisingly hard problem, mostly because we need to identify the node in some circumstances solely based on somewhat unreliable and mutable hardware characteristics of the node. There are two important times when Razor needs to determine what a node is, and whether it is different from all the other nodes it knows about:

  1. when the generic bootstrap.ipxe script retrieves instructions on how to boot
  2. when the microkernel checks in with the Razor server

There are of course other places where a node and the server communicate, but those happen at times when the server already knows about the node and is in complete control of how the node identifies itself - in other words, there the server can encode its idea of node identity into the node's request, solving all identity worries.

Using hardware information to identify a node

The most difficult place to do node identification is from the generic bootstrap.ipxe file that you place on your TFTP server. What makes it difficult is that we can only use information available to iPXE, which is limited. If you never modify your hardware after first booting a node, you're in good shape. Trouble comes if you ever move hardware components between your nodes.

Currently, Razor gets the following pieces of information from iPXE to guide its decision about when a node is a certain known node, and when it is a completely new node; this data is collectively called the hardware information (hw_info) of the node. This information is updated every time a node boots.

  • MAC addresses of all the NIC's; actually the MAC addresses of the first nic_max network interfaces, a parameter you can pass in when you generate the bootstrap.ipxe
  • The asset tag, serial, and uuid as reported by the SMBIOS

When bootstrap.ipxe contacts the server, it sends the above pieces of hardware information; the setting match_nodes_on in config.yaml determines which of these pieces is used to identify a node. By default, only mac, the MAC addresses, are used. The server goes looking for a known node whose hardware information contains at least one of these values. If there is no such node, Razor assumes it's seeing a new node; if there is more than one, Razor will complain and not continue booting that node.

The match_nodes_on configuration should contain all the hardware information that is unique across all machines that Razor manages (or empty on machines where it is not set) Some vendors fill in some of these values with nonsense values, for example, set the asset tag to no asset tag. In these cases, matching on asset tag would make Razor think all those nodes are really just one node, and the rest of the hardware on that node changes quickly. In this situation, match_nodes_on must not contain asset to avoid confusing Razor.

How to confuse Razor about node identity

If you never change existing hardware, you do not need to worry about confusing Razor. When you do change hardware, there are some things that you can do without any problems:

  • Remove hardware from an existing node
  • Add a piece of hardware, e.g. a new NIC, to a node, provided that NIC has not been used with Razor before
  • Replace hardware with a new component, e.g. to replace a faulty network card

Things get complicated if you move hardware components between nodes that Razor knows about:

  • if you move a component, e.g. a network card, from one node that Razor knows about to another known node, Razor will complain about having two nodes matching the same hw_info
  • if you move a component from a known node to a new node new, Razor will get very confused: when the new node boots, it will identify it as the known node; when the known node boots after that, it will appear as a completely new node.

To support the above two scenarios, we need a command to tell Razor that a piece of hardware is being moved; comment on this issue if this affects you.

Microkernel checkins

For now, we assume that the Razor server is in control of booting the Microkernel. As such, the Microkernel command line receives a checkin URL that contains the node's internal identifier. This scheme will break down if you ever need to boot Microkernels from media, e.g. a CD. Please contact us if that affects you.