This document is intended as an introductory guide to the Yocton parsing API. Let's start with a basic example that shows how to open an input file and read a single property:
FILE *fs = fopen(filename, "r");
assert(fs != NULL);
struct yocton_object *obj = yocton_read_from(fs);
assert(obj != NULL);
struct yocton_prop *p = yocton_next_prop(obj);
assert(p != NULL);
printf("property %s has value %s\n", yocton_prop_name(p),
yocton_prop_value(p));
This example shows the basic boilerplate of how to get started with the API. A Yocton document is an object (@ref yocton_object) which contains properties (@ref yocton_prop). We can expand the example into one that prints every property in a file:
FILE *fs = fopen(filename, "r");
assert(fs != NULL);
struct yocton_object *obj = yocton_read_from(fs);
assert(obj != NULL);
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
printf("property %s has value %s\n", yocton_prop_name(p),
yocton_prop_value(p));
}
However, this example only works when all property values are strings.
Property values may instead be objects; these can be accessed using
yocton_prop_inner()
. Using this we can construct a recursive function that
reads and prints all properties of all subobjects:
void print_obj(struct yocton_object *obj) {
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
if (yocton_prop_type(p) == YOCTON_PROP_OBJECT) {
printf("property %s has subobject...\n", yocton_prop_name(p));
print_obj(yocton_prop_inner(p));
printf("end of property %s.\n", yocton_prop_name(p));
} else {
printf("property %s has value %s\n", yocton_prop_name(p),
yocton_prop_value(p));
}
}
}
The APIs for many serialization formats are often document based, where data is deserialized into a document object that can then inspected (examples are the XML DOM, and protocol buffers). Yocton instead uses a pull parser. With a pull parser, it is up to the caller to read data one item at a time. This avoids the need for either autogenerated code (as with protobufs) or complicated APIs - Yocton's API is minimalist and simple to learn.
The API has been designed with a particular approach in mind to using input data to populate data structures. It is assumed that Yocton objects will correspond to C structs, and object properties will correspond to C struct fields. Here's a simple example of how a struct might be read and populated; the example struct here is a minimal one containing a single string field:
struct foo {
char *bar;
};
struct foo *read_foo(struct foo *f, struct yocton_object *obj) {
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
if (!strcmp(yocton_prop_name(p), "bar")) {
f->bar = yocton_prop_value_dup(p);
}
}
return result;
}
While this is relatively easy to understand, it looks quite verbose. It is therefore important to note that there are convenience functions and macros to make things much simpler, as will be explained in the sections below.
Yocton is a recursive format where objects can also contain other objects. The assumption is that a subobject likely corresponds to a field with a struct type. Consider the following input:
my_baz {
my_foo {
bar: "hello world!"
}
}
This might be used to populate structs of the following types:
struct baz {
struct foo *my_foo;
};
struct qux {
struct baz my_baz;
};
When subobjects are mapped to struct types in this way, a function can be
written to populate each type of struct. In the examples above, read_foo()
might be complemented with read_baz()
and read_qux()
functions. This makes
for clear and readable deserialization code; recursion in the programming
language is used to handle recursion in the input file. The approach also
means that the individual functions can be tested in isolation.
Yocton property values can contain arbitrary strings, the contents of which are open to interpretation. In practice though, the values are often likely to be one of several common base types which every C programmer is familiar with. There are convenience functions to help parse values into these types:
Function | Purpose |
---|---|
yocton_prop_int() | Parse value as a signed integer. Works with all integer types, performs bounds checking, etc. |
yocton_prop_uint() | Parse value as an unsigned integer. Works with all unsigned integer types, performs bounds checking, etc. |
yocton_prop_value_dup() | Returns the value as a plain, freshly allocated string, performing the appropriate checking for memory allocation failure. Useful for populating string fields. |
While these functions are useful, in most cases it is more convenient to use the preprocessor macros which are specifically intended for populating variables (and struct fields).
Type | Macro |
---|---|
Signed integer | YOCTON_VAR_INT(property, property_name, type_name, variable) |
Unsigned integer | YOCTON_VAR_UINT(property, property_name, type_name, variable) |
String | YOCTON_VAR_STRING(property, property_name, variable) |
Consider the following input:
signed_val: -123
unsigned_val: 999
string_val: "hello world"
We might want to read this input and populate the following struct type:
struct foo {
int signed_value;
unsigned int unsigned_value;
char *string_value;
};
In the following example, we populate a struct foo
variable named x
. A
different YOCTON_VAR_...
macro is used to match each property name and assign
a value to a different struct field:
struct foo x = {0, 0, 0, NULL};
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
YOCTON_VAR_INT(p, "signed_val", int, x.signed_value);
YOCTON_VAR_UINT(p, "unsigned_val", unsigned int, x.unsigned_value);
YOCTON_VAR_STRING(p, "string_val", x.string_value);
}
In the above example the fields of a struct are being populated, but this does
not have to be the case; for example the following sets an ordinary variable
named string_value
:
char *string_value = NULL;
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
YOCTON_VAR_STRING(p, "string_val", string_value);
}
It is important to note is that these macros are internally designed to provide a simple and convenient API, not for efficiency. If performance is essential or becomes a bottleneck, it may be preferable to avoid using these macros.
C provides enumerated types (enums) which allow the programmer to define a
set of integer values with symbolic names. Yocton provides support for enums
through the yocton_prop_enum()
function which will map a property value to an
integer value through lookup in an array of strings. For example:
enum e { FIRST, SECOND, THIRD };
const char *enum_names[] = {"FIRST", "SECOND", "THIRD", NULL};
enum e enum_var = yocton_prop_enum(p, enum_names);
The array of strings must be NULL terminated. As with the other functions
described in the previous section, it is usually simpler to use the
YOCTON_VAR_ENUM()
convenience macro:
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
YOCTON_VAR_ENUM(p, "enum_val", enum_var, enum_names);
}
Sometimes we might have a pointer variable, and want to initialize that variable when a particular property is read. For example, consider the following input:
foo {
val: "hello world"
}
We might want to use this to initialize the following pointer variable:
struct foo {
char *val;
};
struct foo *my_foo = NULL;
In this scenario, we can use YOCTON_VAR_PTR()
to allocate a new struct foo
.
In the following example, when YOCTON_VAR_PTR()
matches a property named
foo
, a new struct foo
is allocated, my_foo
is initialized to point to it,
and parse_foo()
is called to populate it from the property's object value.
void parse_foo(struct yocton_object *obj, struct foo *my_foo);
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
YOCTON_VAR_PTR(p, "foo", my_foo, {
parse_foo(yocton_prop_inner(p), my_foo);
});
}
The Yocton format has no special way of representing lists. Since property names do not have to be unique, it is simple enough to represent a list using multiple properties with the same name.
As with the previous example that described how to populate variables (and struct fields) with base types, convenience macros also exist for constructing arrays. The main difference is that an extra variable (or struct field) is needed to store the array length.
Type | Macro |
---|---|
String array | YOCTON_VAR_STRING_ARRAY(property, property_name, variable, length_variable) |
Signed integer array | YOCTON_VAR_INT_ARRAY(property, property_name, type_name, variable, length_variable) |
Unsigned integer array | YOCTON_VAR_UINT_ARRAY(property, property_name, type_name, variable, length_variable) |
Enum array | YOCTON_VAR_ENUM_ARRAY(property, property_name, variable, length_variable, enum_names) |
Array of pointers | YOCTON_VAR_PTR_ARRAY(property, property_name, variable, length_variable, code_block) |
Array of structs | YOCTON_VAR_ARRAY(property, property_name, variable, length_variable, code_block) |
Consider the following input:
signed_val: -123
signed_val: 456
unsigned_val: 999
unsigned_val: 12345
string_val: "hello"
string_val: "world"
enum_val: THIRD
enum_val: FIRST
We might want to parse this input to populate the following struct type:
enum e { FIRST, SECOND, THIRD };
struct bar {
int *signed_values;
size_t num_signed_values;
unsigned int *unsigned_values;
size_t num_unsigned_values;
char **string_values;
size_t num_string_values;
enum e *enum_values;
size_t num_enum_values;
};
The following code populates a single struct bar
named x
:
const char *enum_names[] = {"FIRST", "SECOND", "THIRD", NULL};
struct bar x = {NULL, 0, NULL, 0, NULL, 0, NULL, 0};
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
YOCTON_VAR_INT_ARRAY(p, "signed_val", int, x.signed_values,
x.num_signed_values);
YOCTON_VAR_UINT_ARRAY(p, "unsigned_val", unsigned int,
x.unsigned_values, x.num_unsigned_values);
YOCTON_VAR_STRING_ARRAY(p, "string_val", x.string_values,
x.num_string_values);
YOCTON_VAR_ENUM_ARRAY(p, "enum_val", x.enum_values,
x.num_enum_values, enum_names);
}
While the above macros are convenient for building arrays of base types, often
it is preferable to construct arrays of structs. The YOCTON_VAR_ARRAY()
macro
can be used to do this (actually, it can be used to construct arrays of any
type; it is what the previous macros were built upon). It does the following:
- Check if the name of the property matches a particular name.
- If the name matches, the array pointer is reallocated to allot space for a new element at the end of the array.
- An arbitrary block of code is executed that can (optionally) populate the contents of the new array element.
Consider the following input:
item { val: 1 }
item { val: 2 }
item { val: 3 }
We might want to parse this input into the following array:
struct foo {
int val;
};
struct foo *items = NULL;
int num_items = 0;
In the following example, when YOCTON_VAR_ARRAY()
matches a property named
item
, the items
array is reallocated to allot space for a new element
(item[num_items]
). The parse_foo()
function is then called to populate
the contents of this new struct from the property's inner object value.
Finally, the length of the array num_items
is incremented.
void parse_foo(struct yocton_object *obj, struct foo *item);
struct yocton_prop *p;
while ((p = yocton_next_prop(obj)) != NULL) {
YOCTON_VAR_ARRAY(p, "item", items, num_items, {
parse_foo(yocton_prop_inner(p), &item[num_items]);
num_items++;
});
}
The previous section covered how to construct an array of structs. The
analogous YOCTON_VAR_PTR_ARRAY()
can be used to construct an array of struct
pointers. Consider the following input (same input as the previous section):
item { val: 1 }
item { val: 2 }
item { val: 3 }
We might want to parse this input into the following array (note the difference to the previous section; this is an array of pointers to structs):
struct foo {
int val;
};
struct foo **items = NULL;
int num_items = 0;
In the following example, when YOCTON_VAR_PTR_ARRAY()
matches a property
named item
, a new struct foo
is allocated and appended to the items
array, and the parse_foo()
function is called to populate the struct's
contents from the property's inner object value. Finally, the length of the
array num_items
is incremented.
void parse_foo(struct yocton_object *obj, struct foo *item);
while ((p = yocton_next_prop(obj)) != NULL) {
YOCTON_VAR_PTR_ARRAY(p, "item", items, num_items, {
parse_foo(yocton_prop_inner(p), items[num_items]);
num_items++;
});
}
There are many different types of error that can occur while parsing a Yocton file. For example:
- Syntax error
- Memory allocation failure
- Property has unexpected type (string for an object property, or vice versa)
- Invalid property value (eg. overflow when parsing an integer value)
- Violation of a user-provided constraint
Continual checking for error conditions can make for complicated code. The Yocton API instead adopts an "error state" mechanism for error reporting. Write your code assuming success, and at the end, check once if an error occurred.
Here's how this works in practice: most parsing code involves continually
calling yocton_next_prop()
to read new properties from the file. If an error
condition is reached, this function will stop returning any more properties. In
effect it is like reaching the end of file. So when "end of file" is reached,
simply check if an error occurred or whether the document was successfully
parsed.
Here is a simple example of what this might look like:
// Returns true if file was successfully parsed:
bool parse_config_file(const char *filename, struct config_data *cfg)
{
FILE *fs;
struct yocton_object *obj;
const char *error_msg;
int lineno;
bool success;
fs = fopen(filename, "r");
if (fs == NULL) {
return false;
}
obj = yocton_read_from(fs);
parse_config_toplevel(obj, cfg);
success = !yocton_have_error(obj, &lineno, &error_msg);
if (!success) {
fprintf(stderr, "Error in parsing config:\n%s:%d:%s\n",
filename, lineno, error_msg);
}
yocton_free(obj);
fclose(fs);
return success;
}
Some of the API functions will also trigger the error state. It may be tempting to add extra checks in your code to avoid this happening, but it is better that you do not. If an error is triggered in this way, it is likely that it is due to an error in the file being parsed. Your API calls implicitly document the expected format of the input file. If the file does not conform to that format, it is the file that is wrong, not your code.
An example may be illustrative. Suppose your Yocton files contain a property
called name
which is expected to have a string value. If the property has an
object value instead, a call to yocton_prop_value()
to get the expected
string value will trigger the error state. That is not a misuse of the API;
your code is implicitly indicating that a string was expected, and the input is
therefore erroneous. The line number where the error occurred is logged, just
the same as if the file itself was syntactically incorrect.