ObjFW  Artifact [ce29dc838b]

Artifact ce29dc838bab7ce530dc6b6f2adca22beaedbbcb91ea96ba90eb80c43df7c1fc:


1.) Basic Concept

This document describes the serialization used by ObjFW. It is designed to be
easy parsable and usable in other programming languages while still supporting
ObjFW-specific features, which are all optional.

Every object can have a set of parameters which are optional and can be ignored
if they don't exist in the language for which parsing is done. These parameters
are written in the form (parameter1,parameter2) and precede the object.
Parameters not known by the implementation should be ignored - they are
completely optional except for the extension type.

All spaces (except those in strings, of course) are optional and only to improve
readability - they are by no means required, but they are still recommended.

ObjFW serialization supports 4 basic types: Strings, arrays and dictionaries.
It is limited to those 4 types because those are available in all languages.
Lists are a special case of arrays in ObjFW serialization. For all objects
that can not be serialized with those 4 basic types, there is the extension
type [], which can be used for any object but is not portable between languages.
This type is described in 6.).


2.) Strings

Strings are very similar to how strings are done in C. They start with a " and
end with a ". The escape sequences \", \\, \n, \r and \t exist, like in C
strings. It is required to use \", \\, \n and \r. The reason for this is that
using an actual newline would modify the string when indention is done. However,
using \t is recommended.
Strings can be split just like in C. For example "Hello " "World" is equivalent
to "Hello World". Because this is possible, it is recommended to end the string
after \n and continue it in a new line to increase readability.

Strings are required to be UTF-8 encoded.

The only accepted parameter for strings is mutable. If the language for which
parsing is done does not know the concept of mutable and immutable objects, this
parameter should be ignored.

Examples for strings:

	"This is a \"string!\""

	"This is a string containing a\n"
	    "new line!"

	"This is\ta string\tcontaining tabs!"

	(mutable)"This string is mutable!"


3.) Arrays

Arrays start with a [ and end with an ]. The elements are separated by a comma.
Whitespaces are allowed between objects and after the inital [ and before the
final ].

It allows the mutable parameter, which should be handled the same way like it
should be for strings.

It also has the list parameter. Specifying this parameter creates a
double-linked list instead of an array.

It also allows specifying a number as a parameter. This number is considered
the expected size of the array. It is by no means required and it should by no
means be assumed to be reliable. If the number is not equal to the actual size,
the parser should error out. The parser should also make sure to ignore the
number if it is too big to prevent a possible DoS.

Examples for arrays:

	["This", "is", "an", "array", "with", "7", "strings"]

	(mutable)["This array", "is mutable"]

	(3)["This array specifies", "the number of elements", "for performance"]

	(mutable,2)["Parameters can be", "combined"]


4.) Dictionaries

Dictionary start with a { and end with a }. The elements are written in the form
key = value and each entry ends with an ;.

It allows the mutable parameter, which should be handled the same way like it
should be for strings.

It also allows specifying a number as a parameter, which should be handled
exactly like for arrays.

Examples for dictionaries:

	{"This is a key" = "This is a value"}

	(mutable){"mutable" = (BOOL)1}

	(2){
		"key1" = "value1",
		"key2" = "value2"
	}

	{
		["Mapping", "an", "array"] = "To a string";
		{ "mapping" = "a dictionary" } = ["To", "an", "array"];
	}


5.) Numbers

Numbers are written by just writing a number. The type of a number may be
specified by a parameter. If it is not specified, the implementation should
choose are type which fits the number. If the specified type is not big enough
for the number, the implementation should use another type that fits.

Known parameters are:
	BOOL
	char		(signed!)
	short		(signed!)
	int		(signed!)
	long		(signed!)
	int8_t
	int16_t
	int32_t
	int64_t
	unsigned char,
	unsigned short,
	unsigned int,
	unsigned long
	uint8_t
	uint16_t
	uint32_t
	uint64_t
	size_t
	ssize_t
	intmax_t
	uintmax_t
	intptr_t
	uintptr_t
	float
	double

Examples for numbers:

	1
	2.6
	(BOOL)0
	(intmax_t)1234567
	(double)2.5


6.) Extension Type

The extension type allows adding new objects to ObjFW serialization. The
extension type has a parameter class= which specifies the class which should
handle deserialization. The extension type starts with a [ and ends with a ].
Inside those brackets can be arbitray basic types, which should be passed 
unmodified to the class for deserialization.
If an implementation can't deserialize an extension type, it is required to
error out. Other languages are allowed to parse extension types of classes which
are in ObjFW, like OFXMLElement, but are by no means required to do so. Other
languages may also add their own extension types, but are required to add the
foreign= parameter and set it to their name, so other implementations don't try
to deserialize it, but error out instead. Other implementations are allowed to
serialize to ObjFW objects if they know them. For example, it might be desirable
to also create OFXMLElements from other languages.

Examples for using the extension type:

	(class=OFXMLElement)<"<some-xml/>">

 	(class=OFURL)<
		"https://webkeks.org/objfw/"
	>

 	(class=Foo,foreign=Foolang)<
		{
			"property1" = "value1"
		}
	>