Categories

The idea of categories is to make a super-structure where as much information as possible is implicit in the structure (deduced using what is specified, rather than explicitly specified). In SEMAK, we want a structure that can incorporate any information into the system in a very usable and efficient way. With that, there is the point that we need to implement some things in C++ either to make certain functionality accessible or to make it happen super fast. Imagine something like doing a fourier transform on a sound signal with some pattern matching on the major frequencies over time to try and form symbols at a small level and a kind of "story" of events at the higher level, all while informing the more cognizant and delegative part of the system when something important enough occurs. Forming a symbol at the small level can mean any way we find to represent noteworthy information as a part of the story or a message to the more cognitive part of the system. Two microphones pointing in opposite directions can give a pretty good sense of where an audio source is, there might be a symbol that goes into the story when a sensory (acoustic in this example) thread detects an audio source with some sort of weird property (maybe it sounded like it came from an unexpected location, or doesn't match up with any known sound patterns, or something has moved somewhere, any number of things really). Maybe we want these parts of the system to communicate based on all having access to the same categorical information, that way a sensory system could specify a story almost like code, it could give a code to represent adding a new entity into the story, then it could algorithmically specify the details (like category, instance data, etc.).

The possibilities are endless anyways, especially since I plan to put the source code out so that people can tweak and extend the system as they please. If you remember that part of my rant about AIs hopping into systems running on different hardware, then maybe you've already got a sense about how categorization makes that feasible (even when people extend their system). You can categorize things like interfaces implemented in C++, sub-categorize that into things like system call, or general device related interfaces and information. There might be things somewhere in the category of 'Physics', maybe in the sub-category of 'Models', or some 'Algorithm' sub-category that get used to make device drivers dynamically (like knowing the details of how to specify the right PWM duty cycle in order to get a servo motor to be at a certain position). All of this is information that a computer-jumping sentient robot may want to consider when packing its bags for another host system. Having hardware interfaces (or information very specific to them) is a matter of physical capabilities that can be controlled, and means nothing (except maybe as a kind of communicable experience) on another system, whereas things like intent and other information can be worthwhile things to send over.

Ultimately, there should be primitive categories and compositions of them into bigger and deeper things. By having separate command interfaces, we can keep our instructions small. By categorizing these interfaces, we've come up with a categorization that has special meaning to the processor (it knows to use a specific call mechanisms, which is probably different from the one implemented in the execution engine).

Primitive Categories, Their Identification and Processing

To introduce this concept, we'll consider an example of a primitive category. Consider something like a "Boolean Composition", not that it's conceptually primitive, it could be thought of as an instance of the notion of a "Bitwise composition", for example. But it's about what we implement in C++ to save space and time, rather than making our notion of primitive match conceptual standards of primitivity. We might even have a way to specify the condition for our "Bitwise configuration" to be able to be categorized more specifically as a "Boolean configuration" (the bitwise entities in sequence are homogeneous and 1-bit apart), perhaps we want this optimization to happen automatically as we make structures (call it dynamic optimization). The other thing is that homogeneity affects bitwise entity compositions the same way it affects byte-wise compositions (the entity offset calculation becomes a linear function, rather than storing offset sequences and doing a lookup), so there's kind of a 2x2 matrix of possible categories defined over the homogeneity status and the "bit or byte entity" status. The entity type status defines whether offset/size calculations measure in bits or bytes, while the homogeneity status tells us whether we define the binary format using a size per element (in case of homogeneity) or a sequence of offsets (each provides quick random access, as long as you have hardware multipliers for the homogeneity case). Let the paradigm shift sink in, we've went from:

Thinking of a homogeneous single-bit composition as an instance of bitwise composition (where offsets in the sequence would indicate entities are all 1 bit in size)
Thinking of homogeneity as defining the information we use and how we use it to calculate an address, and imagining entity type status as having the orthogonal effect (procedurally) of making this address have meaning at the bit or byte level.

In the first case, we're thinking of it this way because bitwise compositions are more general in the sense that they have data that is proportional to the number of entities, so there is more degrees of freedom in what can be represented (the data for homogeneous structures is basically constant as entities grow, there is less freedom). For every homogeneous composition, there is a simple way of representing that using a general bitwise composition (but not the other way around).

In the second paradigm, we see what primitive means in the processor, orthogonal/independent things that get associated with their unique processor functions and combine to form a final accumulated interpretation. We need two bits of information, the one for homogeneity tells us something about how to interpret the rest of the categories information, the one for bit/byte mode tells us how to get and set the value for an indexed entity from some data identified by a memory location (which depends on whether addressing is based on bits or bytes).

We use the idea of states when we talk about representing things, 1 bit is needed to represent homogeneity (homogeneous = 1, heterogeneous = 0), same for bit/byte mode. Any combination of heterogeneity and bit/byte mode together is valid, so there are 4 possible states of things (2 bits worth). This is what happens when we need one state AND another state in our final sequence of data, we put the string of bits needed to represent each state in sequence, one string of bits after the other. What if we need one state OR another state? In this case, we use a string representing a different number for each potential state (the final state is one or the other, not a combination of them together).

There's information about how we represent the category, and information for how we get computation-ready binary representations out of instance memory, all packed into a sequence of bits and put right at the start of a category definition since its format depends on the information encoded in the binary string (which will be useful when we start embedding category definitions in memory or a file structure). Let's get formal, we want a way of representing or implying the state in the category definition, and how its interface is used. We want it to scale well, for 64 possible states, it would be ideal if we deterministically decomposed that into 6 binary sub-states, and potential 6 procedural changes (one for each sub-state), rather than writing out 64 unique (and large) procedural changes for interpreting the category and its instances, which would be a mistake if we could factor out a few conceptual changes and represent those as substates in the final representation. So formally we want to define a representation for categorical interpretation (R) of the form:

$R = S_0 | S_1 | ... | S_K$

Where '|' represents the binary concatenation operator, and each substate $S_0,..,S_K$ is being concatenated in a way that I'll describe. We can say:

$ \forall i \in \mathbb{N}[0, K], \exists M \in \mathbb{N} : S_i \in \mathbb{N}[0, 2^{m}) $

In other words, we can assign $M$ bits to each substate, knowing it can be represented within these bits ($M$ bits can represent any unsigned number from $0$ to $2^{M} - 1$). We would like each substate to be allocated a personal length of bits because this will make it computationally easy to decode from $R$. We can define a function $L$ such that $L : S_i \in \mathbb{N}[0, 2^{M_i} - 1] \Rightarrow M_i$. Let $R[m,n]$ represent the number you get when you go to the bit at index $m$ in $R$, and take $n$ bits starting at the location $m$.

$(R = S_i | S_j) \iff (R[0, L(S_i)] = S_i$ and $R[L(S_i), L(S_j)] = S_j)$

Basically $R$ is made by starting with the bits of $S_i$, then continuing with the bits of $S_j$. Notice this operation ($f : (S_i, S_j) \Rightarrow S_i|S_j$) is associative, non-commutative, and conditionally invertible provided we know $L(S_i)$ and $L(S_j)$ (in order to know $f : S_i|S_j \Rightarrow S_i$ and $g : S_i|S_j \Rightarrow S_j$).

So, referring back to the equation:

$R = S_0 | S_1 | ... | S_K$

We want to maximize the number of distinct groups of procedural modes, identified as substates $S_0,.., S_K$, essentially we want to optimize the ratio of $K$ to the average of $L(S_i)$ over $i \in \mathbb{N}[0, K]$. Practically, consider when $L(S) = 1$, we have two possible implementations to choose from (and that we have to implement), when $L(S) = 2$ we have four implementations (to write and potentially use), $L(S) = 3$ gives 8 implementations. The amount of work per substate is proportional to the number of potential states that the substate has. So we want our substates to usually be binary (this or that) since this is when $Work / Avg(L(S))$ is minimized, but the golden rule is that if they cannot both be composed at the same time, then combine them into the same substate, rather than making them a part of different substates.

What we've covered here is the basis of category definitions and what we want to implement in C++. Besides the mechanisms covered, we could also talk about using contexts to partition classes of objects by what substates are used in its representation code. Our binary concatenation mechanism combines groups of substates $\{S_{0i}\}$,..,$\{S_{Ui}\}$ in a way that can be semantically represented as:

($S_{00}$ OR ... OR $S_{0N}$) AND ... AND ($S_{U0}$ OR ... OR $S_{UM}$)

When a class of things can be described with this pattern, we tend to do exactly that. What cannot or should not be described with this pattern? Let us consider a kind of antithesis pattern. Let $\mathbb{P}$ represent the class of all things that can be described with this pattern, let $P_0$,..,$P_N$ $\in \mathbb{P}$. Consider the expression:

$P_0$ OR ... OR $P_N$ $= POR\{P_i\} = POR\{ PAND\{POR\{S_{ij}\}\} \}$

There may be shared substates between any two values $P_i$ and $P_j$ (represented as $S_i = POR\{S_{ij}\}$)

There is a sense in which $P_i$ and $P_j$ can have common factors. We can find the largest possible $C \in \mathbb{P}$ such that $P = C$ AND $S$ for both $P = P_i$ and $P = P_j$, each sharing $C$ while having their unique $S \in \mathbb{P}$ (they share the same C value but have their own unique substates as well). Let $P_i = C$ AND $S_i$,and let P_j have the same form (note that we could define this expression for $S_i \in \mathbb{P}$, and the result would be the same). We can write:

($P_i$ OR $P_j$) = ($C$ AND $S_i$) OR ($C$ AND $S_j$) = $C$ AND ($S_i$ OR $S_j$)

This is a hint towards how we deal with categories with primitives that don't neatly fit a specific $S \in \mathbb{P}$. Here's some options that work with our formula:

Have a categorical context that decides the common part $C$ of $P_i$ and $P_j$, have a unique derived category for every different ending (some might be other contexts with their own shared part and sub-categories).
Allocate a bit sequence with the $C$ value, Allocate what you need to hold the sum number of states over $S_i$ and $S_j$, use an invertible function to map the sequences into their shared space. For instance, if $S_i$ and $S_j$ have $N_i$ and $N_j$ states, and $S_i[k]$ represents the $k$-th state in $S_i$, then we could use the function:
$f(k) = \left\{ \begin{array}{ll} S_i[k] & k\le N_i \\ S_j[k - N_i] & N_i <= k < N_i + N_j \end{array} \right.$
This works because given $k$ because we know which state in $S_i$ or $S_j$ this corresponds to, provided we know at least $N_i$