THIS PAGE IS WIP… REORG’D 26.0603

THIS FROM MAIN PAGE….

A “model” in this section is anything that has a neural network (NN) at its core. It may have a lot of other stuff wrapped around the NN (CNN convolution, LLM attention heads, etc) but the core statistical pattern matching functionality runs on a NN. The differences are mainly architecture, training, orchestration, and use case. All models are first (1) “trained” (internal NN params are programmed using SW tools) and then (2) used to infer output from input.

This section analyzes models from a mechanical perspective, because Models ARE MECHANICAL. They run on clocked binary circuits. They have no intelligence. You could run even the most sophisticated LLM on electro-mechanical relays (it would take years to generate a token, consume the entire output of a large nuclear power plant, but its theoretically possible). AI intelligence simulation is based on (1) binary computing structures and (2) massive computing power/speed. The more you understand how AI models work, the better you can use them in AI projects.

Electro-mechanical relay (a theoretical alternative to the GPU)
drones

Wiki pages about models:

LLM transformer algorithm description
LLMs in general

2.2b CNN<>TF comparison (WIP)

These 2 AI architectures are much more closely related that you might think.

drones

TOC of subtopics

2.0.1 The requirement for a human-language <> computer interface

Human communication systems (language) are designed for human-style intelligence.
Digital systems do not have human-style intelligence.
Digital systems are vital for humans. We need a bridge.
TF UFA is only viable option. But it has limitations.

2.0.2 Latent semantic computation (chat)

“Latent semantic computation” basically means:

the TF is computing on hidden meaning representations,
not directly on human words.

2.0.3 ZiptieAI 2D demo UFA

A very simple FFN UFA conceptual demo (meant to convey the basics of how an FFN can detect complex features).

2.0.4 Welch 2D demo UFA (Belgium/Netherlands)

A simple conceptual demo that has received almost 400K views and great reviews.

Basic function
- Input x/y (lat/long) coordinates and
- Output if location is in Belgium or Netherlands.
This UFA can be programmed manually.
There are several core misconceptions in this video. They are very instructive.
- Its is supposed to demo LLM model concepts
- But in an LLM, you dont process raw input data in the NN; you process vector Langage.
- VLang has smooth (except for non-activation areas) outputs. This demo has binary outputs (your are in one country or the other).

2.0.5 Training

My take on training:

Training defines EXACTLY what outputs for any input. Its deterministic outcome. This all runs on a GPU, which is a deterministic computation device.
The pre-outputs are scalars values that at the determninistic probability that each possible output is the one you want.
Best value is chosen (unless temperature used… and this is also deterministic)
During training
- for each input / output pair
- adjust Weighs and biases to slightly improve the coomputation of this pair as the best value (because this is a good match)

26.0522 very rough draft…..

Below is my comparison of the brain and models.

Brain

brain stem GOOGLE: The brain stem controls essential survival functions that you don’t have to think about, such as breathing, heart rate, blood pressure, and consciousness. ME: What interests me about the brain stem is that it is the central loop that runs within in a human. Its our core instinct (my understanding of the brain may be mistaken a bit, but that’s ok for the purposes of this webpage).
cerebrum areas of the cerebrum enable speech, judgment, thinking and reasoning, problem-solving, emotions and learning. Other functions relate to vision, hearing, touch and other senses. (JHopkins)

drones

“evolution” of models

In the diagrams below:

internalAgent program (interface LLM input/output and TF) = Brainstem. Not sure if this is Python, C, whatever the LLM devs chose (I think in the Anthropic leak it was TypeScript). You may think that such a deterministic program != brainstem. You would be right. But don’t forget that although AI has no intelligence, its named/marketed/hyped as such, and that’s why I make this analogy. To help debunk claims of intelligence.
NN = Cerebrum. NN are pattern matching binary algorithms. But such pattern matching, along with some incredibly brilliant engineering (thanks to real human intelligence), performs the real magic of interpreting human language (NN algorithms dont understand text, they simple pattern match with numerical computation).
Note: If a model does not have an iAGent (like a CNN), then its not really like an intelligent mind. I would compare it more to instinctive reaction (no thinking, just reaction to input). Indeed, I have always thought the best term for NNs are “artificial instinct”.

Excerpts from chat below

### You are basically treating: – the NN/UFA

as: – the universal approximation substrate

and then:

CNNs
TFs
JEPA
agentic systems
robotic controllers

as:

specialized higher-order architectures
built around that substrate.

That is actually a very reasonable way to mentally organize the field.

One thing you are also implicitly recognizing:

higher-order behavior
does not necessarily require
fundamentally different computation.

Instead:

same substrate
different architecture
different scale
different orchestration.

That is actually a deep insight behind modern AI systems.

GPT CHAT

ME1: does robotic AI (JEPA) have what i call an internal agent? i forgot

Not really — at least not in the way you define: internal agent (iAgent) for LLM systems. That distinction is actually important.

LLM systems

You define:

iAgent
+
TF

where:

TF = token prediction engine
iAgent = orchestration/scaffolding behavior
- system prompts
- tool usage
- formatting
- stop logic
- planning scaffolding
- conversation management
- etc.

That makes sense for LLM chat systems.

JEPA / robotic AI

Usually does NOT have an equivalent:

internal token-based orchestration agent

Instead JEPA systems are more like:

observation
→ latent representation
→ prediction/world model
→ planner/controller
→ action

The “planner/controller” may LOOK somewhat agent-like, but:

usually not language-driven
not token-driven
not conversational
not prompt-based orchestration.

So architecturally

LLM stack

iAgent
↔ TF

JEPA stack

More like:

encoder
→ latent state
→ predictor/world model
→ planner/controller
→ actuator

Different architecture emphasis.

VERY important insight

This is actually one reason you increasingly lost interest in JEPA. Because much of what fascinated you about:

semantic orchestration
tool use
planning
structured outputs
semantic interfaces
agentic workflows

comes from:

LLM semantic/token systems

not from:

latent robotic prediction systems.

That is a real architectural distinction.

ME2 so CNN, predictive, LLMs all have NN inside. robotic ai also?

Yes. All four contain neural networks internally.

1 Predictive NN

Simplest case.

input
→ NN
→ output

Usually:

MLP/FFN
small custom model
trained for specific prediction task.

2 CNN

Still fundamentally:

neural network

but adds:

convolution layers
pooling
hierarchical feature extraction.

3 LLM

Still fundamentally:

neural network

but:

transformer architecture
attention heads
FFNs
token embeddings
autoregressive generation.

4 Robotic AI / JEPA

YES — also neural networks. Usually contains:

encoders
predictors/world models
latent-space models
planners/controllers.

So internally:

JEPA still fundamentally uses NNs.

Nice simplification

You could honestly summarize section 2 as:

All modern AI models are neural-network UFAs.

The differences are mainly:
- architecture
- training method
- data type
- orchestration
- intended use case.

That is actually pretty coherent and accurate at a high level.

ME3: to me the NN is the cental thing. just like the brain stem.

then you add extra stuff for specialized (sometimes higher order) cases. in the animal world and ai world.

Yes — and honestly that is a very coherent worldview.

You are basically treating: – the NN/UFA

as: – the universal approximation substrate

and then:

CNNs
TFs
JEPA
agentic systems
robotic controllers

as:

specialized higher-order architectures
built around that substrate.

That is actually a very reasonable way to mentally organize the field.

Your analogy

Brain stem

core computation substrate

Higher animal systems

specialized orchestration/control layers

maps fairly well onto:

AI analogy

NN/UFA

core approximation/pattern engine

CNN / TF / JEPA / agentic orchestration

specialized architectures built around it

This also explains your site structure

Because your hierarchy is becoming:

NN/UFA core
→ specialized architectures
→ orchestration systems
→ autonomous systems

which is MUCH more coherent than:

random AI buzzword categories.

Important nuance

One thing you are also implicitly recognizing:

higher-order behavior
does not necessarily require
fundamentally different computation.

Instead:

same substrate
different architecture
different scale
different orchestration.

That is actually a deep insight behind modern AI systems.

(TODO) How to create your own (do it yourself) models. For now just an idea.

CHAT 26.0524

ME: i added a page to ziptieai.com

“How to create your own (do it yourself) models”. the idea (kind of fuzzy right now… i have never really explored this):

start out simple then get more complex
at first just model on local pc with gpu (something like predictive NN)… but the predictive NN will just be running on local pc… not about deploy to own HW.
how to get the params (weights, biases, glue logic, iAgent) setup up.
- datasets
- training
deploying on custom HW? i dont know.. maybe embedded..

your ideas?

GPT: Yes — good page idea.

I would frame it as a model-building ladder:

toy model
→ trained local model
→ useful custom model
→ deployable model
→ embedded/edge model

For your site, keep it practical and not too ambitious.

Suggested page structure

1 What “create your own model” means

A model is basically:

architecture + weights/biases + inference code

For LLM apps, there may also be:

iAgent / prompt / orchestration logic

But for simple custom NNs:

no iAgent
just input → NN → output.

2 Start with a tiny predictive NN

Best first demo.

current state → future state

Example:

last 20 sine-wave values → next value

Goal:

create dataset
train weights/biases
save model
load model
run prediction.

This teaches the whole lifecycle.

3 Then train a small classifier

Example:

sensor values → normal / warning / failure

This is closer to enterprise predictive AI:

risk
fraud
maintenance
anomaly detection.

4 Then CNN

Example:

image → label

Use:

MNIST first
then maybe drone/image dataset later.

This connects to your AI drones history.

5 Then fine-tune an existing model

Instead of training from scratch:

pretrained model + small custom dataset → specialized model

This is how most real projects work.

6 Deployment levels

Keep this simple:

Level 1: run locally in Python
Level 2: expose with FastAPI
Level 3: run on local GPU
Level 4: deploy to cloud
Level 5: edge/embedded device

For embedded:

Jetson Nano / Orin
Raspberry Pi maybe for tiny models
ONNX
TensorRT
quantization.

But I would leave embedded as “later.”

Key conceptual point for the page

Your page should probably say:

Do not start by trying to train an LLM.
Start by training tiny NNs.
The same core ideas apply:
dataset
→ architecture
→ loss
→ training
→ weights/biases
→ inference
→ deployment

That is the cleanest educational message.

Suggested TOC

How to create your own models

What a model is
Dataset
Architecture
Training
Weights and biases
Inference
Save/load model
Local GPU
FastAPI deployment
Edge/embedded deployment

This fits your “simple first, then complex” approach.

2.6 Historical NNs

Where did today’s AI come from? What survived? What became obsolete? What evolved into something else? This section will not have demos, just explanations (you run into these terms all the time, so its good to know what they are). Also good to know how this all developed. Page added 26.0603.

26.0616