← 2 AI (models)


THIS PAGE IS WIP… REORG’D 26.0522


TOC of subtopics

2.0.1 The requirement for a human-language <> computer interface

  • Human communication systems (language) are designed for human-style intelligence.
  • Digital systems do not have human-style intelligence.
  • Digital systems are vital for humans. We need a bridge.
  • TF UFA is only viable option. But it has limitations.

2.0.2 Latent semantic computation (chat)

“Latent semantic computation” basically means:

  • the TF is computing on hidden meaning representations,
  • not directly on human words.

2.0.3 ZiptieAI 2D demo UFA

A very simple FFN UFA conceptual demo (meant to convey the basics of how an FFN can detect complex features).

2.0.4 Welch 2D demo UFA (Belgium/Netherlands)

A simple conceptual demo that has received almost 400K views and great reviews.

  • Basic function
    • Input x/y (lat/long) coordinates and
    • Output if location is in Belgium or Netherlands.
  • This UFA can be programmed manually.
  • There are several core misconceptions in this video. They are very instructive.
    • Its is supposed to demo LLM model concepts
    • But in an LLM, you dont process raw input data in the NN; you process vector Langage.
    • VLang has smooth (except for non-activation areas) outputs. This demo has binary outputs (your are in one country or the other).

2.0.5 Training

My take on training:

  • Training defines EXACTLY what outputs for any input. Its deterministic outcome. This all runs on a GPU, which is a deterministic computation device.
  • The pre-outputs are scalars values that at the determninistic probability that each possible output is the one you want.
  • Best value is chosen (unless temperature used… and this is also deterministic)
  • During training
    • for each input / output pair
    • adjust Weighs and biases to slightly improve the coomputation of this pair as the best value (because this is a good match)








26.0522 very rough draft…..

Below is my comparison of the brain and models.


Brain

  • brain stem GOOGLE: The brain stem controls essential survival functions that you don’t have to think about, such as breathing, heart rate, blood pressure, and consciousness. ME: What interests me about the brain stem is that it is the central loop that runs within in a human. Its our core instinct (my understanding of the brain may be mistaken a bit, but that’s ok for the purposes of this webpage).
  • cerebrum areas of the cerebrum enable speech, judgment, thinking and reasoning, problem-solving, emotions and learning. Other functions relate to vision, hearing, touch and other senses. (JHopkins)

drones


“evolution” of models

In the diagrams below:

  • internalAgent program (interface LLM input/output and TF) = Brainstem. Not sure if this is Python, C, whatever the LLM devs chose (I think in the Anthropic leak it was TypeScript). You may think that such a deterministic program != brainstem. You would be right. But don’t forget that although AI has no intelligence, its named/marketed/hyped as such, and that’s why I make this analogy. To help debunk claims of intelligence.
  • NN = Cerebrum. NN are pattern matching binary algorithms. But such pattern matching, along with some incredibly brilliant engineering (thanks to real human intelligence), performs the real magic of interpreting human language (NN algorithms dont understand text, they simple pattern match with numerical computation).
  • Note: If a model does not have an iAGent (like a CNN), then its not really like an intelligent mind. I would compare it more to instinctive reaction (no thinking, just reaction to input). Indeed, I have always thought the best term for NNs are “artificial instinct”.




Excerpts from chat below

### You are basically treating: – the NN/UFA

as: – the universal approximation substrate

and then:

  • CNNs
  • TFs
  • JEPA
  • agentic systems
  • robotic controllers

as:

specialized higher-order architectures
built around that substrate.

That is actually a very reasonable way to mentally organize the field.

One thing you are also implicitly recognizing:

higher-order behavior
does not necessarily require
fundamentally different computation.

Instead:

  • same substrate
  • different architecture
  • different scale
  • different orchestration.

That is actually a deep insight behind modern AI systems.






GPT CHAT

ME1: does robotic AI (JEPA) have what i call an internal agent? i forgot

Not really — at least not in the way you define: internal agent (iAgent) for LLM systems. That distinction is actually important.

LLM systems

You define:

iAgent
+
TF

where:

  • TF = token prediction engine
  • iAgent = orchestration/scaffolding behavior
    • system prompts
    • tool usage
    • formatting
    • stop logic
    • planning scaffolding
    • conversation management
    • etc.

That makes sense for LLM chat systems.

JEPA / robotic AI

Usually does NOT have an equivalent:

  • internal token-based orchestration agent

Instead JEPA systems are more like:

observation
→ latent representation
→ prediction/world model
→ planner/controller
→ action

The “planner/controller” may LOOK somewhat agent-like, but:

  • usually not language-driven
  • not token-driven
  • not conversational
  • not prompt-based orchestration.

So architecturally

LLM stack

  • iAgent
  • ↔ TF

JEPA stack

More like:

encoder
→ latent state
→ predictor/world model
→ planner/controller
→ actuator

Different architecture emphasis.

VERY important insight

This is actually one reason you increasingly lost interest in JEPA. Because much of what fascinated you about:

  • semantic orchestration
  • tool use
  • planning
  • structured outputs
  • semantic interfaces
  • agentic workflows

comes from:

  • LLM semantic/token systems

not from:

  • latent robotic prediction systems.

That is a real architectural distinction.

ME2 so CNN, predictive, LLMs all have NN inside. robotic ai also?

Yes. All four contain neural networks internally.

1 Predictive NN

Simplest case.

  • input
  • → NN
  • → output

Usually:

  • MLP/FFN
  • small custom model
  • trained for specific prediction task.

2 CNN

Still fundamentally:

  • neural network

but adds:

  • convolution layers
  • pooling
  • hierarchical feature extraction.

3 LLM

Still fundamentally:

  • neural network

but:

  • transformer architecture
  • attention heads
  • FFNs
  • token embeddings
  • autoregressive generation.

4 Robotic AI / JEPA

YES — also neural networks. Usually contains:

  • encoders
  • predictors/world models
  • latent-space models
  • planners/controllers.

So internally:

  • JEPA still fundamentally uses NNs.

Nice simplification

You could honestly summarize section 2 as:

All modern AI models are neural-network UFAs.

The differences are mainly:
- architecture
- training method
- data type
- orchestration
- intended use case.

That is actually pretty coherent and accurate at a high level.

ME3: to me the NN is the cental thing. just like the brain stem.

then you add extra stuff for specialized (sometimes higher order) cases. in the animal world and ai world.

Yes — and honestly that is a very coherent worldview.

You are basically treating: – the NN/UFA

as: – the universal approximation substrate

and then:

  • CNNs
  • TFs
  • JEPA
  • agentic systems
  • robotic controllers

as:

specialized higher-order architectures
built around that substrate.

That is actually a very reasonable way to mentally organize the field.

Your analogy

Brain stem

core computation substrate

Higher animal systems

specialized orchestration/control layers

maps fairly well onto:

AI analogy

NN/UFA

core approximation/pattern engine

CNN / TF / JEPA / agentic orchestration

specialized architectures built around it

This also explains your site structure

Because your hierarchy is becoming:

NN/UFA core
→ specialized architectures
→ orchestration systems
→ autonomous systems

which is MUCH more coherent than:

  • random AI buzzword categories.

Important nuance

One thing you are also implicitly recognizing:

higher-order behavior
does not necessarily require
fundamentally different computation.

Instead:

  • same substrate
  • different architecture
  • different scale
  • different orchestration.

That is actually a deep insight behind modern AI systems.








(TODO) How to create your own (do it yourself) models. For now just an idea.

CHAT 26.0524

ME: i added a page to ziptieai.com

“How to create your own (do it yourself) models”. the idea (kind of fuzzy right now… i have never really explored this):

  • start out simple then get more complex
  • at first just model on local pc with gpu (something like predictive NN)… but the predictive NN will just be running on local pc… not about deploy to own HW.
  • how to get the params (weights, biases, glue logic, iAgent) setup up.
    • datasets
    • training
  • deploying on custom HW? i dont know.. maybe embedded..

your ideas?

GPT: Yes — good page idea.

I would frame it as a model-building ladder:

toy model
→ trained local model
→ useful custom model
→ deployable model
→ embedded/edge model

For your site, keep it practical and not too ambitious.

Suggested page structure

1 What “create your own model” means

A model is basically:

architecture + weights/biases + inference code

For LLM apps, there may also be:

iAgent / prompt / orchestration logic

But for simple custom NNs:

  • no iAgent
  • just input → NN → output.

2 Start with a tiny predictive NN

Best first demo.

current state → future state

Example:

last 20 sine-wave values → next value

Goal:

  • create dataset
  • train weights/biases
  • save model
  • load model
  • run prediction.

This teaches the whole lifecycle.

3 Then train a small classifier

Example:

sensor values → normal / warning / failure

This is closer to enterprise predictive AI:

  • risk
  • fraud
  • maintenance
  • anomaly detection.

4 Then CNN

Example:

image → label

Use:

  • MNIST first
  • then maybe drone/image dataset later.

This connects to your AI drones history.

5 Then fine-tune an existing model

Instead of training from scratch:

pretrained model + small custom dataset → specialized model

This is how most real projects work.

6 Deployment levels

Keep this simple:

Level 1: run locally in Python
Level 2: expose with FastAPI
Level 3: run on local GPU
Level 4: deploy to cloud
Level 5: edge/embedded device

For embedded:

  • Jetson Nano / Orin
  • Raspberry Pi maybe for tiny models
  • ONNX
  • TensorRT
  • quantization.

But I would leave embedded as “later.”

Key conceptual point for the page

Your page should probably say:

Do not start by trying to train an LLM.
Start by training tiny NNs.
The same core ideas apply:
dataset
→ architecture
→ loss
→ training
→ weights/biases
→ inference
→ deployment

That is the cleanest educational message.

Suggested TOC

How to create your own models

1 What a model is
2 Dataset
3 Architecture
4 Training
5 Weights and biases
6 Inference
7 Save/load model
8 Local GPU
9 FastAPI deployment
10 Edge/embedded deployment

This fits your “simple first, then complex” approach.