TERA

Terrestrial Extendable Retrieval & Addressing

A semantic content discovery network with cryptographic integrity guarantees.

What is TERA?

TERA is a peer-to-peer network that enables similarity-based content discovery while preventing spam through cryptographic verification. Unlike traditional content-addressed networks (like IPFS) that require exact content hashes, TERA allows you to find "similar" content with provable integrity.

The Core Innovation

Traditional DHTs have a fundamental limitation: they can only verify exact content identity. TERA introduces a dual-output hash function that provides:

H_crypto - A homomorphic hash supporting O(1) incremental extensions
H_semantic - Universal neural kernels for multi-modal similarity

This enables integrity-gated semantic search: nodes can verify that content legitimately extends a root hash while filtering spam based on relevance.

Key Properties

Universal: Works with any content type (text, images, code, audio, binary)
Extendable: Add content to a collection in O(1) time without recomputing the entire hash
Verifiable: Cryptographically prove that content B extends content A
Spam-resistant: Invalid extensions are automatically rejected by the network
Parameterized: Users define their own notion of "similarity" via runtime kernel parameters
Model-agnostic: Neural kernels are content-addressed and exchangeable (like codecs)

How It Works

Traditional IPFS:
Content → SHA-256 → Exact lookup → Retrieve

TERA:
Content → (H_crypto, H_semantic) → Similarity search + Verification → Discover

Example:

// Extract universal features (works for text, images, code, etc.)
features, _ := semantic.ExtractFeatures(content, filename)
// → FeatureVector{Modality: "text", Data: [512]float32, Hash: "..."}

// Load a neural kernel by content ID (cached locally like a codec)
registry := semantic.NewKernelRegistry("~/.tera/kernels")
kernel, _ := registry.Get("bafyreisemantic...")  // IPLD CID

// Compute similarity with runtime parameters
params := semantic.KernelParams{
    WeightSemantic:   0.7,
    WeightLexical:    0.3,
    WeightStructural: 0.1,
    Threshold:        0.6,
}
similarity, _ := kernel.ComputeSimilarity(featuresA, featuresB, params)

// Create extendable content with cryptographic proof
root := tera.NewContent(features)
extended := root.Extend(newFeatures)  // O(1) verification

// Network forwards only if:
// 1. H_crypto verifies (legitimate extension)
// 2. Neural kernel similarity exceeds threshold (relevant)

Neural Kernels: The Codec Analogy

TERA solves the heterogeneous model problem in distributed systems. In a P2P network, different nodes may use different LLMs (Claude, GPT-4, local models), which produce incompatible embeddings. TERA's solution:

Content-Addressed Kernels (like audio/video codecs):

Small neural networks (100KB-1MB) for computing similarity
Referenced by IPLD CID (e.g., bafyreisemantic...)
Cached locally, downloaded on-demand
Multiple kernels coexist (like having H.264, VP9, AV1)

Universal Features (model-agnostic):

Fixed-size vectors (512 floats) for all content types
Extracted deterministically (no training needed)
Text: TF-IDF + n-grams, Images: color/edge histograms, Code: token frequency
Lightweight enough to transmit over network

Runtime Parameters (high dexterity):

Users tune kernel behavior at query time
No need to recompute features or retrain models
Example: Adjust semantic vs lexical focus per query

This design means nodes can share a "taste" (kernel) without forcing everyone to use the same LLM.

Architecture

┌─────────────────────────────────────────┐
│ Application Layer                       │
│ (Queries, Publications, Subscriptions)  │
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│ Gossip Protocol + Gatekeeping           │
│ (Forward if: valid_extension ∧ similar) │
└─────────────────────────────────────────┘
                  ↓
┌──────────────────┬──────────────────────┐
│ H_crypto         │ H_semantic           │
│ (Homomorphic)    │ (Neural Kernels)     │
│ Integrity ✓      │ Discovery ✓          │
└──────────────────┴──────────────────────┘
                  ↓
┌──────────────────┬──────────────────────┐
│ Storage Layer    │ Kernel Registry      │
│ (BadgerDB)       │ (IPLD Cache)         │
└──────────────────┴──────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│ libp2p Transport Layer                  │
└─────────────────────────────────────────┘

Project Status

Phase 1: Primitives (Complete ✓)

Homomorphic hash implementation (crypto/)
Universal feature extraction (semantic/features_universal.go)
Native neural kernels (semantic/neural.go)
IPLD kernel descriptors (semantic/kernel_model.go)
Gatekeeping logic (core/)
Working demo (examples/demo.go)

Phase 2: Network (Complete ✓)

libp2p integration (network/)
Gossip protocol with pubsub
Basic node implementation
CLI tool (tera-node)

Phase 3: Storage & Kernels (In Progress)

Quick Start

# Run tests
go test ./...

# Run demo (local simulation)
go run examples/demo.go

# Build and run network node
go build -o tera-node ./cmd/tera-node

# Start bootstrap node
./tera-node -port 9000 -interests "machine learning"

# Start second node (in another terminal)
# Replace <PEER-ID> with the peer ID from bootstrap node
./tera-node -port 9001 \
  -bootstrap "/ip4/127.0.0.1/tcp/9000/p2p/<PEER-ID>" \
  -interests "machine learning"

# In the node shell, publish content:
> publish neural networks and deep learning algorithms
> stats
> peers

Why "Terrestrial"?

It's a pun. IPFS is "InterPlanetary" File System, so naturally this is the "Terrestrial" version. Supposedly inverted priorities: we're focused on Earth-based problems like spam, discovery, and semantic search rather than interplanetary data transfer.

Related Work

IPFS: Content-addressed storage (exact matching only)
DHT2VEC (2020): Early exploration of semantic DHTs
Kademlia: XOR-metric DHT routing (no semantic awareness)

TERA combines ideas from content-addressed networks, homomorphic 5C2B cryptography, and kernel methods to create something new: semantic content addressing with integrity.

License

BSD 3-Clause License (see LICENSE file)

Contributing

This project is in early development. Contributions welcome once the core primitives are stable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TERA

What is TERA?

The Core Innovation

Key Properties

How It Works

Neural Kernels: The Codec Analogy

Architecture

Project Status

Quick Start

Why "Terrestrial"?

Related Work

License

Contributing

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
cmd/tera-node		cmd/tera-node
core		core
crypto		crypto
examples		examples
network		network
semantic		semantic
storage		storage
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
tera-node		tera-node
test-network.sh		test-network.sh

License

systemshift/Tera

Folders and files

Latest commit

History

Repository files navigation

TERA

What is TERA?

The Core Innovation

Key Properties

How It Works

Neural Kernels: The Codec Analogy

Architecture

Project Status

Quick Start

Why "Terrestrial"?

Related Work

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages