A high-performance ONNX-based ranking inference service built with C++ and Go.
Longmen provides real-time ranking inference using ONNX models with efficient sparse embedding lookup and graceful resource management.
longmen/
├── main.go # Service entry point
├── api/ # gRPC API definitions
│ ├── api.proto # Protocol buffer definitions
│ ├── api.pb.go # Generated Go code
│ ├── api_grpc.pb.go # Generated gRPC code
│ └── proto.sh # Script to generate protobuf code
├── services/ # Service layer - gRPC/HTTP handlers
│ └── services.go
├── ranker/ # Ranking engine - inference logic (Go + CGO)
│ └── ranker.go
├── config/ # Configuration management
│ └── config.go
├── third/ # Third-party C++ libraries
│ ├── CMakeLists.txt # CMake build configuration
│ ├── longmen/ # C++ inference engine
│ └── minia/ # Minia library
├── conf/ # Configuration files
│ └── local/ # Local environment config
├── dist/ # Distribution directory
│ ├── conf/ # Deployed config
│ ├── logs/ # Log files
│ └── longmen # Compiled binary
├── test/ # Test scripts
│ └── test.py
└── makefile # Build automation- Go 1.24+
- C++20 compiler
- ONNX Runtime
- CMake 3.16+
- Etcd (for service registration)
cd third
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
make install
cd ../..make prodCreate configuration file in conf/local/config.toml:
[server]
project_name = "longmen"
grpc_port = 9527 # gRPC service port
http_port = 9528 # HTTP API port
prome_port = 9529 # Prometheus metrics port
pprof_port = 9530 # pprof profiling port
debug = true # Enable debug mode
[server.register.etcd]
name = "test" # Service name for registration
endpoints = ["http://127.0.0.1:2379"] # Etcd endpoints
[env]
workdir = "/tmp/dnnrec_dump" # Working directory for temp files
itemdir = "/tmp/items" # Item data directory[server]
project_name: Service identifiergrpc_port: gRPC service port (default: 9527)http_port: HTTP API port (default: 9528)prome_port: Prometheus metrics port (default: 9529)pprof_port: Go pprof profiling port (default: 9530)debug: Enable debug logging
[server.register.etcd]
name: Service registration name in Etcdendpoints: Etcd server addresses
[env]
workdir: Directory for temporary filesitemdir: Directory containing item data
./dist/longmen -config conf/local/config.toml./dist/longmen -config conf/local/config.toml -log ./dist/logsgrpcurl -plaintext localhost:9527 listcurl http://localhost:9528/health
# Check Prometheus metrics
curl http://localhost:9529/metrics
# Check pprof
curl http://localhost:9530/debug/pprof/
import (
"context"
"time"
"google.golang.org/grpc"
pb "path/to/api"
)
conn, err := grpc.Dial("localhost:9527", grpc.WithInsecure())
defer conn.Close()
client := pb.NewRankServiceClient(conn)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.Rank(ctx, &pb.RankRequest{
Features: features,
})curl -X POST http://localhost:9528/api/rank \
-H "Content-Type: application/json" \
-d '{"features": [...]}'cd api
./proto.sh# Python test
python test/test.pySet debug = true in config file to enable:
- Verbose logging
- Request/response tracing
- Performance profiling
mkdir -p /var/lib/longmen/{dump,items,logs}
[server]
project_name = "longmen"
grpc_port = 9527
http_port = 9528
prome_port = 9529
pprof_port = 9530
debug = false
[server.register.etcd]
name = "longmen-prod"
endpoints = ["http://etcd1:2379", "http://etcd2:2379"]
[env]
workdir = "/var/lib/longmen/dump"
itemdir = "/var/lib/longmen/items"./dist/longmen -config conf/prod/config.toml -log /var/lib/longmen/logstail -f dist/logs/longmen.log
# Check metrics
curl http://localhost:9529/metrics | grep longmen
# Memory profiling
go tool pprof http://localhost:9530/debug/pprof/heap
# CPU profiling
go tool pprof http://localhost:9530/debug/pprof/profileThe service handles these signals for graceful shutdown:
- SIGINT (Ctrl+C)
- SIGTERM
- SIGQUIT
All resources are properly cleaned up before exit.
GNU Affero General Public License v3.0 (AGPL-3.0)