[go: up one dir, main page]

Reka et al., 2025 - Google Patents

Multi-Modal 3D Mesh Reconstruction from Images and Text

Reka et al., 2025

View PDF
Document ID
10602687320061270706
Author
Reka M
Pulli T
Vincze M
Publication year
Publication venue
arXiv preprint arXiv:2503.07190

External Links

Snippet

6D object pose estimation for unseen objects is essential in robotics but traditionally relies on trained models that require large datasets, high computational costs, and struggle to generalize. Zero-shot approaches eliminate the need for training but depend on pre-existing …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06K9/6232Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
    • G06K9/6247Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration, e.g. from bit-mapped to bit-mapped creating a similar image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation

Similar Documents

Publication Publication Date Title
Yin et al. Towards accurate reconstruction of 3d scene shape from a single monocular image
Ren et al. Benchmarking and analyzing point cloud classification under corruptions
Zioulis et al. Omnidepth: Dense depth estimation for indoors spherical panoramas
Li et al. 3D IoU-Net: IoU guided 3D object detector for point clouds
Ye et al. Nef: Neural edge fields for 3d parametric curve reconstruction from multi-view images
CN103430218A (en) Method of augmented makeover with 3d face modeling and landmark alignment
Jeon et al. Struct-mdc: Mesh-refined unsupervised depth completion leveraging structural regularities from visual slam
Ramon et al. Multi-view 3d face reconstruction in the wild using siamese networks
Yan et al. Occlusion-aware unsupervised light field depth estimation based on multi-scale GANs
Chang et al. EI-MVSNet: Epipolar-guided multi-view stereo network with interval-aware label
Choi et al. Tmo: Textured mesh acquisition of objects with a mobile device by using differentiable rendering
CN114972539A (en) On-line calibration method, system, computer equipment and medium for camera plane in computer room
Piao et al. Dynamic fusion network for light field depth estimation
Bahat et al. Neural volume super-resolution
Jiang et al. Rrt-mvs: Recurrent regularization transformer for multi-view stereo
CN114612798B (en) Satellite image tampering detection method based on Flow model
Pini et al. Learning to generate facial depth maps
Reka et al. Multi-Modal 3D Mesh Reconstruction from Images and Text
Tombari et al. Evaluation of stereo algorithms for 3d object recognition
Sun et al. Intern-gs: Vision model guided sparse-view 3d gaussian splatting
Salimi et al. Geometry-aware diffusion models for multiview scene inpainting
Sheng et al. Rendering-enhanced automatic image-to-point cloud registration for roadside scenes
CN119625183A (en) A three-dimensional head model reconstruction method and device, and electronic equipment
Ge et al. Geobench: Benchmarking and analyzing monocular geometry estimation models
Achaibou et al. Guided Depth Inpainting in ToF Image Sensing Based on Near Infrared Information