Resumen de: US20260154949A1
0000 A reliability prediction method for an image classification neural network model includes: a reliability model is trained; and the trained reliability model predicts a reliability of the image classification neural network model. According to training and testing of the image classification neural network model, input features of the reliability model include a model training factor and a model testing factor. The model training factor characterizes data and model factors affecting the reliability of the image classification neural network model. The model testing factor characterizes a test sufficiency of the image classification neural network model. An output of the reliability model is a reliability prediction result of the image classification neural network model.
Resumen de: US20260154956A1
An image processing method is disclosed. The method includes receiving an input image including at least one object, and classifying the at least one object in the input image using a first model based on an artificial neural network trained to classify objects into one of a plurality of predetermined categories. At least one second model corresponding to the classified category of the at least one object is determined from among a plurality of second models, each of which is based on an artificial neural network trained to output a specialized processing applied image specific to a respective category. An output image is obtained by inputting the input image, or a region thereof corresponding to the at least one object, into the determined at least one second model.
Resumen de: US20260154840A1
0000 Systems and methods for artificial intelligence (“AI”)-assisted radiographic detection of leadless implanted electronic devices (“LLIEDs”) are provided. In general, AI systems and methods are constructed and implemented to provide a highly accurate C model that can assist physicians and support personnel with radiographic (e.g., chest x-ray) detection of the presence (or absence) of any LLIED, and the localization of any detected LLIED(s), prior to performing a scheduled or emergency MRI examination. A two-tier cascading neural network methodology is used to detect the locations of LLIEDs in the first tier and to classify or otherwise identify the type of detected LLIEDs in the second tier.
Resumen de: US20260154963A1
0000 Apparatuses, systems, and techniques to generate an image. In at least one embodiment, one or more neural networks are to generate a second image based, at least in part, on a first image and information indicating zero or more differences between the first and second image.
Resumen de: US20260154982A1
0000 Methods, apparatus, systems, and articles of manufacture are disclosed to tag segments in a document. An example apparatus includes processor circuitry to execute machine readable instructions to generate node embeddings for nodes of a graph, the node embeddings based on features extracted from text segments detected in a document, the text segments to be represented by the nodes of the graph; sample edges corresponding to the nodes to generate the graph; generate first updated node embeddings by passing the node embeddings and the graph through layers of a graph neural network, the first updated embeddings corresponding to the node embeddings augmented with neighbor information; generate second updated node embeddings by passing the first updated embeddings through layers of a recurrent neural network, the second updated embeddings corresponding to the first updated node embeddings augmented with sequential information; and classify the text segments based on the second updated node embeddings.
Resumen de: US20260153862A1
A method for anomaly detection in an operational asset includes collecting a source domain dataset corresponding to a first operating condition of the operational asset, wherein samples from the source domain dataset belong to a healthy class, and a faulty class; collecting a target domain dataset corresponding to a second operation condition of the operational asset, wherein samples from the target domain dataset belong to the healthy class; inputting the source domain dataset and the target domain dataset as input data into a neural network; extracting, by the neural network, features from the input data, wherein a first subset of features is discriminative of the healthy class and a second subset of features is domain invariant; reducing a dimensionality of the features into reduced features; and classifying the reduced features into a normal class and an anomaly class using a one-class classifier.
Resumen de: WO2026116831A1
A method for operating an information processing system for automating a basic evaluation of a beneficiary of a welfare facility may comprise the steps of: acquiring state data of the beneficiary of the welfare facility; acquiring service provision history for the beneficiary; inputting the state data of the beneficiary and the service provision history to an artificial neural network model; acquiring state change pattern data of the beneficiary output by the artificial neural network model; and generating basic evaluation data on the basis of the state change pattern data.
Resumen de: WO2026116516A1
Provided are a method and system for integrating a CNN with lightweight artificial intelligence for on-device image classification. An on-device integrated artificial intelligence system, according to an embodiment of the present invention, comprises: a first network which is a non-lightweight network that extracts features from an input image; and a second network which is a lightweight network that performs inference on the input image on the basis of the extracted features. Accordingly, the performance of the lightweight artificial intelligence can be complemented by integrating the feature point extraction function of the CNN into a lightweight network in an on-device environment.
Resumen de: WO2026116542A1
A processor of an electronic device, according to one embodiment, may acquire: from a first neural network into which a speech signal received through a microphone has been input, a first sequence of portions of the speech signal corresponding to designated frame units; from a second neural network into which designated text has been input, a second sequence of one or more phonetic symbols for the designated text; from a third neural network into which the first sequence and the second sequence have been input, a first dataset indicating the degree to which each of the one or more phonetic symbols corresponds to each of the portions of the speech signal; and, from a pattern discriminator into which the first dataset has been input, predicted phonemes of the portions of the speech signal.
Resumen de: US20260154775A1
0000 A multi-node cluster-based inference method through GPU separation allocation of a pre-trained layer and a fine-tuning layer of multiple deep learning models. The method includes: receiving an input value from a client; distributing the received input value, and transmitting the first input value to a first computation node including a container in which a neural network bundle of a first stage is loaded; performing, by a first container of the first computation nodes, an operation through a neural network layer of a GPU by using the received first input value as an input, and generating a first output value; and selecting a container in which the neural network bundle of the next stage is loaded, and transmitting the second output value to the computation node that includes the container in which the next stage is to be executed or the container in which to execute the next stage.
Resumen de: US20260154549A1
A linguistic feature amount output part receives a text describing a base class image and outputs a linguistic feature amount. An image feature amount output part receives the base class image and outputs an image feature amount. A base class image selection part receives the linguistic feature amount, the image feature amount, and the base class image and selects a base class image corresponding to the image feature amount having a distance equal to or smaller than a predetermined threshold value from the linguistic feature amount. A neural network lower layer part receives the base class image selected by the base class image selection part and a novel class image and outputs a value based the base class image and a value based on the novel class image. A base class classification output part outputs a base class classification based on the base class image and the novel class image. A novel class classification output part outputs a novel class classification based on the novel class image.
Resumen de: AU2025204589A1
The present disclosure relates to the field of laser micro-nano manufacturing technology and discloses a method and system for manufacturing a fiber Bragg grating based on machine vision. By combining advanced machine vision technology and neural network model technology, automatic recognition of the focal plane of the fiber core is realized, so that the laser focus can be automatically focused on the focal plane of the fiber core, thereby realizing the automatic and intelligent fiber Bragg grating manufacturing process. The method not only has the characteristics of high precision and high efficiency, but also shows broad application prospects, providing strong support for the development of optical fiber communication, sensing, and other fields. (Fig. 1) The present disclosure relates to the field of laser micro-nano manufacturing technology and discloses a method and system for manufacturing a fiber Bragg grating based on machine vision. By combining advanced machine vision technology and neural network model technology, automatic recognition of the focal plane of the fiber core is realized, SO that the laser focus can be automatically focused on the focal plane of the fiber core, thereby realizing the automatic and intelligent fiber Bragg grating manufacturing process. The method not only has the characteristics of high precision and high efficiency, but also shows broad application prospects, providing strong support for the development of optical fiber communication, se
Resumen de: US20260154568A1
0000 A method for predicting growth on the basis of growth age and providing a solution by using an artificial intelligence model may include the steps of: receiving biometric data of a measurement target; extracting data regarding the predicted age of peak height velocity (APHV), at which the growth velocity is expected to reach the maximum value, by using the biometric data of the measurement target; classifying the growth step of the measurement target into one of multiple growth steps on the basis of the extracted data regarding the predicted APHV; predicting the final height by inputting the extracted data regarding the predicted APHV into a trained neural network; and providing a growth management solution on the basis of the classified growth step and the predicted final height.
Resumen de: US20260154378A1
0000 Apparatuses, systems, and techniques to modify a set of training data used for machine learning. In at least one embodiment, a set of images used for training a machine learning system is resampled by augmenting the set of images with additional images of under represented object types extracted from portions of existing training images in the set.
Resumen de: US20260154979A1
0000 A method for training an image processing neural network. The method includes: providing a set of training images; feeding each training image to a first trained neural network, which assigns semantic information to pixels, other image portions, and/or image features of an input image; feeding each training image to a second trained neural network, which assigns depth information to pixels, other image portions, and/or image features of an input image; fusing the semantic information and depth information to form a target map, which assigns semantic information to locations in three-dimensional space; processing, using the image processing neural network to be trained, each training image to form a map, which assigns semantic information to locations in three-dimensional space; checking, using a cost function, to what extent the map thus obtained is in line with the target map; optimizing parameters that characterize the behavior of the image processing neural network.
Resumen de: US20260154533A1
0000 The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As another example, in some implementations, the neural network architectures of the present disclosure can include one or more inverted residual blocks where the input and output of the inverted residual block are thin bottleneck layers, while an intermediate layer is an expanded representation. For example, the expanded representation can include one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. A residual shortcut connection can exist between the thin bottleneck layers that play a role of an input and output of the inverted residual block.
Resumen de: US20260154955A1
0000 Apparatuses, systems, and techniques are presented for generating instructional text. In at least one embodiment, an instructional video is analyzed to determine logical steps of a process or task demonstrated in that video, and instructive text is generated for those logical steps.
Resumen de: US20260154550A1
A neural network in one embodiment is built by decomposing a structure into different building materials creating neurons that represent building materials and open spaces in a structure. Subsystems in the building have their neurons concatenated together to create same length neuron strings. In some embodiments, neurons in a short neuron string are split to make longer neuron strings. In some embodiments, neurons are added to some neuron strings to represent inside features, air features, and outside features.
Resumen de: US20260154959A1
Apparatuses, systems, and techniques to identify one or more objects in one or more images. In at least one embodiment, one or more objects are identified in one or more images based, at least in part, on a likelihood that one or more objects is different from other objects in one or more images.
Resumen de: EP4752843A2
Some embodiments of an example method disclosed herein may include receiving point cloud data representing one or more three-dimensional objects; receiving a viewpoint of the point cloud data; selecting a selected object from the one or more three-dimensional objects using the viewpoint; retrieving a neural network model for the selected object; generating a level of detail data for the selected object using the neural network model; and replacing, within the point cloud data, points corresponding to the selected object with the level of detail data.
Resumen de: GB2644802A
The present invention relates to an Al-driven system for generating style transfer fingerprints and compositions. It includes modules for integrating with sonic libraries, extracting metadata and audio features, and employing deep neural networks for style transfer. A style fingerprint generation module captures the artist's sonic characteristics, stored securely in a database linked to artist profiles. A composition generation module utilizes these fingerprints to create new audio compositions that authentically reflect the artist's unique style. The method involves connecting the artist's library, preprocessing audio, extracting features, training a style transfer model, generating a style fingerprint, and producing compositions.
Resumen de: EP4752837A2
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for video synthesis. The program and method provide for accessing a primary generative adversarial network (GAN) comprising a pre-trained image generator, a motion generator comprising a plurality of neural networks, and a video discriminator; generating an updated GAN based on the primary GAN, by performing operations comprising identifying input data of the updated GAN, the input data comprising an initial latent code and a motion domain dataset, training the motion generator based on the input data, and adjusting weights of the plurality of neural networks of the primary GAN based on an output of the video discriminator; and generating a synthesized video based on the primary GAN and the input data.
Resumen de: AU2026203685A1
Various techniques facilitate the development of an image library that can be used to train and/or validate an automated visual inspection (AVI) model, such an AVI neural network for image classification. In one aspect, an arithmetic transposition algorithm is used to generate synthetic images from original images by transposing features (e.g., defects) onto the original images, with pixel-level realism. In other aspects, digital inpainting techniques are used to generate realistic synthetic images from original images. Deep learning-based inpainting techniques may be used to add, remove, and/or modify defects or other depicted features. In still other aspects, quality control techniques are used to assess the suitability of image libraries for training and/or validation of AVI models, and/or to assess whether individual images are suitable for inclusion in such libraries. ay a y
Resumen de: US20260148518A1
A training method for a neural network model for image processing is provided. The present disclosure relates to the technical field of artificial intelligence, and in particular to the technical field of image recognition. The neural network model includes a first sub-model and a second sub-model, and a training method for the first sub-model includes: obtaining a first sample image and labeling a ground truth coordinate value of a region of interest in the first sample image; inputting the first sample image into the first sub-model to obtain a first output; and adjusting parameters of the first sub-model; a training method for the second sub-model includes: obtaining a second sample image and labeling a ground truth threshold; inputting the second sample image into the first sub-model and obtaining a second output of the first sub-model; inputting the second output into the second sub-model and obtaining a predicted threshold output by the second sub-model; and adjusting parameters of the second sub-model based on the ground truth threshold and the predicted threshold.
Nº publicación: US20260148519A1 28/05/2026
Solicitante:
QILU UNIV OF TECHNOLOGY SHANDONG ACADEMY OF SCIENCES [CN]
SHANDONG ARTIFICIAL INTELLIGENCE INST [CN]
Qilu University of Technology (Shandong Academy of Sciences)
Shandong Artificial Intelligence Institute
Resumen de: US20260148519A1
The disclosure relates to image retrieval, and provides a method for constructing an adaptive weight-based cross-camera proxy contrastive loss; the method includes: obtaining a pre-processed training dataset; inputting the pre-processed training dataset into a convolutional neural network to obtain a global feature; constructing a cross-camera proxy contrastive loss function based on the global feature; constructing an adaptive weight for the cross-camera proxy contrastive loss based on the cross-camera proxy contrastive loss function; training the network with adaptive weight-integrated cross-camera proxy contrastive loss; and optimizing the network by back-propagating an optimized network parameter. With the adaptive weight, the network model realizes adjustment of the contribution of each sample to the loss based on similarities between the samples and the feature centroids of the cameras; a higher weight is assigned to samples with similar features so that the model being trained focuses more on such higher weighted samples.