Skip to content

Conversation

@Retribution98
Copy link
Contributor

@Retribution98 Retribution98 commented Dec 12, 2025

Description

This PR introduces comprehensive support for the JavaScript bindings in the VLMPipeline.

VLMPipeline is implemented similarly to LLMPipeline, but with some improvements:

  • Error handling in vLmPerformInferenceThread was modified for a much better user experience. It no longer throws fatal errors. Now, the thread uses a callback to reject the promise and throw a catchable exception to the user. If that is not possible, an error message is logged.
  • Streamer and callback were separated to create clearer code. These functions are used in different cases and have different semantics. A callback is always used to resolve a promise with the final result or reject it with an error. The streamer is only used if it is provided by the user to get chunks.
  • A wrapper property, is_initializing, was added to prevent the pipeline from being loaded incorrectly twice.
  • A wrapper property, is_generating, was added to prevent the incorrect attempt to call generate in parallel. This leads to an InferRequest exception, which may confuse the user.
  • VLMPerfMetrics should extend PerfMetrics. This was refactored to avoid code duplication. The Node addon API doesn't support inheritance from the box, so a workaround was used to resolve this issue.
  • The optional parameters for generate and stream were grouped into the options argument. This hook allows the use of named arguments and simplifies calls.

CVS-172789

Checklist:

Signed-off-by: Kirill Suvorov <[email protected]>
Copilot AI review requested due to automatic review settings December 12, 2025 14:13
@github-actions github-actions bot added the category: JS API GenAI JS API label Dec 12, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces comprehensive JavaScript bindings support for the VLMPipeline (Visual Language Model Pipeline) with several improvements to error handling, code structure, and API design.

Key Changes:

  • Implemented VLMPipeline similar to LLMPipeline with enhanced error handling that uses callbacks to reject promises instead of fatal errors
  • Separated streamer and callback logic for clearer semantics and different use cases
  • Added wrapper properties is_initializing and is_generating to prevent incorrect concurrent operations
  • Refactored performance metrics to use inheritance through a base template class to avoid code duplication

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/js/tests/vlmPipeline.test.js Comprehensive test suite for VLMPipeline functionality including text generation, streaming, image/video handling, and chat operations
src/js/tests/utils.js Helper functions to create test image and video tensors for VLM testing
src/js/tests/models.js Added VLLM model reference for testing
src/js/src/vlm_pipeline/vlm_pipeline_wrapper.cpp Core VLMPipeline implementation with improved error handling and async generation
src/js/src/vlm_pipeline/start_chat_worker.cpp AsyncWorker implementation for starting chat sessions
src/js/src/vlm_pipeline/perf_metrics.cpp VLM-specific performance metrics wrapper extending base metrics
src/js/src/vlm_pipeline/init_worker.cpp AsyncWorker for pipeline initialization with concurrent init prevention
src/js/src/vlm_pipeline/finish_chat_worker.cpp AsyncWorker for finishing chat sessions
src/js/src/perf_metrics.cpp Refactored to use base template class for code reuse
src/js/src/llm_pipeline/llm_pipeline_wrapper.cpp Updated to use shared helper function for decoded results
src/js/src/helper.cpp Added tensor conversion helpers and result conversion utilities
src/js/src/addon.cpp Registered VLMPipeline and VLMPerfMetrics classes
src/js/lib/utils.ts Added VLMPipelineProperties type definition
src/js/lib/pipelines/vlmPipeline.ts TypeScript VLMPipeline class with generate, stream, and chat methods
src/js/lib/pipelines/llmPipeline.ts Moved DecodedResults to separate module for reuse
src/js/lib/perfMetrics.ts Type definitions for performance metrics including VLM-specific metrics
src/js/lib/index.ts Exported VLMPipeline factory and related types
src/js/lib/decodedResults.ts Shared DecodedResults classes for LLM and VLM pipelines
src/js/lib/addon.ts TypeScript interface definitions for VLMPipeline addon
src/js/include/vlm_pipeline/vlm_pipeline_wrapper.hpp Header for VLMPipelineWrapper class
src/js/include/vlm_pipeline/start_chat_worker.hpp Header for VLMStartChatWorker AsyncWorker
src/js/include/vlm_pipeline/perf_metrics.hpp Header for VLMPerfMetricsWrapper
src/js/include/vlm_pipeline/init_worker.hpp Header for VLMInitWorker AsyncWorker
src/js/include/vlm_pipeline/finish_chat_worker.hpp Header for VLMFinishChatWorker AsyncWorker
src/js/include/perf_metrics.hpp Updated to use base template class
src/js/include/helper.hpp Added function declarations for tensor and result conversion
src/js/include/base/perf_metrics.hpp Base template class for performance metrics wrappers using CRTP pattern
src/js/include/addon.hpp Added VLM pipeline and metrics references to AddonData

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added the category: GHA CI based on Github actions label Dec 15, 2025
Copilot AI review requested due to automatic review settings December 15, 2025 08:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

std::shared_ptr<ov::genai::VLMPipeline>& pipe,
std::shared_ptr<bool> is_initializing,
const std::string model_path,
std::string device,
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The device parameter should be passed by const reference to avoid unnecessary string copying in the constructor.

Suggested change
std::string device,
const std::string& device,

Copilot uses AI. Check for mistakes.
@github-actions github-actions bot removed the category: GHA CI based on Github actions label Dec 15, 2025
Copilot AI review requested due to automatic review settings December 15, 2025 12:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added the category: GH Pages Docs Github Pages documentation label Dec 15, 2025
Copilot AI review requested due to automatic review settings December 17, 2025 10:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings December 18, 2025 12:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return this.texts[0];
}
const lines = this.scores.map((score, i) => `${score.toFixed(6)}: ${this.texts[i]}`);
return lines.join("\n");
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The toString() implementation differs from the original in llmPipeline.ts. The original handled the last element separately to avoid a trailing newline, but this version always adds newlines between all elements. While functionally similar, this creates an inconsistency in output format compared to the original implementation.

Suggested change
return lines.join("\n");
let result = "";
for (let i = 0; i < lines.length; i++) {
result += lines[i];
if (i !== lines.length - 1) {
result += "\n";
}
}
return result;

Copilot uses AI. Check for mistakes.
return env.Undefined();
}

Napi::Value VLMPipelineWrapper::start_chat(const Napi::CallbackInfo& info) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Napi::Value VLMPipelineWrapper::start_chat(const Napi::CallbackInfo& info) {
void VLMPipelineWrapper::start_chat(const Napi::CallbackInfo& info) {

return env.Undefined();
}

Napi::Value VLMPipelineWrapper::set_chat_template(const Napi::CallbackInfo& info) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove return value from void functions

Suggested change
Napi::Value VLMPipelineWrapper::set_chat_template(const Napi::CallbackInfo& info) {
void VLMPipelineWrapper::set_chat_template(const Napi::CallbackInfo& info) {

obj.Set("texts", cpp_to_js<std::vector<std::string>, Napi::Value>(env, results.texts));
obj.Set("scores", cpp_to_js<std::vector<float>, Napi::Value>(env, results.scores));
obj.Set("perfMetrics", PerfMetricsWrapper::wrap(env, results.perf_metrics));
obj.Set("subword", Napi::String::New(env, results));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if subword is used in context of DecodedResults
I think it can be a leftover to keep backward compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will remove it when LLMPipeline is refactored

Copy link
Contributor

@almilosz almilosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a few comments

};

void VLMInitWorker::OnOK() {
*this->is_initializing = false;
Copy link
Contributor

@almilosz almilosz Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this variable needed also for LLMPipeline? If yes please add it in separate PR

InstanceMethod("setGenerationConfig", &VLMPipelineWrapper::set_generation_config)});
}

Napi::Value VLMPipelineWrapper::init(const Napi::CallbackInfo& info) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Napi::Value VLMPipelineWrapper::init(const Napi::CallbackInfo& info) {
void VLMPipelineWrapper::init(const Napi::CallbackInfo& info) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VALIDATE_ARGS_COUNT now returns Env().Undefined(). It doesn't allow it to be done now. :(
Let's fix it later.

Comment on lines +197 to +200
[context, this](Napi::Env) { // Finalizer used to clean threads up
context->native_thread.join();
delete context;
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} else {
// If no exceptions from streamer, call the final callback with the result
napi_status status =
context->callback.BlockingCall([result, &report_error](Napi::Env env, Napi::Function jsCallback) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here result is passed by value. Such copy may affect performance. Maybe use shared_ptr or capture?

@Retribution98 Retribution98 added this pull request to the merge queue Dec 19, 2025
Merged via the queue into openvinotoolkit:master with commit 109ee68 Dec 19, 2025
97 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GH Pages Docs Github Pages documentation category: JS API GenAI JS API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants