Skip to content

Screencap Plugin

Description

Screencap is a screen capture plugin that provides desktop and window capture capabilities for CVEDIA-RT. It enables real-time screen recording, window capture, and desktop monitoring for AI processing applications that require screen-based input sources. The plugin is particularly useful for analyzing user interfaces, monitoring applications, or processing desktop content.

On Windows, Screencap uses the modern Windows Graphics Capture API for efficient, high-performance screen capture with minimal system impact. On Linux, it integrates with GStreamer's ximagesrc element to provide similar functionality.

Key Features

  • Desktop Capture: Full desktop screen capture using Windows Graphics Capture API
  • Window Capture: Selective window capture through system picker dialog
  • Multi-monitor Support: Support for multiple display configurations
  • Real-time Processing: Live screen capture optimized for real-time AI processing
  • Performance Optimization: Efficient capture with minimal system impact using asynchronous operations
  • Cursor Control: Optional cursor inclusion/exclusion in captures
  • Format Support: Automatic color space conversion (BGRA to BGR)

When to Use

  • Analyzing user interface interactions and workflows
  • Monitoring desktop applications for anomaly detection
  • Processing screen content for accessibility applications
  • Recording desktop demonstrations for analysis
  • Capturing application windows for quality assurance testing
  • Creating training data from desktop applications

Requirements

Windows Requirements

  • Windows 10 version 1903 (build 18362) or later
  • Windows Graphics Capture API support
  • DirectX 11 compatible graphics hardware
  • Visual Studio C++ runtime libraries

Linux Requirements

  • GStreamer development libraries (gstreamer1.0-plugins-good)
  • X11 development libraries
  • XShm extension support

Software Dependencies

  • CVEDIA-RT Core runtime
  • Platform-specific capture libraries (Windows Runtime, GStreamer)
  • OpenCV for image processing

Configuration

Basic Configuration

{
  "screencap": {
    "uri": "screencap:///",
    "capture_cursor": false
  }
}

Advanced Configuration

{
  "screencap": {
    "uri": "screencap:///",
    "capture_cursor": true,
    "target_fps": 30,
    "buffer_size": 5
  }
}

Configuration Schema

Parameter Type Default Description
uri string "screencap:///" Screen capture URI scheme
capture_cursor boolean false Include mouse cursor in capture
target_fps integer 30 Target capture frame rate
buffer_size integer 5 Frame buffer size for smooth capture

API Reference

C++ API

The Screencap plugin implements the InputHandler interface:

class Screencap {
public:
    // Core capture methods
    bool startCapture();
    cvec getNextFrame();
    int getCurrentFrame() const;

    // Control methods
    bool isPaused() const;
    bool isEnded();
    void pause(bool state);

    // InputHandler interface
    int getCurrentFrameIndex() override;
    float getCurrentFps(iface::FPSType fpsType = iface::FPSType::FPSType_TARGET) override;
    double getCurrentTimestamp() override;
    bool canRead() override;
    expected<void> openUri(std::string const& uri) override;
    expected<cvec> readFrame(bool ignore_skip_frame = false, cmap frameSettings = {}) override;

    // Configuration
    struct config {
        bool capture_cursor = false;
    };
};

Windows-Specific Implementation

// Windows Graphics Capture integration
class WindowsScreencap {
    winrt::Windows::Graphics::Capture::GraphicsCapturePicker picker;
    winrt::Windows::Graphics::Capture::GraphicsCaptureItem captureItem;
    winrt::Windows::Graphics::Capture::Direct3D11CaptureFramePool framePool;

    // Asynchronous frame processing
    winrt::Windows::Foundation::IAsyncAction ProcessFrameAsync();
    void OnFrameArrived(winrt::Windows::Graphics::Capture::Direct3D11CaptureFramePool const& sender,
                       winrt::Windows::Foundation::IInspectable const& args);
};

URI Registration

extern "C" EXPORT void registerHandler() {
    api::input::registerUriHandler("screencap", &ScreencapInput::create);
}

Lua API

Screen capture is typically configured through the Input plugin interface:

-- Create input instance for screen capture
local instance = api.thread.getCurrentInstance()
local input = api.factory.input.create(instance, "Input")

-- Configure for screen capture
local config = {
    uri = "screencap:///",
    capture_cursor = false,
    buffer_size = 8
}
input:saveConfig(config)
input:setSourceFromConfig()

Examples

Basic Desktop Capture

-- Create screen capture input
local instance = api.thread.getCurrentInstance()
local input = api.factory.input.create(instance, "Input")

-- Configure for full desktop capture
local config = {
    uri = "screencap:///",
    capture_cursor = false,
    buffer_size = 10
}
input:saveConfig(config)

-- Start capture (will show Windows picker dialog)
input:setSourceFromConfig()

-- Process captured frames
while input:canRead() do
    local frames = input:readMetaFrames(false)
    if frames and #frames > 0 then
        local frame = frames[1]
        api.logging.LogInfo("Captured frame at " .. frame.timestamp)
        -- Process screen content for analysis
        processScreenFrame(frame)
    end
end

Application Window Monitoring

-- Monitor specific application window
local instance = api.thread.getCurrentInstance()
local input = api.factory.input.create(instance, "Input")

-- Configure with cursor capture for interaction analysis
local config = {
    uri = "screencap:///",
    capture_cursor = true,  -- Include cursor for UI interaction analysis
    target_fps = 15,        -- Lower FPS for monitoring applications
    buffer_size = 5
}
input:saveConfig(config)

-- User selects window through system dialog
input:setSourceFromConfig()

-- Analyze window content
local frame_count = 0
while input:canRead() do
    local frames = input:readMetaFrames(false)
    if frames and #frames > 0 then
        frame_count = frame_count + 1

        -- Process every 10th frame to reduce load
        if frame_count % 10 == 0 then
            analyzeApplicationWindow(frames[1])
        end
    end
end

Multi-Screen Setup

-- Capture from multiple screens (requires multiple instances)
local screens = {}
local screen_count = 2

local instance = api.thread.getCurrentInstance()
for i = 1, screen_count do
    screens[i] = api.factory.input.create(instance, "Input" .. i)

    local config = {
        uri = "screencap:///",
        capture_cursor = false,
        buffer_size = 8,
        target_fps = 20
    }
    screens[i]:saveConfig(config)

    -- Each instance will trigger separate window picker
    api.logging.LogInfo("Select screen/window for monitor " .. i)
    screens[i]:setSourceFromConfig()
end

-- Process all screens round-robin
while true do
    for i = 1, screen_count do
        if screens[i]:canRead() then
            local frames = screens[i]:readMetaFrames(false)
            if frames and #frames > 0 then
                processScreenContent(i, frames[1])
            end
        end
    end
end

Linux GStreamer Integration

-- Linux screen capture using GStreamer fallback
local instance = api.thread.getCurrentInstance()
local input = api.factory.input.create(instance, "Input")

-- Configure GStreamer pipeline for screen capture
local config = {
    uri = "gstreamer:///ximagesrc startx=0 starty=0 endx=1920 endy=1080 ! videoconvert ! video/x-raw,format=BGR ! appsink drop=true name=cvdsink",
    buffer_size = 10
}
input:saveConfig(config)
input:setSourceFromConfig()

-- Process Linux screen capture
while input:canRead() do
    local frames = input:readMetaFrames(false)
    if frames and #frames > 0 then
        processLinuxScreenFrame(frames[1])
    end
end

Best Practices

Performance Optimization

  1. Use appropriate frame rates - Higher FPS increases system load
  2. Monitor system resources - Screen capture can be CPU intensive
  3. Optimize buffer size - Balance between smoothness and memory usage
  4. Disable cursor capture when not needed for better performance

Window Selection

  1. Test window picker before deployment in production
  2. Handle picker cancellation gracefully in applications
  3. Document window selection process for end users
  4. Consider automation for unattended deployments

System Integration

  1. Check Windows version compatibility before deployment
  2. Test on target hardware to verify Graphics Capture API support
  3. Monitor capture stability during long-running sessions
  4. Handle display changes (resolution, orientation) appropriately

Security Considerations

  1. Be aware of sensitive content in screen captures
  2. Implement privacy controls when capturing user desktops
  3. Secure capture data appropriately during processing
  4. Consider user consent for screen capture applications

Troubleshooting

Common Issues

  1. "Graphics Capture not supported" error (Windows)

    • Verify Windows 10 build 18362 or later
    • Check for Windows Updates
    • Ensure Graphics Capture API is available
    • Test on different hardware if persistent
  2. Window picker dialog not appearing

    • Verify application has UI focus
    • Check for popup blockers or security software
    • Try running as administrator if needed
    • Ensure Windows Runtime components are installed
  3. Poor capture performance

    • Reduce target_fps to lower values (10-15 FPS)
    • Decrease buffer_size to reduce memory usage
    • Close unnecessary applications to free resources
    • Check for hardware acceleration availability
  4. Capture stops unexpectedly

    • Monitor target window state (minimized, closed)
    • Handle display configuration changes
    • Check for graphics driver updates
    • Implement capture session recovery

Linux-Specific Issues

  1. XImageSrc not available

    • Install gstreamer1.0-plugins-good package
    • Verify X11 development libraries are installed
    • Check DISPLAY environment variable
    • Ensure XShm extension is supported
  2. Permission denied for screen capture

    • Check X11 security settings
    • Verify user has display access permissions
    • Test with different window managers
    • Consider using xhost +local: if appropriate

Debugging Tips

-- Monitor screen capture performance
local function debugScreenCapture(input)
    local stats = {
        current_frame = input:getCurrentFrame(),
        fps = input:getFPS(3),  -- Real FPS (type 3)
        timestamp = input:getCurrentTimestamp(),
        can_read = input:canRead(),
        is_paused = input:isPaused()
    }

    local json = dofile(luaroot .. "/api/json.lua")
    api.logging.LogDebug("Screen Capture Stats: " .. json.encode(stats))

    -- Check for frame drops
    local config = input:getConfig()
    if config.target_fps and stats.fps < config.target_fps * 0.8 then
        api.logging.LogWarning("Frame rate below target")
    end
end

-- Monitor system resources
local function checkSystemLoad()
    -- System monitoring requires platform-specific implementation
    -- Option 1: Parse /proc files on Linux
    -- Option 2: Use os.execute() with system commands
    -- Option 3: Implement monitoring in C++ and expose to Lua

    -- Example using command line tools:
    local handle = io.popen("top -bn1 | grep 'Cpu(s)'")
    if handle then
        local result = handle:read("*a")
        handle:close()
        -- Parse CPU usage from result
        api.logging.LogInfo("CPU info: " .. result)
    end
end

Integration Examples

UI Testing and Automation

-- Screen capture for automated UI testing
local instance = api.thread.getCurrentInstance()
local input = api.factory.input.create(instance, "Input")

local config = {
    uri = "screencap:///",
    capture_cursor = true,  -- Track cursor for interaction analysis
    target_fps = 10,        -- Sufficient for UI change detection
    buffer_size = 15
}
input:saveConfig(config)
input:setSourceFromConfig()

-- Analyze UI changes
local previous_frame = nil
while input:canRead() do
    local frames = input:readMetaFrames(false)
    if frames and #frames > 0 then
        local current_frame = frames[1]

        if previous_frame then
            local diff = calculateFrameDifference(previous_frame, current_frame)
            if diff > threshold then
                api.logging.LogInfo("UI change detected at " .. current_frame.timestamp)
                analyzeUIChange(previous_frame, current_frame)
            end
        end

        previous_frame = current_frame
    end
end

Application Monitoring

-- Monitor application for anomaly detection
local instance = api.thread.getCurrentInstance()
local input = api.factory.input.create(instance, "Input")

local config = {
    uri = "screencap:///",
    capture_cursor = false,
    target_fps = 5,  -- Low frequency monitoring
    buffer_size = 10
}
input:saveConfig(config)
input:setSourceFromConfig()

-- Continuous monitoring
local monitoring_active = true
while monitoring_active and input:canRead() do
    local frames = input:readMetaFrames(false)
    if frames and #frames > 0 then
        local anomalies = detectApplicationAnomalies(frames[1])
        if #anomalies > 0 then
            api.logging.LogWarning("Application anomalies detected: " .. json.encode(anomalies))
            triggerAlert(anomalies)
        end
    end

    -- Check monitoring status
    monitoring_active = checkMonitoringStatus()
end

See Also